WO2004090864B1 - Method and apparatus for the encoding and decoding of speech - Google Patents

Method and apparatus for the encoding and decoding of speech

Info

Publication number
WO2004090864B1
WO2004090864B1 PCT/IN2004/000060 IN2004000060W WO2004090864B1 WO 2004090864 B1 WO2004090864 B1 WO 2004090864B1 IN 2004000060 W IN2004000060 W IN 2004000060W WO 2004090864 B1 WO2004090864 B1 WO 2004090864B1
Authority
WO
WIPO (PCT)
Prior art keywords
speech
lsfs
pvq
parameters
frames
Prior art date
Application number
PCT/IN2004/000060
Other languages
French (fr)
Other versions
WO2004090864A3 (en
WO2004090864A2 (en
Inventor
Preeti Rao
Original Assignee
Indian Inst Technology Bombay
Preeti Rao
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Indian Inst Technology Bombay, Preeti Rao filed Critical Indian Inst Technology Bombay
Publication of WO2004090864A2 publication Critical patent/WO2004090864A2/en
Publication of WO2004090864A3 publication Critical patent/WO2004090864A3/en
Publication of WO2004090864B1 publication Critical patent/WO2004090864B1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • G10L19/038Vector quantisation, e.g. TwinVQ audio
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
    • G10L19/07Line spectrum pair [LSP] vocoders

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)

Abstract

Methods and apparatus for encoding speech for communication to a decoder for reproduction of the speech signal where the speech signal is represented by the parameters of a speech model, and a specific quantisation' scheme is used for each parameter, with novel quantisation schemes for the spectral amplitudes. The spectral amplitudes are represented by line spectral frequencies (LSFs) and gain. The LSF vector is split into sub-vectors for quantisation by SNPVQ and frame-fill interpolation. The low-frequency split vector is quantised by an SN-PVQ scheme, and the high frequency split vector by SN-PVQ in the even-numbered frames and frame-fill interpolation in the odd-numbered frames. Optionally all LSF sub-vectors can be quantised by SN-PVQ. Further, the gain parameters of two frames are jointly quantised. These result in a system of encoder and decoder for speech coding with communication quality output speech at bit rates below 2 kbps.

Claims

AMENDED CLAIMS [received by the International Bureau on 16 December 2004 (16.12.04); original claim 1 amended; remaining claims unchanged (1 page)]
1. A novel method of coding speech signals to achieve communication quality at bit rates less than 2 kbps involving quantisation of spectral parameters of speech signals, as obtained from frame based analysis of speech, by the combination of prediction and interpolation.
2. The method of encoding as claimed in 1. comprising the steps of: (a) Processing the speech signal to divide it into speech frames each representing a fixed time interval of speech (b) Processing the speech frames to obtain the parameters of a speech mode! including spectral parameters (c) Representing the spectral parameters by means of LPCs and gain (d) Converting the LPCs to LSFs (e) Quantising and encoding the LSFs by a combination of SN-PVQ and frame-fill interpolation (f) Joint quantisation of the gains of a pair of frames (g) Quantising the remaining parameters of the model
3. The method of decoding as claimed in 1 comprising the steps of: (a) Reconstructing the quantised LSFs from the flag and codebook indices using SN-PVQ reconstruction (b) Reconstructing the interpolated LSFs from the interpolation index and the neighbouring frames' reconstructed LSFs (c) Converting the LSFs to LPCs after optionally correcting for stability (d) Reconstructing the gains of two frames from the indices and the gain codebook (e) Reconstructing the remaining model parameters (f) Synthesizing a speech signal from the decoded parameters
4. The method of encoding and decoding as claimed in Claims 1-3 wherein the vector of LSFs is divided into sub-vectors, each of which is quantised independently, either by SN-PVQ or, by a combination of SN-PVQ and frame-fill interpolation
5. A quantisation method for the split LSF sub-vectors as claimed in Claims 1-4 comprising the steps of: (a) Forming the corresponding mean-removed vector (b) Searching the SN codebook for the best matched codevector and associated index based on a weighted Euclidean distance metric (c) Forming the error vector as the difference between the mean-removed vector and its first-order predicted value from the previous quantised frame (d) Searching the PVQ codebook to find the best matched error codevector and associated index based on a weighted Euclidean distance metric (e) Determining the mode that yields the minimum distortion and setting the flag bit accordingly
29
PCT/IN2004/000060 2003-03-12 2004-03-12 Method and apparatus for the encoding and decoding of speech WO2004090864A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN273MU2003 2003-03-12
IN273/MUM/2003 2003-03-12

Publications (3)

Publication Number Publication Date
WO2004090864A2 WO2004090864A2 (en) 2004-10-21
WO2004090864A3 WO2004090864A3 (en) 2005-03-24
WO2004090864B1 true WO2004090864B1 (en) 2005-05-19

Family

ID=33156203

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2004/000060 WO2004090864A2 (en) 2003-03-12 2004-03-12 Method and apparatus for the encoding and decoding of speech

Country Status (1)

Country Link
WO (1) WO2004090864A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7743016B2 (en) 2005-10-05 2010-06-22 Lg Electronics Inc. Method and apparatus for data processing and encoding and decoding method, and apparatus therefor
US7752053B2 (en) 2006-01-13 2010-07-06 Lg Electronics Inc. Audio signal processing using pilot based coding
US7774199B2 (en) 2005-10-05 2010-08-10 Lg Electronics Inc. Signal processing using pilot based coding

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7970072B2 (en) 2005-10-13 2011-06-28 Lg Electronics Inc. Method and apparatus for processing a signal
US8199828B2 (en) 2005-10-13 2012-06-12 Lg Electronics Inc. Method of processing a signal and apparatus for processing a signal
EP1946556A4 (en) * 2005-10-13 2009-12-30 Lg Electronics Inc Method and apparatus for signal processing
EP2301021B1 (en) 2008-07-10 2017-06-21 VoiceAge Corporation Device and method for quantizing lpc filters in a super-frame
US8762136B2 (en) 2011-05-03 2014-06-24 Lsi Corporation System and method of speech compression using an inter frame parameter correlation
PT3633675T (en) * 2014-07-28 2021-06-01 Ericsson Telefon Ab L M Pyramid vector quantizer shape search
CN112970063A (en) * 2018-10-29 2021-06-15 杜比国际公司 Method and apparatus for rate quality scalable coding with generative models
CN113808601B (en) * 2021-11-19 2022-02-22 信瑞递(北京)科技有限公司 Method, device and electronic equipment for generating RDSS short message channel voice code
CN115050378A (en) * 2022-05-19 2022-09-13 腾讯科技(深圳)有限公司 Audio coding and decoding method and related product

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6556966B1 (en) * 1998-08-24 2003-04-29 Conexant Systems, Inc. Codebook structure for changeable pulse multimode speech coding
US7315815B1 (en) * 1999-09-22 2008-01-01 Microsoft Corporation LPC-harmonic vocoder with superframe structure

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7743016B2 (en) 2005-10-05 2010-06-22 Lg Electronics Inc. Method and apparatus for data processing and encoding and decoding method, and apparatus therefor
US7756702B2 (en) 2005-10-05 2010-07-13 Lg Electronics Inc. Signal processing using pilot based coding
US7756701B2 (en) 2005-10-05 2010-07-13 Lg Electronics Inc. Audio signal processing using pilot based coding
US7774199B2 (en) 2005-10-05 2010-08-10 Lg Electronics Inc. Signal processing using pilot based coding
US8068569B2 (en) 2005-10-05 2011-11-29 Lg Electronics, Inc. Method and apparatus for signal processing and encoding and decoding
US7752053B2 (en) 2006-01-13 2010-07-06 Lg Electronics Inc. Audio signal processing using pilot based coding
US7865369B2 (en) 2006-01-13 2011-01-04 Lg Electronics Inc. Method and apparatus for signal processing and encoding and decoding method, and apparatus therefor

Also Published As

Publication number Publication date
WO2004090864A3 (en) 2005-03-24
WO2004090864A2 (en) 2004-10-21

Similar Documents

Publication Publication Date Title
US11282530B2 (en) Methods, encoder and decoder for linear predictive encoding and decoding of sound signals upon transition between frames having different sampling rates
JP6173288B2 (en) Multi-mode audio codec and CELP coding adapted thereto
CA2179228C (en) Method and apparatus for reproducing speech signals and method for transmitting same
US6470313B1 (en) Speech coding
US5873059A (en) Method and apparatus for decoding and changing the pitch of an encoded speech signal
US5018200A (en) Communication system capable of improving a speech quality by classifying speech signals
US20050027517A1 (en) Transcoding method and system between celp-based speech codes
JP2002541499A (en) CELP code conversion
US6847929B2 (en) Algebraic codebook system and method
RU2015147276A (en) SOUND ENCODING DEVICE AND DECODING DEVICE
WO2004090864B1 (en) Method and apparatus for the encoding and decoding of speech
US6687667B1 (en) Method for quantizing speech coder parameters
EP1597721B1 (en) 600 bps mixed excitation linear prediction transcoding
JPH0934499A (en) Sound encoding communication system
JPH08234795A (en) Voice encoding device
KR100341398B1 (en) Codebook searching method for CELP type vocoder
JPH08129400A (en) Voice coding system
CN1327410C (en) Method and apparatus for transcoding between different speech encoding/decoding systems and recording medium
JP3296411B2 (en) Voice encoding method and decoding method
US20130191134A1 (en) Method and apparatus for decoding an audio signal using a shaping function
JPH08202398A (en) Voice coding device
JPH0458299A (en) Sound encoding device
JPH09269798A (en) Voice coding method and voice decoding method
JPH01255900A (en) Sound encoding system
JPH0572780B2 (en)

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
B Later publication of amended claims

Effective date: 20041216

122 Ep: pct app. not ent. europ. phase