CN102622353B - Fixed audio retrieval method - Google Patents

Fixed audio retrieval method Download PDF

Info

Publication number
CN102622353B
CN102622353B CN 201110028979 CN201110028979A CN102622353B CN 102622353 B CN102622353 B CN 102622353B CN 201110028979 CN201110028979 CN 201110028979 CN 201110028979 A CN201110028979 A CN 201110028979A CN 102622353 B CN102622353 B CN 102622353B
Authority
CN
China
Prior art keywords
audio data
section
detection section
audio
inquiry
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 201110028979
Other languages
Chinese (zh)
Other versions
CN102622353A (en
Inventor
刘赵杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
TVMining Beijing Media Technology Co Ltd
Original Assignee
TVMining Beijing Media Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TVMining Beijing Media Technology Co Ltd filed Critical TVMining Beijing Media Technology Co Ltd
Priority to CN 201110028979 priority Critical patent/CN102622353B/en
Publication of CN102622353A publication Critical patent/CN102622353A/en
Application granted granted Critical
Publication of CN102622353B publication Critical patent/CN102622353B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a fixed audio retrieval method which comprises the following steps of: when an audio data retrieval database is established, firstly extracting the characteristics of the audio data according to a detection segment to establish an index table; establishing a secondary index for the audio segment with relatively large information quantity in a fingerprint segment of the audio data; in the retrieval stage of target audio data, firstly segmenting the target audio data to be retrieved according to the classification of the target audio data; performing quick inquiry of theaudio data segment with relatively large information quantity to obtain possible candidate positions; and performing fine inquiry near the candidate positions by use of the target audio data. In the technical scheme of the invention, by performing high-quality indexing of the audio database and adopting a coarse-fine combined classified inquiry mode for inquiry, the computational complexity can be remarkably reduced, and the inquiry efficiency can be improved.

Description

A kind of fixed-audio search method
Technical field
The present invention relates to multimedia technology field, relate in particular to a kind of fixed-audio search method.
Background technology
Increasing of the development in accompanying information epoch, multimedia document presents the scale of magnanimity day by day.When people when browsing and understand these contents, as the audio frequency of significant part in the multi-medium data, for people's perception provides important clue.In order to obtain interested content from these data, need to carry out information extraction and retrieval and inquisition, wherein the fixed-audio retrieval is exactly a kind of practical technology.Fixed-audio retrieval refers to the audio fragment of detection and location and given inquiry audio frequency homology in audio frequency to be checked, and it is one of basic problem in the multimedia retrieval.The fixed-audio detection technique relates to pattern-recognition, Audio Signal Processing, the multinomial technology such as speech processes.This technology has very widely application prospect, can be widely used in retrieval and the location of program, music, advertisement etc., the aspects such as the compression quality of copyright protection and evaluation audio frequency and the audio signal decoding that some has military use and monitoring.Development along with continuous maturation and the computer hardware level of technology, can predict, in the near future, this technology will be come into rapidly popular life, it will change the mode of people's study, work and life ﹠ amusement, thereby produce huge economic benefit and social benefit.
In the audio retrieval field, be a kind of system that commonly uses based on the audio-frequency fingerprint searching system.The method that it is mainly processed by signal with transfer the audio-frequency fingerprint of a fixed byte size in the audio frequency to every the sound signal of a set time, changes into voice data audio-frequency fingerprint data in this way.Then system sets up concordance list to all audio-frequency fingerprint data, thereby voice data has been set up quick-searching.
, in the fewer situation of voice data, all finger print datas can be called in the internal memory based on the audio-frequency fingerprint searching system, carry out index after, can carry out easily quick-searching.Under actual conditions, the amount of voice data is very large, and quantity is also in continuous growth, while fixed-audio searching system, when the template number of inquiry was many, when perhaps the template length of inquiry was long, computation complexity will be high, efficient will descend by straight line, and is more obvious during in the face of magnanimity inquiry storehouse.The characteristic of data is not considered in fixed-audio retrieval and inquisition storehouse when setting up, it is very large to cause inquiring about storehouse itself, does not consider simultaneously the searched targets data characteristic, and when searched targets was longer, it was very long to become retrieval time.
Summary of the invention
The object of the invention is to propose a kind of fixed-audio search method, can greatly reduce computation complexity, improve the efficient of voice data inquiry.
For reaching this purpose, the present invention by the following technical solutions:
A kind of fixed-audio search method may further comprise the steps:
A, by quiet section voice data is carried out segmentation, form non-quiet audio data detection section;
B, the audio data detection section is carried out harmonic wave detect, and the audio data detection section is classified, form voice data fingerprint section category index;
C, the audio data detection section is divided into the voice data fingerprint section of regular length, according to quantity of information voice data fingerprint section is identified classification, form the segment index of voice data fingerprint;
D, each voice data fingerprint section is extracted the voice data fingerprint characteristic, set up the voice data fingerprint index;
E, treat retrieve audio data by quiet section and carry out segmentation, form non-quiet audio data detection section to be retrieved, therefrom choose and be no less than the longest audio data detection section to be retrieved of a period of time as inquiry audio data detection section;
F, inquiry audio data detection section is carried out harmonic wave detect, determine the classification of inquiry audio data detection section, by audio-frequency fingerprint section category index, find the audio data detection section of inquiry audio data detection section correspondence;
G, will inquire about the inquiry voice data fingerprint section that the audio data detection section is divided into regular length, the quantity of information of assessment inquiry voice data fingerprint section is chosen quantity of information and is surpassed the longest continuous-query voice data fingerprint section of predetermined threshold value as the inquiry audio data section piecemeal;
H, in the audio data detection section of described correspondence, by the segment index of voice data fingerprint, obtain the position candidate of inquiry audio data section in the audio data detection section of described correspondence;
I, by the voice data fingerprint index, the inquiry audio data section is mated acquisition audio retrieval result with the position candidate in the described corresponding audio data detection section.
Among the step B, the audio data detection section that comprises harmonic structure is divided into voice segments or music segments, the audio data detection section that does not comprise harmonic structure is divided into noise section or invalid segment.
In the step F, the inquiry audio data detection section that comprises harmonic structure is divided into voice segments or music segments, the inquiry audio data detection section that does not comprise harmonic structure is divided into noise section or invalid segment.
In the steps A, by the energy of voice data present segment and the ratio of total energy, judge whether quiet section or effective acoustic segment.
In the step e, by the energy of voice data present segment to be retrieved and the ratio of total energy, judge whether quiet section or effective acoustic segment.
Adopt technical scheme of the present invention, by audio database is carried out the high-quality index, adopted thickness in conjunction with a minute rank inquiry mode during inquiry, can greatly reduce computation complexity, improved search efficiency.
Description of drawings
Fig. 1 is fixed-audio retrieval flow figure in the specific embodiment of the invention.
Embodiment
Further specify technical scheme of the present invention below in conjunction with accompanying drawing and by embodiment.
The main thought of technical solution of the present invention is based on the voice data fingerprinting key, at first voice data is carried out a pre-service, voice data is classified by detection segment, such as music, voice, quiet and other sound etc.; Then the audio data detection section is carried out a simple classification by the set time section by quantity of information.When setting up the audio retrieval database, at first set up concordance list by the feature of detection segment extraction voice data, then the higher audio section of quantity of information in the voice data fingerprint section is set up secondary index.First according to the classification of target audio data target audio data to be retrieved are carried out segmentation in the searched targets voice data stage, the higher audio data section of quantity of information is carried out fast query obtain possible position candidate, then near position candidate, carry out meticulous inquiry with the target audio data.
Fig. 1 is fixed-audio retrieval flow figure in the specific embodiment of the invention.As shown in Figure 1, this fixed-audio retrieval flow may further comprise the steps:
Phase one is the audio database process of building, and is about to the huge audio repository of capacity and converts multiple index audio-frequency fingerprint storehouse to.
Step 101, the energy that passes through the voice data present segment and the ratio of total energy judge whether quiet section or effective acoustic segment, by quiet section voice data are carried out segmentation again, form non-quiet audio data detection section.
Step 102, the audio data detection section is carried out harmonic wave detect, the audio data detection section is classified, form voice data fingerprint section category index.Wherein, the audio data detection section that comprises harmonic structure is divided into voice segments or music segments, the audio data detection section that does not comprise harmonic structure is divided into noise section or invalid segment.
Step 103, the audio data detection section is divided into the voice data fingerprint section of regular length, according to quantity of information voice data fingerprint section is identified classification, form the segment index of voice data fingerprint.Namely to the voice data fingerprint section of regular length appreciation information amount piecemeal, then the section that wherein quantity of information is higher is done sign.
Step 104, each voice data fingerprint section is extracted the voice data fingerprint characteristic, set up the voice data fingerprint index.
Subordinate phase is the audio retrieval process, and the voice data to be retrieved that is based on input mates retrieval, obtains the needed voice data of user from audio database.
Step 105, the energy that passes through voice data present segment to be retrieved and the ratio of total energy, judge whether quiet section or effective acoustic segment, treat retrieve audio data by quiet section again and carry out segmentation, form non-quiet audio data detection section to be retrieved, therefrom choose and be no less than the longest audio data detection section to be retrieved of a period of time as inquiry audio data detection section.
Step 106, inquiry audio data detection section is carried out harmonic wave detect, determine the classification of inquiry audio data detection section, the inquiry audio data detection section that comprises harmonic structure is divided into voice segments or music segments, the inquiry audio data detection section that does not comprise harmonic structure is divided into noise section or invalid segment.By audio-frequency fingerprint section category index, find the audio data detection section of inquiry audio data detection section correspondence.
Step 107, will inquire about the inquiry voice data fingerprint section that the audio data detection section is divided into regular length, the quantity of information of assessment inquiry voice data fingerprint section is chosen quantity of information and is surpassed the longest continuous-query voice data fingerprint section of predetermined threshold value as the inquiry audio data section piecemeal.
Step 108, in the audio data detection section of described correspondence, by the segment index of voice data fingerprint, obtain the position candidate of inquiry audio data section in the audio data detection section of described correspondence.Here generally can give a looser thresholding, allow candidate result comprise wherein as far as possible.
Step 109, by the voice data fingerprint index, the inquiry audio data section is mated acquisition audio retrieval result with the position candidate in the described corresponding audio data detection section.
The above; only for the better embodiment of the present invention, but protection scope of the present invention is not limited to this, anyly is familiar with the people of this technology in the disclosed technical scope of the present invention; the variation that can expect easily or replacement all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of claim.

Claims (5)

1. a fixed-audio search method is characterized in that, may further comprise the steps:
A, by quiet section voice data is carried out segmentation, form non-quiet audio data detection section;
B, the audio data detection section is carried out harmonic wave detect, and the audio data detection section is classified, form voice data fingerprint section category index;
C, with audio data detection section conversion and be divided into the voice data fingerprint section of regular length, according to quantity of information voice data fingerprint section is identified classification, form the segment index of voice data fingerprint;
D, each voice data fingerprint section is extracted the voice data fingerprint characteristic, set up the voice data fingerprint index;
E, treat retrieve audio data by quiet section and carry out segmentation, form non-quiet audio data detection section to be retrieved, therefrom choose and be no less than the longest audio data detection section to be retrieved of a period of time as inquiry audio data detection section;
F, inquiry audio data detection section is carried out harmonic wave detect, determine the classification of inquiry audio data detection section, by audio-frequency fingerprint section category index, find the audio data detection section of inquiry audio data detection section correspondence;
G, will inquire about the inquiry voice data fingerprint section that the audio data detection section is divided into regular length, the quantity of information of assessment inquiry voice data fingerprint section is chosen quantity of information and is surpassed the longest continuous-query voice data fingerprint section of predetermined threshold value as the inquiry audio data section piecemeal;
H, in the audio data detection section of described correspondence, by the segment index of voice data fingerprint, obtain the position candidate of inquiry audio data section in the audio data detection section of described correspondence;
I, by the voice data fingerprint index, the inquiry audio data section is mated acquisition audio retrieval result with the position candidate in the described corresponding audio data detection section.
2. a kind of fixed-audio search method according to claim 1, it is characterized in that, among the step B, the audio data detection section that comprises harmonic structure is divided into voice segments or music segments, the audio data detection section that does not comprise harmonic structure is divided into noise section or invalid segment.
3. a kind of fixed-audio search method according to claim 1, it is characterized in that, in the step F, the inquiry audio data detection section that comprises harmonic structure is divided into voice segments or music segments, the inquiry audio data detection section that does not comprise harmonic structure is divided into noise section or invalid segment.
4. a kind of fixed-audio search method according to claim 1, it is characterized in that, in the steps A, by the energy of voice data present segment and the ratio of total energy, judge whether quiet section or effective acoustic segment, when being quiet section, by quiet section voice data is carried out segmentation, form non-quiet audio data detection section.
5. a kind of fixed-audio search method according to claim 1, it is characterized in that, in the step e, by the energy of voice data present segment to be retrieved and the ratio of total energy, judge whether quiet section or effective acoustic segment, when being quiet section, treating retrieve audio data by quiet section and carry out segmentation, form non-quiet audio data detection section to be retrieved, therefrom choose and be no less than the longest audio data detection section to be retrieved of a period of time as inquiry audio data detection section.
CN 201110028979 2011-01-27 2011-01-27 Fixed audio retrieval method Expired - Fee Related CN102622353B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110028979 CN102622353B (en) 2011-01-27 2011-01-27 Fixed audio retrieval method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110028979 CN102622353B (en) 2011-01-27 2011-01-27 Fixed audio retrieval method

Publications (2)

Publication Number Publication Date
CN102622353A CN102622353A (en) 2012-08-01
CN102622353B true CN102622353B (en) 2013-10-16

Family

ID=46562276

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110028979 Expired - Fee Related CN102622353B (en) 2011-01-27 2011-01-27 Fixed audio retrieval method

Country Status (1)

Country Link
CN (1) CN102622353B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103021440B (en) * 2012-11-22 2015-04-22 腾讯科技(深圳)有限公司 Method and system for tracking audio streaming media
CN105828179A (en) * 2015-06-24 2016-08-03 维沃移动通信有限公司 Video positioning method and device
CN105142018A (en) * 2015-08-12 2015-12-09 深圳Tcl数字技术有限公司 Programme identification method and programme identification device based on audio fingerprints
CN106571150B (en) * 2015-10-12 2021-04-16 阿里巴巴集团控股有限公司 Method and system for recognizing human voice in music
CN110913242B (en) * 2018-09-18 2021-12-10 阿基米德(上海)传媒有限公司 Automatic generation method of broadcast audio label
CN112883283A (en) * 2019-11-22 2021-06-01 拉扎斯网络科技(上海)有限公司 Information processing method, information processing device, electronic equipment and computer readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1261181A (en) * 1999-01-19 2000-07-26 国际商业机器公司 Automatic system and method for analysing content of audio signals
US7013301B2 (en) * 2003-09-23 2006-03-14 Predixis Corporation Audio fingerprinting system and method
CN101673266A (en) * 2008-09-12 2010-03-17 未序网络科技(上海)有限公司 Method for searching audio and video contents
EP2168061A1 (en) * 2007-06-06 2010-03-31 Dolby Laboratories Licensing Corporation Improving audio/video fingerprint search accuracy using multiple search combining
CN101777075A (en) * 2010-02-05 2010-07-14 上海全土豆网络科技有限公司 Method for searching parallel audio fingerprint
CN101853262A (en) * 2009-12-07 2010-10-06 清华大学 Voice frequency fingerprint rapid searching method based on cross entropy

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1261181A (en) * 1999-01-19 2000-07-26 国际商业机器公司 Automatic system and method for analysing content of audio signals
US7013301B2 (en) * 2003-09-23 2006-03-14 Predixis Corporation Audio fingerprinting system and method
EP2168061A1 (en) * 2007-06-06 2010-03-31 Dolby Laboratories Licensing Corporation Improving audio/video fingerprint search accuracy using multiple search combining
CN101673266A (en) * 2008-09-12 2010-03-17 未序网络科技(上海)有限公司 Method for searching audio and video contents
CN101853262A (en) * 2009-12-07 2010-10-06 清华大学 Voice frequency fingerprint rapid searching method based on cross entropy
CN101777075A (en) * 2010-02-05 2010-07-14 上海全土豆网络科技有限公司 Method for searching parallel audio fingerprint

Also Published As

Publication number Publication date
CN102622353A (en) 2012-08-01

Similar Documents

Publication Publication Date Title
CN102622353B (en) Fixed audio retrieval method
CN103440313B (en) music retrieval system based on audio fingerprint feature
CN107293307B (en) Audio detection method and device
CN103021440B (en) Method and system for tracking audio streaming media
WO2005031600A3 (en) Computer aided document retrieval
CN108170650B (en) Text comparison method and text comparison device
CN1822000A (en) Method for automatic detecting news event
CN109145180B (en) Enterprise hot event mining method based on incremental clustering
CN101719167A (en) Interactive movie searching method
CN113051442A (en) Time series data processing method, device and computer readable storage medium
CN104182465A (en) Network-based big data processing method
CN101594527B (en) Two-stage method for detecting templates in audio and video streams with high accuracy
CN102937994A (en) Similar document query method based on stop words
CN104951553A (en) Content collecting and data mining platform accurate in data processing and implementation method thereof
CN102375863A (en) Method and device for keyword extraction in geographic information field
CN110767248B (en) Anti-modulation interference audio fingerprint extraction method
CN110674243A (en) Corpus index construction method based on dynamic K-means algorithm
Xiao et al. Fast Hamming Space Search for Audio Fingerprinting Systems.
CN103294696A (en) Audio and video content retrieval method and system
CN105244024A (en) Voice recognition method and device
CN102253993B (en) Vocabulary tree-based audio-clip retrieving algorithm
CN103870466A (en) Automatic extracting method for audio examples
CN110597982A (en) Short text topic clustering algorithm based on word co-occurrence network
CN111159996B (en) Short text set similarity comparison method and system based on text fingerprint algorithm
CN111723297B (en) Dual-semantic similarity judging method for grid society situation research and judgment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Fixed audio retrieval method

Effective date of registration: 20140402

Granted publication date: 20131016

Pledgee: Zhongguancun Beijing technology financing Company limited by guarantee

Pledgor: TVMining (Beijing) Media Technology Co., Ltd.

Registration number: 2014990000223

PLDC Enforcement, change and cancellation of contracts on pledge of patent right or utility model
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20151126

Granted publication date: 20131016

Pledgee: Zhongguancun Beijing technology financing Company limited by guarantee

Pledgor: TVMining (Beijing) Media Technology Co., Ltd.

Registration number: 2014990000223

PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Fixed audio retrieval method

Effective date of registration: 20151130

Granted publication date: 20131016

Pledgee: Zhongguancun Beijing technology financing Company limited by guarantee

Pledgor: TVMining (Beijing) Media Technology Co., Ltd.

Registration number: 2015990001068

PLDC Enforcement, change and cancellation of contracts on pledge of patent right or utility model
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20131016

Termination date: 20210127