[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN103870476A - Retrieval method and device - Google Patents

Retrieval method and device Download PDF

Info

Publication number
CN103870476A
CN103870476A CN201210535176.8A CN201210535176A CN103870476A CN 103870476 A CN103870476 A CN 103870476A CN 201210535176 A CN201210535176 A CN 201210535176A CN 103870476 A CN103870476 A CN 103870476A
Authority
CN
China
Prior art keywords
retrieval
result
audio
cluster
characteristic information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201210535176.8A
Other languages
Chinese (zh)
Inventor
刘锋
朱中的
王宽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING YINZHIBANG CULTURE TECHNOLOGY Co.,Ltd.
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201210535176.8A priority Critical patent/CN103870476A/en
Publication of CN103870476A publication Critical patent/CN103870476A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a retrieval method and a retrieval device. According to the retrieval method and the retrieval device, at least two retrieval results are clustered according to the content feature information of the at least two retrieval results obtained according to a retrieval keyword so as to obtain at least two retrieval results after clustering, and then the at least two retrieval results after clustering can be sent to a client; since the clustering of the retrieval results is performed by using the content feature information of the retrieval results, the problem that in prior art the clustering of the retrieval results which is performed by using the subject names of the retrieval results is not accurate can be avoided, and thus the reliability of the retrieval is improved.

Description

Search method and equipment
[technical field]
The present invention relates to retrieval technique, relate in particular to a kind of search method and equipment.
[background technology]
Along with the development of the communication technology, increasing function that terminal is integrated, thus make to have comprised more and more corresponding application programs in the systemic-function list of terminal, in some application program, can relate to some object retrievals operations, for example, Baidu's music etc.In retrieving, often there will be the similar situation of a large amount of result for retrieval.In prior art, can utilize the subject name of result for retrieval, for example, the song title of audio file and singer, carry out cluster to described result for retrieval, the result for retrieval of being shown to optimize client.
But, utilize the subject name of result for retrieval, described result for retrieval is carried out to cluster not accurate enough, thereby caused the reduction of the reliability of retrieval.
[summary of the invention]
Many aspects of the present invention provide a kind of search method and equipment, in order to improve the reliability of retrieval.
An aspect of of the present present invention, provides a kind of search method, comprising:
Receive the retrieval command that client sends, in described retrieval command, comprise search key;
According to described search key, obtain at least two result for retrieval that mate with described search key;
According to the content characteristic information of described at least two result for retrieval, described at least two result for retrieval are carried out to cluster, to obtain cluster at least two result for retrieval afterwards;
At least two result for retrieval after described client sends described cluster.
Aspect as above and arbitrary possible implementation, further provide a kind of implementation, and described at least two result for retrieval are audio file; The content characteristic information of described at least two result for retrieval comprises:
The audio-frequency fingerprint of described audio file; And/or
The index value of described audio-frequency fingerprint, the index value of described audio-frequency fingerprint is to generate according to the audio-frequency fingerprint of described audio file.
Aspect as above and arbitrary possible implementation, further provide a kind of implementation, and described at least two result for retrieval are video file; The content characteristic information of described at least two result for retrieval comprises:
The video finger print of described video file; And/or
The index value of described video finger print, the index value of described video finger print is to generate according to the video finger print of described video file.
Aspect as above and arbitrary possible implementation, further provide a kind of implementation, and the content characteristic information of at least two result for retrieval, carries out cluster to described at least two result for retrieval described in described basis, comprising:
According to the content characteristic information of described at least two result for retrieval, utilize hash algorithm, each described result for retrieval is moved into respectively in the chained list of corresponding groove position in Hash table.
Aspect as above and arbitrary possible implementation, further provide a kind of implementation, and described at least two result for retrieval after described client sends described cluster, comprising:
Travel through each described groove position, obtain the result for retrieval in the chained list of each described groove position;
Send the result for retrieval in the chained list of each described groove position to described client.
An aspect of of the present present invention, provides a kind of retrieval facility, comprising:
Receiving element, the retrieval command sending for receiving client, comprises search key in described retrieval command;
Matching unit, for according to described search key, obtains at least two result for retrieval that mate with described search key;
Cluster cell, for according to the content characteristic information of described at least two result for retrieval, carries out cluster to described at least two result for retrieval, to obtain cluster at least two result for retrieval afterwards;
Transmitting element, at least two result for retrieval after described client sends described cluster.
Aspect as above and arbitrary possible implementation, further provide a kind of implementation, and described at least two result for retrieval that described matching unit matches are audio file; The content characteristic information of described at least two result for retrieval comprises:
The audio-frequency fingerprint of described audio file; And/or
The index value of described audio-frequency fingerprint, the index value of described audio-frequency fingerprint is to generate according to the audio-frequency fingerprint of described audio file.
Aspect as above and arbitrary possible implementation, further provide a kind of implementation, and described at least two result for retrieval that described matching unit matches are video file; The content characteristic information of described at least two result for retrieval comprises:
The video finger print of described video file; And/or
The index value of described video finger print, the index value of described video finger print is to generate according to the video finger print of described video file.
Aspect as above and arbitrary possible implementation, further provide a kind of implementation, described cluster cell specifically for
According to the content characteristic information of described at least two result for retrieval, utilize hash algorithm, each described result for retrieval is moved into respectively in the chained list of corresponding groove position in Hash table.
Aspect as above and arbitrary possible implementation, further provide a kind of implementation, described transmitting element specifically for
Travel through each described groove position, obtain the result for retrieval in the chained list of each described groove position; And send the result for retrieval in the chained list of each described groove position to described client
As shown from the above technical solution, the embodiment of the present invention is passed through the content characteristic information of at least two result for retrieval that obtain according to search key, described at least two result for retrieval are carried out to cluster, to obtain cluster at least two result for retrieval afterwards, make it possible at least two result for retrieval after described client sends described cluster, because the content characteristic information that adopts result for retrieval is carried out cluster to result for retrieval, can avoid utilizing in prior art the subject name of result for retrieval, described result for retrieval is carried out to the not accurate enough problem of cluster, thereby improve the reliability of retrieval.
[accompanying drawing explanation]
In order to be illustrated more clearly in the technical scheme in the embodiment of the present invention, to the accompanying drawing of required use in embodiment or description of the Prior Art be briefly described below, apparently, accompanying drawing in the following describes is some embodiments of the present invention, for those of ordinary skills, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.
The schematic flow sheet of the search method that Fig. 1 provides for one embodiment of the invention;
The groove position schematic diagram that Fig. 2 is the Hash table that adopts in embodiment corresponding to Fig. 1;
The structural representation of the retrieval facility that Fig. 3 provides for another embodiment of the present invention.
[embodiment]
For making object, technical scheme and the advantage of the embodiment of the present invention clearer, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment in the present invention, those of ordinary skills, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.
In addition, term "and/or" herein, is only a kind of incidence relation of describing affiliated partner, and expression can exist three kinds of relations, and for example, A and/or B, can represent: individualism A exists A and B, these three kinds of situations of individualism B simultaneously.In addition, character "/" herein, generally represents that forward-backward correlation is to liking a kind of relation of "or".
The schematic flow sheet of the search method that Fig. 1 provides for one embodiment of the invention, as shown in Figure 1.
101, receive the retrieval command that client sends, in described retrieval command, comprise search key.
102,, according to described search key, obtain at least two result for retrieval that mate with described search key.
103,, according to the content characteristic information of described at least two result for retrieval, described at least two result for retrieval are carried out to cluster, to obtain cluster at least two result for retrieval afterwards.
104, at least two result for retrieval after described client sends described cluster.
It should be noted that, 101~104 executive agent can be server.
Like this, by the content characteristic information of at least two result for retrieval obtaining according to search key, described at least two result for retrieval are carried out to cluster, to obtain cluster at least two result for retrieval afterwards, make it possible at least two result for retrieval after described client sends described cluster, because the content characteristic information that adopts result for retrieval is carried out cluster to result for retrieval, can avoid utilizing in prior art the subject name of result for retrieval, described result for retrieval is carried out to the not accurate enough problem of cluster, thereby improved the reliability of retrieval.
Wherein, described at least two result for retrieval can include but not limited at least one in audio file and video file.
Alternatively, in one of the present embodiment possible implementation, if described at least two result for retrieval are audio file; So, correspondingly, the content characteristic information of described at least two result for retrieval can include but not limited to:
The audio-frequency fingerprint of described audio file; And/or
The index value of described audio-frequency fingerprint, the index value of described audio-frequency fingerprint is to generate according to the audio-frequency fingerprint of described audio file.
Particularly, before 103, can also further carry out audio-frequency fingerprint identification to described audio file, to obtain the audio-frequency fingerprint of described audio file.Wherein, the audio-frequency fingerprint of described audio file is a feature that audio file is exclusive, refer to the content-based digital signature of the important acoustic feature that can represent one section of music, its fundamental purpose is to set up a kind of actual mechanism and come the perception acoustical quality of two voice datas of comparison.Note it not being direct very large voice data relatively conventionally itself here, but compare its corresponding less audio-frequency fingerprint conventionally.For example, the contents such as song title, ci and qu author, the lyrics are stored in a database together, and adopt the index of audio-frequency fingerprint as respective meta-data for the audio-frequency fingerprint of a large amount of voice datas and its corresponding metadata.Detailed description can, referring to related content of the prior art, repeat no more herein.Further, if necessary, can also, further according to the audio-frequency fingerprint of described audio file, generate the index value of described audio-frequency fingerprint.
Alternatively, in one of the present embodiment possible implementation, if described at least two result for retrieval are video file; So, correspondingly, the content characteristic information of described at least two result for retrieval can include but not limited to:
The video finger print of described video file; And/or
The index value of described video finger print, the index value of described video finger print is to generate according to the video finger print of described video file.
Particularly, before 103, can also further carry out video fingerprint recognition to described video file, to obtain the video finger print of described video file.Wherein, the video finger print of described video file is unique proper vector that a video file is different from other video files, refers to the content-based digital signature of the important video features that can characterize one section of video.Detailed description can, referring to related content of the prior art, repeat no more herein.Further, if necessary, can also, further according to the video finger print of described video file, generate the index value of described video finger print.
Alternatively, in one of the present embodiment possible implementation, in 103, specifically can be according to the content characteristic information of described at least two result for retrieval, utilize Hash (Hash) algorithm, each described result for retrieval is moved into respectively in the chained list of corresponding groove position in Hash table.
For example, take retrieval audio file as example, suppose to generate for each audio file in advance the audio-frequency fingerprint of 128 bytes, so, can, further in advance according to described audio-frequency fingerprint, generate the index value of two 32 (bit) signless integers (unsign32).First, server receives the audio retrieval order that client sends, and in this audio retrieval order, comprises search key.Then, described server is according to described search key, obtain at least two audio files that mate with described search key, be audio file 1, audio file 2, audio file 3 ..., audio file n, and adopt scheme of the prior art, for example, relevance algorithms etc., sort to these audio files.Then, described server is according to the order after described sequence, utilize the index value of each audio file, search Hash table, hit determining whether, if hit, in order further to guarantee the accuracy of cluster, can also be the more further audio-frequency fingerprint of the audio file in the audio-frequency fingerprint of this audio file and the groove position of hitting more once, if consistent, this audio file is moved in Hash table and hit in the chained list of groove position, if do not hit, in Hash table, obtain a new groove position, this audio file is moved in this new groove position, as shown in Figure 2, until complete searching of whole audio files.
Correspondingly, in 104, can travel through each described groove position, obtain the result for retrieval in the chained list of each described groove position; Then, send the result for retrieval in the chained list of each described groove position to described client.Particularly, can utilize interior poly-data structure, result for retrieval in the chained list of each described groove position is carried out to structuring, and send through the result for retrieval in the chained list of described structurized each described groove position to described client, to make the result for retrieval that client is represented realize interior poly-effect.
In the present embodiment, by the content characteristic information of at least two result for retrieval obtaining according to search key, described at least two result for retrieval are carried out to cluster, to obtain cluster at least two result for retrieval afterwards, make it possible at least two result for retrieval after described client sends described cluster, because the content characteristic information that adopts result for retrieval is carried out cluster to result for retrieval, can avoid utilizing in prior art the subject name of result for retrieval, described result for retrieval is carried out to the not accurate enough problem of cluster, thereby improved the reliability of retrieval.
It should be noted that, for aforesaid each embodiment of the method, for simple description, therefore it is all expressed as to a series of combination of actions, but those skilled in the art should know, the present invention is not subject to the restriction of described sequence of movement, because according to the present invention, some step can adopt other orders or carry out simultaneously.Secondly, those skilled in the art also should know, the embodiment described in instructions all belongs to preferred embodiment, and related action and module might not be that the present invention is necessary.
In the above-described embodiments, the description of each embodiment is all emphasized particularly on different fields, in certain embodiment, there is no the part of detailed description, can be referring to the associated description of other embodiment.
The structural representation of the retrieval facility that Fig. 3 provides for another embodiment of the present invention, as shown in Figure 3.The retrieval facility of the present embodiment can comprise receiving element 31, matching unit 32, cluster cell 33 and transmitting element 34.Wherein, receiving element 31, the retrieval command sending for receiving client, comprises search key in described retrieval command; Matching unit 32, for according to described search key, obtains at least two result for retrieval that mate with described search key; Cluster cell 33, for according to the content characteristic information of described at least two result for retrieval, carries out cluster to described at least two result for retrieval, to obtain cluster at least two result for retrieval afterwards; Transmitting element 34, at least two result for retrieval after described client sends described cluster.
It should be noted that, the retrieval facility that the present embodiment provides can be server.
Like this, the content characteristic information of at least two result for retrieval that obtain according to search key according to matching unit by cluster cell, described at least two result for retrieval are carried out to cluster, to obtain cluster at least two result for retrieval afterwards, make at least two result for retrieval that transmitting element can be after described client sends described cluster, because the content characteristic information that adopts result for retrieval is carried out cluster to result for retrieval, can avoid utilizing in prior art the subject name of result for retrieval, described result for retrieval is carried out to the not accurate enough problem of cluster, thereby improve the reliability of retrieval.
Wherein, described at least two result for retrieval that described matching unit 32 matches can include but not limited at least one in audio file and video file.
Alternatively, in one of the present embodiment possible implementation, if described at least two result for retrieval that described matching unit 32 matches are audio file; So, correspondingly, the content characteristic information of described at least two result for retrieval can include but not limited to:
The audio-frequency fingerprint of described audio file; And/or
The index value of described audio-frequency fingerprint, the index value of described audio-frequency fingerprint is to generate according to the audio-frequency fingerprint of described audio file.
Particularly, described cluster cell 33, before carrying out cluster operation, can also further carry out audio-frequency fingerprint identification to described audio file, to obtain the audio-frequency fingerprint of described audio file.Wherein, the audio-frequency fingerprint of described audio file is a feature that audio file is exclusive, refer to the content-based digital signature of the important acoustic feature that can represent one section of music, its fundamental purpose is to set up a kind of actual mechanism and come the perception acoustical quality of two voice datas of comparison.Note it not being direct very large voice data relatively conventionally itself here, but compare its corresponding less audio-frequency fingerprint conventionally.For example, the contents such as song title, ci and qu author, the lyrics are stored in a database together, and adopt the index of audio-frequency fingerprint as respective meta-data for the audio-frequency fingerprint of a large amount of voice datas and its corresponding metadata.Detailed description can, referring to related content of the prior art, repeat no more herein.Further, if necessary, described cluster cell 33 can also, further according to the audio-frequency fingerprint of described audio file, generate the index value of described audio-frequency fingerprint.
Alternatively, in one of the present embodiment possible implementation, if described at least two result for retrieval that described matching unit 32 matches are video file; So, correspondingly, the content characteristic information of described at least two result for retrieval can include but not limited to:
The video finger print of described video file; And/or
The index value of described video finger print, the index value of described video finger print is to generate according to the video finger print of described video file.
Particularly, described cluster cell 33, before carrying out cluster operation, can also further carry out video fingerprint recognition to described video file, to obtain the video finger print of described video file.Wherein, the video finger print of described video file is unique proper vector that a video file is different from other video files, refers to the content-based digital signature of the important video features that can characterize one section of video.Detailed description can, referring to related content of the prior art, repeat no more herein.Further, if necessary, described cluster cell 33 can also, further according to the video finger print of described video file, generate the index value of described video finger print.
Alternatively, in one of the present embodiment possible implementation, described cluster cell 33 specifically can, for according to the content characteristic information of described at least two result for retrieval, utilize hash algorithm, and each described result for retrieval is moved into respectively in the chained list of corresponding groove position in Hash table.
For example, take retrieval audio file as example, suppose that described cluster cell 33 generates the audio-frequency fingerprint of 128 bytes in advance for each audio file, so, 33 of described cluster cells can, further in advance according to described audio-frequency fingerprint, generate the index value of two 32 (bit) signless integers (unsign32).First, described receiving element 31 receives the audio retrieval order that client sends, and in this audio retrieval order, comprises search key.Then, 32 of described matching units are according to described search key, obtain at least two audio files that mate with described search key, be audio file 1, audio file 2, audio file 3 ..., audio file n, and adopt scheme of the prior art, for example, relevance algorithms etc., sort to these audio files.Then, order after 33 sequences of described audio file being carried out according to described matching unit of described cluster cell, utilize the index value of each audio file, search Hash table, hit determining whether, if hit, in order further to guarantee the accuracy of cluster, can also be the more further audio-frequency fingerprint of the audio file in the audio-frequency fingerprint of this audio file and the groove position of hitting more once, if consistent, this audio file is moved in Hash table and hit in the chained list of groove position, if do not hit, in Hash table, obtain a new groove position, this audio file is moved in this new groove position, as shown in Figure 2, until complete searching of whole audio files.
Correspondingly, 34 of described transmitting elements specifically can be for traveling through each described groove position, obtains the result for retrieval in the chained list of each described groove position; And send the result for retrieval in the chained list of each described groove position to described client.Particularly, described transmitting element 34 can utilize interior poly-data structure, result for retrieval in the chained list of each described groove position is carried out to structuring, and send through the result for retrieval in the chained list of described structurized each described groove position to described client, to make the result for retrieval that client is represented realize interior poly-effect.
In the present embodiment, the content characteristic information of at least two result for retrieval that obtain according to search key according to matching unit by cluster cell, described at least two result for retrieval are carried out to cluster, to obtain cluster at least two result for retrieval afterwards, make at least two result for retrieval that transmitting element can be after described client sends described cluster, because the content characteristic information that adopts result for retrieval is carried out cluster to result for retrieval, can avoid utilizing in prior art the subject name of result for retrieval, described result for retrieval is carried out to the not accurate enough problem of cluster, thereby improve the reliability of retrieval.
Those skilled in the art can be well understood to, for convenience and simplicity of description, the system of foregoing description, the specific works process of device and unit, can, with reference to the corresponding process in preceding method embodiment, not repeat them here.
In several embodiment provided by the present invention, should be understood that, disclosed system, apparatus and method, can realize by another way.For example, device embodiment described above is only schematic, for example, the division of described unit, be only that a kind of logic function is divided, when actual realization, can have other dividing mode, for example multiple unit or assembly can in conjunction with or can be integrated into another system, or some features can ignore, or do not carry out.Another point, shown or discussed coupling each other or direct-coupling or communication connection can be by some interfaces, indirect coupling or the communication connection of device or unit can be electrically, machinery or other form.
The described unit as separating component explanation can or can not be also physically to separate, and the parts that show as unit can be or can not be also physical locations, can be positioned at a place, or also can be distributed in multiple network element.Can select according to the actual needs some or all of unit wherein to realize the object of the present embodiment scheme.
In addition, the each functional unit in each embodiment of the present invention can be integrated in a processing unit, can be also that the independent physics of unit exists, and also can be integrated in a unit two or more unit.Above-mentioned integrated unit both can adopt the form of hardware to realize, and the form that also can adopt hardware to add SFU software functional unit realizes.
The integrated unit that the above-mentioned form with SFU software functional unit realizes, can be stored in a computer read/write memory medium.Above-mentioned SFU software functional unit is stored in a storage medium, comprise that some instructions (can be personal computers in order to make a computer equipment, server, or the network equipment etc.) or processor (processor) carry out the part steps of method described in each embodiment of the present invention.And aforesaid storage medium comprises: various media that can be program code stored such as USB flash disk, portable hard drive, ROM (read-only memory) (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disc or CDs.
Finally it should be noted that: above embodiment only, in order to technical scheme of the present invention to be described, is not intended to limit; Although the present invention is had been described in detail with reference to previous embodiment, those of ordinary skill in the art is to be understood that: its technical scheme that still can record aforementioned each embodiment is modified, or part technical characterictic is wherein equal to replacement; And these modifications or replacement do not make the essence of appropriate technical solution depart from the spirit and scope of various embodiments of the present invention technical scheme.

Claims (10)

1. a search method, is characterized in that, comprising:
Receive the retrieval command that client sends, in described retrieval command, comprise search key;
According to described search key, obtain at least two result for retrieval that mate with described search key;
According to the content characteristic information of described at least two result for retrieval, described at least two result for retrieval are carried out to cluster, to obtain cluster at least two result for retrieval afterwards;
At least two result for retrieval after described client sends described cluster.
2. method according to claim 1, is characterized in that, described at least two result for retrieval are audio file; The content characteristic information of described at least two result for retrieval comprises:
The audio-frequency fingerprint of described audio file; And/or
The index value of described audio-frequency fingerprint, the index value of described audio-frequency fingerprint is to generate according to the audio-frequency fingerprint of described audio file.
3. method according to claim 1, is characterized in that, described at least two result for retrieval are video file; The content characteristic information of described at least two result for retrieval comprises:
The video finger print of described video file; And/or
The index value of described video finger print, the index value of described video finger print is to generate according to the video finger print of described video file.
4. according to the method described in the arbitrary claim of claim 1~3, it is characterized in that, the content characteristic information of at least two result for retrieval, carries out cluster to described at least two result for retrieval described in described basis, comprising:
According to the content characteristic information of described at least two result for retrieval, utilize hash algorithm, each described result for retrieval is moved into respectively in the chained list of corresponding groove position in Hash table.
5. method according to claim 4, is characterized in that, described at least two result for retrieval after described client sends described cluster, comprising:
Travel through each described groove position, obtain the result for retrieval in the chained list of each described groove position;
Send the result for retrieval in the chained list of each described groove position to described client.
6. a retrieval facility, is characterized in that, comprising:
Receiving element, the retrieval command sending for receiving client, comprises search key in described retrieval command;
Matching unit, for according to described search key, obtains at least two result for retrieval that mate with described search key;
Cluster cell, for according to the content characteristic information of described at least two result for retrieval, carries out cluster to described at least two result for retrieval, to obtain cluster at least two result for retrieval afterwards;
Transmitting element, at least two result for retrieval after described client sends described cluster.
7. equipment according to claim 6, is characterized in that, described at least two result for retrieval that described matching unit matches are audio file; The content characteristic information of described at least two result for retrieval comprises:
The audio-frequency fingerprint of described audio file; And/or
The index value of described audio-frequency fingerprint, the index value of described audio-frequency fingerprint is to generate according to the audio-frequency fingerprint of described audio file.
8. equipment according to claim 6, is characterized in that, described at least two result for retrieval that described matching unit matches are video file; The content characteristic information of described at least two result for retrieval comprises:
The video finger print of described video file; And/or
The index value of described video finger print, the index value of described video finger print is to generate according to the video finger print of described video file.
9. according to the equipment described in the arbitrary claim of claim 6~8, it is characterized in that, described cluster cell specifically for
According to the content characteristic information of described at least two result for retrieval, utilize hash algorithm, each described result for retrieval is moved into respectively in the chained list of corresponding groove position in Hash table.
10. equipment according to claim 9, is characterized in that, described transmitting element specifically for
Travel through each described groove position, obtain the result for retrieval in the chained list of each described groove position; And send the result for retrieval in the chained list of each described groove position to described client.
CN201210535176.8A 2012-12-12 2012-12-12 Retrieval method and device Pending CN103870476A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210535176.8A CN103870476A (en) 2012-12-12 2012-12-12 Retrieval method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210535176.8A CN103870476A (en) 2012-12-12 2012-12-12 Retrieval method and device

Publications (1)

Publication Number Publication Date
CN103870476A true CN103870476A (en) 2014-06-18

Family

ID=50909020

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210535176.8A Pending CN103870476A (en) 2012-12-12 2012-12-12 Retrieval method and device

Country Status (1)

Country Link
CN (1) CN103870476A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104951484A (en) * 2014-08-28 2015-09-30 腾讯科技(深圳)有限公司 Search result processing method and search result processing device
CN106202204A (en) * 2016-06-24 2016-12-07 维沃移动通信有限公司 The lookup method of a kind of voice document and mobile terminal
CN112232290A (en) * 2020-11-06 2021-01-15 四川云从天府人工智能科技有限公司 Data clustering method, server, system, and computer-readable storage medium
CN113536093A (en) * 2018-04-26 2021-10-22 华为技术有限公司 Information processing method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101216826A (en) * 2007-01-05 2008-07-09 鸿富锦精密工业(深圳)有限公司 Information search system and method
CN101246504A (en) * 2008-03-31 2008-08-20 北京搜狗科技发展有限公司 Clustering method, device and system
CN101271476A (en) * 2008-04-25 2008-09-24 清华大学 Relevant feedback retrieval method based on clustering in network image search
CN101458708A (en) * 2008-12-05 2009-06-17 北京大学 Searching result clustering method and device
US20110035035A1 (en) * 2000-10-24 2011-02-10 Rovi Technologies Corporation Method and system for analyzing digital audio files
CN102332031A (en) * 2011-10-18 2012-01-25 中国科学院自动化研究所 Method for clustering retrieval results based on video collection hierarchical theme structure
CN102799605A (en) * 2012-05-02 2012-11-28 天脉聚源(北京)传媒科技有限公司 Method and system for monitoring advertisement broadcast

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110035035A1 (en) * 2000-10-24 2011-02-10 Rovi Technologies Corporation Method and system for analyzing digital audio files
CN101216826A (en) * 2007-01-05 2008-07-09 鸿富锦精密工业(深圳)有限公司 Information search system and method
CN101246504A (en) * 2008-03-31 2008-08-20 北京搜狗科技发展有限公司 Clustering method, device and system
CN101271476A (en) * 2008-04-25 2008-09-24 清华大学 Relevant feedback retrieval method based on clustering in network image search
CN101458708A (en) * 2008-12-05 2009-06-17 北京大学 Searching result clustering method and device
CN102332031A (en) * 2011-10-18 2012-01-25 中国科学院自动化研究所 Method for clustering retrieval results based on video collection hierarchical theme structure
CN102799605A (en) * 2012-05-02 2012-11-28 天脉聚源(北京)传媒科技有限公司 Method and system for monitoring advertisement broadcast

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104951484A (en) * 2014-08-28 2015-09-30 腾讯科技(深圳)有限公司 Search result processing method and search result processing device
CN106202204A (en) * 2016-06-24 2016-12-07 维沃移动通信有限公司 The lookup method of a kind of voice document and mobile terminal
CN113536093A (en) * 2018-04-26 2021-10-22 华为技术有限公司 Information processing method and device
CN112232290A (en) * 2020-11-06 2021-01-15 四川云从天府人工智能科技有限公司 Data clustering method, server, system, and computer-readable storage medium
CN112232290B (en) * 2020-11-06 2023-12-08 四川云从天府人工智能科技有限公司 Data clustering method, server, system and computer readable storage medium

Similar Documents

Publication Publication Date Title
US20200210468A1 (en) Document recommendation method and device based on semantic tag
US8275177B2 (en) System and method for media fingerprint indexing
TWI553494B (en) Multi-modal fusion based Intelligent fault-tolerant video content recognition system and recognition method
JP6006327B2 (en) SEARCH METHOD, SEARCH DEVICE, AND SEARCH ENGINE SYSTEM
US20070106405A1 (en) Method and system to provide reference data for identification of digital content
US8521759B2 (en) Text-based fuzzy search
JP6906641B2 (en) Voice search / recognition method and equipment
JP2013541793A (en) Multi-mode search query input method
Kiktova-Vozarikova et al. Feature selection for acoustic events detection
US11664015B2 (en) Method for searching for contents having same voice as voice of target speaker, and apparatus for executing same
US8725766B2 (en) Searching text and other types of content by using a frequency domain
Chen et al. Improving music genre classification using collaborative tagging data
CN114328996A (en) Method and device for publishing information
Li et al. Fast distributed video deduplication via locality-sensitive hashing with similarity ranking
CN103870476A (en) Retrieval method and device
CN117251879A (en) Secure storage and query method and system based on trust extension and computer storage medium
CN111078849B (en) Method and device for outputting information
US10504002B2 (en) Systems and methods for clustering of near-duplicate images in very large image collections
EP3042316B1 (en) Music identification
Yang et al. Music retagging using label propagation and robust principal component analysis
EP3477505B1 (en) Fingerprint clustering for content-based audio recogntion
WO2022134683A1 (en) Method and device for generating context information of written content in writing process
Aryafar et al. Multimodal music and lyrics fusion classifier for artist identification
US20210034704A1 (en) Identifying Ambiguity in Semantic Resources
US20200111017A1 (en) Intelligent searching of electronically stored information

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20160322

Address after: 100027 Haidian District, Qinghe Qinghe East Road, No. 23, building two, floor 2108, No., No. 18

Applicant after: BEIJING YINZHIBANG CULTURE TECHNOLOGY Co.,Ltd.

Address before: 100085 Beijing, Haidian District, No. ten on the street Baidu building, No. 10

Applicant before: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

RJ01 Rejection of invention patent application after publication

Application publication date: 20140618

RJ01 Rejection of invention patent application after publication