CN103187052B - A kind of method and device setting up the language model being used for speech recognition - Google Patents
A kind of method and device setting up the language model being used for speech recognition Download PDFInfo
- Publication number
- CN103187052B CN103187052B CN201110451385.XA CN201110451385A CN103187052B CN 103187052 B CN103187052 B CN 103187052B CN 201110451385 A CN201110451385 A CN 201110451385A CN 103187052 B CN103187052 B CN 103187052B
- Authority
- CN
- China
- Prior art keywords
- language model
- user
- speech
- text
- identifiable
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
The invention provides a kind of foundation for the method for the language model of speech recognition and device, wherein said method comprises: the result identified the voice search query of user is carried out language model training as phonetic search language material by A., obtains speech language model; And, the text search query of user is carried out language model training as text search language material, obtains text language model; B. by described speech language model and described text language Model Fusion, identifiable language model is obtained.The identifiable language model obtained by the way, can reflect word preference when user speech inputs well, this identifiable language models applying can be improved in speech recognition the precision of speech recognition.
Description
[technical field]
The present invention relates to speech recognition technology, particularly a kind of method and device setting up the language model being used for speech recognition.
[background technology]
Search engine changes the mode of people's obtaining information greatly, has more and more become part indispensable in people's life.In recent years, along with the development of speech recognition technology, phonetic search has become one way of search more easily.People, by the request of mobile terminal input phonetic search, just can be met the Search Results of oneself demand from search engine server.
Phonetic search relies on speech recognition technology, only has and correctly identifies the phonetic entry of user, just can return the information that user wants to search.The effect of speech recognition depends on the acoustic model and language model that use in speech recognition.Acoustic model is applied to the calculating of voice to syllable probability in speech recognition, and language model is applied to the calculating of syllable to word probability in speech recognition.Language model is the model of words of description probability distribution, and the language model of the probability distribution of word when reliably can reflect that user speech is searched for is the key that phonetic searching system obtains reliable results.Due to the language material that the uses when probability distribution of word in language model depends on this language model of training, therefore, obtaining word when searching for user speech, to be accustomed to consistent corpus extremely important.In the prior art, usual employing two kinds of methods obtain corpus, the first manually marks the inquiry request of user when phonetic search and using the inquiry request after mark as corpus, the shortcoming of this mode is that cost is very high, also the language material getting sufficient amount is difficult to, another kind of mode be directly by user using the inquiry request of text event detection as corpus, the shortcoming of this mode is because word when user uses phonetic entry mode to send searching request there are differences compared with word when using character input modes to send searching request, the language model that this mode obtains, be difficult to the preference of word when reflection user uses phonetic search, such language model is applied in speech recognition, the precision of speech recognition will be reduced.
[summary of the invention]
Technical matters to be solved by this invention is to provide a kind of foundation for the method for the language model of speech recognition and device, is difficult to reflect the word custom of user when speech expression thus affects the defect of precision of identifying speech with the language model solving prior art.
The present invention is that the technical scheme that technical solution problem adopts is to provide the method for a kind of foundation for the language model of speech recognition, comprise: the result identified the voice search query of user is carried out language model training as phonetic search language material by A., obtains speech language model; And, the text search query of user is carried out language model training as text search language material, obtains text language model; B. by described speech language model and described text language Model Fusion, identifiable language model is obtained.
According to one of the present invention preferred embodiment, in described steps A, the initial voice search query of identifiable language model to user is used to identify.
According to one of the present invention preferred embodiment, in described step B, during by described speech language model and described text language Model Fusion, the parameter in the parameter in described speech language model and described text language model is carried out interpolation, to obtain the parameter in described identifiable language model.
According to one of the present invention preferred embodiment, when the parameter in described speech language model and described text language model is carried out interpolation, the parameter in described speech language model or described text language model is weighted.
According to one of the present invention preferred embodiment, described method comprises further: use the voice search query of described identifiable language model to user to identify, obtain recognition result.
According to one of the present invention preferred embodiment, described method comprises further: described recognition result is carried out language model training as the phonetic search language material newly increased, and to upgrade described speech language model, and returns described step B.
According to one of the present invention preferred embodiment, the step using the voice search query of identifiable language model to user to identify comprises: set up multiple candidate word sequence according to the voice search query of user; Use identifiable language model calculates the probability that each candidate word sequence occurs in identifiable language model, and selects to occur the recognition result of the candidate word sequence of maximum probability as the voice search query to user.
8, method according to claim 5, is characterized in that, described method comprises further:
The result for retrieval relevant to described recognition result is returned to user.
Present invention also offers a kind of device setting up identification speech model, comprising: the first training unit, carry out language model training for the result will identified the voice search query of user as phonetic search language material, obtain speech language model; Second training unit, for the text search query of user is carried out language model training as text search language material, obtains text language model; Integrated unit, for by described speech language model and described text language Model Fusion, obtains identifiable language model.
According to one of the present invention preferred embodiment, the phonetic search language material used when described first training unit carries out language model training obtains after using the initial voice search query of identifiable language model to user to identify.
According to one of the present invention preferred embodiment, when described integrated unit is by described speech language model and described text language Model Fusion, interpolation is carried out to the parameter in described speech language model and described text language model, to obtain the parameter in described identifiable language model.
According to one of the present invention preferred embodiment, when described integrated unit carries out interpolation to the parameter in described speech language model and text language model, the parameter in described speech language model or described text language model is weighted.
According to one of the present invention preferred embodiment, described device comprises further: recognition unit, for using the voice search query of described identifiable language model to user to identify, obtains recognition result.
According to one of the present invention preferred embodiment, the recognition result obtained is supplied to described first training unit by described recognition unit, described recognition result is carried out language model training as the phonetic search language material newly increased, to upgrade described speech language model for described first training unit.
According to one of the present invention preferred embodiment, described recognition unit comprises: word sequence unit, for setting up multiple candidate word sequence according to the voice search query of user; Computing unit, the probability calculating each candidate word sequence for using described identifiable language model and occur in described identifiable language model, and select to occur the recognition result of the candidate word sequence of maximum probability as the voice search query to user.
According to one of the present invention preferred embodiment, described device comprises further: retrieval unit, for returning the result for retrieval relevant to described recognition result to user.
As can be seen from the above technical solutions, by voice identification result is carried out language model training as language material, and the language model that the language model of being trained by voice identification result and corpus of text are trained merges the identifiable language model obtained, word preference when user speech inputs can be reflected well, by such identifiable language models applying in speech recognition, the precision of speech recognition can be improved.
[accompanying drawing explanation]
Fig. 1 is the schematic flow sheet of the embodiment of the method setting up the language model for speech recognition in the present invention;
Fig. 2 is the schematic diagram of the embodiment obtaining phonetic search language material and text search language material in the present invention;
Fig. 3 is the schematic diagram of the embodiment of word figure in the present invention;
Fig. 4 is the structural schematic block diagram of the embodiment of the device setting up the language model for speech recognition in the present invention;
Fig. 5 is the structural schematic block diagram of the embodiment of recognition unit in the present invention.
[embodiment]
In order to make the object, technical solutions and advantages of the present invention clearly, describe the present invention below in conjunction with the drawings and specific embodiments.
Please refer to Fig. 1, Fig. 1 is the schematic flow sheet of the embodiment of the method setting up the language model for speech recognition in the present invention.As shown in Figure 1, the method comprises:
S101: the result identified the voice search query of user is carried out language model training as phonetic search language material, obtains speech language model; And, the text search query of user is carried out language model training as text search language material, obtains text language model.
S102: by speech language model and text language Model Fusion to obtain identifiable language model.
Below above-mentioned steps is specifically described.
Please refer to Fig. 2, Fig. 2 is the schematic diagram of the embodiment obtaining phonetic search language material and text search language material in step S101.As shown in Figure 2, user, when searching for, can be undertaken by the mode of text event detection or phonetic entry.When user utilizes input through keyboard searching request, client collected by text will be sent to search engine server by the text search request of collecting by network, recording user is by the searching request of input through keyboard in retrieve log for log recording apparatus in server, and this retrieve log just can as the text search language material in the present invention.When user sends phonetic search request by mobile terminal (as mobile phone), voice collect client can by the voice signal collected by network delivery to search engine server, obtain recognition result after the phonetic search request of speech recognition equipment in search engine server to user identifies, namely this recognition result can be used as the phonetic search language material in the present invention.
In the embodiment of the acquisition phonetic search language material shown in Fig. 2, speech recognition equipment needs to utilize the initial voice search query of identifiable language model to user to identify.Initial identifiable language model in the present embodiment can be an existing identifiable language model, also can be the identifiable language model utilizing method establishment provided by the present invention, in this case the recognition result that the speech recognition equipment in Fig. 2 obtains, namely for the phonetic search language material of language model training, the identifiable language model in step S102 is served to the effect of renewal in step S101, thus achieve the adaptive process of the identifiable language model in the present invention.
Language model refers to N-Gram language model, and this model is based on a kind of like this hypothesis, and namely the appearance of N number of word is only to N-1 word is relevant above, and all uncorrelated with other any word, and the probability of whole sentence is exactly the product of each word probability of occurrence.The process of train language model, adds up the number of times that N number of word occurs simultaneously exactly from language material, to obtain the process of each N-Gram probable value.What usual use was more is the Bi-Gram model of binary and the Tri-Gram model of ternary, and the present invention does not limit this.
Parameter in speech language model and text language model is each N-Gram probable value, time in step s 102 by speech language model and text language Model Fusion, interpolation processing is carried out to the parameter in the parameter in speech language model and text language model, so just obtain the parameter in identifiable language model, each N-Gram probable value namely in identifiable language model.
Such as: in speech language model, P (you are good) is 0.5, wherein P (X) represents the probable value of X, in text language model, word P (you are good) is 0.8, if give identical weights with the parameter (i.e. each probable value) in text language model to speech language model, in identifiable language model then after interpolation, P (you are good) is exactly 50%*0.5+50%*0.8=0.65.
In addition, when carrying out interpolation processing to the parameter in the parameter in speech language model and text language model, can also be the parameter weighting in speech language model.In example such as, if the weight of speech language model is set to 70%, the weight of text language model is set to 30%, then P (you are good) is exactly 70%*0.5+30%*0.2=0.41.For the parameter in speech language model is weighted, the preference that final identifiable language model can be made to reflect better when user speech inputs.If wish that final identifiable language model lays particular emphasis on preference when reflection user version inputs, also can be weighted text language model.
After obtaining identifiable language model, further, can also identify with the voice search query of identifiable language model to user, obtain recognition result.The recognition result obtained can carry out language model training as the phonetic search language material newly increased, and to upgrade speech language model, can upgrade identifiable language model again, reach adaptive process after the speech language model after renewal and text language Model Fusion.
The process that the voice search query of user identifies is comprised:
Multiple candidate word sequence is set up according to the voice search query of user;
Use identifiable language model calculates the probability that each candidate word sequence occurs in identifiable language model, and selects to occur the recognition result of the candidate word sequence of maximum probability as the voice search query of user.
The syllable of the voice search query of such as user is " na li de kao ya hao chi ", this syllable sequence can be expressed as multiple candidate word sequence, as " the roasting tooth of there is fond of eating ", " roast duck of there is fond of eating " or " roast duck where is fond of eating " etc.For each candidate word sequence, the transition probability between probability and adjacent word that wherein each word occurs can be found from identifiable language model, transition probability between the probability occur each word and adjacent word is multiplied and can obtains the probability that this candidate word sequence occurs in language model, like this, the candidate word sequence that probability of occurrence is maximum just can as the recognition result of the voice search query to user.For identifiable language model for Bi-Gram language model, the probability that candidate word sequence occurs in identifiable language model can be expressed as follows:
P (roast duck where is fond of eating)=
P (where) * P (roast duck | where) * P (roast duck) * P (nice | roast duck) * P (being fond of eating)
Wherein P (where), P (roast duck), P (being fond of eating) they are the probability that in candidate word sequence, each word occurs, and P (roast duck | where), P (nice | roast duck) is the transition probability between adjacent word.
When setting up multiple candidate word sequence according to the voice search query of user, several morphologies one-tenth word figure as shown in Figure 3 that the frequency of occurrences in language model is the highest can be chosen in the word corresponding with syllable, in word figure, any path be communicated with from front to back all can be used as candidate word sequence, should be appreciated that, the mode more than setting up multiple candidate word sequence just schematically illustrates, the present invention does not limit the strategy setting up multiple candidate word sequence, selects arbitrarily in the mode that can be able to realize those skilled in the art.
After obtaining the recognition result to user speech search inquiry, the present invention can also return the result for retrieval relevant to recognition result to user further, it is similar that this process and existing search engine return the result for retrieval relevant with the query contents that user inputs, and is no longer specifically described at this.Be appreciated that, the result for retrieval relevant to recognition result, both can be the result for retrieval comprising recognition result, also can be the result for retrieval carrying out based on recognition result expanding, the expanding policy that corresponding query expansion result adopts, can adopt any existing expanding policy, the present invention does not limit this.
Please refer to Fig. 4, Fig. 4 is the structural schematic block diagram of the embodiment of the device setting up the language model for speech recognition in the present invention.As shown in Figure 4, the device of speech recognition comprises: the first training unit 201, second training unit 202, integrated unit 203, recognition unit 204.
First training unit 201, carries out language model training for the result will identified the voice search query of user as phonetic search language material, obtains speech language model.
Second training unit 202, for the text search query of user is carried out language model training as text search language material, obtains text language model.
Integrated unit 203, term, by speech language model and text language Model Fusion, obtains identifiable language model.
Recognition unit 204, for using the language search inquiry of identifiable language model to user to identify, obtains recognition result.
In one embodiment, the phonetic search language material used when the first training unit 201 carries out language model training obtains after using the language search inquiry of initial identifiable language model to user to identify.
Initial identifiable language model can be an existing identifiable language model, also can be the identifiable language model utilizing device provided by the present invention to set up.
Text search language material in second training unit 202 is the retrieve log of recording user text search query in search engine.
Language model in the present invention, refers to N-Gram language model, and this model is based on a kind of like this hypothesis, and namely the appearance of N number of word is only to N-1 word is relevant above, and all uncorrelated with other any word, and the probability of whole sentence is exactly the product of each word probability of occurrence.The process of train language model, adds up the number of times that N number of word occurs simultaneously exactly from language material, to obtain the process of each N-Gram probable value.What usual use was more is the Bi-Gram model of binary and the Tri-Gram model of ternary, and the present invention does not limit this.
Parameter in speech language model and text language model is each N-Gram probable value, integrated unit 203 is when by speech language model and text language Model Fusion, interpolation processing is carried out to the parameter in the parameter in speech language model and text language model, so just obtain the parameter in identifiable language model, each N-Gram probable value namely in identifiable language model.
Such as: in speech language model, word P (you are good) was 0.5 (P represents probable value), in text language model, word P (you are good) is 0.8, if give identical weights with the parameter (i.e. each probable value) in text language model to speech language model, in identifiable language model then after interpolation, P (you are good) is exactly 50%*0.5+50%*0.8=0.65.
Integrated unit 203, when carrying out interpolation processing to the parameter in the parameter in speech language model and text language model, can be the parameter weighting in speech language model.In example such as, if the weight of speech language model is set to 70%, the weight of text language model is set to 30%, then P (you are good) is exactly 70%*0.5+30%*0.2=0.41.For the parameter in speech language model is weighted, the preference that final identifiable language model can be made to reflect better when user speech inputs.If wish that final identifiable language model lays particular emphasis on preference when reflection user version inputs, integrated unit 203 also can be weighted text language model.
Please refer to Fig. 5, Fig. 5 is the schematic diagram of the embodiment of recognition unit in the present invention.As shown in Figure 5, recognition unit 204 comprises word sequence unit 2041 and computing unit 2042.Wherein word sequence unit 2041 is for setting up multiple candidate word sequence according to the voice search query of user, the probability that computing unit 2042 occurs in identifiable language model for using identifiable language model to calculate each candidate word sequence, and select to occur the recognition result of the candidate word sequence of maximum probability as the voice search query of user.
Word sequence unit 2041 sets up multiple candidate word sequence according to the syllable of voice search query after the voice search query obtaining user.The syllable of the voice search query of such as user is " na li de kaoya hao chi ", then word sequence unit 2041 can set up word figure as shown in Figure 3, in word figure, any path be communicated with from front to back defines a candidate word sequence, as " the roasting tooth of there is fond of eating ", " roast duck of there is fond of eating " or " roast duck where is fond of eating " etc.Word sequence unit 2041 is when setting up multiple candidate word sequence, several morphologies that in identifiable language model, the frequency of occurrences is the highest can be chosen in the word corresponding with syllable and become word figure, in addition, any additive method that those skilled in the art also can be adopted to realize sets up multiple candidate word sequence.
Computing unit 2042 is for each candidate word sequence in word candidate unit 2041, the transition probability between probability and adjacent word that wherein each word occurs is found from identifiable language model, and the probability occurred by each word is multiplied with the transition probability between adjacent word and obtains the probability that each candidate word sequence occurs, like this, the candidate word sequence that probability of occurrence is maximum just can as the recognition result of the voice search query to user.
Please continue to refer to Fig. 4.Further, the recognition result obtained is supplied to the first training unit 201 by recognition unit 204, recognition result is carried out language model training as the phonetic search language material newly increased, to upgrade speech language model for the first training unit 201.Speech language model after renewal and text language model, through the process of integrated unit 203, just achieve the object upgraded identifiable language model, thus realize the adaptive process of this device.In addition, device of the present invention also can comprise a retrieval unit (not shown in Fig. 4) further, for after the recognition result that obtains user speech search inquiry at recognition unit 204, the result for retrieval relevant to recognition result is returned to user, the principle of work of retrieval unit is identical with the principle of work of the retrieval unit of existing search engine, and the present invention is no longer described in detail.Should be appreciated that, the result for retrieval relevant to recognition result, both can be the result for retrieval comprising recognition result, also can be the result for retrieval carrying out based on recognition result expanding, the expanding policy that corresponding query expansion result adopts, can adopt any existing expanding policy, the present invention does not limit this.
The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, within the spirit and principles in the present invention all, any amendment made, equivalent replacement, improvement etc., all should be included within the scope of protection of the invention.
Claims (14)
1. set up a method for the language model being used for speech recognition, it is characterized in that, the method comprises:
A. the result identified the voice search query of user is carried out language model training as phonetic search language material, obtain speech language model; And, the text search query of user is carried out language model training as text search language material, obtains text language model;
B. by described speech language model and described text language Model Fusion, identifiable language model is obtained;
In described step B, during by described speech language model and described text language Model Fusion, the parameter in the parameter in described speech language model and described text language model is carried out interpolation, to obtain the parameter in described identifiable language model.
2. method according to claim 1, is characterized in that, in described steps A, uses the initial voice search query of identifiable language model to user to identify.
3. method according to claim 1, is characterized in that, when the parameter in described speech language model and described text language model is carried out interpolation, is weighted the parameter in described speech language model or described text language model.
4. method according to claim 1, is characterized in that, described method comprises further:
Use the voice search query of described identifiable language model to user to identify, obtain recognition result.
5. method according to claim 4, is characterized in that, described method comprises further: described recognition result is carried out language model training as the phonetic search language material newly increased, and to upgrade described speech language model, and returns described step B.
6. method according to claim 4, is characterized in that, the step using the voice search query of identifiable language model to user to identify comprises:
Multiple candidate word sequence is set up according to the voice search query of user;
Use identifiable language model calculates the probability that each candidate word sequence occurs in identifiable language model, and selects to occur the recognition result of the candidate word sequence of maximum probability as the voice search query to user.
7. method according to claim 4, is characterized in that, described method comprises further:
The result for retrieval relevant to described recognition result is returned to user.
8. set up the device identifying speech model, it is characterized in that, this device comprises:
First training unit, carries out language model training for the result will identified the voice search query of user as phonetic search language material, obtains speech language model;
Second training unit, for the text search query of user is carried out language model training as text search language material, obtains text language model;
Integrated unit, for by described speech language model and described text language Model Fusion, obtains identifiable language model;
When described integrated unit is by described speech language model and described text language Model Fusion, interpolation is carried out to the parameter in described speech language model and described text language model, to obtain the parameter in described identifiable language model.
9. device according to claim 8, is characterized in that, the phonetic search language material used when described first training unit carries out language model training obtains after using the initial voice search query of identifiable language model to user to identify.
10. device according to claim 8, is characterized in that, when described integrated unit carries out interpolation to the parameter in described speech language model and text language model, is weighted the parameter in described speech language model or described text language model.
11. devices according to claim 8, is characterized in that, described device comprises further:
Recognition unit, for using the voice search query of described identifiable language model to user to identify, obtains recognition result.
12. devices according to claim 11, it is characterized in that, the recognition result obtained is supplied to described first training unit by described recognition unit, described recognition result is carried out language model training as the phonetic search language material newly increased, to upgrade described speech language model for described first training unit.
13. devices according to claim 11, is characterized in that, described recognition unit comprises:
Word sequence unit, for setting up multiple candidate word sequence according to the voice search query of user;
Computing unit, the probability calculating each candidate word sequence for using described identifiable language model and occur in described identifiable language model, and select to occur the recognition result of the candidate word sequence of maximum probability as the voice search query to user.
14. devices according to claim 11, is characterized in that, described device comprises further: retrieval unit, for returning the result for retrieval relevant to described recognition result to user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110451385.XA CN103187052B (en) | 2011-12-29 | 2011-12-29 | A kind of method and device setting up the language model being used for speech recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110451385.XA CN103187052B (en) | 2011-12-29 | 2011-12-29 | A kind of method and device setting up the language model being used for speech recognition |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103187052A CN103187052A (en) | 2013-07-03 |
CN103187052B true CN103187052B (en) | 2015-09-02 |
Family
ID=48678187
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201110451385.XA Active CN103187052B (en) | 2011-12-29 | 2011-12-29 | A kind of method and device setting up the language model being used for speech recognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103187052B (en) |
Families Citing this family (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103871402B (en) * | 2012-12-11 | 2017-10-10 | 北京百度网讯科技有限公司 | Language model training system, speech recognition system and correlation method |
CN103578465B (en) * | 2013-10-18 | 2016-08-17 | 威盛电子股份有限公司 | Speech identifying method and electronic installation |
CN105574040B (en) * | 2014-10-16 | 2020-04-10 | 高德软件有限公司 | Synonymy transformation method and equipment for query string |
US9812126B2 (en) * | 2014-11-28 | 2017-11-07 | Microsoft Technology Licensing, Llc | Device arbitration for listening devices |
KR102167719B1 (en) * | 2014-12-08 | 2020-10-19 | 삼성전자주식회사 | Method and apparatus for training language model, method and apparatus for recognizing speech |
CN105654945B (en) * | 2015-10-29 | 2020-03-06 | 乐融致新电子科技(天津)有限公司 | Language model training method, device and equipment |
KR102450853B1 (en) * | 2015-11-30 | 2022-10-04 | 삼성전자주식회사 | Apparatus and method for speech recognition |
CN105843868B (en) * | 2016-03-17 | 2019-03-26 | 浙江大学 | A kind of case searching method based on language model |
CN105654955B (en) * | 2016-03-18 | 2019-11-12 | 华为技术有限公司 | Audio recognition method and device |
CN106328147B (en) * | 2016-08-31 | 2022-02-01 | 中国科学技术大学 | Speech recognition method and device |
CN110058986A (en) * | 2018-01-18 | 2019-07-26 | 普天信息技术有限公司 | A kind of network system data characterizing method and device |
CN108831441B (en) * | 2018-05-08 | 2019-08-13 | 上海依图网络科技有限公司 | A kind of training method and device of speech recognition modeling |
CN108682420B (en) * | 2018-05-14 | 2023-07-07 | 平安科技(深圳)有限公司 | Audio and video call dialect recognition method and terminal equipment |
CN108847222B (en) * | 2018-06-19 | 2020-09-08 | Oppo广东移动通信有限公司 | Speech recognition model generation method and device, storage medium and electronic equipment |
CN110120221A (en) * | 2019-06-06 | 2019-08-13 | 上海蔚来汽车有限公司 | The offline audio recognition method of user individual and its system for vehicle system |
CN112309377B (en) * | 2019-07-18 | 2024-09-13 | Tcl科技集团股份有限公司 | Intelligent bath control method, intelligent bath control equipment and storage medium |
CN110738987B (en) * | 2019-10-18 | 2022-02-15 | 清华大学 | Keyword retrieval method based on unified representation |
CN112466295A (en) * | 2020-11-24 | 2021-03-09 | 北京百度网讯科技有限公司 | Language model training method, application method, device, equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1573926A (en) * | 2003-06-03 | 2005-02-02 | 微软公司 | Discriminative training of language models for text and speech classification |
CN101042868A (en) * | 2006-03-20 | 2007-09-26 | 富士通株式会社 | Clustering system, clustering method, clustering program and attribute estimation system using clustering system |
CN101042867A (en) * | 2006-03-24 | 2007-09-26 | 株式会社东芝 | Apparatus, method and computer program product for recognizing speech |
CN101382937A (en) * | 2008-07-01 | 2009-03-11 | 深圳先进技术研究院 | Multimedia resource processing method based on speech recognition and on-line teaching system thereof |
CN101651788A (en) * | 2008-12-26 | 2010-02-17 | 中国科学院声学研究所 | Alignment system of on-line speech text and method thereof |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7395205B2 (en) * | 2001-02-13 | 2008-07-01 | International Business Machines Corporation | Dynamic language model mixtures with history-based buckets |
US7143035B2 (en) * | 2002-03-27 | 2006-11-28 | International Business Machines Corporation | Methods and apparatus for generating dialog state conditioned language models |
-
2011
- 2011-12-29 CN CN201110451385.XA patent/CN103187052B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1573926A (en) * | 2003-06-03 | 2005-02-02 | 微软公司 | Discriminative training of language models for text and speech classification |
CN101042868A (en) * | 2006-03-20 | 2007-09-26 | 富士通株式会社 | Clustering system, clustering method, clustering program and attribute estimation system using clustering system |
CN101042867A (en) * | 2006-03-24 | 2007-09-26 | 株式会社东芝 | Apparatus, method and computer program product for recognizing speech |
CN101382937A (en) * | 2008-07-01 | 2009-03-11 | 深圳先进技术研究院 | Multimedia resource processing method based on speech recognition and on-line teaching system thereof |
CN101651788A (en) * | 2008-12-26 | 2010-02-17 | 中国科学院声学研究所 | Alignment system of on-line speech text and method thereof |
Also Published As
Publication number | Publication date |
---|---|
CN103187052A (en) | 2013-07-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103187052B (en) | A kind of method and device setting up the language model being used for speech recognition | |
CN109697973B (en) | Rhythm level labeling method, model training method and device | |
CN107818781B (en) | Intelligent interaction method, equipment and storage medium | |
CN103635963B (en) | Language model across languages initialization | |
CN107204184B (en) | Audio recognition method and system | |
Regelson et al. | Predicting click-through rate using keyword clusters | |
CN107240398B (en) | Intelligent voice interaction method and device | |
US10332506B2 (en) | Computerized system and method for formatted transcription of multimedia content | |
CN109637537B (en) | Method for automatically acquiring annotated data to optimize user-defined awakening model | |
US9582608B2 (en) | Unified ranking with entropy-weighted information for phrase-based semantic auto-completion | |
CN102623010B (en) | A kind ofly set up the method for language model, the method for speech recognition and device thereof | |
US9043199B1 (en) | Manner of pronunciation-influenced search results | |
CN109522419B (en) | Session information completion method and device | |
US8473293B1 (en) | Dictionary filtering using market data | |
CN110162770A (en) | A kind of word extended method, device, equipment and medium | |
WO2007008798A3 (en) | System and method for searching for network-based content in a multi-modal system using spoken keywords | |
CN105389389B (en) | A kind of network public-opinion propagation situation medium control analysis method | |
CN102236677A (en) | Question answering system-based information matching method and system | |
CN104217717A (en) | Language model constructing method and device | |
CN110276010B (en) | Weight model training method and related device | |
CN104991943A (en) | Music searching method and apparatus | |
CN103186574A (en) | Method and device for generating searching result | |
CN106570180A (en) | Artificial intelligence based voice searching method and device | |
CN105677857B (en) | method and device for accurately matching keywords with marketing landing pages | |
CN105354199A (en) | Scene information based entity meaning identification method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |