CN106448683A - Method and device for viewing recording in multimedia files - Google Patents
Method and device for viewing recording in multimedia files Download PDFInfo
- Publication number
- CN106448683A CN106448683A CN201610877020.6A CN201610877020A CN106448683A CN 106448683 A CN106448683 A CN 106448683A CN 201610877020 A CN201610877020 A CN 201610877020A CN 106448683 A CN106448683 A CN 106448683A
- Authority
- CN
- China
- Prior art keywords
- vocal print
- print feature
- medium data
- multimedia file
- recording
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 230000001755 vocal effect Effects 0.000 claims abstract description 128
- 238000001514 detection method Methods 0.000 claims description 5
- 241001269238 Data Species 0.000 description 7
- 230000008569 process Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000006855 networking Effects 0.000 description 3
- 241000234282 Allium Species 0.000 description 2
- 235000002732 Allium cepa var. cepa Nutrition 0.000 description 2
- 230000004069 differentiation Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000013011 mating Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
- G06F16/43—Querying
- G06F16/432—Query formulation
- G06F16/433—Query formulation using audio data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
- G06F16/48—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/54—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Library & Information Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Signal Processing For Digital Recording And Reproducing (AREA)
Abstract
The invention relates to a method and a device for viewing recording in multimedia files. The method for viewing recording in the multimedia files comprises the following steps: scanning the multimedia files, and acquiring vocal print data in the multimedia files; in accordance with the vocal print data, recognizing multimedia data, which has same vocal print characteristics, in the multimedia files; and marking the recognized multimedia data which has the same vocal print characteristics. Correspondingly, the invention also provides the method and the device for viewing the recording in the multimedia files. With the application of the method and the device for viewing the recording in the multimedia files, the recording of different speakers can be distinguished in accordance with the vocal print characteristics, and the recording of each speaker can be marked, so that a user can conveniently view the recording of a designated speaker.
Description
Technical field
The present invention relates to field of computer technology, particularly relate to method and the dress of recording in a kind of view multimedia documents
Put.
Background technology
With the fast development of computer technology, various terminals, such as the performance of smart phone, panel computer, computer etc.
Improving constantly, and, with the increase of people's life requirement, the application type that these terminals can be supported also constantly increases
Add, the various functions such as such as communication, social, multimedia file is play, online or shopping, wherein, use terminal device to play many
Media file is one of function the most frequently used in people's daily life.
But, when using conventional Multi Media file broadcast mode to play multimedia file, multimedia file is from first to last suitable
Sequence is play.When multimedia file includes the recording of multiple spokesman, if user wants to check or record a certain position spokesman
Recording, can only first whole multimedia file completely be play one time, the recording initial time of each spokesman manually recorded, then
When follow-up play, the playing progress rate according to manually recorded adjustment multimedia file, plays the recording wanting the spokesman checking.
Therefore, using in the operating process of recording of conventional Multi Media file broadcast mode broadcasting specified speech people, user needs to carry out
Many manually operated, operate lengthy and tedious, waste time and energy.
Content of the invention
Based on this, it is necessary to for conventional Multi Media file broadcast mode play specified speech people recording operation lengthy and tedious,
The problem wasting time and energy, provides the method and device of recording in a kind of view multimedia documents.
A kind of method of recording in view multimedia documents, including:
Scanning multimedia file, obtains the voice print database in multimedia file;
According to voice print database, identify the multi-medium data in multimedia file with identical vocal print feature;
The multi-medium data with identical vocal print feature that will identify that is marked.
Wherein in an embodiment, the multi-medium data with identical vocal print feature that will identify that is marked, bag
Include:The multi-medium data with identical vocal print feature and the default label that will identify that are bound;
After the multi-medium data with identical vocal print feature that will identify that and default label are bound, also include:
Receive and operation is chosen to default label;
According to the operation of choosing to default label, run the many matchmakers with identical vocal print feature with the binding of default label phase
Volume data.
Wherein in an embodiment, the above-mentioned multi-medium data with identical vocal print feature that will identify that enters rower
Note, including:
Read the initial time of each corresponding recording of vocal print feature;
The multi-medium data with identical vocal print feature that will identify that carries out initial time mark.
Wherein in an embodiment, above-mentioned according to voice print database, identify in multimedia file that there is identical vocal print feature
Multi-medium data after, also include:
Read local linkages people's voice print database, identify the vocal print feature of local linkages people's voice print database, by local linkages people
Vocal print feature and the vocal print characteristic matching in the multimedia file having obtained;
If the vocal print feature in multimedia file is identical with the vocal print feature of local linkages people, then extract local linkages name
Claim, with local linkages people's title, multi-medium data is marked.
Wherein in an embodiment, in above-mentioned view multimedia documents, the method for recording also includes:
Whether detection multimedia file comprises caption information, if multimedia file comprises caption information, then at display interface
Upper display captions search window.
A kind of device of recording in view multimedia documents, including:
Voice print database scan module, is used for scanning multimedia file, obtains the voice print database in multimedia file;
Vocal print characteristics analysis module, for according to voice print database, identifies in multimedia file have identical vocal print feature
Multi-medium data;
Record labels module, is marked for the multi-medium data with identical vocal print feature that will identify that.
Wherein in an embodiment, in view multimedia documents, the device of recording also includes:
Preset label binding module, for the multi-medium data with identical vocal print feature and the default label that will identify that
Bind;
Receiver module, chooses operation for receiving to default label;
Record playing module, for according to the operation of choosing to default label, runs and having that default label phase is bound
The multi-medium data of identical vocal print feature.
Wherein in an embodiment, in view multimedia documents, the device of recording also includes:
Initial time acquisition module, for reading the initial time of each corresponding recording of vocal print feature;
The multi-medium data with identical vocal print feature that record labels module is additionally operable to will identify that carries out initial time
Mark.
Wherein in an embodiment, in view multimedia documents, the device of recording also includes:
Local voiceprint analysis module, is used for reading local linkages people's voice print database, identifies local linkages people's voice print database
Vocal print feature, by the vocal print feature of local linkages people and the vocal print characteristic matching in the multimedia file obtaining;
If the vocal print feature in multimedia file is identical with the vocal print feature of local linkages people, then extract local linkages name
Claim, local linkages people's title is sent to record labels module;
Multi-medium data is marked by record labels module with local linkages people's title.
Wherein in an embodiment, in view multimedia documents, the device of recording also includes:
Caption information detection module, is used for detecting whether multimedia file comprises caption information, when multimedia file comprises
During caption information, display interface shows captions search window.
The method and device of recording in above-mentioned view multimedia documents, identifies in multimedia file have identical vocal print feature
Multi-medium data, and the multi-medium data with identical vocal print feature that will identify that is marked.Therefore, above-mentioned check many
The method and device of recording in media file can pass through the recording of the different spokesman of vocal print feature differentiation, and to each spokesman
Recording be marked, so that user can check the recording of specified speech person easily.When user needs to check a certain position
During the recording of spokesman, can quickly find the recording of the spokesman wanting to check according to mark, and without listening a recording
Carry out record, easy to operate, effectively save user and check the time that specified speech person records, facilitate user to check recording.
Brief description
Fig. 1 is the process principle figure of the method for recording in view multimedia documents in an embodiment;
Fig. 2 is to show in a concrete application scenarios in an embodiment in multimedia file that spokesman records mark
Interface schematic diagram;
Fig. 3 is to show in a concrete application scenarios in another embodiment in multimedia file that spokesman records mark
Interface schematic diagram;
Fig. 4 is to show in a concrete application scenarios in another embodiment in multimedia file that spokesman records mark
Interface schematic diagram;
Fig. 5 is the structure principle chart of the device of recording in view multimedia documents in an embodiment.
Detailed description of the invention
In order to make the purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, right
The present invention is further elaborated.It should be appreciated that specific embodiment described herein only in order to explain the present invention, not
For limiting the present invention.
Refer to Fig. 1, the method for recording in a kind of view multimedia documents, including:
Step 120:Scanning multimedia file, obtains the voice print database in multimedia file.
Concrete, multimedia file can be audio file or the video file including recording, includes recording when checking
During vedio/audio file, carry out voice print database scanning to vedio/audio file, obtain the vocal print number in vedio/audio file
According to.
Step 140:According to voice print database, identify the multi-medium data in multimedia file with identical vocal print feature.
Concrete, carry out vocal print feature discriminance analysis to the voice print database getting in step 120, extract vocal print feature letter
Breath, is identified to the multi-medium data in multimedia file according to vocal print characteristic information, obtains and has identical vocal print feature
Multi-medium data.
Step 160:The multi-medium data with identical vocal print feature that will identify that is marked.
Concrete, identical mark is marked to the multi-medium data with identical vocal print feature, user is by checking to many
The mark of media data i.e. would know which multi-medium data has identical vocal print feature, is the speech record of a spokesman
Sound, thus facilitate user to make a distinction the multiple spokesman recording in multimedia file, make user can check finger easily
Determine the recording of spokesman.
In one embodiment, word naming method can be used to be marked the mark of multi-medium data mark.As,
Multiple different corresponding marks of vocal print feature are designated as spokesman the 1st, spokesman 2 respectively ... spokesman n, there is identical vocal print special
The mark of the multi-medium data levied is identical.For example, in one embodiment, the first vocal print feature and the second vocal print feature are got
Two vocal print features, wherein, the multi-medium data with the first vocal print feature has two sections, has the multimedia of the second vocal print feature
Data have one section, use the mode of above-mentioned word name mark, can be to the mark of the multi-medium data with the first vocal print feature
Memorize is spokesman 1, is designated as spokesman 2 to the mark of the multi-medium data with the second vocal print feature, has the first vocal print special
The mark of the two sections of multi-medium datas levied is all spokesman 1, and 1 section of multi-medium data mark with the second vocal print feature is all to send out
Speech people 2, the mark of the multi-medium data with identical vocal print feature is identical.
It should be noted that specifically the mark to multi-medium data mark can be any form of mark, as long as can
Embodying the multi-medium data to different vocal print features to make a distinction, it is same for allowing users to clearly differentiated which multi-medium data
The recording of one spokesman.As, in other embodiments, can be right as mark with symbolization form or graphic form
Multi-medium data is marked.For example, still as a example by above-mentioned citing, in one embodiment, can be to having the first vocal print
Three sections of multi-medium datas with the second vocal print feature are all marked by the equal triangle mark of two sections of multi-medium data marks of feature
Square indicia.Therefore, above-mentioned simply a kind of example, the present embodiment is simultaneously not specifically limited.
In one embodiment, step 160 includes:The multi-medium data with identical vocal print feature that will identify that with pre-
Bidding label are bound.
Concrete, default label is the broadcast button of multi-medium data, and corresponding same spokesman, presets the quantity of label
Correspondence has the quantity of the multi-medium data of identical vocal print feature, and default label is to each multimedia with identical vocal print feature
Data are differently shown, and the default label display content every multi-medium data with identical vocal print feature is different, with right
Multiple multi-medium datas with identical vocal print feature make a distinction, and facilitate user to check.
As, it still is exemplified as example with above-mentioned, the two sections of multi-medium datas with the first vocal print feature being designated spokesman 1 divide
Not binding from two default labels showing that content is different, equally, one section that is designated spokesman 2 has vocal print feature
Multi-medium data and a default label are bound.Concrete, it as in figure 2 it is shown, in one embodiment, is designated spokesman
In two multi-medium datas of 1, the display content of the default label 202 of a multi-medium data binding is recording 1, another many matchmakers
The content of the default label 202 of volume data binding is recording 2;It is designated in a multi-medium data of spokesman 2, preset label
The display content of 202 is recording 1.
As in figure 2 it is shown, in the present embodiment, use list shape between the multi-medium data of the different vocal print features of two spokesman
Formula is shown, mark spokesman the 1st, spokesman 2 is according to corresponding multimedia file quantity sequence, and each default label 202 is corresponding
Spokesman identify corresponding displaying.It should be noted that, the present embodiment does not limit the mark of concrete multi-medium data and presets
The display form of label 202 and the sortord of each mark, above example is a kind of example, and is not specifically limited.
Further, in the present embodiment, the multi-medium data with identical vocal print feature and the default label that will identify that
After binding, also include:Receive and operation chosen to default label, according to the operation of choosing to default label, run with
Preset the multi-medium data with identical vocal print feature of label phase binding.
Concrete, terminal detects whether to receive user and chooses operation to default label, when receiving user to presetting
When choosing operation of label, obtains and presets the corresponding multi-medium data of label, plays and presets label 202 corresponding multimedia number
According to.As, user couple detected be designated spokesman 1 corresponding display content for recording 1 default label 202 choose behaviour
Make, i.e. play the corresponding multi-medium data of default label 202 of the recording 1 of spokesman 1.Concrete, choose operation can be point
Hitting or pressing operation, the present embodiment is not specifically limited.
In one embodiment, step 160 includes:Read the initial time of each corresponding recording of vocal print feature, will know
The multi-medium data with identical vocal print feature not gone out carries out initial time mark.
Concrete, for the different multimedia data with identical vocal print feature, read the recording of each multi-medium data
Initial time, and with initial time, each multi-medium data with identical vocal print feature is marked respectively.Concrete, according to
The recording initial time of each multi-medium data generates initial time interface element, shows the mark of each multi-medium data and corresponding
Initial time interface element.As it is shown on figure 3, the moment in fact of the two of spokesman 1 sections of recording is respectively 7s and 19s, spokesman 2
One section recording initial time be 13s, with mark corresponding two interface elements 302 of spokesman 1 be respectively indicated as 0:00:07
With 0:00:19;The mark corresponding interface element of spokesman 2 302 shows 0:00:13.By recording initial time to multimedia number
According to being marked, the spokesman's recording specified can be searched with convenient user.
Further, the initial time according to each recording generates initial time interface element 302, shows the mark of each recording
Know and after the step of each initial time interface element 302 corresponding to mark, also include:Receive to interface element 302
Choose operation, according to operation is chosen to interface, play multimedia file from the interface element 302 corresponding moment in fact of recording.
Concrete, detect whether to receive user and operation is chosen to interface element 302, when receiving user to Interface Element
Element 302 when choosing operation, obtains interface element 302 corresponding recording initial time, commences play out many from recording initial time
Media file.As detected that user is 0 to display content:00:The interface element 302 of 07 choose operation, then open from 7s position
Begin to play multimedia file, be played to terminate during 13s, play 0:00:The recording of the corresponding spokesman of interface element 302 1 of 07.
Concrete, choose operation can be click on operation or pressing operation, the present embodiment is not specifically limited.
In the present embodiment, by recording, multi-medium data is marked by initial time, and user needs to check that a certain position is sent out
During the recording of speaker, choose corresponding interface element 302 can play multimedia literary composition at the corresponding initial time of interface element 302
Part, and without listening a sound recordings record length, easy to operate, and can accurately read each section of initial time recorded,
Record length record is accurate, can save the time that user checks that specified speech person records further, it is ensured that user checks recording
Efficiency.
As it is shown on figure 3, in the present embodiment, use list shape between the multi-medium data of the different vocal print features of two spokesman
Formula is shown, mark spokesman the 1st, spokesman 2 is according to correspondence recording initial time sequence, and each interface element 302 is corresponding
Spokesman identifies corresponding displaying.It should be noted that, the present embodiment does not limit mark and the Interface Element of concrete multi-medium data
The sortord of element 302, as in other embodiments, can also be according between the mark of the multi-medium data of different vocal print features
Recording quantity, or long recording time, or alternate manner are ranked up.In addition, the present embodiment does not specifically limit multimedia number yet
According to mark and the display form of interface element 302, as shown in Figure 4, in one embodiment, can directly open up on a timeline
Show mark and the initial time of multi-medium data, respectively 0:00:07,0:00:13 and 0:00:19 3 initial times make bid
Note, and in corresponding initial time position, mark spokesman 1 and spokesman 2 are marked, use this kind of display form, need
During multimedia file to be play, can drag that playing progress bar to initial time position plays out also can be directly to initial time
Position carries out clicking on or multimedia file is play in the operation such as touch-control.
Above example is to check the recording of specified speech people in a multimedia file.In other embodiments, also
Multiple multi-medium datas can be checked, when the recording of specified speech people checked in multiple multimedia files by needs simultaneously
When, scan multiple multimedia file simultaneously, obtain the voice print database in each multimedia file, multiple many according to voice print database identification
Media file has the multi-medium data of identical vocal print feature, and is marked, to each spokesman in different files
Recording be marked.General principle and above-described embodiment phase of the recording of specified speech people is checked in multiple multimedia files
Same, multiple multimedia files are scanned i.e. simultaneously, all use said process to be identified and mark each multimedia file
Note process, finally carries out collecting showing to mark result.
In one embodiment, also include after step 140:Read local linkages people's voice print database, identify local linkages
The vocal print feature of people's voice print database, by the vocal print feature of local linkages people and the vocal print feature in the multimedia file obtaining
Join;If the vocal print feature in multimedia file is identical with the vocal print feature of local linkages people, then extract local linkages people's title, with
Multi-medium data is marked by local linkages people's title.
Concrete, local linkages people include address book contact or support phonetic function social networking application in contact person.
First obtain local linkages people's voice print database and carry out signature analysis, obtaining the vocal print feature of local linkages people, will obtain
Multimedia file in characteristics of the multimedia mate with the vocal print feature of local linkages people, if the match is successful, then extract
Local linkages people's title, with the title of local linkages people as the mark of this corresponding multi-medium data of vocal print feature, to many matchmakers
Volume data is marked.Concrete local linkages people's title includes address book contact name or supports the social networking application of phonetic function
In contact person's pet name.As in matched acquisition address list, name is two that in Xiao Zhang and social networking application, the pet name is onion bulb
Individual contact name is identical with two vocal print features in the multimedia file obtaining, then the mark by two multi-medium datas
Know and be respectively designated as Xiao Zhang and onion bulb.
In the present embodiment the vocal print feature by mating local linkages people realize with the title of local linkages people to record into
Line flag, can further facilitate the recording in user's view multimedia documents.Specifically with contact person name mark and other
Can sort according to initial order or numerical order when the mark that mode is named is shown, it is possible to according to record as described in above-described embodiment
The sequencing in sound moment in fact or according to recording quantity, or the sequence such as long recording time.
Wherein in an embodiment, in above-mentioned view multimedia documents, the method for recording also includes:Detection multimedia literary composition
Whether part comprises caption information, if multimedia file comprises caption information, then shows captions search window on display interface.
Concrete, when needing the multimedia file checked to comprise caption information, display interface shows captions are searched for
Window, user can input word and scan for so that the speech content of the spokesman that user can search as required is quick
Search the recording of specified speech people, further facilitate user and search, improve user's search efficiency.
As it is shown in figure 5, the device 500 recorded in a kind of view multimedia documents, including:
Voice print database scan module 502, is used for scanning multimedia file, obtains the voice print database in multimedia file;
Vocal print characteristics analysis module 504, for according to voice print database, identifies in multimedia file have identical vocal print feature
Multi-medium data;
Record labels module 506, is marked for the multi-medium data with identical vocal print feature that will identify that.
In one embodiment, in view multimedia documents, the device of recording also includes:Preset label binding module, reception
Module and record playing module, preset label binding module for the multi-medium data with identical vocal print feature that will identify that
Bind with default label;Default label is chosen operation for receiving by receiver module.Record playing module is used for basis
Operation of choosing to default label, runs the multi-medium data with identical vocal print feature with the binding of default label phase.
In one embodiment, in view multimedia documents, the device of recording also includes:Initial time acquisition module, is used for
Read the initial time of each corresponding recording of vocal print feature.Having that record labels module 506 is additionally operable to will identify that is identical
The multi-medium data of vocal print feature carries out initial time mark.
Wherein in an embodiment, in view multimedia documents, the device of recording also includes:
Local voiceprint analysis module, is used for reading local linkages people's voice print database, identifies local linkages people's voice print database
Vocal print feature, by the vocal print feature of local linkages people and the vocal print characteristic matching in the multimedia file obtaining.If multimedia
Vocal print feature in file is identical with the vocal print feature of local linkages people, then extract local linkages people's title, by local linkages people
Title is sent to record labels module;Multi-medium data is marked by record labels module with local linkages people's title.
Wherein in an embodiment, in view multimedia documents, the device of recording also includes:Caption information detection module,
For detecting whether multimedia file comprises caption information, when multimedia file comprises caption information, aobvious on display interface
Show captions search window.
In above-mentioned view multimedia documents, the method and device of recording can apply to including but not limited to following at least one
Plant terminal:Smart mobile phone, panel computer, notebook computer, desktop PC, wearable intelligent equipment.Above-mentioned simply a kind of example,
This is not limited in any way by the present embodiment.
The method and device of recording in above-mentioned view multimedia documents, identifies in multimedia file have identical vocal print feature
Multi-medium data, and the multi-medium data with identical vocal print feature that will identify that is marked.Therefore, above-mentioned check many
The method and device of recording in media file can pass through the recording of the different spokesman of vocal print feature differentiation, and to each spokesman
Recording be marked, so that user can check the recording of specified speech person easily.When user needs to check a certain position
During the recording of spokesman, can quickly find the recording of the spokesman wanting to check according to mark, and without listening a recording
Carry out record, easy to operate, effectively save user and check the time that specified speech person records, facilitate user to check recording.
Each technical characteristic of embodiment described above can combine arbitrarily, for making description succinct, not to above-mentioned reality
The all possible combination of each technical characteristic executed in example is all described, but, as long as the combination of these technical characteristics is not deposited
It in contradiction, is all considered to be the scope that this specification is recorded.
Embodiment described above only have expressed the several embodiments of the present invention, and it describes more concrete and detailed, but simultaneously
Can not therefore be construed as limiting the scope of the patent.It should be pointed out that, come for those of ordinary skill in the art
Saying, without departing from the inventive concept of the premise, can also make some deformation and improve, these broadly fall into the protection of the present invention
Scope.Therefore, the protection domain of patent of the present invention should be as the criterion with claims.
Claims (10)
1. the method recorded in a view multimedia documents, it is characterised in that include:
Scanning multimedia file, obtains the voice print database in described multimedia file;
According to described voice print database, identify the multi-medium data in described multimedia file with identical vocal print feature;
The described multi-medium data with identical vocal print feature identifying is marked.
2. the method recorded in view multimedia documents according to claim 1, it is characterised in that
Described the described multi-medium data with identical vocal print feature identifying is marked, including:Identify described
The multi-medium data with identical vocal print feature bind with default label;
Described the described multi-medium data with identical vocal print feature identifying and default label are bound after, also wrap
Include:
Receive and operation is chosen to default label;
According to the described operation of choosing to default label, run and described there is identical vocal print spy with the binding of described default label phase
The multi-medium data levied.
3. the method recorded in view multimedia documents according to claim 1, it is characterised in that described by described identification
The multi-medium data with identical vocal print feature going out is marked, including:
Read the initial time of each corresponding recording of described vocal print feature;
The described multi-medium data with identical vocal print feature identifying is carried out initial time mark.
4. the method recorded in view multimedia documents according to claim 1, it is characterised in that described according to described sound
Line data, after identifying the multi-medium data in described multimedia file with identical vocal print feature, also include:
Read local linkages people's voice print database, identify the vocal print feature of described local linkages people's voice print database, by described local connection
It is vocal print feature and the vocal print characteristic matching in the described multimedia file obtaining of people;
If the vocal print feature in described multimedia file is identical with the vocal print feature of described local linkages people, then extract local linkages
People's title, is marked to multi-medium data with described local linkages people's title.
5. the method recorded in view multimedia documents according to claim 1, it is characterised in that also include:
Detect whether described multimedia file comprises caption information, if described multimedia file comprises caption information, then in display
Captions search window is shown on interface.
6. the device recorded in a view multimedia documents, it is characterised in that include:
Voice print database scan module, is used for scanning multimedia file, obtains the voice print database in described multimedia file;
Vocal print characteristics analysis module, for according to described voice print database, identifies have identical vocal print spy in described multimedia file
The multi-medium data levied;
Record labels module, for being marked the described multi-medium data with identical vocal print feature identifying.
7. the device recorded in view multimedia documents according to claim 6, it is characterised in that also include:
Preset label binding module, for by the described multi-medium data with identical vocal print feature identifying and default label
Bind;
Receiver module, chooses operation for receiving to default label;
Record playing module, for according to the described operation of choosing to default label, run described in tie up with described default label phase
The fixed multi-medium data with identical vocal print feature.
8. the device recorded in view multimedia documents according to claim 6, it is characterised in that also include:
Initial time acquisition module, for reading the initial time of each corresponding recording of described vocal print feature;
Record labels module is additionally operable to the described multi-medium data with identical vocal print feature identifying is carried out initial time
Mark.
9. the device recorded in view multimedia documents according to claim 6, it is characterised in that also include:
Local voiceprint analysis module, is used for reading local linkages people's voice print database, identifies described local linkages people's voice print database
Vocal print feature, by the vocal print feature of described local linkages people and the vocal print characteristic matching in the described multimedia file obtaining;
If the vocal print feature in described multimedia file is identical with the vocal print feature of described local linkages people, then extract local linkages
Described local linkages people's title is sent to described record labels module by people's title;
Multi-medium data is marked by described record labels module with described local linkages people's title.
10. the device recorded in view multimedia documents according to claim 6, it is characterised in that also include:
Caption information detection module, is used for detecting whether described multimedia file comprises caption information, when described multimedia file
When comprising caption information, display interface shows captions search window.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610877020.6A CN106448683A (en) | 2016-09-30 | 2016-09-30 | Method and device for viewing recording in multimedia files |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610877020.6A CN106448683A (en) | 2016-09-30 | 2016-09-30 | Method and device for viewing recording in multimedia files |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106448683A true CN106448683A (en) | 2017-02-22 |
Family
ID=58171629
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610877020.6A Pending CN106448683A (en) | 2016-09-30 | 2016-09-30 | Method and device for viewing recording in multimedia files |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106448683A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107403623A (en) * | 2017-07-31 | 2017-11-28 | 努比亚技术有限公司 | Store method, terminal, Cloud Server and the readable storage medium storing program for executing of recording substance |
CN107452408A (en) * | 2017-07-27 | 2017-12-08 | 上海与德科技有限公司 | A kind of audio frequency playing method and device |
CN107564531A (en) * | 2017-08-25 | 2018-01-09 | 百度在线网络技术(北京)有限公司 | Minutes method, apparatus and computer equipment based on vocal print feature |
CN107845386A (en) * | 2017-11-14 | 2018-03-27 | 维沃移动通信有限公司 | Audio signal processing method, mobile terminal and server |
CN109410953A (en) * | 2018-12-21 | 2019-03-01 | 上海蒂茜科技有限公司 | A kind of vertical play system of multimedia |
CN109471840A (en) * | 2018-10-15 | 2019-03-15 | 北京海数宝科技有限公司 | Fileview method, apparatus, computer equipment and storage medium |
WO2019183904A1 (en) * | 2018-03-29 | 2019-10-03 | 华为技术有限公司 | Method for automatically identifying different human voices in audio |
CN111641754A (en) * | 2020-05-29 | 2020-09-08 | 北京小米松果电子有限公司 | Contact photo generation method and device and storage medium |
CN108364654B (en) * | 2018-01-30 | 2020-10-13 | 网易乐得科技有限公司 | Voice processing method, medium, device and computing equipment |
CN111913627A (en) * | 2020-06-22 | 2020-11-10 | 维沃移动通信有限公司 | Recording file display method and device and electronic equipment |
CN114464198A (en) * | 2021-11-30 | 2022-05-10 | 中国人民解放军战略支援部队信息工程大学 | Visual human voice separation system, method and device |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6961703B1 (en) * | 2000-09-13 | 2005-11-01 | Itt Manufacturing Enterprises, Inc. | Method for speech processing involving whole-utterance modeling |
CN102063461A (en) * | 2009-11-06 | 2011-05-18 | 株式会社理光 | Comment recording appartus and method |
CN103035247A (en) * | 2012-12-05 | 2013-04-10 | 北京三星通信技术研究有限公司 | Method and device of operation on audio/video file based on voiceprint information |
CN104123115A (en) * | 2014-07-28 | 2014-10-29 | 联想(北京)有限公司 | Audio information processing method and electronic device |
CN105488227A (en) * | 2015-12-29 | 2016-04-13 | 惠州Tcl移动通信有限公司 | Electronic device and method for processing audio file based on voiceprint features through same |
CN105679357A (en) * | 2015-12-29 | 2016-06-15 | 惠州Tcl移动通信有限公司 | Mobile terminal and voiceprint identification-based recording method thereof |
CN105719659A (en) * | 2016-02-03 | 2016-06-29 | 努比亚技术有限公司 | Recording file separation method and device based on voiceprint identification |
-
2016
- 2016-09-30 CN CN201610877020.6A patent/CN106448683A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6961703B1 (en) * | 2000-09-13 | 2005-11-01 | Itt Manufacturing Enterprises, Inc. | Method for speech processing involving whole-utterance modeling |
CN102063461A (en) * | 2009-11-06 | 2011-05-18 | 株式会社理光 | Comment recording appartus and method |
CN103035247A (en) * | 2012-12-05 | 2013-04-10 | 北京三星通信技术研究有限公司 | Method and device of operation on audio/video file based on voiceprint information |
CN104123115A (en) * | 2014-07-28 | 2014-10-29 | 联想(北京)有限公司 | Audio information processing method and electronic device |
CN105488227A (en) * | 2015-12-29 | 2016-04-13 | 惠州Tcl移动通信有限公司 | Electronic device and method for processing audio file based on voiceprint features through same |
CN105679357A (en) * | 2015-12-29 | 2016-06-15 | 惠州Tcl移动通信有限公司 | Mobile terminal and voiceprint identification-based recording method thereof |
CN105719659A (en) * | 2016-02-03 | 2016-06-29 | 努比亚技术有限公司 | Recording file separation method and device based on voiceprint identification |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107452408A (en) * | 2017-07-27 | 2017-12-08 | 上海与德科技有限公司 | A kind of audio frequency playing method and device |
CN107452408B (en) * | 2017-07-27 | 2020-09-25 | 成都声玩文化传播有限公司 | Audio playing method and device |
CN107403623A (en) * | 2017-07-31 | 2017-11-28 | 努比亚技术有限公司 | Store method, terminal, Cloud Server and the readable storage medium storing program for executing of recording substance |
CN107564531A (en) * | 2017-08-25 | 2018-01-09 | 百度在线网络技术(北京)有限公司 | Minutes method, apparatus and computer equipment based on vocal print feature |
CN107845386B (en) * | 2017-11-14 | 2020-04-21 | 维沃移动通信有限公司 | Sound signal processing method, mobile terminal and server |
CN107845386A (en) * | 2017-11-14 | 2018-03-27 | 维沃移动通信有限公司 | Audio signal processing method, mobile terminal and server |
CN108364654B (en) * | 2018-01-30 | 2020-10-13 | 网易乐得科技有限公司 | Voice processing method, medium, device and computing equipment |
WO2019183904A1 (en) * | 2018-03-29 | 2019-10-03 | 华为技术有限公司 | Method for automatically identifying different human voices in audio |
CN111328418A (en) * | 2018-03-29 | 2020-06-23 | 华为技术有限公司 | Method for automatically identifying different voices in audio |
CN109471840A (en) * | 2018-10-15 | 2019-03-15 | 北京海数宝科技有限公司 | Fileview method, apparatus, computer equipment and storage medium |
CN109410953A (en) * | 2018-12-21 | 2019-03-01 | 上海蒂茜科技有限公司 | A kind of vertical play system of multimedia |
CN111641754A (en) * | 2020-05-29 | 2020-09-08 | 北京小米松果电子有限公司 | Contact photo generation method and device and storage medium |
CN111641754B (en) * | 2020-05-29 | 2021-12-14 | 北京小米松果电子有限公司 | Contact photo generation method and device and storage medium |
CN111913627A (en) * | 2020-06-22 | 2020-11-10 | 维沃移动通信有限公司 | Recording file display method and device and electronic equipment |
CN114464198A (en) * | 2021-11-30 | 2022-05-10 | 中国人民解放军战略支援部队信息工程大学 | Visual human voice separation system, method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106448683A (en) | Method and device for viewing recording in multimedia files | |
CN107274916B (en) | Method and device for operating audio/video file based on voiceprint information | |
CN106024009A (en) | Audio processing method and device | |
US20070265720A1 (en) | Content marking method, content playback apparatus, content playback method, and storage medium | |
CN102937959A (en) | Automatically creating a mapping between text data and audio data | |
WO2016197708A1 (en) | Recording method and terminal | |
CN110335625A (en) | The prompt and recognition methods of background music, device, equipment and medium | |
JP2007519047A (en) | Method and system for determining topic of conversation and acquiring and presenting related content | |
KR100676863B1 (en) | System and method for providing music search service | |
CN103491450A (en) | Setting method of playback fragment of media stream and terminal | |
CN104657074A (en) | Method, device and mobile terminal for realizing sound recording | |
CN106407358B (en) | Image searching method and device and mobile terminal | |
CN107885483A (en) | Method of calibration, device, storage medium and the electronic equipment of audio-frequency information | |
CN103530320A (en) | Multimedia file processing method and device and terminal | |
CN105139698A (en) | Information input method and device of point reading machine | |
US8996580B2 (en) | Apparatus and method for generating multimedia play list based on user experience in portable multimedia player | |
CN102664008B (en) | Method, terminal and system for transmitting data | |
JP2009519538A (en) | Method and apparatus for accessing a digital file from a collection of digital files | |
CN105138617A (en) | Music automatic positioning and annotation system and method | |
CN113901186A (en) | Telephone recording marking method, device, equipment and storage medium | |
CN108305622B (en) | Voice recognition-based audio abstract text creating method and device | |
CN110309324A (en) | A kind of searching method and relevant apparatus | |
CN106791442B (en) | A kind of image pickup method and mobile terminal | |
CN111723235B (en) | Music content identification method, device and equipment | |
KR20130110965A (en) | Sensibility evalution and contents recommendation method based on user feedback |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170222 |
|
RJ01 | Rejection of invention patent application after publication |