[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN106448683A - Method and device for viewing recording in multimedia files - Google Patents

Method and device for viewing recording in multimedia files Download PDF

Info

Publication number
CN106448683A
CN106448683A CN201610877020.6A CN201610877020A CN106448683A CN 106448683 A CN106448683 A CN 106448683A CN 201610877020 A CN201610877020 A CN 201610877020A CN 106448683 A CN106448683 A CN 106448683A
Authority
CN
China
Prior art keywords
vocal print
print feature
medium data
multimedia file
recording
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610877020.6A
Other languages
Chinese (zh)
Inventor
韩旭
黄润杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Meizu Technology Co Ltd
Original Assignee
Meizu Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Meizu Technology Co Ltd filed Critical Meizu Technology Co Ltd
Priority to CN201610877020.6A priority Critical patent/CN106448683A/en
Publication of CN106448683A publication Critical patent/CN106448683A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/432Query formulation
    • G06F16/433Query formulation using audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/54Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Library & Information Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)

Abstract

The invention relates to a method and a device for viewing recording in multimedia files. The method for viewing recording in the multimedia files comprises the following steps: scanning the multimedia files, and acquiring vocal print data in the multimedia files; in accordance with the vocal print data, recognizing multimedia data, which has same vocal print characteristics, in the multimedia files; and marking the recognized multimedia data which has the same vocal print characteristics. Correspondingly, the invention also provides the method and the device for viewing the recording in the multimedia files. With the application of the method and the device for viewing the recording in the multimedia files, the recording of different speakers can be distinguished in accordance with the vocal print characteristics, and the recording of each speaker can be marked, so that a user can conveniently view the recording of a designated speaker.

Description

The method and device of recording in view multimedia documents
Technical field
The present invention relates to field of computer technology, particularly relate to method and the dress of recording in a kind of view multimedia documents Put.
Background technology
With the fast development of computer technology, various terminals, such as the performance of smart phone, panel computer, computer etc. Improving constantly, and, with the increase of people's life requirement, the application type that these terminals can be supported also constantly increases Add, the various functions such as such as communication, social, multimedia file is play, online or shopping, wherein, use terminal device to play many Media file is one of function the most frequently used in people's daily life.
But, when using conventional Multi Media file broadcast mode to play multimedia file, multimedia file is from first to last suitable Sequence is play.When multimedia file includes the recording of multiple spokesman, if user wants to check or record a certain position spokesman Recording, can only first whole multimedia file completely be play one time, the recording initial time of each spokesman manually recorded, then When follow-up play, the playing progress rate according to manually recorded adjustment multimedia file, plays the recording wanting the spokesman checking. Therefore, using in the operating process of recording of conventional Multi Media file broadcast mode broadcasting specified speech people, user needs to carry out Many manually operated, operate lengthy and tedious, waste time and energy.
Content of the invention
Based on this, it is necessary to for conventional Multi Media file broadcast mode play specified speech people recording operation lengthy and tedious, The problem wasting time and energy, provides the method and device of recording in a kind of view multimedia documents.
A kind of method of recording in view multimedia documents, including:
Scanning multimedia file, obtains the voice print database in multimedia file;
According to voice print database, identify the multi-medium data in multimedia file with identical vocal print feature;
The multi-medium data with identical vocal print feature that will identify that is marked.
Wherein in an embodiment, the multi-medium data with identical vocal print feature that will identify that is marked, bag Include:The multi-medium data with identical vocal print feature and the default label that will identify that are bound;
After the multi-medium data with identical vocal print feature that will identify that and default label are bound, also include:
Receive and operation is chosen to default label;
According to the operation of choosing to default label, run the many matchmakers with identical vocal print feature with the binding of default label phase Volume data.
Wherein in an embodiment, the above-mentioned multi-medium data with identical vocal print feature that will identify that enters rower Note, including:
Read the initial time of each corresponding recording of vocal print feature;
The multi-medium data with identical vocal print feature that will identify that carries out initial time mark.
Wherein in an embodiment, above-mentioned according to voice print database, identify in multimedia file that there is identical vocal print feature Multi-medium data after, also include:
Read local linkages people's voice print database, identify the vocal print feature of local linkages people's voice print database, by local linkages people Vocal print feature and the vocal print characteristic matching in the multimedia file having obtained;
If the vocal print feature in multimedia file is identical with the vocal print feature of local linkages people, then extract local linkages name Claim, with local linkages people's title, multi-medium data is marked.
Wherein in an embodiment, in above-mentioned view multimedia documents, the method for recording also includes:
Whether detection multimedia file comprises caption information, if multimedia file comprises caption information, then at display interface Upper display captions search window.
A kind of device of recording in view multimedia documents, including:
Voice print database scan module, is used for scanning multimedia file, obtains the voice print database in multimedia file;
Vocal print characteristics analysis module, for according to voice print database, identifies in multimedia file have identical vocal print feature Multi-medium data;
Record labels module, is marked for the multi-medium data with identical vocal print feature that will identify that.
Wherein in an embodiment, in view multimedia documents, the device of recording also includes:
Preset label binding module, for the multi-medium data with identical vocal print feature and the default label that will identify that Bind;
Receiver module, chooses operation for receiving to default label;
Record playing module, for according to the operation of choosing to default label, runs and having that default label phase is bound The multi-medium data of identical vocal print feature.
Wherein in an embodiment, in view multimedia documents, the device of recording also includes:
Initial time acquisition module, for reading the initial time of each corresponding recording of vocal print feature;
The multi-medium data with identical vocal print feature that record labels module is additionally operable to will identify that carries out initial time Mark.
Wherein in an embodiment, in view multimedia documents, the device of recording also includes:
Local voiceprint analysis module, is used for reading local linkages people's voice print database, identifies local linkages people's voice print database Vocal print feature, by the vocal print feature of local linkages people and the vocal print characteristic matching in the multimedia file obtaining;
If the vocal print feature in multimedia file is identical with the vocal print feature of local linkages people, then extract local linkages name Claim, local linkages people's title is sent to record labels module;
Multi-medium data is marked by record labels module with local linkages people's title.
Wherein in an embodiment, in view multimedia documents, the device of recording also includes:
Caption information detection module, is used for detecting whether multimedia file comprises caption information, when multimedia file comprises During caption information, display interface shows captions search window.
The method and device of recording in above-mentioned view multimedia documents, identifies in multimedia file have identical vocal print feature Multi-medium data, and the multi-medium data with identical vocal print feature that will identify that is marked.Therefore, above-mentioned check many The method and device of recording in media file can pass through the recording of the different spokesman of vocal print feature differentiation, and to each spokesman Recording be marked, so that user can check the recording of specified speech person easily.When user needs to check a certain position During the recording of spokesman, can quickly find the recording of the spokesman wanting to check according to mark, and without listening a recording Carry out record, easy to operate, effectively save user and check the time that specified speech person records, facilitate user to check recording.
Brief description
Fig. 1 is the process principle figure of the method for recording in view multimedia documents in an embodiment;
Fig. 2 is to show in a concrete application scenarios in an embodiment in multimedia file that spokesman records mark Interface schematic diagram;
Fig. 3 is to show in a concrete application scenarios in another embodiment in multimedia file that spokesman records mark Interface schematic diagram;
Fig. 4 is to show in a concrete application scenarios in another embodiment in multimedia file that spokesman records mark Interface schematic diagram;
Fig. 5 is the structure principle chart of the device of recording in view multimedia documents in an embodiment.
Detailed description of the invention
In order to make the purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, right The present invention is further elaborated.It should be appreciated that specific embodiment described herein only in order to explain the present invention, not For limiting the present invention.
Refer to Fig. 1, the method for recording in a kind of view multimedia documents, including:
Step 120:Scanning multimedia file, obtains the voice print database in multimedia file.
Concrete, multimedia file can be audio file or the video file including recording, includes recording when checking During vedio/audio file, carry out voice print database scanning to vedio/audio file, obtain the vocal print number in vedio/audio file According to.
Step 140:According to voice print database, identify the multi-medium data in multimedia file with identical vocal print feature.
Concrete, carry out vocal print feature discriminance analysis to the voice print database getting in step 120, extract vocal print feature letter Breath, is identified to the multi-medium data in multimedia file according to vocal print characteristic information, obtains and has identical vocal print feature Multi-medium data.
Step 160:The multi-medium data with identical vocal print feature that will identify that is marked.
Concrete, identical mark is marked to the multi-medium data with identical vocal print feature, user is by checking to many The mark of media data i.e. would know which multi-medium data has identical vocal print feature, is the speech record of a spokesman Sound, thus facilitate user to make a distinction the multiple spokesman recording in multimedia file, make user can check finger easily Determine the recording of spokesman.
In one embodiment, word naming method can be used to be marked the mark of multi-medium data mark.As, Multiple different corresponding marks of vocal print feature are designated as spokesman the 1st, spokesman 2 respectively ... spokesman n, there is identical vocal print special The mark of the multi-medium data levied is identical.For example, in one embodiment, the first vocal print feature and the second vocal print feature are got Two vocal print features, wherein, the multi-medium data with the first vocal print feature has two sections, has the multimedia of the second vocal print feature Data have one section, use the mode of above-mentioned word name mark, can be to the mark of the multi-medium data with the first vocal print feature Memorize is spokesman 1, is designated as spokesman 2 to the mark of the multi-medium data with the second vocal print feature, has the first vocal print special The mark of the two sections of multi-medium datas levied is all spokesman 1, and 1 section of multi-medium data mark with the second vocal print feature is all to send out Speech people 2, the mark of the multi-medium data with identical vocal print feature is identical.
It should be noted that specifically the mark to multi-medium data mark can be any form of mark, as long as can Embodying the multi-medium data to different vocal print features to make a distinction, it is same for allowing users to clearly differentiated which multi-medium data The recording of one spokesman.As, in other embodiments, can be right as mark with symbolization form or graphic form Multi-medium data is marked.For example, still as a example by above-mentioned citing, in one embodiment, can be to having the first vocal print Three sections of multi-medium datas with the second vocal print feature are all marked by the equal triangle mark of two sections of multi-medium data marks of feature Square indicia.Therefore, above-mentioned simply a kind of example, the present embodiment is simultaneously not specifically limited.
In one embodiment, step 160 includes:The multi-medium data with identical vocal print feature that will identify that with pre- Bidding label are bound.
Concrete, default label is the broadcast button of multi-medium data, and corresponding same spokesman, presets the quantity of label Correspondence has the quantity of the multi-medium data of identical vocal print feature, and default label is to each multimedia with identical vocal print feature Data are differently shown, and the default label display content every multi-medium data with identical vocal print feature is different, with right Multiple multi-medium datas with identical vocal print feature make a distinction, and facilitate user to check.
As, it still is exemplified as example with above-mentioned, the two sections of multi-medium datas with the first vocal print feature being designated spokesman 1 divide Not binding from two default labels showing that content is different, equally, one section that is designated spokesman 2 has vocal print feature Multi-medium data and a default label are bound.Concrete, it as in figure 2 it is shown, in one embodiment, is designated spokesman In two multi-medium datas of 1, the display content of the default label 202 of a multi-medium data binding is recording 1, another many matchmakers The content of the default label 202 of volume data binding is recording 2;It is designated in a multi-medium data of spokesman 2, preset label The display content of 202 is recording 1.
As in figure 2 it is shown, in the present embodiment, use list shape between the multi-medium data of the different vocal print features of two spokesman Formula is shown, mark spokesman the 1st, spokesman 2 is according to corresponding multimedia file quantity sequence, and each default label 202 is corresponding Spokesman identify corresponding displaying.It should be noted that, the present embodiment does not limit the mark of concrete multi-medium data and presets The display form of label 202 and the sortord of each mark, above example is a kind of example, and is not specifically limited.
Further, in the present embodiment, the multi-medium data with identical vocal print feature and the default label that will identify that After binding, also include:Receive and operation chosen to default label, according to the operation of choosing to default label, run with Preset the multi-medium data with identical vocal print feature of label phase binding.
Concrete, terminal detects whether to receive user and chooses operation to default label, when receiving user to presetting When choosing operation of label, obtains and presets the corresponding multi-medium data of label, plays and presets label 202 corresponding multimedia number According to.As, user couple detected be designated spokesman 1 corresponding display content for recording 1 default label 202 choose behaviour Make, i.e. play the corresponding multi-medium data of default label 202 of the recording 1 of spokesman 1.Concrete, choose operation can be point Hitting or pressing operation, the present embodiment is not specifically limited.
In one embodiment, step 160 includes:Read the initial time of each corresponding recording of vocal print feature, will know The multi-medium data with identical vocal print feature not gone out carries out initial time mark.
Concrete, for the different multimedia data with identical vocal print feature, read the recording of each multi-medium data Initial time, and with initial time, each multi-medium data with identical vocal print feature is marked respectively.Concrete, according to The recording initial time of each multi-medium data generates initial time interface element, shows the mark of each multi-medium data and corresponding Initial time interface element.As it is shown on figure 3, the moment in fact of the two of spokesman 1 sections of recording is respectively 7s and 19s, spokesman 2 One section recording initial time be 13s, with mark corresponding two interface elements 302 of spokesman 1 be respectively indicated as 0:00:07 With 0:00:19;The mark corresponding interface element of spokesman 2 302 shows 0:00:13.By recording initial time to multimedia number According to being marked, the spokesman's recording specified can be searched with convenient user.
Further, the initial time according to each recording generates initial time interface element 302, shows the mark of each recording Know and after the step of each initial time interface element 302 corresponding to mark, also include:Receive to interface element 302 Choose operation, according to operation is chosen to interface, play multimedia file from the interface element 302 corresponding moment in fact of recording.
Concrete, detect whether to receive user and operation is chosen to interface element 302, when receiving user to Interface Element Element 302 when choosing operation, obtains interface element 302 corresponding recording initial time, commences play out many from recording initial time Media file.As detected that user is 0 to display content:00:The interface element 302 of 07 choose operation, then open from 7s position Begin to play multimedia file, be played to terminate during 13s, play 0:00:The recording of the corresponding spokesman of interface element 302 1 of 07. Concrete, choose operation can be click on operation or pressing operation, the present embodiment is not specifically limited.
In the present embodiment, by recording, multi-medium data is marked by initial time, and user needs to check that a certain position is sent out During the recording of speaker, choose corresponding interface element 302 can play multimedia literary composition at the corresponding initial time of interface element 302 Part, and without listening a sound recordings record length, easy to operate, and can accurately read each section of initial time recorded, Record length record is accurate, can save the time that user checks that specified speech person records further, it is ensured that user checks recording Efficiency.
As it is shown on figure 3, in the present embodiment, use list shape between the multi-medium data of the different vocal print features of two spokesman Formula is shown, mark spokesman the 1st, spokesman 2 is according to correspondence recording initial time sequence, and each interface element 302 is corresponding Spokesman identifies corresponding displaying.It should be noted that, the present embodiment does not limit mark and the Interface Element of concrete multi-medium data The sortord of element 302, as in other embodiments, can also be according between the mark of the multi-medium data of different vocal print features Recording quantity, or long recording time, or alternate manner are ranked up.In addition, the present embodiment does not specifically limit multimedia number yet According to mark and the display form of interface element 302, as shown in Figure 4, in one embodiment, can directly open up on a timeline Show mark and the initial time of multi-medium data, respectively 0:00:07,0:00:13 and 0:00:19 3 initial times make bid Note, and in corresponding initial time position, mark spokesman 1 and spokesman 2 are marked, use this kind of display form, need During multimedia file to be play, can drag that playing progress bar to initial time position plays out also can be directly to initial time Position carries out clicking on or multimedia file is play in the operation such as touch-control.
Above example is to check the recording of specified speech people in a multimedia file.In other embodiments, also Multiple multi-medium datas can be checked, when the recording of specified speech people checked in multiple multimedia files by needs simultaneously When, scan multiple multimedia file simultaneously, obtain the voice print database in each multimedia file, multiple many according to voice print database identification Media file has the multi-medium data of identical vocal print feature, and is marked, to each spokesman in different files Recording be marked.General principle and above-described embodiment phase of the recording of specified speech people is checked in multiple multimedia files Same, multiple multimedia files are scanned i.e. simultaneously, all use said process to be identified and mark each multimedia file Note process, finally carries out collecting showing to mark result.
In one embodiment, also include after step 140:Read local linkages people's voice print database, identify local linkages The vocal print feature of people's voice print database, by the vocal print feature of local linkages people and the vocal print feature in the multimedia file obtaining Join;If the vocal print feature in multimedia file is identical with the vocal print feature of local linkages people, then extract local linkages people's title, with Multi-medium data is marked by local linkages people's title.
Concrete, local linkages people include address book contact or support phonetic function social networking application in contact person. First obtain local linkages people's voice print database and carry out signature analysis, obtaining the vocal print feature of local linkages people, will obtain Multimedia file in characteristics of the multimedia mate with the vocal print feature of local linkages people, if the match is successful, then extract Local linkages people's title, with the title of local linkages people as the mark of this corresponding multi-medium data of vocal print feature, to many matchmakers Volume data is marked.Concrete local linkages people's title includes address book contact name or supports the social networking application of phonetic function In contact person's pet name.As in matched acquisition address list, name is two that in Xiao Zhang and social networking application, the pet name is onion bulb Individual contact name is identical with two vocal print features in the multimedia file obtaining, then the mark by two multi-medium datas Know and be respectively designated as Xiao Zhang and onion bulb.
In the present embodiment the vocal print feature by mating local linkages people realize with the title of local linkages people to record into Line flag, can further facilitate the recording in user's view multimedia documents.Specifically with contact person name mark and other Can sort according to initial order or numerical order when the mark that mode is named is shown, it is possible to according to record as described in above-described embodiment The sequencing in sound moment in fact or according to recording quantity, or the sequence such as long recording time.
Wherein in an embodiment, in above-mentioned view multimedia documents, the method for recording also includes:Detection multimedia literary composition Whether part comprises caption information, if multimedia file comprises caption information, then shows captions search window on display interface.
Concrete, when needing the multimedia file checked to comprise caption information, display interface shows captions are searched for Window, user can input word and scan for so that the speech content of the spokesman that user can search as required is quick Search the recording of specified speech people, further facilitate user and search, improve user's search efficiency.
As it is shown in figure 5, the device 500 recorded in a kind of view multimedia documents, including:
Voice print database scan module 502, is used for scanning multimedia file, obtains the voice print database in multimedia file;
Vocal print characteristics analysis module 504, for according to voice print database, identifies in multimedia file have identical vocal print feature Multi-medium data;
Record labels module 506, is marked for the multi-medium data with identical vocal print feature that will identify that.
In one embodiment, in view multimedia documents, the device of recording also includes:Preset label binding module, reception Module and record playing module, preset label binding module for the multi-medium data with identical vocal print feature that will identify that Bind with default label;Default label is chosen operation for receiving by receiver module.Record playing module is used for basis Operation of choosing to default label, runs the multi-medium data with identical vocal print feature with the binding of default label phase.
In one embodiment, in view multimedia documents, the device of recording also includes:Initial time acquisition module, is used for Read the initial time of each corresponding recording of vocal print feature.Having that record labels module 506 is additionally operable to will identify that is identical The multi-medium data of vocal print feature carries out initial time mark.
Wherein in an embodiment, in view multimedia documents, the device of recording also includes:
Local voiceprint analysis module, is used for reading local linkages people's voice print database, identifies local linkages people's voice print database Vocal print feature, by the vocal print feature of local linkages people and the vocal print characteristic matching in the multimedia file obtaining.If multimedia Vocal print feature in file is identical with the vocal print feature of local linkages people, then extract local linkages people's title, by local linkages people Title is sent to record labels module;Multi-medium data is marked by record labels module with local linkages people's title.
Wherein in an embodiment, in view multimedia documents, the device of recording also includes:Caption information detection module, For detecting whether multimedia file comprises caption information, when multimedia file comprises caption information, aobvious on display interface Show captions search window.
In above-mentioned view multimedia documents, the method and device of recording can apply to including but not limited to following at least one Plant terminal:Smart mobile phone, panel computer, notebook computer, desktop PC, wearable intelligent equipment.Above-mentioned simply a kind of example, This is not limited in any way by the present embodiment.
The method and device of recording in above-mentioned view multimedia documents, identifies in multimedia file have identical vocal print feature Multi-medium data, and the multi-medium data with identical vocal print feature that will identify that is marked.Therefore, above-mentioned check many The method and device of recording in media file can pass through the recording of the different spokesman of vocal print feature differentiation, and to each spokesman Recording be marked, so that user can check the recording of specified speech person easily.When user needs to check a certain position During the recording of spokesman, can quickly find the recording of the spokesman wanting to check according to mark, and without listening a recording Carry out record, easy to operate, effectively save user and check the time that specified speech person records, facilitate user to check recording.
Each technical characteristic of embodiment described above can combine arbitrarily, for making description succinct, not to above-mentioned reality The all possible combination of each technical characteristic executed in example is all described, but, as long as the combination of these technical characteristics is not deposited It in contradiction, is all considered to be the scope that this specification is recorded.
Embodiment described above only have expressed the several embodiments of the present invention, and it describes more concrete and detailed, but simultaneously Can not therefore be construed as limiting the scope of the patent.It should be pointed out that, come for those of ordinary skill in the art Saying, without departing from the inventive concept of the premise, can also make some deformation and improve, these broadly fall into the protection of the present invention Scope.Therefore, the protection domain of patent of the present invention should be as the criterion with claims.

Claims (10)

1. the method recorded in a view multimedia documents, it is characterised in that include:
Scanning multimedia file, obtains the voice print database in described multimedia file;
According to described voice print database, identify the multi-medium data in described multimedia file with identical vocal print feature;
The described multi-medium data with identical vocal print feature identifying is marked.
2. the method recorded in view multimedia documents according to claim 1, it is characterised in that
Described the described multi-medium data with identical vocal print feature identifying is marked, including:Identify described The multi-medium data with identical vocal print feature bind with default label;
Described the described multi-medium data with identical vocal print feature identifying and default label are bound after, also wrap Include:
Receive and operation is chosen to default label;
According to the described operation of choosing to default label, run and described there is identical vocal print spy with the binding of described default label phase The multi-medium data levied.
3. the method recorded in view multimedia documents according to claim 1, it is characterised in that described by described identification The multi-medium data with identical vocal print feature going out is marked, including:
Read the initial time of each corresponding recording of described vocal print feature;
The described multi-medium data with identical vocal print feature identifying is carried out initial time mark.
4. the method recorded in view multimedia documents according to claim 1, it is characterised in that described according to described sound Line data, after identifying the multi-medium data in described multimedia file with identical vocal print feature, also include:
Read local linkages people's voice print database, identify the vocal print feature of described local linkages people's voice print database, by described local connection It is vocal print feature and the vocal print characteristic matching in the described multimedia file obtaining of people;
If the vocal print feature in described multimedia file is identical with the vocal print feature of described local linkages people, then extract local linkages People's title, is marked to multi-medium data with described local linkages people's title.
5. the method recorded in view multimedia documents according to claim 1, it is characterised in that also include:
Detect whether described multimedia file comprises caption information, if described multimedia file comprises caption information, then in display Captions search window is shown on interface.
6. the device recorded in a view multimedia documents, it is characterised in that include:
Voice print database scan module, is used for scanning multimedia file, obtains the voice print database in described multimedia file;
Vocal print characteristics analysis module, for according to described voice print database, identifies have identical vocal print spy in described multimedia file The multi-medium data levied;
Record labels module, for being marked the described multi-medium data with identical vocal print feature identifying.
7. the device recorded in view multimedia documents according to claim 6, it is characterised in that also include:
Preset label binding module, for by the described multi-medium data with identical vocal print feature identifying and default label Bind;
Receiver module, chooses operation for receiving to default label;
Record playing module, for according to the described operation of choosing to default label, run described in tie up with described default label phase The fixed multi-medium data with identical vocal print feature.
8. the device recorded in view multimedia documents according to claim 6, it is characterised in that also include:
Initial time acquisition module, for reading the initial time of each corresponding recording of described vocal print feature;
Record labels module is additionally operable to the described multi-medium data with identical vocal print feature identifying is carried out initial time Mark.
9. the device recorded in view multimedia documents according to claim 6, it is characterised in that also include:
Local voiceprint analysis module, is used for reading local linkages people's voice print database, identifies described local linkages people's voice print database Vocal print feature, by the vocal print feature of described local linkages people and the vocal print characteristic matching in the described multimedia file obtaining;
If the vocal print feature in described multimedia file is identical with the vocal print feature of described local linkages people, then extract local linkages Described local linkages people's title is sent to described record labels module by people's title;
Multi-medium data is marked by described record labels module with described local linkages people's title.
10. the device recorded in view multimedia documents according to claim 6, it is characterised in that also include:
Caption information detection module, is used for detecting whether described multimedia file comprises caption information, when described multimedia file When comprising caption information, display interface shows captions search window.
CN201610877020.6A 2016-09-30 2016-09-30 Method and device for viewing recording in multimedia files Pending CN106448683A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610877020.6A CN106448683A (en) 2016-09-30 2016-09-30 Method and device for viewing recording in multimedia files

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610877020.6A CN106448683A (en) 2016-09-30 2016-09-30 Method and device for viewing recording in multimedia files

Publications (1)

Publication Number Publication Date
CN106448683A true CN106448683A (en) 2017-02-22

Family

ID=58171629

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610877020.6A Pending CN106448683A (en) 2016-09-30 2016-09-30 Method and device for viewing recording in multimedia files

Country Status (1)

Country Link
CN (1) CN106448683A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107403623A (en) * 2017-07-31 2017-11-28 努比亚技术有限公司 Store method, terminal, Cloud Server and the readable storage medium storing program for executing of recording substance
CN107452408A (en) * 2017-07-27 2017-12-08 上海与德科技有限公司 A kind of audio frequency playing method and device
CN107564531A (en) * 2017-08-25 2018-01-09 百度在线网络技术(北京)有限公司 Minutes method, apparatus and computer equipment based on vocal print feature
CN107845386A (en) * 2017-11-14 2018-03-27 维沃移动通信有限公司 Audio signal processing method, mobile terminal and server
CN109410953A (en) * 2018-12-21 2019-03-01 上海蒂茜科技有限公司 A kind of vertical play system of multimedia
CN109471840A (en) * 2018-10-15 2019-03-15 北京海数宝科技有限公司 Fileview method, apparatus, computer equipment and storage medium
WO2019183904A1 (en) * 2018-03-29 2019-10-03 华为技术有限公司 Method for automatically identifying different human voices in audio
CN111641754A (en) * 2020-05-29 2020-09-08 北京小米松果电子有限公司 Contact photo generation method and device and storage medium
CN108364654B (en) * 2018-01-30 2020-10-13 网易乐得科技有限公司 Voice processing method, medium, device and computing equipment
CN111913627A (en) * 2020-06-22 2020-11-10 维沃移动通信有限公司 Recording file display method and device and electronic equipment
CN114464198A (en) * 2021-11-30 2022-05-10 中国人民解放军战略支援部队信息工程大学 Visual human voice separation system, method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6961703B1 (en) * 2000-09-13 2005-11-01 Itt Manufacturing Enterprises, Inc. Method for speech processing involving whole-utterance modeling
CN102063461A (en) * 2009-11-06 2011-05-18 株式会社理光 Comment recording appartus and method
CN103035247A (en) * 2012-12-05 2013-04-10 北京三星通信技术研究有限公司 Method and device of operation on audio/video file based on voiceprint information
CN104123115A (en) * 2014-07-28 2014-10-29 联想(北京)有限公司 Audio information processing method and electronic device
CN105488227A (en) * 2015-12-29 2016-04-13 惠州Tcl移动通信有限公司 Electronic device and method for processing audio file based on voiceprint features through same
CN105679357A (en) * 2015-12-29 2016-06-15 惠州Tcl移动通信有限公司 Mobile terminal and voiceprint identification-based recording method thereof
CN105719659A (en) * 2016-02-03 2016-06-29 努比亚技术有限公司 Recording file separation method and device based on voiceprint identification

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6961703B1 (en) * 2000-09-13 2005-11-01 Itt Manufacturing Enterprises, Inc. Method for speech processing involving whole-utterance modeling
CN102063461A (en) * 2009-11-06 2011-05-18 株式会社理光 Comment recording appartus and method
CN103035247A (en) * 2012-12-05 2013-04-10 北京三星通信技术研究有限公司 Method and device of operation on audio/video file based on voiceprint information
CN104123115A (en) * 2014-07-28 2014-10-29 联想(北京)有限公司 Audio information processing method and electronic device
CN105488227A (en) * 2015-12-29 2016-04-13 惠州Tcl移动通信有限公司 Electronic device and method for processing audio file based on voiceprint features through same
CN105679357A (en) * 2015-12-29 2016-06-15 惠州Tcl移动通信有限公司 Mobile terminal and voiceprint identification-based recording method thereof
CN105719659A (en) * 2016-02-03 2016-06-29 努比亚技术有限公司 Recording file separation method and device based on voiceprint identification

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107452408A (en) * 2017-07-27 2017-12-08 上海与德科技有限公司 A kind of audio frequency playing method and device
CN107452408B (en) * 2017-07-27 2020-09-25 成都声玩文化传播有限公司 Audio playing method and device
CN107403623A (en) * 2017-07-31 2017-11-28 努比亚技术有限公司 Store method, terminal, Cloud Server and the readable storage medium storing program for executing of recording substance
CN107564531A (en) * 2017-08-25 2018-01-09 百度在线网络技术(北京)有限公司 Minutes method, apparatus and computer equipment based on vocal print feature
CN107845386B (en) * 2017-11-14 2020-04-21 维沃移动通信有限公司 Sound signal processing method, mobile terminal and server
CN107845386A (en) * 2017-11-14 2018-03-27 维沃移动通信有限公司 Audio signal processing method, mobile terminal and server
CN108364654B (en) * 2018-01-30 2020-10-13 网易乐得科技有限公司 Voice processing method, medium, device and computing equipment
WO2019183904A1 (en) * 2018-03-29 2019-10-03 华为技术有限公司 Method for automatically identifying different human voices in audio
CN111328418A (en) * 2018-03-29 2020-06-23 华为技术有限公司 Method for automatically identifying different voices in audio
CN109471840A (en) * 2018-10-15 2019-03-15 北京海数宝科技有限公司 Fileview method, apparatus, computer equipment and storage medium
CN109410953A (en) * 2018-12-21 2019-03-01 上海蒂茜科技有限公司 A kind of vertical play system of multimedia
CN111641754A (en) * 2020-05-29 2020-09-08 北京小米松果电子有限公司 Contact photo generation method and device and storage medium
CN111641754B (en) * 2020-05-29 2021-12-14 北京小米松果电子有限公司 Contact photo generation method and device and storage medium
CN111913627A (en) * 2020-06-22 2020-11-10 维沃移动通信有限公司 Recording file display method and device and electronic equipment
CN114464198A (en) * 2021-11-30 2022-05-10 中国人民解放军战略支援部队信息工程大学 Visual human voice separation system, method and device

Similar Documents

Publication Publication Date Title
CN106448683A (en) Method and device for viewing recording in multimedia files
CN107274916B (en) Method and device for operating audio/video file based on voiceprint information
CN106024009A (en) Audio processing method and device
US20070265720A1 (en) Content marking method, content playback apparatus, content playback method, and storage medium
CN102937959A (en) Automatically creating a mapping between text data and audio data
WO2016197708A1 (en) Recording method and terminal
CN110335625A (en) The prompt and recognition methods of background music, device, equipment and medium
JP2007519047A (en) Method and system for determining topic of conversation and acquiring and presenting related content
KR100676863B1 (en) System and method for providing music search service
CN103491450A (en) Setting method of playback fragment of media stream and terminal
CN104657074A (en) Method, device and mobile terminal for realizing sound recording
CN106407358B (en) Image searching method and device and mobile terminal
CN107885483A (en) Method of calibration, device, storage medium and the electronic equipment of audio-frequency information
CN103530320A (en) Multimedia file processing method and device and terminal
CN105139698A (en) Information input method and device of point reading machine
US8996580B2 (en) Apparatus and method for generating multimedia play list based on user experience in portable multimedia player
CN102664008B (en) Method, terminal and system for transmitting data
JP2009519538A (en) Method and apparatus for accessing a digital file from a collection of digital files
CN105138617A (en) Music automatic positioning and annotation system and method
CN113901186A (en) Telephone recording marking method, device, equipment and storage medium
CN108305622B (en) Voice recognition-based audio abstract text creating method and device
CN110309324A (en) A kind of searching method and relevant apparatus
CN106791442B (en) A kind of image pickup method and mobile terminal
CN111723235B (en) Music content identification method, device and equipment
KR20130110965A (en) Sensibility evalution and contents recommendation method based on user feedback

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170222

RJ01 Rejection of invention patent application after publication