CN106448683A

CN106448683A - Method and device for viewing recording in multimedia files

Info

Publication number: CN106448683A
Application number: CN201610877020.6A
Authority: CN
Inventors: 韩旭; 黄润杰
Original assignee: Meizu Technology Co Ltd
Current assignee: Meizu Technology Co Ltd
Priority date: 2016-09-30
Filing date: 2016-09-30
Publication date: 2017-02-22

Abstract

The invention relates to a method and a device for viewing recording in multimedia files. The method for viewing recording in the multimedia files comprises the following steps: scanning the multimedia files, and acquiring vocal print data in the multimedia files; in accordance with the vocal print data, recognizing multimedia data, which has same vocal print characteristics, in the multimedia files; and marking the recognized multimedia data which has the same vocal print characteristics. Correspondingly, the invention also provides the method and the device for viewing the recording in the multimedia files. With the application of the method and the device for viewing the recording in the multimedia files, the recording of different speakers can be distinguished in accordance with the vocal print characteristics, and the recording of each speaker can be marked, so that a user can conveniently view the recording of a designated speaker.

Description

The method and device of recording in view multimedia documents

Technical field

The present invention relates to field of computer technology, particularly relate to method and the dress of recording in a kind of view multimedia documents Put.

Background technology

With the fast development of computer technology, various terminals, such as the performance of smart phone, panel computer, computer etc. Improving constantly, and, with the increase of people's life requirement, the application type that these terminals can be supported also constantly increases Add, the various functions such as such as communication, social, multimedia file is play, online or shopping, wherein, use terminal device to play many Media file is one of function the most frequently used in people's daily life.

But, when using conventional Multi Media file broadcast mode to play multimedia file, multimedia file is from first to last suitable Sequence is play.When multimedia file includes the recording of multiple spokesman, if user wants to check or record a certain position spokesman Recording, can only first whole multimedia file completely be play one time, the recording initial time of each spokesman manually recorded, then When follow-up play, the playing progress rate according to manually recorded adjustment multimedia file, plays the recording wanting the spokesman checking. Therefore, using in the operating process of recording of conventional Multi Media file broadcast mode broadcasting specified speech people, user needs to carry out Many manually operated, operate lengthy and tedious, waste time and energy.

Content of the invention

Based on this, it is necessary to for conventional Multi Media file broadcast mode play specified speech people recording operation lengthy and tedious, The problem wasting time and energy, provides the method and device of recording in a kind of view multimedia documents.

A kind of method of recording in view multimedia documents, including：

Scanning multimedia file, obtains the voice print database in multimedia file；

According to voice print database, identify the multi-medium data in multimedia file with identical vocal print feature；

The multi-medium data with identical vocal print feature that will identify that is marked.

Wherein in an embodiment, the multi-medium data with identical vocal print feature that will identify that is marked, bag Include：The multi-medium data with identical vocal print feature and the default label that will identify that are bound；

After the multi-medium data with identical vocal print feature that will identify that and default label are bound, also include：

Receive and operation is chosen to default label；

According to the operation of choosing to default label, run the many matchmakers with identical vocal print feature with the binding of default label phase Volume data.

Wherein in an embodiment, the above-mentioned multi-medium data with identical vocal print feature that will identify that enters rower Note, including：

Read the initial time of each corresponding recording of vocal print feature；

The multi-medium data with identical vocal print feature that will identify that carries out initial time mark.

Wherein in an embodiment, above-mentioned according to voice print database, identify in multimedia file that there is identical vocal print feature Multi-medium data after, also include：

Read local linkages people's voice print database, identify the vocal print feature of local linkages people's voice print database, by local linkages people Vocal print feature and the vocal print characteristic matching in the multimedia file having obtained；

If the vocal print feature in multimedia file is identical with the vocal print feature of local linkages people, then extract local linkages name Claim, with local linkages people's title, multi-medium data is marked.

Wherein in an embodiment, in above-mentioned view multimedia documents, the method for recording also includes：

Whether detection multimedia file comprises caption information, if multimedia file comprises caption information, then at display interface Upper display captions search window.

A kind of device of recording in view multimedia documents, including：

Voice print database scan module, is used for scanning multimedia file, obtains the voice print database in multimedia file；

Vocal print characteristics analysis module, for according to voice print database, identifies in multimedia file have identical vocal print feature Multi-medium data；

Record labels module, is marked for the multi-medium data with identical vocal print feature that will identify that.

Wherein in an embodiment, in view multimedia documents, the device of recording also includes：

Preset label binding module, for the multi-medium data with identical vocal print feature and the default label that will identify that Bind；

Receiver module, chooses operation for receiving to default label；

Record playing module, for according to the operation of choosing to default label, runs and having that default label phase is bound The multi-medium data of identical vocal print feature.

Initial time acquisition module, for reading the initial time of each corresponding recording of vocal print feature；

The multi-medium data with identical vocal print feature that record labels module is additionally operable to will identify that carries out initial time Mark.

Local voiceprint analysis module, is used for reading local linkages people's voice print database, identifies local linkages people's voice print database Vocal print feature, by the vocal print feature of local linkages people and the vocal print characteristic matching in the multimedia file obtaining；

If the vocal print feature in multimedia file is identical with the vocal print feature of local linkages people, then extract local linkages name Claim, local linkages people's title is sent to record labels module；

Multi-medium data is marked by record labels module with local linkages people's title.

Caption information detection module, is used for detecting whether multimedia file comprises caption information, when multimedia file comprises During caption information, display interface shows captions search window.

The method and device of recording in above-mentioned view multimedia documents, identifies in multimedia file have identical vocal print feature Multi-medium data, and the multi-medium data with identical vocal print feature that will identify that is marked.Therefore, above-mentioned check many The method and device of recording in media file can pass through the recording of the different spokesman of vocal print feature differentiation, and to each spokesman Recording be marked, so that user can check the recording of specified speech person easily.When user needs to check a certain position During the recording of spokesman, can quickly find the recording of the spokesman wanting to check according to mark, and without listening a recording Carry out record, easy to operate, effectively save user and check the time that specified speech person records, facilitate user to check recording.

Brief description

Fig. 1 is the process principle figure of the method for recording in view multimedia documents in an embodiment；

Fig. 2 is to show in a concrete application scenarios in an embodiment in multimedia file that spokesman records mark Interface schematic diagram；

Fig. 3 is to show in a concrete application scenarios in another embodiment in multimedia file that spokesman records mark Interface schematic diagram；

Fig. 4 is to show in a concrete application scenarios in another embodiment in multimedia file that spokesman records mark Interface schematic diagram；

Fig. 5 is the structure principle chart of the device of recording in view multimedia documents in an embodiment.

Detailed description of the invention

In order to make the purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, right The present invention is further elaborated.It should be appreciated that specific embodiment described herein only in order to explain the present invention, not For limiting the present invention.

Refer to Fig. 1, the method for recording in a kind of view multimedia documents, including：

Step 120：Scanning multimedia file, obtains the voice print database in multimedia file.

Concrete, multimedia file can be audio file or the video file including recording, includes recording when checking During vedio/audio file, carry out voice print database scanning to vedio/audio file, obtain the vocal print number in vedio/audio file According to.

Step 140：According to voice print database, identify the multi-medium data in multimedia file with identical vocal print feature.

Concrete, carry out vocal print feature discriminance analysis to the voice print database getting in step 120, extract vocal print feature letter Breath, is identified to the multi-medium data in multimedia file according to vocal print characteristic information, obtains and has identical vocal print feature Multi-medium data.

Step 160：The multi-medium data with identical vocal print feature that will identify that is marked.

Concrete, identical mark is marked to the multi-medium data with identical vocal print feature, user is by checking to many The mark of media data i.e. would know which multi-medium data has identical vocal print feature, is the speech record of a spokesman Sound, thus facilitate user to make a distinction the multiple spokesman recording in multimedia file, make user can check finger easily Determine the recording of spokesman.

In one embodiment, word naming method can be used to be marked the mark of multi-medium data mark.As, Multiple different corresponding marks of vocal print feature are designated as spokesman the 1st, spokesman 2 respectively ... spokesman n, there is identical vocal print special The mark of the multi-medium data levied is identical.For example, in one embodiment, the first vocal print feature and the second vocal print feature are got Two vocal print features, wherein, the multi-medium data with the first vocal print feature has two sections, has the multimedia of the second vocal print feature Data have one section, use the mode of above-mentioned word name mark, can be to the mark of the multi-medium data with the first vocal print feature Memorize is spokesman 1, is designated as spokesman 2 to the mark of the multi-medium data with the second vocal print feature, has the first vocal print special The mark of the two sections of multi-medium datas levied is all spokesman 1, and 1 section of multi-medium data mark with the second vocal print feature is all to send out Speech people 2, the mark of the multi-medium data with identical vocal print feature is identical.

It should be noted that specifically the mark to multi-medium data mark can be any form of mark, as long as can Embodying the multi-medium data to different vocal print features to make a distinction, it is same for allowing users to clearly differentiated which multi-medium data The recording of one spokesman.As, in other embodiments, can be right as mark with symbolization form or graphic form Multi-medium data is marked.For example, still as a example by above-mentioned citing, in one embodiment, can be to having the first vocal print Three sections of multi-medium datas with the second vocal print feature are all marked by the equal triangle mark of two sections of multi-medium data marks of feature Square indicia.Therefore, above-mentioned simply a kind of example, the present embodiment is simultaneously not specifically limited.

In one embodiment, step 160 includes：The multi-medium data with identical vocal print feature that will identify that with pre- Bidding label are bound.

Concrete, default label is the broadcast button of multi-medium data, and corresponding same spokesman, presets the quantity of label Correspondence has the quantity of the multi-medium data of identical vocal print feature, and default label is to each multimedia with identical vocal print feature Data are differently shown, and the default label display content every multi-medium data with identical vocal print feature is different, with right Multiple multi-medium datas with identical vocal print feature make a distinction, and facilitate user to check.

As, it still is exemplified as example with above-mentioned, the two sections of multi-medium datas with the first vocal print feature being designated spokesman 1 divide Not binding from two default labels showing that content is different, equally, one section that is designated spokesman 2 has vocal print feature Multi-medium data and a default label are bound.Concrete, it as in figure 2 it is shown, in one embodiment, is designated spokesman In two multi-medium datas of 1, the display content of the default label 202 of a multi-medium data binding is recording 1, another many matchmakers The content of the default label 202 of volume data binding is recording 2；It is designated in a multi-medium data of spokesman 2, preset label The display content of 202 is recording 1.

As in figure 2 it is shown, in the present embodiment, use list shape between the multi-medium data of the different vocal print features of two spokesman Formula is shown, mark spokesman the 1st, spokesman 2 is according to corresponding multimedia file quantity sequence, and each default label 202 is corresponding Spokesman identify corresponding displaying.It should be noted that, the present embodiment does not limit the mark of concrete multi-medium data and presets The display form of label 202 and the sortord of each mark, above example is a kind of example, and is not specifically limited.

Further, in the present embodiment, the multi-medium data with identical vocal print feature and the default label that will identify that After binding, also include：Receive and operation chosen to default label, according to the operation of choosing to default label, run with Preset the multi-medium data with identical vocal print feature of label phase binding.

Concrete, terminal detects whether to receive user and chooses operation to default label, when receiving user to presetting When choosing operation of label, obtains and presets the corresponding multi-medium data of label, plays and presets label 202 corresponding multimedia number According to.As, user couple detected be designated spokesman 1 corresponding display content for recording 1 default label 202 choose behaviour Make, i.e. play the corresponding multi-medium data of default label 202 of the recording 1 of spokesman 1.Concrete, choose operation can be point Hitting or pressing operation, the present embodiment is not specifically limited.

In one embodiment, step 160 includes：Read the initial time of each corresponding recording of vocal print feature, will know The multi-medium data with identical vocal print feature not gone out carries out initial time mark.

Concrete, for the different multimedia data with identical vocal print feature, read the recording of each multi-medium data Initial time, and with initial time, each multi-medium data with identical vocal print feature is marked respectively.Concrete, according to The recording initial time of each multi-medium data generates initial time interface element, shows the mark of each multi-medium data and corresponding Initial time interface element.As it is shown on figure 3, the moment in fact of the two of spokesman 1 sections of recording is respectively 7s and 19s, spokesman 2 One section recording initial time be 13s, with mark corresponding two interface elements 302 of spokesman 1 be respectively indicated as 0:00:07 With 0:00:19；The mark corresponding interface element of spokesman 2 302 shows 0:00:13.By recording initial time to multimedia number According to being marked, the spokesman's recording specified can be searched with convenient user.

Further, the initial time according to each recording generates initial time interface element 302, shows the mark of each recording Know and after the step of each initial time interface element 302 corresponding to mark, also include：Receive to interface element 302 Choose operation, according to operation is chosen to interface, play multimedia file from the interface element 302 corresponding moment in fact of recording.

Concrete, detect whether to receive user and operation is chosen to interface element 302, when receiving user to Interface Element Element 302 when choosing operation, obtains interface element 302 corresponding recording initial time, commences play out many from recording initial time Media file.As detected that user is 0 to display content:00:The interface element 302 of 07 choose operation, then open from 7s position Begin to play multimedia file, be played to terminate during 13s, play 0:00:The recording of the corresponding spokesman of interface element 302 1 of 07. Concrete, choose operation can be click on operation or pressing operation, the present embodiment is not specifically limited.

In the present embodiment, by recording, multi-medium data is marked by initial time, and user needs to check that a certain position is sent out During the recording of speaker, choose corresponding interface element 302 can play multimedia literary composition at the corresponding initial time of interface element 302 Part, and without listening a sound recordings record length, easy to operate, and can accurately read each section of initial time recorded, Record length record is accurate, can save the time that user checks that specified speech person records further, it is ensured that user checks recording Efficiency.

As it is shown on figure 3, in the present embodiment, use list shape between the multi-medium data of the different vocal print features of two spokesman Formula is shown, mark spokesman the 1st, spokesman 2 is according to correspondence recording initial time sequence, and each interface element 302 is corresponding Spokesman identifies corresponding displaying.It should be noted that, the present embodiment does not limit mark and the Interface Element of concrete multi-medium data The sortord of element 302, as in other embodiments, can also be according between the mark of the multi-medium data of different vocal print features Recording quantity, or long recording time, or alternate manner are ranked up.In addition, the present embodiment does not specifically limit multimedia number yet According to mark and the display form of interface element 302, as shown in Figure 4, in one embodiment, can directly open up on a timeline Show mark and the initial time of multi-medium data, respectively 0:00:07,0:00:13 and 0:00:19 3 initial times make bid Note, and in corresponding initial time position, mark spokesman 1 and spokesman 2 are marked, use this kind of display form, need During multimedia file to be play, can drag that playing progress bar to initial time position plays out also can be directly to initial time Position carries out clicking on or multimedia file is play in the operation such as touch-control.

Above example is to check the recording of specified speech people in a multimedia file.In other embodiments, also Multiple multi-medium datas can be checked, when the recording of specified speech people checked in multiple multimedia files by needs simultaneously When, scan multiple multimedia file simultaneously, obtain the voice print database in each multimedia file, multiple many according to voice print database identification Media file has the multi-medium data of identical vocal print feature, and is marked, to each spokesman in different files Recording be marked.General principle and above-described embodiment phase of the recording of specified speech people is checked in multiple multimedia files Same, multiple multimedia files are scanned i.e. simultaneously, all use said process to be identified and mark each multimedia file Note process, finally carries out collecting showing to mark result.

In one embodiment, also include after step 140：Read local linkages people's voice print database, identify local linkages The vocal print feature of people's voice print database, by the vocal print feature of local linkages people and the vocal print feature in the multimedia file obtaining Join；If the vocal print feature in multimedia file is identical with the vocal print feature of local linkages people, then extract local linkages people's title, with Multi-medium data is marked by local linkages people's title.

Concrete, local linkages people include address book contact or support phonetic function social networking application in contact person. First obtain local linkages people's voice print database and carry out signature analysis, obtaining the vocal print feature of local linkages people, will obtain Multimedia file in characteristics of the multimedia mate with the vocal print feature of local linkages people, if the match is successful, then extract Local linkages people's title, with the title of local linkages people as the mark of this corresponding multi-medium data of vocal print feature, to many matchmakers Volume data is marked.Concrete local linkages people's title includes address book contact name or supports the social networking application of phonetic function In contact person's pet name.As in matched acquisition address list, name is two that in Xiao Zhang and social networking application, the pet name is onion bulb Individual contact name is identical with two vocal print features in the multimedia file obtaining, then the mark by two multi-medium datas Know and be respectively designated as Xiao Zhang and onion bulb.

In the present embodiment the vocal print feature by mating local linkages people realize with the title of local linkages people to record into Line flag, can further facilitate the recording in user's view multimedia documents.Specifically with contact person name mark and other Can sort according to initial order or numerical order when the mark that mode is named is shown, it is possible to according to record as described in above-described embodiment The sequencing in sound moment in fact or according to recording quantity, or the sequence such as long recording time.

Wherein in an embodiment, in above-mentioned view multimedia documents, the method for recording also includes：Detection multimedia literary composition Whether part comprises caption information, if multimedia file comprises caption information, then shows captions search window on display interface.

Concrete, when needing the multimedia file checked to comprise caption information, display interface shows captions are searched for Window, user can input word and scan for so that the speech content of the spokesman that user can search as required is quick Search the recording of specified speech people, further facilitate user and search, improve user's search efficiency.

As it is shown in figure 5, the device 500 recorded in a kind of view multimedia documents, including：

Voice print database scan module 502, is used for scanning multimedia file, obtains the voice print database in multimedia file；

Vocal print characteristics analysis module 504, for according to voice print database, identifies in multimedia file have identical vocal print feature Multi-medium data；

Record labels module 506, is marked for the multi-medium data with identical vocal print feature that will identify that.

In one embodiment, in view multimedia documents, the device of recording also includes：Preset label binding module, reception Module and record playing module, preset label binding module for the multi-medium data with identical vocal print feature that will identify that Bind with default label；Default label is chosen operation for receiving by receiver module.Record playing module is used for basis Operation of choosing to default label, runs the multi-medium data with identical vocal print feature with the binding of default label phase.

In one embodiment, in view multimedia documents, the device of recording also includes：Initial time acquisition module, is used for Read the initial time of each corresponding recording of vocal print feature.Having that record labels module 506 is additionally operable to will identify that is identical The multi-medium data of vocal print feature carries out initial time mark.

Local voiceprint analysis module, is used for reading local linkages people's voice print database, identifies local linkages people's voice print database Vocal print feature, by the vocal print feature of local linkages people and the vocal print characteristic matching in the multimedia file obtaining.If multimedia Vocal print feature in file is identical with the vocal print feature of local linkages people, then extract local linkages people's title, by local linkages people Title is sent to record labels module；Multi-medium data is marked by record labels module with local linkages people's title.

Wherein in an embodiment, in view multimedia documents, the device of recording also includes：Caption information detection module, For detecting whether multimedia file comprises caption information, when multimedia file comprises caption information, aobvious on display interface Show captions search window.

In above-mentioned view multimedia documents, the method and device of recording can apply to including but not limited to following at least one Plant terminal：Smart mobile phone, panel computer, notebook computer, desktop PC, wearable intelligent equipment.Above-mentioned simply a kind of example, This is not limited in any way by the present embodiment.

Each technical characteristic of embodiment described above can combine arbitrarily, for making description succinct, not to above-mentioned reality The all possible combination of each technical characteristic executed in example is all described, but, as long as the combination of these technical characteristics is not deposited It in contradiction, is all considered to be the scope that this specification is recorded.

Embodiment described above only have expressed the several embodiments of the present invention, and it describes more concrete and detailed, but simultaneously Can not therefore be construed as limiting the scope of the patent.It should be pointed out that, come for those of ordinary skill in the art Saying, without departing from the inventive concept of the premise, can also make some deformation and improve, these broadly fall into the protection of the present invention Scope.Therefore, the protection domain of patent of the present invention should be as the criterion with claims.

Claims

1. the method recorded in a view multimedia documents, it is characterised in that include：

Scanning multimedia file, obtains the voice print database in described multimedia file；

According to described voice print database, identify the multi-medium data in described multimedia file with identical vocal print feature；

The described multi-medium data with identical vocal print feature identifying is marked.

2. the method recorded in view multimedia documents according to claim 1, it is characterised in that

Described the described multi-medium data with identical vocal print feature identifying is marked, including：Identify described The multi-medium data with identical vocal print feature bind with default label；

Described the described multi-medium data with identical vocal print feature identifying and default label are bound after, also wrap Include：

Receive and operation is chosen to default label；

According to the described operation of choosing to default label, run and described there is identical vocal print spy with the binding of described default label phase The multi-medium data levied.

3. the method recorded in view multimedia documents according to claim 1, it is characterised in that described by described identification The multi-medium data with identical vocal print feature going out is marked, including：

Read the initial time of each corresponding recording of described vocal print feature；

The described multi-medium data with identical vocal print feature identifying is carried out initial time mark.

4. the method recorded in view multimedia documents according to claim 1, it is characterised in that described according to described sound Line data, after identifying the multi-medium data in described multimedia file with identical vocal print feature, also include：

Read local linkages people's voice print database, identify the vocal print feature of described local linkages people's voice print database, by described local connection It is vocal print feature and the vocal print characteristic matching in the described multimedia file obtaining of people；

If the vocal print feature in described multimedia file is identical with the vocal print feature of described local linkages people, then extract local linkages People's title, is marked to multi-medium data with described local linkages people's title.

5. the method recorded in view multimedia documents according to claim 1, it is characterised in that also include：

Detect whether described multimedia file comprises caption information, if described multimedia file comprises caption information, then in display Captions search window is shown on interface.

6. the device recorded in a view multimedia documents, it is characterised in that include：

Voice print database scan module, is used for scanning multimedia file, obtains the voice print database in described multimedia file；

Vocal print characteristics analysis module, for according to described voice print database, identifies have identical vocal print spy in described multimedia file The multi-medium data levied；

Record labels module, for being marked the described multi-medium data with identical vocal print feature identifying.

7. the device recorded in view multimedia documents according to claim 6, it is characterised in that also include：

Preset label binding module, for by the described multi-medium data with identical vocal print feature identifying and default label Bind；

Receiver module, chooses operation for receiving to default label；

Record playing module, for according to the described operation of choosing to default label, run described in tie up with described default label phase The fixed multi-medium data with identical vocal print feature.

8. the device recorded in view multimedia documents according to claim 6, it is characterised in that also include：

Initial time acquisition module, for reading the initial time of each corresponding recording of described vocal print feature；

Record labels module is additionally operable to the described multi-medium data with identical vocal print feature identifying is carried out initial time Mark.

9. the device recorded in view multimedia documents according to claim 6, it is characterised in that also include：

Local voiceprint analysis module, is used for reading local linkages people's voice print database, identifies described local linkages people's voice print database Vocal print feature, by the vocal print feature of described local linkages people and the vocal print characteristic matching in the described multimedia file obtaining；

If the vocal print feature in described multimedia file is identical with the vocal print feature of described local linkages people, then extract local linkages Described local linkages people's title is sent to described record labels module by people's title；

Multi-medium data is marked by described record labels module with described local linkages people's title.

10. the device recorded in view multimedia documents according to claim 6, it is characterised in that also include：

Caption information detection module, is used for detecting whether described multimedia file comprises caption information, when described multimedia file When comprising caption information, display interface shows captions search window.