WO2019015613A1

WO2019015613A1 - Electronic-book voice playback method, apparatus, and terminal device

Info

Publication number: WO2019015613A1
Application number: PCT/CN2018/096162
Authority: WO
Inventors: 董明舒
Original assignee: 广州阿里巴巴文学信息技术有限公司
Priority date: 2017-07-21
Filing date: 2018-07-18
Publication date: 2019-01-24
Also published as: CN107369462A; CN107369462B

Abstract

Provided are an electronic-book voice playback method, apparatus, and terminal device; according to a voice playback instruction used for instructing an electronic book to perform voice playback, content of a to-be-played electronic book is determined (S102); real human voice audio of the content corresponding to the to-be-played electronic book is obtained, and the real human voice audio is played (S104). Thus the user is provided with a better "book listening" experience.

Description

E-book voice playing method, device and terminal device

Cross reference

The present application claims the following priority: Application No.: 201710601433.6, entitled "E-book Voice Play Method, Apparatus, and Terminal Equipment", filed on July 21, 2017, the entire contents of which are hereby incorporated by reference. .

Technical field

Embodiments of the present invention relate to the field of electronic book data processing technologies, and in particular, to an electronic book voice playing method, apparatus, and terminal device.

Background technique

An e-book is a publication that digitizes information such as text, pictures, sounds, and images using computer technology. With the increasing use of Internet technology, traditional paper reading methods have gradually been replaced by e-books. People are increasingly using Internet and computer technology to download e-books through e-book reading applications for reading e-books. Read it.

However, with the development of smart terminal technology, people have higher and higher requirements for e-book reading applications. For example, how to read e-books of interest in the case of eye fatigue or poor light. Therefore, how to meet the needs of users has become an urgent problem to be solved.

Summary of the invention

In view of this, the embodiments of the present invention provide a method, a device, and a terminal device for playing an e-book voice, so as to solve the problem that the user reads the e-book under the condition of eye fatigue or poor light.

According to an aspect of an embodiment of the present invention, a method for playing an e-book voice includes: determining an e-book content to be played by a voice according to a voice play instruction for instructing an e-book to perform voice playback; obtaining the e-book The content corresponds to the real vocal audio and plays the real vocal audio.

According to another aspect of the embodiments of the present invention, an electronic book voice playback apparatus is provided, including: a content determining module, configured to determine an e-book to be played by voice according to a voice play instruction for instructing an e-book to perform voice play And an audio playing module, configured to obtain real vocal audio corresponding to the e-book content, and play the real vocal audio.

According to still another aspect of the embodiments of the present invention, a terminal device includes: a processor, a memory, a communication interface, and a communication bus, wherein the processor, the memory, and the communication interface are completed by using the communication bus Communication with each other; the memory is for storing at least one executable instruction that causes the processor to perform an operation corresponding to the e-book voice playback method as described above.

The e-book voice playing solution provided by the embodiment of the invention can perform the voice playing of the corresponding e-book content through the voice playing instruction in the case of the user's eye fatigue or poor light, thereby realizing the "listening" of the e-book reading application. "Features. Moreover, in the embodiment of the present invention, real vocal audio is used, and compared with the machine-synthesized audio, the real vocal audio is far superior to the machine synthesis in terms of voice intonation and fluency because of recording through real vocals, so that the user Can get a better "listening" experience.

DRAWINGS

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are only It is a few embodiments described in the embodiments of the present invention, and other drawings can be obtained according to the drawings for those skilled in the art.

1 is a flow chart showing the steps of a method for playing an e-book voice according to a first embodiment of the present invention;

2 is a flow chart showing the steps of a method for playing an e-book voice according to a second embodiment of the present invention;

3 is a block diagram showing the structure of an electronic book voice playing device according to a third embodiment of the present invention;

4 is a block diagram showing the structure of an electronic book voice playback apparatus according to Embodiment 4 of the present invention;

FIG. 5 is a schematic structural diagram of a terminal device according to Embodiment 5 of the present invention.

Detailed ways

Of course, any technical solution of implementing the embodiments of the present invention necessarily does not necessarily need to achieve all the above advantages at the same time.

For a better understanding of the technical solutions in the embodiments of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described in conjunction with the accompanying drawings in the embodiments of the present invention. The embodiments are only a part of the embodiments of the embodiments of the invention, and not all of the embodiments. All other embodiments obtained by those skilled in the art should be within the scope of protection of the embodiments of the present invention based on the embodiments in the embodiments of the present invention.

Embodiment 1

Referring to FIG. 1, a flow chart of steps of an e-book voice playing method according to a first embodiment of the present invention is shown.

The e-book voice playing method of this embodiment includes the following steps:

Step S102: Determine an e-book content to be played by the voice according to a voice play instruction for instructing the e-book to perform voice play.

The generation of the voice play instruction may be implemented in any suitable manner, including but not limited to: receiving the user's operation on the voice play button or option displayed in the e-book interface, or receiving the user's display of the e-book page. The setting operation (such as double-clicking, clicking, long-pressing) is generated, or is received after the user performs the voice playing setting through the corresponding setting menu, and the like, which is not limited by the embodiment of the present invention.

The content of the e-book to be played by the voice may be the content set by the e-book reading application, such as the entire content of the currently displayed e-book, or one or more segments, one or more lines, one or more sentences selected by the user. And so on.

Step S104: Obtain real vocal audio corresponding to the content of the e-book to be played by the voice, and play the real vocal audio.

After the content of the e-book to be played by the voice is determined, the real vocal audio corresponding to the content of the e-book can be obtained, and then played.

Among them, the real vocal audio is the voice generated by the real person's voice, such as audio generated by a real person reading aloud, or audio generated by a real person's dialogue, or audio generated by processing a real human voice. (such as the audio generated by re-splitting and re-synthesizing sentences that have been read by real people) and so on.

Through the e-book voice playing solution provided by the embodiment, when the user is tired or the light is bad, the user can perform the voice playing of the corresponding e-book content through the voice playing instruction, thereby realizing the "listening" of the e-book reading application. Features. Moreover, in the embodiment of the present invention, real vocal audio is used, and compared with the machine-synthesized audio, the real vocal audio is far superior to the machine synthesis in terms of voice intonation and fluency because of recording through real vocals, so that the user Can get a better "listening" experience.

The e-book voice playing method of this embodiment may be executed by any suitable device having data processing capability, including but not limited to: various terminal devices (including PCs, tablets, mobile terminals, etc.) and servers.

Embodiment 2

Referring to FIG. 2, a flow chart of steps of an e-book voice playing method according to a second embodiment of the present invention is shown.

Step S202: Determine the e-book content to be played by the voice according to the voice play instruction for performing voice play on the electronic book and the selection operation of the display content of the electronic book.

When users read e-books, in some cases there will be a need to "listen to the book", such as eye strain or poor lighting or other conditions. In this case, the device where the e-book application is located receives the corresponding user. After the operation, a corresponding voice play instruction is generated to indicate that the corresponding e-book content is played by voice. As described in the first embodiment, the content of the e-book to be played by the voice may be the content set by the e-book reading application by default, or may be the content selected by the user. In this embodiment, an e-book voice play solution provided by an embodiment of the present invention is described by taking a user selection as an example.

When the user selects the content of the e-book to be played by the voice, the user can select a certain segment or a certain segment of the content of the e-book, a certain line or a certain number of lines, the content of a certain sentence or a certain sentence, etc., by which the method can improve The flexibility of the user's "listening to the book" content enhances the user's "listening to the book" experience. However, it should be understood by those skilled in the art that the e-book content to be voice-played by default in the e-book reading application described in the first embodiment can also be applied to the solution of the embodiment.

It should be noted that, in practical applications, the operation of the user to indicate the voice play and the operation of the user to select the e-book content may be in any suitable order. For example, the voice playback may be first indicated by an appropriate method, and then the e-book content may be selected; or the e-book content may be selected first, and then the selected e-book content may be voice-played. In this embodiment, only the latter embodiment is taken as an example to describe the solution of the embodiment of the present invention. However, those skilled in the art can implement the e-book voice playing solution based on the previous mode by referring to the embodiment.

In the manner of first selecting the e-book content and then instructing the voice content to be played on the selected e-book content, the selection operation of the display content of the e-book may be first received, and the e-book content to be played by the voice is determined according to the selection operation.

In an optional manner, a first operation of the display content of the electronic book by the user may be received, a first action point of the first operation in the display content is determined, and a second operation of the display content by the user is received, Determining a second action point of the second operation in the display content; determining display content between the first action point and the second action point as the e-book content to be played by the voice. The first operation and the second operation include, but are not limited to, a click operation.

In another optional manner, the user may receive a third operation of the display content of the electronic book, determine a third action point of the third operation in the display content, and use the third action point as a reference point, which will include The display content in the first setting range including the third action point is determined as the electronic book content to be played by the voice; or the display content in the second setting range starting from the third action point is determined as the to-be-voiced The content of the e-book to be played; or, the content of the third setting range ending with the third point of action is determined as the content of the e-book to be played by the voice. The first setting range, the second setting range, and the third setting range may be the same or different, and may be set by a person skilled in the art according to actual needs. And, taking the third action point as a reference point, determining the display content in the first setting range including the third action point as the content of the electronic book to be played by the voice, and ending the third action point, The display content in the first setting range is determined as the content of the electronic book to be played by the voice, but is not limited thereto, and the third action point may not be the end point. The third operation includes, but is not limited to, a click operation. In this way, user operations are simplified and the operating burden of the system is reduced.

In another optional manner, the user may receive a selection operation of the display content of the electronic book, determine a content tag corresponding to the display content selected by the selection operation, and determine the content marked by the content tag as the to-be-voiced voice. The content of the e-book played. In this manner, a corresponding content mark is preset in the e-book content, and the content mark can be set by a person skilled in the art according to actual needs, such as setting a content mark for each chapter or each section, or setting one for each page. Content tagging, or, each segment is set to a content tag, or, based on an analysis of the e-book content, each complete episode (such as the teacher and student's dialogue in the classroom) or each complete scene (such as a sea scene) Set a content tag, and more. In this case, when the user performs a selection operation, for example, a certain portion of the e-book content is selected by the first operation and the second operation; or, a click operation is performed at any position of the currently displayed e-book content, such as The third operation mode; or, when the content tag is displayed to the user in an appropriate prompt manner in the e-book, after the user operates the corresponding prompt, the e-book reading application first determines the corresponding content tag, and further, the content is The entire portion of the e-book content marked by the tag is determined as the e-book content to be played by the voice.

However, the method is not limited to the above manner. In an actual application, other suitable manners for determining the content of the e-book to be played by the voice are also applicable to the solution of the embodiment of the present invention, such as determining the content of the entire page currently displayed by the e-book as the to-be-voiced voice. The content of the e-book to be played, etc.

Step S204: Obtain real vocal audio corresponding to the content of the e-book to be played by the voice, and play the real vocal audio.

The real vocal audio includes at least one of the following: a film and television audio obtained from a movie drama corresponding to the electronic book; a spoken audio corresponding to the electronic book content of the electronic book; and a user recording of the electronic book reading application where the electronic book is located User audio.

For example, in the e-book "Sansheng Sanshi Shili Peach Blossom", "I am only a short two months, but it is a very long life for you. Have you read the life that you wrote to you?" If the user selects the sentence in the e-book, or the voice is played there, the sentence of the actor in the TV series "Sansheng Sanshi Shili Peach Blossom" can be played, but it is not limited to this, after the book is adapted into a film and television work. The original text may not be exactly the same as the video file, that is, the exact match may not be performed. In this case, the matching degree satisfies a certain threshold or standard, and the threshold or standard may be appropriately set by a person skilled in the art. No restrictions.

For another example, the electronic book "Three Kingdoms" corresponds to the original sound of the original voice, and in this case, the start position of the audio corresponding to the content of the electronic book to be played by the voice can be determined, and the play is performed from the home position.

For another example, the user of the e-book reading application reads all or part of the content of the e-book and records it into audio, or combines the e-book content for voice commenting and saving it as audio, in the case where the audio can be used, such as The audio is set by the user to be shared, or sent to others, or published in an appropriate way in an e-book reading application, such as by e-book comment posting or by sharing or by other appropriate means, etc. When the content of the e-book is played, or the other person who can obtain the audio plays the content of the e-book, the audio can be used to implement the "listening". In this manner, before determining the e-book content to be played by the voice according to the voice play instruction of the e-book, the user can also receive the spoken audio recorded by the user through the e-book reading application for the content of the e-book, and the recorded audio and The content of the corresponding e-book is stored in association; and/or, the user receives the comment audio recorded by the e-book reading application for the content of the e-book, and associates the comment audio with the content of the corresponding e-book. The "listening" function is realized based on the recorded audio of the user recorded and associated storage, further enhancing the user's experience of using the e-book reading application.

It should be noted that, in an optional solution, the above-mentioned real vocal audio can be further processed, such as splitting and re-synthesizing to meet the real vocal audio playing needs in certain situations, such as video The splitting and recombination of lines, the splitting and recombination of audio readings, the splitting and recombination of user audio, etc., form new real vocal audio.

In addition, the real vocal audio can also be synthesized with the background audio and/or the business audio to generate synthesized audio, in which case the synthesized audio corresponding to the electronic book content to be played by the voice will be obtained, wherein the synthesized audio includes In addition to the real vocal audio, background audio and/or service audio is also included; and the synthesized audio is played. The background audio can be background music, and the background audio can further enhance the atmosphere, so that the user can feel the atmosphere of the part of the e-book content; the service audio can be the business audio recorded by the person in the current real vocal audio, or It is a business audio related to the content of the e-book to be played by voice, such as a story-related business audio. The business audio can be inserted at any appropriate position at the beginning, end, or beginning to end of the current real vocal audio. Alternatively, the business audio can be implemented as an advertising audio.

In an implementation manner of obtaining real vocal audio corresponding to the content of the e-book to be played by the voice, the content tag may be pre-set for the e-book, and the audio tag may be pre-set for the real vocal audio. That is, at least one content tag for marking the content of the e-book is pre-set in the e-book, and at least one audio tag for marking the audio content is pre-set in the real vocal audio, based on which the content tag can be Correspondence between audio tags, obtaining real vocal audio corresponding to the contents of the e-book.

Specifically, a content tag corresponding to the e-book content to be played by the voice may be determined; and an audio tag corresponding to the content tag is determined according to the correspondence between the pre-stored content tag and the audio tag; and the determined audio tag is obtained Corresponding audio content. Through the content mark and audio mark, the real vocal audio corresponding to the e-book content can be obtained quickly and accurately, and the response speed of the "listening" function to the user operation is improved.

In another implementation manner of obtaining real vocal audio corresponding to the content of the e-book to be played by the voice, the e-book to be played by the voice may be determined in advance (for example, in a voice play instruction for performing voice play on the e-book according to the indication) Before the step of content) performing voice recognition on the existing or acquired real vocal audio, obtaining the corresponding text content; determining the e-book content in the e-book that matches the text content; establishing and storing the real content corresponding to the text content The correspondence between the vocal audio and the determined e-book content. For example, voice recognition is performed on a piece of video audio of a period of 30 minutes, and corresponding multi-segment text content is obtained; further, the multi-segment text content is respectively matched with the e-book content, and the multi-segment text content and the e-book are determined according to the matching result. Corresponding relationship between multiple pieces of content; further, according to the relationship between the two, the correspondence between the plurality of parts of the real vocal audio corresponding to the plurality of pieces of text content recognized by the speech and the contents of the plurality of pieces of e-book contents can be established and stored relationship. Based on this, when the real vocal audio corresponding to the e-book content to be played by the voice is obtained, the real vocal audio corresponding to the e-book content to be played by the voice can be obtained according to the correspondence.

In addition, in an optional manner, if the real vocal audio includes a plurality of, for example, at least two of the audio-visual line audio, the e-book content reading audio, and the user audio, obtaining the content corresponding to the e-book content In real vocal audio, real vocal audio corresponding to the e-book content can be obtained from at least two of the audio and video audio, the e-book content reading audio, and the user audio according to a preset priority; or Receiving, by the user, a selection operation of at least two corresponding options of the audio-visual line audio, the e-book content reading audio, and the user audio, and obtaining the real vocal audio corresponding to the e-book content selected by the selecting operation; or The user may also determine the audio type preference of the user according to the historical data of the user playing the real vocal audio; according to the user's audio type preference, at least two of the audio and video audio, the e-book content reading audio, and the user audio are obtained and to be obtained. The real vocal audio corresponding to the e-book content of the voice playback. For example, the user's historical data indicates that the user has had ten voice playback records. Among them, the audio and video audio is used eight times. When the user plays the voice again, the audio and video audio can be directly used to perform the corresponding e-book content. Voice playback.

For another example, in the first mode, it is assumed that the priority of setting three types of audio such as audio and video audio, e-book content reading audio, and user audio is from high to low: user audio, audio and video audio, and e-book content reading audio. . When a certain part of the text of the e-book corresponds to the three kinds of audio at the same time, the user audio is played; and if a part of the text of the e-book only corresponds to some of the audio, for example, the audio and electronic contents of the electronic book and the e-book content are read aloud. Audio, according to the priority level will play the film and television audio, and if the part of the text only corresponds to the e-book content reading audio, the e-book content will be played aloud audio. It should be noted that the foregoing priority setting is only an exemplary description, and may be appropriately set by a person skilled in the art according to actual needs, which is not limited by the embodiment of the present invention. By setting the priority, it is possible to ensure, as much as possible, that the e-book text corresponds to audio, and the form of the audio is diversified.

In the second way, the user is provided with greater flexibility in selecting the real vocal audio corresponding to the e-book content, and the user can select the audio and play it. The options corresponding to the audio and video audio, the e-book content reading audio, and the user audio may be appropriately set by a person skilled in the art according to actual needs. In an optional implementation manner, the audio and video audio may be displayed through a pop-up window or a transparent overlay. The e-book content reads the audio and user audio options. For example, after receiving a voice play instruction for performing voice playback on a part of the e-book content, the e-book application presents a corresponding audio option to the user through a pop-up window or a transparent overlay layer for the user to select, and after playing the user's selection result, playing The real vocal audio corresponding to the selection result, for example, if the user selects the film and television word audio, the audio and video audio corresponding to the part of the electronic book content is played. Based on the interface for displaying the content of the e-book, the audio option is displayed through the pop-up window or the transparent overlay layer, which facilitates the user's operation and improves the user experience.

Through the above process, the "listening" function of the e-book content is realized. On the basis of this, optionally, the following operations of step S206 or step S208 can be further performed.

Step S206: in the process of playing the real vocal audio, receiving the page turning operation of the e-book, suspending the playing of the real vocal audio; re-determining the e-book content to be played by the voice according to the page turning operation; obtaining and re-creating The actual vocal audio corresponding to the determined e-book content is played and played.

In the process of playing a real vocal audio, it is possible that the audio has not been played yet, and the user has performed corresponding operations, such as page turning or page turning, and the e-book reading application is monitored during the audio playback process. After the page operation, the playing of the audio is automatically suspended; further, the e-book content to be played by the voice is re-determined according to the page turning operation, for example, determining the final target page of the page turning operation, and then re-creating according to the content of the target page. Determine the content of the e-book to be played.

In a possible case, suppose that the current real vocal audio is playing the content of the first sentence of the third paragraph of the fifth page of the e-book. At this time, the user performs a continuous page turning operation, and finally stops at the e-book. Page 10 of the page, in this case, you can stop the previous audio and play the real vocal audio of the e-book content on page 10 (such as the audio corresponding to the content tag of the first e-book on page 10, or , the audio corresponding to the start text on page 10, or the audio of the scene on page 10 or the scene, etc.); it is also possible to stop the previous audio and receive the user's selection of the e-book content on page 10, The real vocal audio corresponding to the e-book content selected by the selection operation is played. The page turning operation is similar to the page turning operation, and will not be described here.

In another possible case, suppose that the current real vocal audio is playing the content of the first sentence of the third paragraph of the fifth page of the e-book. At this time, the user performs a continuous page turning operation and turns to the e-book page. After the 10th page, the page flip operation is performed again, and the page 5 is turned back to the e-book page. In this case, the playback of the real vocal audio that was previously interrupted can be continued. However, it is not limited thereto, and the previous audio may be stopped, and the real vocal audio corresponding to the e-book content of the fifth page may be re-determined, for example, the audio corresponding to the content mark of the first e-book of the fifth page, or, page 5 The initial text corresponds to the audio, or the episode on page 5 or the audio corresponding to the scene, and so on. However, the way of continuing the playback of the real vocal audio before the interruption is closer to the real needs of the user "listening to the book" than the other methods, and improving the user's "listening to the book" experience.

Of course, in the actual application, if the page turning operation of the electronic book is received during the process of playing the real vocal audio, the playing of the real vocal audio can be stopped, and the user's next voice playing instruction is awaited.

Step S208: In the process of playing the real vocal audio, receiving an audio processing instruction for the played real vocal audio, and performing an operation indicated by the audio processing instruction on the real vocal audio.

The audio processing instruction includes, but is not limited to, at least one of: a pause instruction for instructing suspension of real human voice audio playback, a first adjustment instruction for indicating a playback speed of adjusting real human voice audio, and an instruction for adjusting the real person. A second adjustment instruction of the playback progress of the audio and audio, an exit instruction for instructing the exit of the real human voice audio, and a switching instruction for indicating the type of switching the real human voice audio.

For example, when the user needs to leave the terminal device through the real vocal audio “listening to the book”, the user may send a pause instruction to the e-book reading application by operating the “pause” or the similar operation option to pause the playing of the current audio; or, when When it is detected that the user interrupts the e-book reading application and uses other applications, the e-book reading application can automatically generate a corresponding pause instruction to suspend the playback of the current audio.

For another example, when the user needs to terminate the audio playback, the user may send an exit instruction indicating that the real human voice audio is exited to the e-book reading application by operating a “stop” or the like operation option to stop the playing of the current real human voice audio.

For another example, as described above, when the real vocal audio includes at least two of the audio-visual audio, the e-book content reading audio, and the user audio, the user may perform a selection operation on other audio types displayed, or by " A switch vocal" or similar operation option sends a switch instruction to the e-book reading application indicating the type of switching real vocal audio. For example, the current real vocal audio is user audio, and the user selects one of a plurality of displayed audio types by the operation of the “switch vocal” operation option, for example, switching the user audio to the audio and video audio or electronic The contents of the book read the audio.

For another example, if the user wants to adjust the playing speed of the audio, the first adjusting instruction for adjusting the playing speed of the real vocal audio can be sent to the e-book reading application through the corresponding playing speed adjusting operation option to adjust the playing speed of the current audio. . For example, if the user selects “2x speed” playback, the playback speed of the current real vocal audio will be adjusted to 2 times of the original playback speed.

For another example, if the user wishes to fast forward or rewind the audio, the second adjustment instruction indicating that the playback progress of the real vocal audio is adjusted may be sent to the e-book reading application through the corresponding play progress adjustment operation option. For example, the user can adjust the playing progress of the current real vocal audio by clicking the “fast forward” or similar operation option, or by dragging the audio playback progress bar.

It should be noted that the foregoing audio processing instructions may be implemented by any suitable setting by those skilled in the art. In an optional manner, the audio processing instructions may be displayed by a floating icon or a floating window or a transparent overlay. By means of displaying the audio processing instructions, on the one hand, the influence of the displayed audio processing instructions on the user's reading of the e-book is reduced as much as possible; and the other direction makes the user's control and processing of the audio more convenient. Improved user "listening" experience.

Embodiment 3

Referring to FIG. 3, a block diagram of a structure of an electronic book voice playback apparatus according to a third embodiment of the present invention is shown.

The e-book voice playing device of the embodiment includes: a content determining module 302, configured to determine an e-book content to be played by the voice according to a voice playing instruction for instructing the e-book to perform voice playing; and an audio playing module 304, configured to obtain and The e-book content corresponds to real vocal audio and plays the real vocal audio.

Through the e-book voice playing device provided by the embodiment, when the user is tired or the light is bad, the user can perform the voice playing of the corresponding e-book content through the voice playing instruction, thereby realizing the "listening" of the e-book reading application. Features. Moreover, in the embodiment of the present invention, real vocal audio is used, and compared with the machine-synthesized audio, the real vocal audio is far superior to the machine synthesis in terms of voice intonation and fluency because of recording through real vocals, so that the user Can get a better "listening" experience.

Embodiment 4

Referring to FIG. 4, a block diagram of a structure of an electronic book voice playback apparatus according to a fourth embodiment of the present invention is shown.

The e-book voice playing device of the embodiment includes: a content determining module 402, configured to determine an e-book content to be played by the voice according to a voice playing instruction for instructing the e-book to perform voice playing; and an audio playing module 404, configured to obtain The e-book content corresponds to real vocal audio and plays the real vocal audio.

Optionally, the real vocal audio includes at least one of: a film and television audio obtained from a movie drama corresponding to the electronic book; a reading audio corresponding to the electronic book content of the electronic book; and an e-book reading application where the electronic book is located User audio recorded by the user.

Optionally, the audio playing module 404 is configured to obtain synthesized audio corresponding to the electronic book content to be played, wherein the synthesized audio includes background audio and/or service audio in addition to the real human voice audio; and is used for playing The synthesized audio.

Optionally, at least one content tag for marking the content of the e-book is pre-set in the e-book, and at least one audio tag for marking the audio content is pre-set in the real vocal audio; the audio playing module 404 is configured to Marking a correspondence relationship with the audio mark, obtaining real vocal audio corresponding to the electronic book content, and playing the real vocal audio.

Optionally, the audio play module 404 is configured to determine a content mark corresponding to the electronic book content to be played by the voice; determine an audio mark corresponding to the content mark according to the corresponding relationship between the pre-stored content mark and the audio mark; acquire and determine The audio tag corresponds to the audio content and plays the audio content.

Optionally, the e-book voice playing device of the embodiment further includes: a relationship establishing module 406, configured to determine, according to the voice playing instruction for instructing the e-book to perform voice playing, the content of the e-book to be played by the voice Previously, speech recognition is performed on real vocal audio to obtain corresponding text content; e-book content matching the text content in the e-book is determined; real vocal audio corresponding to the text content and determined electronic are established and stored Corresponding relationship between the contents of the book; the audio playing module 404 is configured to obtain, according to the correspondence between the real vocal audio corresponding to the text content and the determined content of the electronic book, corresponding to the content of the electronic book to be played by the voice Real vocal audio and play the real vocal audio.

Optionally, when the real vocal audio includes at least two of the audio-visual audio, the e-book content reading audio, and the user audio, the audio playing module 404 is configured to use the audio and television content and the e-book content according to the preset priority. Reading at least two of the audio and the user audio, obtaining real vocal audio corresponding to the e-book content, and playing the real vocal audio; or, the audio playing module 404 is configured to receive the user's audio and video a book content reading audio, and a selection operation of at least two corresponding options in the user audio, obtaining real vocal audio selected by the selection operation corresponding to the e-book content, and playing the real vocal Audio; or, the audio playing module 404 is configured to determine the user's audio type preference according to the historical data of the user playing the real vocal audio; according to the user's audio type preference, from the audio and video audio, the e-book content, the audio and the user audio In at least two, real vocal audio corresponding to the content of the e-book to be played by the voice is obtained, and played Said real human voice audio.

Optionally, the electronic book voice playing device of the embodiment further includes: a display module 408, configured to receive, by the audio playing module 404, at least two corresponding options of the user for the audio and video audio, the electronic book content reading audio, and the user audio. Before the selection operation, at least two corresponding options of the audio and video audio, the e-book content reading audio, and the user audio are displayed through a pop-up window or a transparent overlay.

Optionally, the content determining module 402 is configured to determine, according to the voice play instruction for performing voice play on the electronic book and the selection operation of the display content of the electronic book, the electronic book content to be played by the voice.

Optionally, the electronic book voice playing device of the embodiment further includes: a content selection module 410, configured to select, according to the voice playing instruction for instructing the electronic book to perform voice playing, and the selection content of the electronic book in the content determining module 402 The operation, before determining the content of the electronic book to be played by the voice, receives a selection operation of the display content of the electronic book, and determines the content of the electronic book to be played by the voice according to the selection operation.

Optionally, the content selection module 410 includes: a first selection module 4102, configured to receive a first operation of the display content of the electronic book by the user, determine a first action point of the first operation in the display content, and receive the display content of the user The second operation determines a second action point of the second operation in the display content; determining display content between the first action point and the second action point as the e-book content to be played by the voice.

Optionally, the content selection module 410 includes: a second selection module 4104, configured to receive a third operation of the display content of the electronic book by the user, determine a third action point of the third operation in the display content; The action point is a reference point, and the display content in the first setting range including the third action point is determined as the content of the e-book to be played by the voice; or the second setting range starting from the third action point The display content inside is determined as the e-book content to be played by the voice; or the display content in the third setting range ending with the third action point is determined as the e-book content to be played by the voice.

Optionally, the content selection module 410 includes: a third selection module 4106, configured to receive a user's selection operation on the display content of the electronic book, determine a content tag corresponding to the display content selected by the selection operation, and mark the content The marked content is determined as the e-book content to be played by the voice.

Optionally, the e-book voice playing device of the embodiment further includes: a recording storage module 412, configured to determine, according to the voice playing instruction for instructing the e-book to perform voice playing, the content determining module 402 to determine the e-book content to be played by the voice Previously, receiving the spoken audio recorded by the user through the e-book reading application for the content of the e-book, associating the recorded audio with the content of the corresponding e-book; and/or receiving the user recording the content of the e-book through the e-book reading application The comment audio stores the comment audio associated with the content of the corresponding e-book.

Optionally, the e-book voice playback device of this embodiment further includes: an audio processing module 414, configured to receive an audio processing instruction for the played real vocal audio, and perform the audio processing instruction on the real vocal audio Indicated action.

Optionally, the audio processing instruction includes at least one of: a pause instruction for instructing suspension of the real human voice audio playback, a first adjustment instruction for indicating a playback speed of the real human voice audio, And a second adjustment instruction indicating that the playback progress of the real vocal audio is adjusted, an exit instruction for instructing to exit the real vocal audio play, and a switching instruction for indicating a type of switching the real vocal audio.

Optionally, the display module 408 is further configured to display the audio processing instruction by using a floating icon or a floating window or a transparent overlay.

Optionally, the e-book voice playback device of the embodiment further includes: a re-determination module 416, configured to receive a page turning operation on the e-book during the process of playing the real vocal audio, and suspend the real vocal audio Playback; re-determine the e-book content to be played by the voice according to the page turning operation; obtain real vocal audio corresponding to the re-determined e-book content and play.

The e-book voice playback device of the present embodiment is used to implement the corresponding e-book voice playback method in the foregoing multiple method embodiments, and has the beneficial effects of the corresponding method embodiments, and details are not described herein again.

Embodiment 5

Referring to FIG. 5, a schematic structural diagram of a terminal device according to Embodiment 5 of the present invention is shown. The specific implementation of the present invention does not limit the specific implementation of the terminal device.

As shown in FIG. 5, the terminal device may include a processor 502, a communications interface 504, a memory 506, and a communication bus 508.

among them:

Processor 502, communication interface 504, and memory 506 complete communication with one another via communication bus 508.

The communication interface 504 is configured to communicate with network elements of other devices, such as other terminal devices or servers.

The processor 502 is configured to execute the program 510, and specifically, the related steps in the foregoing embodiment of the electronic book voice playing method.

In particular, program 510 can include program code, the program code including computer operating instructions.

The processor 502 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the present invention. The one or more processors included in the terminal device may be the same type of processor, such as one or more CPUs; or may be different types of processors, such as one or more CPUs and one or more ASICs.

The memory 506 is configured to store the program 510. Memory 506 may include high speed RAM memory and may also include non-volatile memory, such as at least one disk memory.

The program 510 may be specifically configured to cause the processor 502 to: determine the e-book content to be played by the voice according to the voice play instruction indicating the voice play of the e-book; and obtain the real vocal audio corresponding to the e-book content. And play the real vocal audio.

In an optional implementation manner, the real vocal audio includes at least one of: audio and video audio obtained from a movie drama corresponding to the electronic book; reading audio corresponding to the electronic book content of the electronic book; User audio recorded by the user of the e-book reading application.

In an optional implementation manner, the program 510 is further configured to enable the processor 502 to obtain and play the real vocal audio corresponding to the e-book content to be played, and play the real vocal audio. The synthesized audio corresponding to the e-book content, wherein the synthesized audio includes background audio and/or service audio in addition to the real vocal audio; playing the synthesized audio.

In an optional implementation, at least one content tag for marking the content of the e-book is pre-set in the e-book, and at least one audio tag for marking the audio content is pre-set in the real vocal audio; the program 510 is also used to When the processor 502 obtains the real vocal audio corresponding to the e-book content, according to the correspondence between the content tag and the audio tag, obtaining a real vocal corresponding to the e-book content Audio.

In an optional implementation, the program 510 is further configured to enable the processor 502 to obtain real vocal audio corresponding to the e-book content according to the correspondence between the content tag and the audio tag. Determining a content tag corresponding to the content of the e-book to be played by the voice; determining an audio tag corresponding to the content tag according to the correspondence between the pre-stored content tag and the audio tag; acquiring the audio tag corresponding to the determined Audio content.

In an optional implementation manner, the program 510 is further configured to cause the processor 502 to perform voice on the real vocal audio before determining the e-book content to be played by the voice according to the voice play instruction for performing the voice play on the electronic book according to the indication. Identifying, obtaining corresponding text content; determining e-book content in the e-book that matches the text content; establishing and storing between the real vocal audio corresponding to the text content and the determined content of the e-book Corresponding relationship; the program 510 is further configured to: when the processor 502 obtains the real vocal audio corresponding to the e-book content to be played, obtain the real vocal corresponding to the e-book content to be played according to the correspondence relationship Audio.

In an optional implementation, when the real vocal audio includes at least two of the audio-visual audio, the e-book content reading audio, and the user audio, the program 510 is further configured to cause the processor 502 to obtain and When the real vocal audio corresponding to the e-book content corresponds to at least two of the audio-visual line audio, the e-book content reading audio, and the user audio, the content corresponding to the e-book content is obtained according to a preset priority. Real vocal audio; or, receiving a user's selection operation of at least two corresponding options of the audio-visual line audio, the e-book content reading audio, and the user audio, and obtaining the e-book content selected by the selection operation Corresponding real vocal audio or, according to the historical data of the user playing real vocal audio, determining the user's audio type preference; according to the user's audio type preference, from the audio and video audio, the e-book content reading audio, and the user audio In at least two, real vocal audio corresponding to the content of the e-book to be played is obtained.

In an optional implementation, the program 510 is further configured to cause the processor 502 to pass the user's selection operation of the at least two corresponding options of the station audio, the e-book content reading audio, and the user audio. The pop-up window or transparent overlay displays options corresponding to at least two of the audio and video audio, the e-book content reading audio, and the user audio.

In an optional implementation manner, the program 510 is further configured to: when the processor 502 determines the e-book content to be played by the voice according to the voice play instruction for performing the voice play on the electronic book according to the instruction, perform voice on the e-book according to the indication. The played voice play command and the selection operation of the display content of the e-book determine the content of the e-book to be played by the voice.

In an optional implementation manner, the program 510 is further configured to: determine, by the processor 502, a voice play instruction for performing voice play on the electronic book according to the indication and a selection operation on the display content of the electronic book, and determine an electronic book to be played by the voice. Before the content, a selection operation of the display content of the electronic book is received, and the electronic book content to be played by the voice is determined according to the selection operation.

In an optional implementation manner, the program 510 is further configured to: when the processor 502 receives the selection operation of the display content of the electronic book, and determines the electronic book content to be played by the voice according to the selecting operation, receiving the user to the electronic a first operation of displaying content of the book, determining a first action point of the first operation in the display content; receiving a second operation of the display content by the user, determining a second operation of the second operation in the display content The action point; the display content between the first action point and the second action point is determined as the content of the e-book to be played by the voice.

In an optional implementation manner, the program 510 is further configured to: when the processor 502 receives the selection operation of the display content of the electronic book, and determines the electronic book content to be played by the voice according to the selecting operation, receiving the user to the electronic a third operation of displaying the content of the book, determining a third action point of the third operation in the display content; using the third action point as a reference point, the first set range including the third action point The display content is determined as the e-book content to be played by the voice; or the display content in the second setting range starting from the third action point is determined as the e-book content to be played by the voice; or, the third action point is to be The display content in the third setting range of the end point is determined as the content of the e-book to be played by the voice.

In an optional implementation manner, the program 510 is further configured to: when the processor 502 receives the selection operation of the display content of the electronic book, and determines the electronic book content to be played by the voice according to the selecting operation, receiving the user to the electronic a selection operation of the display content of the book, determining a content tag corresponding to the display content selected by the selection operation; and determining the content marked by the content tag as the e-book content to be played by the voice.

In an optional implementation manner, the program 510 is further configured to: when the processor 502 determines the e-book content to be played by the voice according to the voice play instruction of the e-book, receive the content that the user reads the application into the e-book through the e-book. Recording aloud audio, storing the recorded audio in association with the content of the corresponding e-book; and/or receiving the comment audio recorded by the user through the e-book reading application for the content of the e-book, and the content of the comment audio and the corresponding e-book Associate storage.

In an alternative embodiment, the program 510 is further configured to cause the processor 502 to receive an audio processing instruction for the played real vocal audio, the real vocal audio being subjected to the operation indicated by the audio processing instruction.

In an optional implementation manner, the audio processing instruction includes at least one of: a pause instruction for instructing suspension of real human voice audio playback, a first adjustment instruction for indicating a playback speed of adjusting real human voice audio, a second adjustment instruction for instructing adjustment of the playback progress of the real vocal audio, an exit instruction for indicating the exit of the real vocal audio playback, and a switching instruction for indicating the type of switching the real vocal audio.

In an alternative embodiment, the program 510 is further configured to cause the processor 502 to display the audio processing instructions via a floating icon or a floating window or a transparent overlay.

In an optional implementation manner, the program 510 is further configured to enable the processor 502 to receive a page turning operation on the e-book during the playing of the real human voice audio, and suspend the playing of the real human voice audio; The page turning operation redetermines the content of the e-book to be played by the voice; the real vocal audio corresponding to the content of the re-determined e-book is obtained and played.

For the specific implementation of the steps in the program 510, reference may be made to the corresponding steps in the foregoing embodiment of the e-book voice playing method and the corresponding description in the unit, and details are not described herein. A person skilled in the art can clearly understand that, for the convenience and brevity of the description, the specific working process of the device and the module described above may be referred to the corresponding process description in the foregoing method embodiment, and details are not described herein again.

Through the embodiment, in the case that the user is tired or the light is not good, the voice play of the corresponding e-book content can be performed by the voice play instruction, and the “listening to book” function of the e-book reading application is realized. Moreover, in the embodiment of the present invention, real vocal audio is used, and compared with the machine-synthesized audio, the real vocal audio is far superior to the machine synthesis in terms of voice intonation and fluency because of recording through real vocals, so that the user Can get a better "listening" experience.

It should be noted that the various components/steps described in the embodiments of the present invention may be split into more components/steps according to the needs of the implementation, or two or more components/steps or partial operations of the components/steps may be combined into one. New components/steps to achieve the objectives of embodiments of the present invention.

The above method according to an embodiment of the present invention may be implemented in hardware, firmware, or implemented as software or computer code that may be stored in a recording medium such as a CD ROM, a RAM, a floppy disk, a hard disk, or a magneto-optical disk, or implemented by The network downloads computer code originally stored in a remote recording medium or non-transitory machine readable medium and stored in a local recording medium so that the methods described herein can be stored using a general purpose computer, a dedicated processor or programmable Such software processing on a recording medium of dedicated hardware such as an ASIC or an FPGA. It will be understood that a computer, processor, microprocessor controller or programmable hardware includes storage components (eg, RAM, ROM, flash memory, etc.) that can store or receive software or computer code, when the software or computer code is The e-book voice playback method described herein is implemented when the processor or hardware accesses and executes. Moreover, when a general purpose computer accesses code for implementing the e-book voice playback method shown herein, execution of the code converts the general purpose computer into a special purpose computer for executing the electronic book voice playback method shown herein.

Those of ordinary skill in the art will appreciate that the elements and method steps of the various examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the embodiments of the invention.

The above embodiments are only used to illustrate the embodiments of the present invention, and are not intended to limit the embodiments of the present invention, and those skilled in the art can also make various kinds without departing from the spirit and scope of the embodiments of the present invention. Variations and modifications, therefore, all equivalent technical solutions are also within the scope of the embodiments of the present invention, and the scope of patent protection of the embodiments of the present invention should be defined by the claims.

Claims

An electronic book voice playing method includes:

Determining the content of the e-book to be played by the voice according to the voice play instruction for instructing the e-book to perform voice playback;

Real vocal audio corresponding to the content of the e-book is obtained, and the real vocal audio is played.
The method of claim 1 wherein said real vocal audio comprises at least one of:

a film and television audio obtained from a film and television drama corresponding to the e-book;

Reading audio corresponding to the e-book content of the e-book;

The user audio recorded by the user of the e-book reading application in which the e-book is located.
The method of claim 1, wherein obtaining real vocal audio corresponding to the e-book content and playing the real vocal audio comprises:

Obtaining synthesized audio corresponding to the e-book content, wherein the synthesized audio includes the real vocal audio, and further comprising background audio and/or service audio;

Playing the synthesized audio.
The method according to any one of claims 1 to 3, wherein at least one content mark for marking the content of the electronic book is pre-set in the electronic book, and the real vocal audio is pre-set with a mark for the audio content. At least one audio tag;

The obtaining real vocal audio corresponding to the content of the e-book includes:

Real vocal audio corresponding to the e-book content is obtained according to a correspondence between the content tag and the audio tag.
The method according to claim 4, wherein the real vocal audio corresponding to the e-book content is obtained according to the correspondence between the content tag and the audio tag, comprising:

Determining a content tag corresponding to the e-book content to be played by the voice;

Determining an audio tag corresponding to the content tag according to a correspondence between the pre-stored content tag and the audio tag;

Acquiring audio content corresponding to the determined audio tag.
A method according to any one of claims 1 to 3, wherein

Before the determining the e-book content to be played by the voice according to the voice play instruction for instructing the e-book to perform the voice play, the method further includes:

Perform speech recognition on real vocal audio to obtain corresponding text content;

Determining an e-book content in the e-book that matches the text content;

Establishing and storing a correspondence between the real vocal audio corresponding to the text content and the determined content of the e-book;

The obtaining the real vocal audio corresponding to the e-book content includes: obtaining the e-book according to the correspondence between the real vocal audio corresponding to the text content and the determined content of the e-book content The real vocal audio corresponding to the content.
The method according to claim 2, wherein when said real vocal audio includes at least two of said movie audio, said electronic book content reading audio, and said user audio,

The obtaining real vocal audio corresponding to the content of the e-book includes:

Obtaining real vocal audio corresponding to the e-book content from at least two of the film and television word audio, the e-book content reading audio, and the user audio according to a preset priority;

or,

Receiving, by the user, a selection operation of at least two corresponding options of the audio-visual line audio, the e-book content reading audio, and the user audio, obtaining a selection corresponding to the e-book content selected by the selecting operation Real vocal audio;

or,

Determining a user's audio type preference according to the historical data of the user playing the real vocal audio; at least two of the audiovisual vocabulary audio, the e-book content reading audio, and the user audio according to the user's audio type preference Among them, real vocal audio corresponding to the content of the e-book is obtained.
The method of claim 7, wherein prior to said selecting operation of said user for at least two of said television station word audio, said e-book content reading audio, and said user audio, said The method also includes:

At least two corresponding options of the audio-visual line audio, the e-book content reading audio, and the user audio are displayed through a pop-up window or a transparent overlay.
The method of claim 1, wherein the determining the e-book content to be played by the voice according to the voice play instruction for instructing the e-book to perform the voice play comprises:

The e-book content to be played by the voice is determined according to a voice play instruction for instructing the e-book to perform voice play and a selection operation of the display content of the e-book.
The method according to claim 9, wherein before determining the e-book content to be voice-played based on the voice play instruction for instructing the e-book to perform voice playback and the selection operation on the display content of the e-book, The method further includes:

Receiving a selection operation of the display content of the electronic book, and determining an electronic book content to be played by the voice according to the selection operation.
The method according to claim 10, wherein the receiving a selection operation of the display content of the electronic book, and determining the content of the electronic book to be played by the voice according to the selecting operation comprises:

Receiving a first operation of the display content of the electronic book by the user, determining a first action point of the first operation in the display content;

Receiving a second operation of the display content by the user, determining a second action point of the second operation in the display content;

The display content between the first action point and the second action point is determined as the e-book content to be played by the voice.
The method according to claim 10, wherein the receiving a selection operation of the display content of the electronic book, and determining the content of the electronic book to be played by the voice according to the selecting operation comprises:

Receiving a third operation of the display content of the e-book by the user, determining a third action point of the third operation in the display content;

Taking the third action point as a reference point, determining the display content in the first setting range including the third action point as the content of the electronic book to be played by the voice; or, The display content in the second setting range whose point is the starting point is determined as the electronic book content to be played by the voice; or the display content in the third setting range ending in the third acting point is determined to be the voice to be played. E-book content.
The method according to claim 10, wherein the receiving a selection operation of the display content of the electronic book, and determining the content of the electronic book to be played by the voice according to the selecting operation comprises:

Receiving a selection operation of the display content of the e-book by the user, and determining a content tag corresponding to the display content selected by the selection operation;

The content marked by the content tag is determined as the e-book content to be played by the voice.
The method according to claim 1, wherein before the determining the e-book content to be played by the voice according to the voice play instruction for instructing the e-book to perform voice playback, the method further comprises:

Receiving, by the e-book reading application, the spoken audio recorded for the content of the e-book, and storing the recorded audio and the content of the corresponding e-book in association;

and / or,

Receiving, by the e-book reading application, the comment audio recorded for the content of the e-book, and storing the comment audio and the content of the corresponding e-book in association.
The method of claim 1 wherein the method further comprises:

Receiving an audio processing instruction for the played real vocal audio, performing an operation indicated by the audio processing instruction on the real vocal audio.
The method of claim 15, wherein the audio processing instruction comprises at least one of: a pause command for instructing suspension of the real human voice audio playback, for indicating adjustment of a playback speed of the real human voice audio a first adjustment instruction, a second adjustment instruction for instructing adjustment of a playback progress of the real vocal audio, an exit instruction for instructing to exit the real vocal audio play, for indicating switching the real vocal audio The type of switching instruction.
The method of claim 15 or 16, wherein the method further comprises:

The audio processing instructions are displayed by a hovering icon or a floating window or a transparent overlay.
The method of claim 1 wherein the method further comprises:

In the process of playing the real vocal audio, receiving a page turning operation on the electronic book, suspending playing of the real vocal audio;

Re-determining the content of the e-book to be played by the voice according to the page turning operation;

Real vocal audio corresponding to the re-determined content of the e-book is obtained and played.
An electronic book voice playing device, comprising:

a content determining module, configured to determine an e-book content to be played by the voice according to a voice playing instruction for instructing the e-book to perform voice playing;

And an audio playing module, configured to obtain real vocal audio corresponding to the e-book content, and play the real vocal audio.
The apparatus of claim 19, wherein the real vocal audio comprises at least one of:

a film and television audio obtained from a film and television drama corresponding to the e-book;

Reading audio corresponding to the e-book content of the e-book;

The user audio recorded by the user of the e-book reading application in which the e-book is located.
The apparatus according to claim 19, wherein said audio playback module is configured to obtain synthesized audio corresponding to said electronic book content, wherein said synthesized audio comprises said real human voice audio, and further comprising a background Audio and/or business audio; and, for playing the synthesized audio.
The apparatus according to any one of claims 19 to 21, wherein at least one content mark for marking the contents of the electronic book is pre-set in the electronic book, and the real vocal audio is preliminarily provided for marking the audio content. At least one audio tag;

The audio playing module is configured to obtain real vocal audio corresponding to the electronic book content according to a correspondence between the content mark and the audio mark, and play the real vocal audio.
The device according to claim 22, wherein the audio playing module is configured to determine a content mark corresponding to the electronic book content to be voice-played; and determine the corresponding relationship according to the corresponding relationship between the pre-stored content mark and the audio mark Depicting an audio tag corresponding to the content tag; acquiring audio content corresponding to the determined audio tag, and playing the audio content.
A device according to any one of claims 19-21, wherein

The device further includes: a relationship establishing module, configured to perform voice recognition on the real vocal audio before the content determining module determines the content of the electronic book to be played by the voice according to the voice playing instruction for instructing the electronic book to perform voice playing Obtaining corresponding text content; determining e-book content in the e-book that matches the text content; establishing and storing a correspondence between the real vocal audio corresponding to the text content and the determined content of the e-book relationship;

The audio playing module is configured to obtain real vocal audio corresponding to the e-book content according to a correspondence between the real vocal audio corresponding to the text content and the determined content of the e-book, and play The real vocal audio.
The apparatus according to claim 20, wherein when said real vocal audio includes at least two of said movie audio, said electronic book content reading audio, and said user audio,

The audio playing module is configured to obtain, according to a preset priority, a content corresponding to the e-book content from at least two of the audio-visual line audio, the e-book content reading audio, and the user audio. Real vocal audio and play the real vocal audio;

or,

The audio playing module is configured to receive a selection operation of a user corresponding to at least two of the audio-visual station audio, the e-book content reading audio, and the user audio, to obtain the selected operation of the selection operation The real vocal audio corresponding to the e-book content, and playing the real vocal audio;

or,

The audio playing module is configured to determine a user's audio type preference according to the historical data of the user playing the real vocal audio; and read the audio from the movie and television word audio, the electronic book content according to the user's audio type preference In at least two of the user audios, real vocal audio corresponding to the e-book content is obtained, and the real vocal audio is played.
The device of claim 25, wherein the device further comprises:

a display module, configured to: before the audio playback module receives a user's selection operation of at least two corresponding options of the audio-visual line audio, the e-book content reading audio, and the user audio, by using a pop-up window or transparent The overlay layer displays at least two corresponding options of the audiovisual word audio, the e-book content spoken audio, and the user audio.
The device according to claim 19, wherein the content determining module is configured to determine a voice play to be played according to a voice play instruction for instructing the electronic book to perform voice play and a selection operation of the display content of the electronic book E-book content.
The device of claim 27, wherein the device further comprises:

a content selection module, configured to receive, before the content determining module determines a content of the e-book to be voice-played, according to a voice play instruction for instructing the e-book to perform voice playback and a selection operation of displaying content of the e-book The selection operation of the display content of the electronic book determines the content of the electronic book to be played by the voice according to the selection operation.
The apparatus of claim 28, wherein the content selection module comprises:

a first selection module, configured to receive a first operation of the display content of the e-book by the user, determine a first action point of the first operation in the display content, and receive a second user to the display content And determining a second action point of the second operation in the display content; determining display content between the first action point and the second action point as an e-book content to be played by voice.
The apparatus of claim 28, wherein the content selection module comprises:

a second selection module, configured to receive a third operation of the display content of the e-book by the user, determine a third action point of the third operation in the display content, and use the third action point as a reference point Determining the display content in the first setting range including the third action point as the electronic book content to be played by the voice; or, in the second setting range starting from the third action point The display content is determined as the e-book content to be played by the voice; or the display content in the third setting range ending with the third action point is determined as the e-book content to be played by the voice.
The apparatus of claim 28, wherein the content selection module comprises:

a third selection module, configured to receive a user's selection operation on the display content of the electronic book, determine a content tag corresponding to the display content selected by the selection operation, and determine the content marked by the content tag as a voice to be played E-book content.
The device of claim 19, wherein the device further comprises:

a recording storage module, configured to receive, by the content determining module, the content of the e-book through the e-book reading application before determining the e-book content to be played by the voice according to the voice playing instruction for instructing the e-book to perform voice playing Recording aloud audio, associating the recorded audio with the content of the corresponding e-book; and/or receiving a comment audio recorded by the user through the e-book reading application for the content of the e-book, the comment The audio is stored in association with the content of the corresponding e-book.
The device of claim 19, wherein the device further comprises:

And an audio processing module, configured to receive an audio processing instruction for the real vocal audio played, and perform an operation indicated by the audio processing instruction on the real vocal audio.
The apparatus according to claim 33, wherein said audio processing instruction comprises at least one of: a pause instruction for instructing suspension of said real human voice audio, for indicating adjustment of a playback speed of said real human voice audio a first adjustment instruction, a second adjustment instruction for instructing adjustment of a playback progress of the real vocal audio, an exit instruction for instructing to exit the real vocal audio play, and a type for indicating switching of the real vocal audio Switching instructions.
The device according to claim 33 or 34, wherein the display module is further configured to display the audio processing instruction by a floating icon or a floating window or a transparent overlay.
The device of claim 19, wherein the device further comprises:

a re-determination module, configured to receive a page turning operation on the e-book during a process of playing the real vocal audio, suspending playing of the real vocal audio; and re-determining the to-be-voiced according to the page turning operation The played e-book content; the real vocal audio corresponding to the re-determined e-book content is obtained and played.
A terminal device includes: a processor, a memory, a communication interface, and a communication bus, wherein the processor, the memory, and the communication interface complete communication with each other through the communication bus;

The memory is configured to store at least one executable instruction that causes the processor to perform an operation corresponding to the e-book voice playback method of any of claims 1-18.