CN105469656A

CN105469656A - A spoken language learning system and its operation method

Info

Publication number: CN105469656A
Application number: CN201510821973.6A
Authority: CN
Inventors: 于拾全; 卫亚东; 田学红
Original assignee: Dongguan Fandou Information Technology Co ltd
Current assignee: Dongguan Fandou Information Technology Co ltd
Priority date: 2015-11-23
Filing date: 2015-11-23
Publication date: 2016-04-06

Abstract

The invention relates to a spoken language learning system, comprising: the audio decoding module is connected with the voice breakpoint searching module, and the voice breakpoint searching module is connected with the audio playing module. The spoken language learning system effectively solves the problem of simultaneously and interactively training listening and speaking abilities in English learning.

Description

A spoken language learning system and its operation method

技术领域technical field

本发明涉及一种口语学习系统，以及该系统的运作方法。The invention relates to a spoken language learning system and an operation method of the system.

背景技术Background technique

英语口语的学习需要通过反复的听和说的训练，才能提高学习效率。而目前人们对于购买或下载的英语音频文件，一般都是使用单向的音频播放器，用户只能训练自己听的能力，不能及时的训练自己说的能力。The learning of spoken English needs repeated listening and speaking training in order to improve learning efficiency. At present, people generally use a one-way audio player for the English audio files purchased or downloaded. Users can only train their ability to listen, but cannot train their ability to speak in time.

有鉴于此，确有必要提供一种系统可以让用户利用常用的音频文件或网络音频实现听、说、确认的综合训练，以提高学习效率。In view of this, it is indeed necessary to provide a system that allows users to use commonly used audio files or network audio to implement comprehensive training in listening, speaking, and confirmation, so as to improve learning efficiency.

发明内容Contents of the invention

本发明为了解决上述问题而提供的一种口语学习系统，包括：用于音频文件解码的音频解码模块，用于自动计算寻找音频中语音断点的语音中断点搜索模块，用于播放和回放音频数据的音频播放模块，用于自适应地录制用户的语音的自适应录音模块，以及用于回放录音的录音回放模块，所述音频解码模块与所述语音中断点搜索模块连接，所述语音中断点搜索模块与所述音频播放模块连接。A kind of spoken language learning system that the present invention provides in order to solve the above-mentioned problem, comprises: the audio frequency decoding module that is used for audio file decoding, is used for automatic calculation and seeks the speech interruption point search module of speech breakpoint in audio frequency, is used for playing and replaying audio frequency An audio playback module for data, an adaptive recording module for adaptively recording the user's voice, and a recording playback module for playing back recordings, the audio decoding module is connected with the voice interruption point search module, and the voice interruption The point search module is connected with the audio playing module.

优选地，所述音频解码模块支持MP3或MVA等音频文件或在线音频流的解码。Preferably, the audio decoding module supports decoding of audio files such as MP3 or MVA or online audio streams.

优选地，所述音频解码模块支持每次读取任意长度的解码数据。Preferably, the audio decoding module supports reading decoding data of any length each time.

优选地，所述自适应录音模块具有支持语音降噪处理的降噪模块。Preferably, the adaptive recording module has a noise reduction module supporting speech noise reduction processing.

优选地，所述自适应录音模块将语音保存至一个录音文件MicFile中，所述录音回放模块可以自动触发所述录音文件MicFile。Preferably, the adaptive recording module saves the voice in a recording file MicFile, and the recording playback module can automatically trigger the recording file MicFile.

本发明还提供一种上述语学习系统的运作方法，所述运作方法包括：The present invention also provides an operation method of the above-mentioned language learning system, the operation method comprising:

步骤1、音频解码模块对音频文件解码；Step 1, the audio decoding module decodes the audio file;

步骤2、语音中断点搜索模块自动计算寻找音频中的语音中断点；Step 2, the voice interruption point search module automatically calculates and finds the voice interruption point in the audio;

步骤3、音频播放模块播放和回放音频数据；Step 3, the audio playback module plays and replays audio data;

步骤4、自适应录音模块自适应地录制用户的语音；Step 4, the adaptive recording module adaptively records the voice of the user;

步骤5、录音回放模块回放用户的录音。Step 5. The recording playback module plays back the user's recording.

优选地，在所述步骤2中，所述语音中断点搜索模块基于整个音频数据缓存或基于部分数据流自动计算寻找解码后的数据流里面的语音中断点。Preferably, in the step 2, the speech interruption point search module searches for the speech interruption point in the decoded data stream based on the entire audio data buffer or based on automatic calculation of part of the data stream.

优选地，在所述步骤2中，所述语音中断点搜索模块使用能量门限语音断点检测算法。Preferably, in the step 2, the speech break point search module uses an energy threshold speech break point detection algorithm.

优选地，在所述步骤4中，如果持续体格第一时间长度T1内没有出现有效语音，则自动结束录制；如果第一时间长度T1内出现有效语音，则进入静音段判断，如果持续一个第二时间长度T2出现静音段，则自动结束录制。Preferably, in said step 4, if there is no effective voice in the first time length T1, then automatically end the recording; if there is an effective voice in the first time length T1, then enter the silent section judgment, if last a second 2. If there is a silent segment in the time length T2, the recording will end automatically.

优选地，在所述步骤5之后，还包括以下步骤：Preferably, after said step 5, the following steps are also included:

步骤6、所述音频解码模块和语音中断点搜索模块进行后续的数据解码和断点检测。Step 6. The audio decoding module and the voice breakpoint search module perform subsequent data decoding and breakpoint detection.

本发明的有益效果在于：该口语学习系统有效地解决了英语学习中同时交互式训练听和说的能力的问题。只要基于普通的音频文件或网络音频流就可以实现收听、复述、确认的逐句循环训练，另外支持单句重复播放功能，可以显著提高口语的学习效率。The beneficial effect of the present invention is that: the spoken language learning system effectively solves the problem of simultaneous interactive training of listening and speaking abilities in English learning. As long as it is based on ordinary audio files or network audio streams, it can realize the sentence-by-sentence cycle training of listening, repeating, and confirming. In addition, it supports the single-sentence repeat playback function, which can significantly improve the learning efficiency of oral English.

附图说明Description of drawings

图1为本发明实施例提供的学习系统框架示意图。FIG. 1 is a schematic diagram of a learning system framework provided by an embodiment of the present invention.

具体实施方式detailed description

下面结合附图对本发明作进一步阐述：The present invention will be further elaborated below in conjunction with accompanying drawing:

本发明提供一种口语学习系统。该口语学习系统的输入对象为音频文件，其中主要是语音为主，不包括持续的背景音乐。The invention provides a spoken language learning system. The input object of the spoken language learning system is audio files, which are mainly speech and do not include continuous background music.

如图1所示，该口语学习系统包括音频解码模块，用于音频文件的解码；语音中断点搜索模块，用于自动计算寻找音频中的语音中断点；音频播放模块，用于播放和回放音频数据；自适应录音模块，用于自适应地录制用户的语音；录音回放模块，用于回放用户的录音。As shown in Figure 1, this spoken language learning system comprises audio decoding module, is used for the decoding of audio file; Speech interruption point search module, is used for automatic calculation and finds the speech interruption point in audio frequency; Audio playback module, is used for playing and replaying audio frequency data; an adaptive recording module for adaptively recording the user's voice; a recording playback module for playing back the user's recording.

音频解码模块与语音中断点搜索模块连接，将解码后的解码数据流传输至语音中断点搜索模块。语音中断点搜索模块与音频播放模块连接，将语音片段的数据传递给音频播放模块。The audio decoding module is connected with the voice interruption point search module, and transmits the decoded decoded data stream to the voice interruption point search module. The voice interruption point search module is connected with the audio playback module, and transmits the data of the voice segment to the audio playback module.

本发明还提供上述口语学习系统的运作方法，包括以下步骤：The present invention also provides the operation method of the above-mentioned spoken language learning system, comprising the following steps:

音频解码模块支持MP3或MVA等音频文件的解码处理，也支持在线音频流的解码，并支持每次读取任意长度的解码数据。针对不同的平台，可以选择合适的缓存大小，每次读取合适长度的解码数据PcmData。The audio decoding module supports the decoding processing of audio files such as MP3 or MVA, and also supports the decoding of online audio streams, and supports reading decoded data of any length each time. For different platforms, you can choose an appropriate cache size, and read the decoded data PcmData of an appropriate length each time.

语音中断点搜索模块可以基于整个音频数据缓存，也可以基于部分数据流自动计算寻找解码后的数据流里面的语音中断点，使用算法包括但不限于常用的能量门限语音断点检测等算法。如：基于前面得到的解码数据PcmData，以20ms或40ms帧为单位进行语音能量和过零率的计算，然后通过滑窗和门限判决，判断是否存在语音中断点。如果存在语音中断点，则记录断点信息，并在音频播放模块播放语音片段后启动录音模块。如果不存在语音中断点，则直接把数据传递给音频播放模块播放语音。The speech interruption point search module can be based on the entire audio data cache, or can automatically calculate and find the speech interruption point in the decoded data stream based on part of the data stream. The algorithm used includes but is not limited to commonly used energy threshold speech breakpoint detection algorithms. For example: based on the previously obtained decoded data PcmData, calculate speech energy and zero-crossing rate in units of 20ms or 40ms frames, and then judge whether there is a speech interruption point through sliding window and threshold judgment. If there is a voice break point, then record the break point information, and start the recording module after the audio playback module plays the voice clip. If there is no voice interruption point, the data is directly passed to the audio playback module to play the voice.

语音播放模块接收到前面的数据后直接播放，如果没有数据则自动停止播放。语音播放模块可以播放前面语音中断点搜索模块输出的语音片段数据；也可以重复播放某个指定的语音片段。The voice playback module plays directly after receiving the previous data, and automatically stops playing if there is no data. The voice playback module can play the voice segment data output by the previous voice interruption point search module; it can also play a specified voice segment repeatedly.

自适应录音模块可以自适应控制录音时间长度把用户语音输入录制保存为音频文件，同时自适应录音模块具有降噪模块，支持语音降噪处理。其中自适应控制时长的算法包括但不限于语音端点检测、自适应静音段长度控制等。自适应录音模块收到启动指令后，启动录音处理，自适应录音模块缓存麦克风设备输出的数据MicData，保存到一个录音文件MicFile中，同时对数据MicData进行断点检测。如果持续第一时间长度T1内没有出现有效语音则自动结束录制。如果第一时间长度T1内出现有效语音，则进入静音段判断，如果持续第二时间长度T2出现静音段，则自动结束录制。录音接收后，自动启动录音回放模块。The self-adaptive recording module can self-adaptively control the length of recording time and save the recording of the user's voice input as an audio file. At the same time, the self-adaptive recording module has a noise reduction module to support speech noise reduction processing. The algorithms for adaptive control duration include but are not limited to voice endpoint detection, adaptive silence segment length control, and the like. After the adaptive recording module receives the start command, it starts the recording process. The adaptive recording module caches the data MicData output by the microphone device, saves it in a recording file MicFile, and performs breakpoint detection on the data MicData. If there is no valid voice within the first duration T1, the recording is automatically terminated. If a valid voice appears within the first time length T1, it will enter the silent segment judgment, and if the silent segment occurs for the second time length T2, the recording will be automatically ended. After the recording is received, the recording playback module is automatically started.

录音回放模块可以自动触发播放用户的录音文件MicFile，用于用户对自己复述语音的自我确认。录音回放模块收到指令后开始播放录音文件MicFile，播放完毕后，包括以下步骤：通知音频解码模块和语音中断点搜索模块进行后续的数据解码和断点检测。The recording playback module can automatically trigger the playback of the user's recording file MicFile, which is used for the user's self-confirmation of his repeated voice. The recording playback module starts to play the recording file MicFile after receiving the instruction. After the playback is completed, the following steps are included: notify the audio decoding module and the voice breakpoint search module to perform subsequent data decoding and breakpoint detection.

期间如果用户输入指令，则通知音频解码模块从前面保存的中断点位置开始解码数据。During this period, if the user inputs an instruction, the audio decoding module is notified to start decoding data from the previously saved breakpoint position.

该口语学习系统有效地解决了英语学习中同时交互式训练听和说的能力的问题。只要基于普通的音频文件或网络音频流就可以实现收听、复述、确认的逐句循环训练，另外支持单句重复播放功能，可以显著提高口语的学习效率。The spoken language learning system effectively solves the problem of simultaneous interactive training of listening and speaking abilities in English learning. As long as it is based on ordinary audio files or network audio streams, it can realize the sentence-by-sentence cycle training of listening, repeating, and confirming. In addition, it supports the single-sentence repeat playback function, which can significantly improve the learning efficiency of oral English.

以上所述实施例，只是本发明的较佳实例，并非来限制本发明的实施范围，故凡依本发明申请专利范围所述的构造、特征及原理所做的等效变化或修饰，均应包括于本发明专利申请范围内。The above-described embodiments are only preferred examples of the present invention, and are not intended to limit the scope of the present invention, so all equivalent changes or modifications made according to the structure, features and principles described in the patent scope of the present invention should be Included in the patent application scope of the present invention.

Claims

1. A spoken language learning system, characterized in that, said spoken language learning system comprises: an audio decoding module for audio file decoding, a voice breakpoint search module for automatically calculating and finding a voice breakpoint in the audio frequency, for playing and an audio playback module for playing back audio data, an adaptive recording module for adaptively recording the user's voice, and a recording playback module for playing back the recording,

The audio decoding module is connected with the voice interruption point search module, and the voice interruption point search module is connected with the audio playback module.

2. The spoken language learning system as claimed in claim 1, wherein the audio decoding module supports the decoding of audio files such as MP3 or MVA or online audio streams.

3. The spoken language learning system according to claim 1, wherein the audio decoding module supports reading decoding data of any length each time.

4. The spoken language learning system as claimed in claim 1, 2 or 3, wherein the adaptive recording module has a noise reduction module supporting speech noise reduction processing.

5. The spoken language learning system as claimed in claim 4, wherein the adaptive recording module saves the voice into a recording file MicFile, and the recording playback module can automatically trigger the recording file MicFile.

6. A method of operation of any spoken language learning system as claimed in claims 1-5, wherein said method of operation comprises:

Step 1, the audio decoding module decodes the audio file;

Step 2, the voice interruption point search module automatically calculates and finds the voice interruption point in the audio;

Step 3, the audio playback module plays and replays audio data;

Step 4, the adaptive recording module adaptively records the voice of the user;

Step 5. The recording playback module plays back the user's recording.

7. The operation method according to claim 6, wherein in the step 2, the voice interruption point search module is based on the entire audio data cache or based on partial data stream automatic calculation to find the decoded data stream inside Speech break point.

8. The operation method according to claim 6, characterized in that, in the step 2, the speech break point search module uses an energy threshold speech break point detection algorithm.

9. The method of operation as claimed in claim 6, wherein in said step 4, if there is no valid voice within the first duration T1, the recording is automatically terminated; If the voice is valid, it enters the judgment of the silent segment, and if the silent segment occurs for a second duration T2, the recording is automatically ended.

10. The operation method according to claim 6, characterized in that, after said step 5, further comprising the following steps:

Step 6. The audio decoding module and the voice breakpoint search module perform subsequent data decoding and breakpoint detection.