JP2006308993A

JP2006308993A - Device for recording and reproducing speech signal

Info

Publication number: JP2006308993A
Application number: JP2005133383A
Authority: JP
Inventors: Mitsuru Hasegawa; 長谷川　　満
Original assignee: Teac Corp
Current assignee: Teac Corp
Priority date: 2005-04-28
Filing date: 2005-04-28
Publication date: 2006-11-09
Anticipated expiration: 2025-04-28
Also published as: JP4934990B2

Abstract

<P>PROBLEM TO BE SOLVED: To automatically divide a news program and a language teaching program into files at soundless portions and to store them into a storage medium in a radio receiver etc with a built-in AM/FM tuner. <P>SOLUTION: A speech signal level comparison portion 12 compares an input speech signal level with a prescribed level to detect a soundless portion. A control section 16 divides the speech signal into a file at the soundless portion and stores the file into the storage medium 22 when the soundless portion lasts for a certain time and the speech portion right before it lasts for a certain time. Soundless data or index data may forcibly be added to the head of the file and the head of the file may be faded in. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は音声信号記録再生装置、特に入力音声信号の無音部を検出して分割する装置に関する。 The present invention relates to an audio signal recording / reproducing apparatus, and more particularly to an apparatus for detecting and dividing a silent part of an input audio signal.

従来より、入力音声信号の無音部を検出して複数のファイルに分割し、記憶媒体に記憶する技術が知られている。例えば、下記に示す特許文献１には、音声信号のレベルを予め決めたレベルと比較することで音声信号の無音部を判別し、無音部で音声信号を複数の通話ファイルに分割してデータ記録部に記録することが開示されている。 2. Description of the Related Art Conventionally, a technique is known in which a silent portion of an input audio signal is detected, divided into a plurality of files, and stored in a storage medium. For example, in Patent Document 1 shown below, a voice signal level is compared with a predetermined level to determine a silent part of the voice signal, and the voice signal is divided into a plurality of call files by the silent part to record data. It is disclosed that it records in a part.

また、特許文献２には、音声データの始端や終端を自動的に検出し、フェードイン処理またはフェードアウト処理を施す技術が開示されている。 Patent Document 2 discloses a technique for automatically detecting the start and end of audio data and performing a fade-in process or a fade-out process.

特開２００１−２８５４６１号公報JP 2001-285461 A 特開２００１−２２８８８７号公報JP 2001-228887 A

しかしながら、単に入力音声信号の無音部を検出し、検出した無音部で複数のファイルに分割する構成では、例えばニュース番組において、本来であれば一連のニュース記事としてまとまりがあるにもかかわらず、その途中で無音部が存在するために分割記録してしまう等の問題があった。無音部か否かは、無音状態が一定時間継続するか否かに基づき判定できるが、例え一定時間継続する無音部であってもコンテクストから異なるファイルに分割すべきでない場合も存在し得るものであり、このような場合に分割してしまうとファイル再生中に途中で音声で途切れる、あるいはファイル選択時にセンテンスの途中から開示される等の問題が生じ、ユーザが別途、ファイルを編集ないし加工する必要が生じてしまう。 However, in the configuration in which the silent part of the input audio signal is simply detected and the detected silent part is divided into a plurality of files, for example, in a news program, although there are originally a series of news articles, There was a problem such as division recording due to the presence of a silent part on the way. Whether or not it is a silence part can be determined based on whether or not the silence state lasts for a certain period of time, but even if it is a silence part that lasts for a certain period of time, there may be cases where it should not be divided into different files from the context. Yes, if it is divided in such a case, there will be a problem that the audio is interrupted midway during file playback, or the file is disclosed from the middle of the sentence when the file is selected, and the user needs to edit or process the file separately. Will occur.

さらに、各ファイルを再生する際にその先頭をフェードイン処理して再生する場合、音楽データを記録するファイルであれば問題ないが、ニュース記事等においてはフェードイン処理により先頭の音声が聞き取りにくくなる問題が生じる。 Furthermore, when each file is played back with the fade-in process at the beginning, there is no problem as long as it is a file that records music data, but in news articles, etc., it becomes difficult to hear the head sound due to the fade-in process. Problems arise.

本発明の目的は、入力音声信号を自動的に複数のファイルに分割して再生時におけるユーザの便宜、すなわちランダムアクセスを可能とし、また、フェードイン処理が行われても音声信号が途切れることなく再生できる装置を提供することにある。 An object of the present invention is to automatically divide an input audio signal into a plurality of files and enable user convenience during playback, that is, random access, and the audio signal is not interrupted even if fade-in processing is performed. The object is to provide an apparatus capable of reproduction.

本発明は、入力音声信号の無音部を検出する無音検出手段と、前記無音部が第１の一定時間以上継続しているか否かを判定する第１判定手段と、前記無音部が第１の一定時間以上継続していると判定された場合に、前記無音部の直前の音声部が第２の一定時間以上継続しているか否かを判定する第２判定手段と、前記音声部が第２の一定時間以上継続していると判定された場合に、前記入力音声信号を前記無音部でファイルに分割して記憶媒体に記憶する分割手段と、ユーザ操作可能な操作手段と、前記操作手段からの操作信号に応じ、前記記憶媒体に記憶された入力音声信号を前記ファイル単位で再生する再生手段とを有する。 The present invention includes a silence detection means for detecting a silence part of an input audio signal, a first determination means for judging whether or not the silence part has continued for a first predetermined time, and the silence part is a first part. When it is determined that the sound part has continued for a certain period of time or longer, second determination means for determining whether or not the sound part immediately before the silent part has continued for a second certain period of time or more, If the input audio signal is determined to have continued for a certain period of time, the input audio signal is divided into files by the silent section and stored in a storage medium, operation means operable by the user, and the operation means Reproduction means for reproducing the input audio signal stored in the storage medium in units of files in response to the operation signal.

本発明では、単に無音部を検出してファイルに分割するのではなく、無音部が一定時間以上継続して存在するか、及び、無音部の直前の音声部が一定時間以上継続して存在するか、に応じて入力音声信号をファイルに分割して記憶媒体に記憶する。すなわち、無音部が一定時間以上継続し、かつ、その直前の音声部が一定時間以上継続して存在する場合にファイルに分割することで、簡易な構成でセンテンス途中でのファイル分割を抑制できる。 In the present invention, the silent part is not simply detected and divided into files, but the silent part is continuously present for a certain time or more, and the sound part immediately before the silent part is continuously present for a certain time or longer. Or the input audio signal is divided into files and stored in a storage medium. That is, when the silent part continues for a certain period of time and the immediately preceding voice part exists continuously for a certain period of time, the file is divided in the middle of a sentence with a simple configuration by dividing the file.

本発明において、分割された各ファイルの先頭に所定時間継続する無音データあるいは音声インデックスデータを付加することで、ファイル再生時のフェードイン処理を行っても本来の音声データへの影響を抑制できる。 In the present invention, by adding silence data or audio index data that continues for a predetermined time to the head of each divided file, the influence on the original audio data can be suppressed even if fade-in processing is performed during file reproduction.

本発明によれば、入力音声信号を自動的に複数のファイルに分割して再生時におけるユーザの便宜、すなわちランダムアクセスを可能とできる。また、フェードイン処理が行われても音声信号が途切れることなく再生することができる。 According to the present invention, it is possible to automatically divide an input audio signal into a plurality of files and to make user's convenience during reproduction, that is, random access possible. Even if fade-in processing is performed, the audio signal can be reproduced without interruption.

以下、図面に基づき本発明の実施形態について、ラジオ受信機を例にとり説明する。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings, taking a radio receiver as an example.

図１に、本実施形態に係る音声信号記録再生装置の構成ブロック図を示す。本実施形態においては、音声信号記録再生装置は、ＡＭ／ＦＭチューナを内蔵するラジオ受信機として機能する。 FIG. 1 is a block diagram showing the configuration of an audio signal recording / reproducing apparatus according to this embodiment. In the present embodiment, the audio signal recording / reproducing apparatus functions as a radio receiver incorporating an AM / FM tuner.

ＡＭ／ＦＭチューナ１０は、ＡＭあるいはＦＭラジオ放送を受信し、受信した音声信号を音声信号レベル比較部１２および音声処理部１４に出力する。 The AM / FM tuner 10 receives AM or FM radio broadcast and outputs the received audio signal to the audio signal level comparison unit 12 and the audio processing unit 14.

音声信号レベル比較部１２は、ＡＭ／ＦＭチューナ１０からの入力音声信号を所定のレベルと比較し、比較結果を制御部１６に出力する。本実施形態において、音声信号レベル比較部１２にて入力音声信号のレベルが所定のレベル以上である場合に音声部、所定のレベルより小さい場合に無音部と判定する。音声信号レベル比較部１２は、入力音声信号のレベルが所定のレベルより小さい場合にのみその旨の信号あるいはフラグを制御部１６に出力してもよい。入力音声信号レベルが所定のレベル以上である場合に特定の信号あるいはフラグを出力してもよい。 The audio signal level comparison unit 12 compares the input audio signal from the AM / FM tuner 10 with a predetermined level and outputs the comparison result to the control unit 16. In the present embodiment, the audio signal level comparison unit 12 determines that the input audio signal level is equal to or higher than a predetermined level, and the audio signal level is lower than the predetermined level. The audio signal level comparison unit 12 may output a signal or flag to that effect to the control unit 16 only when the level of the input audio signal is lower than a predetermined level. A specific signal or flag may be output when the input audio signal level is equal to or higher than a predetermined level.

音声処理部１４は、入力音声信号をデジタル信号に変換し、さらにＷＡＶやＭＰ３などの音声データフォーマットで圧縮して制御部１６に出力する。また、音声処理部１４は、制御部１６から供給される音声信号、具体的には後述する記憶媒体２２から読み出された音声信号を伸長して音声出力部２４に出力し音声信号として再生する。 The audio processing unit 14 converts the input audio signal into a digital signal, further compresses the audio signal in an audio data format such as WAV or MP3, and outputs the compressed signal to the control unit 16. Further, the audio processing unit 14 expands the audio signal supplied from the control unit 16, specifically, the audio signal read from the storage medium 22 described later, outputs it to the audio output unit 24, and reproduces it as an audio signal. .

制御部１６は、音声信号レベル比較部１２からの比較結果と音声処理部１４からの圧縮デジタル音声データとを入力し、音声信号レベル比較部１２における比較結果に応じて圧縮音声データを記録再生制御部２０を介して記憶媒体２２に記憶する。制御部１６は、入力音声信号の無音部を検出し、この無音部が一定時間Ｔ０以上継続しているか否かを判定する。無音部が一定時間Ｔ０以上継続している場合に、ファイル単位に分割し得る有意の無音部が存在すると判定し、入力音声信号をファイルに分割して記憶媒体２２に記憶する。但し、制御部１６は、たとえ無音部が一定時間Ｔ０以上継続している場合であっても、当該無音部の直前に存在する音声部の継続時間が一定時間Ｔ１以上継続していない場合にはファイル分割処理を実行しない。したがって、この場合には当該無音部、直前の音声部、直後の音声部を含めて一つのファイルとしてまとまって記憶される。すなわち、制御部１６は、無音部が一定時間以上継続して存在するか否かの判定、及び無音部の直前に一定時間以上継続して存在する音声部が存在するか否かの判定の２つの判定を実行し、これら２つの判定結果に応じてファイル分割の可否を決定する。 The control unit 16 inputs the comparison result from the audio signal level comparison unit 12 and the compressed digital audio data from the audio processing unit 14, and controls recording and reproduction of the compressed audio data according to the comparison result in the audio signal level comparison unit 12. The data is stored in the storage medium 22 via the unit 20. The control unit 16 detects a silent part of the input audio signal and determines whether or not the silent part continues for a certain time T0 or more. When the silent part continues for a certain time T0 or more, it is determined that there is a significant silent part that can be divided into file units, and the input audio signal is divided into files and stored in the storage medium 22. However, even if the silent part continues for a certain time T0 or longer, the control unit 16 does not continue when the duration of the voice part existing immediately before the silent part has not continued for a certain time T1 or longer. Do not execute file division processing. Therefore, in this case, the silent part, the immediately preceding voice part, and the immediately following voice part are collectively stored as one file. That is, the control unit 16 determines whether or not there is a silent part continuously for a certain period of time and whether or not there is an audio part that exists continuously for a certain period of time immediately before the silent part. Two determinations are executed, and whether to divide the file is determined according to the two determination results.

記憶媒体２２は、ＵＳＢメモリやフラッシュメモリ等の公知の半導体メモリで構成され、制御部１６にて複数のファイルに分割された入力音声信号をファイル単位で記憶する。記憶フォーマットも公知であり、例えばファイル管理部とデータ記憶部とを有し、ファイル管理部に分割された各ファイルの識別データＩＤやファイル属性等が記憶され、データ記憶部にファイル毎に分割された入力音声データが記憶される。 The storage medium 22 includes a known semiconductor memory such as a USB memory or a flash memory, and stores the input audio signal divided into a plurality of files by the control unit 16 in units of files. The storage format is also known, for example, has a file management unit and a data storage unit, the identification data ID and file attributes of each file divided into the file management unit are stored, and divided into files for each file in the data storage unit Input voice data is stored.

操作部１８は、ユーザ操作可能な操作部であり、タッチスイッチや操作ボタン、操作レバー等、任意の形態を含む。ユーザは、操作部１８を操作することにより、所望のファイルの再生指示を入力する。制御部１６は操作部１８からの操作信号に応じ、記憶媒体２２から該当ファイルを読み出して音声処理部１４に供給する。音声処理部１４は、該当ファイルを伸長して音声出力部２４に出力する。ハードディスクやフラッシュメモリを内蔵し、ネットワークあるいはＣＤなどの可搬媒体から入力した音声データを記憶し、ユーザの操作に応じて音楽データを再生出力するオーディオプレイヤが知られている。これらのオーディオプレイヤにおいては、ユーザは再生ボタンを操作することにより音楽データを再生でき、早送りボタンあるいは後戻しボタンを操作することにより次の曲あるいは直前の曲の頭出しを行うことができる。本実施形態においては、ＡＭ／ＦＭラジオにおけるニュース番組や語学番組がその無音部で自動的に分割されて複数のファイルとして記憶媒体２２に記憶される。例えば、ニュース番組の場合、アナウンサは複数の記事を読み上げる際、各記事の間に所定の間をおいて読み上げる。したがって、本実施形態において、ニュース番組の各記事ごとにファイルに自動分割されて記憶媒体２２に記憶されることとなり、ユーザはオーディオプレイヤにおける早送りボタンや後戻りボタンの操作により、任意の曲の頭出しを行うのと同様の操作により、順次、所望のニュース記事を再生することが可能となる。 The operation unit 18 is an operation unit that can be operated by the user, and includes an arbitrary form such as a touch switch, an operation button, or an operation lever. The user operates the operation unit 18 to input a reproduction instruction for a desired file. In response to an operation signal from the operation unit 18, the control unit 16 reads the corresponding file from the storage medium 22 and supplies it to the audio processing unit 14. The audio processing unit 14 decompresses the corresponding file and outputs it to the audio output unit 24. 2. Description of the Related Art An audio player that has a built-in hard disk or flash memory, stores audio data input from a portable medium such as a network or CD, and reproduces and outputs music data in response to a user operation is known. In these audio players, the user can play music data by operating the play button, and can search for the next song or the immediately preceding song by operating the fast-forward button or the backward button. In the present embodiment, news programs and language programs on AM / FM radio are automatically divided at the silent part and stored in the storage medium 22 as a plurality of files. For example, in the case of a news program, when an announcer reads a plurality of articles, the announcer reads it with a predetermined interval between each article. Therefore, in this embodiment, each article of the news program is automatically divided into files and stored in the storage medium 22, and the user can find the beginning of an arbitrary song by operating the fast-forward button or the backward button on the audio player. It is possible to reproduce desired news articles sequentially by the same operation as performing the above.

図２に、本実施形態における入力音声信号のファイル分割の概念構成が示されている。図２（ａ）に示すように、入力音声信号が音声部（有音部）と無音部から構成されるものとする。音声部１００に続き無音部１０２が存在し、無音部１０２が一定時間Ｔ０以上継続して存在する場合には、図２（ｂ）に示すように、この無音部１０２で音声部１００が一つのファイルとして分割され記憶媒体２２に記憶される。すなわち、音声部１００と、無音部１０２のうち一定時間Ｔ０分の無音データとが一つのファイル１に分割されて記憶される。ファイル１の終端に存在する無音部１０２の一部を除去して記憶媒体２２に記憶してもよい。無音部１０２に続き、音声部１０４及び無音部１０６が存在する場合、同様に無音部１０６が一定時間Ｔ０以上継続する場合に音声部１０４が別のファイルとして分割され記憶媒体２２に記憶される。図２（ｂ）において、音声部１００を含むファイルをファイル１、音声部１０４を含むファイルをファイル２として示している。各ファイルの識別データＩＤとして機能するインデックスナンバーは、音声部を新たに検出するたびにカウンタをインクリメントして得られる。 FIG. 2 shows a conceptual configuration of file division of the input audio signal in the present embodiment. As shown in FIG. 2A, it is assumed that the input voice signal is composed of a voice part (sound part) and a silent part. When there is a silent part 102 following the voice part 100, and the silent part 102 continues for a certain time T0 or longer, as shown in FIG. The file is divided and stored in the storage medium 22. That is, the sound unit 100 and the silence data for a certain time T0 in the silence unit 102 are divided into one file 1 and stored. A part of the silent part 102 existing at the end of the file 1 may be removed and stored in the storage medium 22. When the sound part 104 and the soundless part 106 exist after the soundless part 102, the sound part 104 is similarly divided and stored in the storage medium 22 when the soundless part 106 continues for a predetermined time T0 or more. In FIG. 2B, a file including the audio part 100 is indicated as file 1, and a file including the audio part 104 is indicated as file 2. The index number that functions as the identification data ID of each file is obtained by incrementing the counter each time a new voice part is detected.

無音部１０６に続き、音声部１０８及び無音部１１０が存在し、無音部１１０も一定時間Ｔ０以上継続しているが、無音部１１０の直前の音声部１０８が一定時間Ｔ１以上継続していない場合には、無音部１１０は有意の無音部ではないと判定し、ファイル分割は行わない。無音部１１０に続き音声部１１２及び無音部１１４が存在し、無音部１１４が一定時間Ｔ０以上継続し、かつ、無音部１１４直前のトータルの音声部（この場合には音声部１０８と音声部１１２とを合わせた区間）が一定時間Ｔ１以上である場合にファイルを分割してファイル３とする。無音部１１０にてファイルを分割すると、音声部１０８を含むファイルが形成されるが、ニュース記事において音声部１０８が短すぎ、意味のあるフレーズあるいはセンテンスとして成立しないおそれがあるが、このように音声部の継続時間、あるいは分割されるファイルの最小単位を確保することで、意味のあるファイルに分割することが可能である。音声部の最小継続時間Ｔ１はユーザが操作部１８から入力して制御部１６のメモリに記憶させる他、予め制御部１６のメモリにデフォルト値として記憶していてもよい。最小継続時間Ｔ１は例えば３０秒である。無音部の有意を判定するための一定時間Ｔ０は例えば２秒に設定される。 When the sound part 108 and the soundless part 110 exist after the soundless part 106 and the soundless part 110 continues for a certain time T0 or more, the sound part 108 immediately before the soundless part 110 does not continue for a certain time T1 or more. Therefore, the silence part 110 is determined not to be a significant silence part, and the file is not divided. The sound part 112 and the sound part 114 exist after the soundless part 110, the soundless part 114 continues for a predetermined time T0 or more, and the total sound part immediately before the soundless part 114 (in this case, the sound part 108 and the sound part 112). The file is divided into files 3 when the interval of the above is equal to or longer than the predetermined time T1. If the silence part 110 divides the file, a file including the voice part 108 is formed. However, in the news article, the voice part 108 is too short and may not be formed as a meaningful phrase or sentence. It is possible to divide the file into meaningful files by ensuring the duration of the copy or the minimum unit of the file to be divided. The minimum duration T1 of the voice unit may be input from the operation unit 18 by the user and stored in the memory of the control unit 16, or may be stored in advance as a default value in the memory of the control unit 16. The minimum duration T1 is, for example, 30 seconds. The fixed time T0 for determining the significance of the silent part is set to 2 seconds, for example.

図３に、本実施形態におけるファイル分割の処理フローチャートを示す。まず、入力音声信号レベルが所定レベル以上であるか否かを判定する（Ｓ１０１）。所定レベルは、音声信号記録再生装置のメモリに予めデフォルト値として記憶してもよく、あるいはユーザが操作部１８を介して設定可能としてもよい。ＦＭ放送のニュース番組等においては、音声のニュース記事の背景にＢＧＭが存在する場合もある。所定レベルが低く設定されている場合、ＢＧＭの存在によりニュース記事を複数のファイルに分割できないおそれがあるが、このような場合にはユーザがＢＧＭのレベルに応じた所定レベルを設定することで、ＢＧＭの存在にかかわらず無音部を検出してニュース記事をファイルに分割することができる。ラジオ局毎、あるいはラジオ番組毎の所定レベルを制御部１６のメモリに記憶しておき、記憶媒体２２に記憶すべきラジオ局あるいはラジオ番組に対応する所定レベルをメモリから読み出して設定してもよい。 FIG. 3 shows a process flowchart of file division in the present embodiment. First, it is determined whether or not the input audio signal level is equal to or higher than a predetermined level (S101). The predetermined level may be stored in advance as a default value in the memory of the audio signal recording / reproducing apparatus, or may be set by the user via the operation unit 18. In an FM broadcast news program or the like, BGM may exist in the background of an audio news article. When the predetermined level is set low, there is a possibility that the news article cannot be divided into a plurality of files due to the presence of BGM. In such a case, the user can set a predetermined level according to the BGM level. Regardless of the presence of BGM, silence can be detected and news articles can be divided into files. A predetermined level for each radio station or each radio program may be stored in the memory of the control unit 16, and a predetermined level corresponding to the radio station or radio program to be stored in the storage medium 22 may be read from the memory and set. .

音声信号レベルが所定レベルより小さいとして無音部が検出された場合、検出した無音部が一定時間Ｔ０以上継続して存在するか否かを判定する（Ｓ１０２）。無音部が一定時間Ｔ０以上継続しない場合には、ファイル分割するのに適当でないとして分割処理を行わない。一方、無音部が一定時間Ｔ０以上継続する場合には、さらに、この無音部の直前の音声部（分割されていないトータルの音声部）が一定時間Ｔ１以上継続しているか否かを判定する（Ｓ１０３）。この判定は、例えば音声部の始端と同時にタイマをスタートさせ、Ｓ１０１にて音声信号レベルが所定レベルより小さいと検出された時点でタイマをストップさせて得られる時間をＴ１と大小比較すればよい。無音部が一定時間Ｔ０以上継続していても、その直前の音声部が一定時間Ｔ１以上継続していない場合には、ファイル分割を行わない。直前の音声部が一定時間Ｔ１以上継続している場合のみ、Ｓ１０１で検出された無音部でファイル分割する（Ｓ１０４）。Ｓ１０４の処理は、分割されるべきファイルの最小時間あるいは最小データ長を確保する処理ということもできる。ファイル分割時には、ファイルカウンタのカウンタ値を１だけインクリメントし（Ｓ１０５）、記憶媒体２２のファイル管理部に登録する。 When the silent part is detected with the audio signal level being lower than the predetermined level, it is determined whether or not the detected silent part is continuously present for a certain time T0 or more (S102). If the silent part does not continue for a certain time T0 or more, the division process is not performed because it is not suitable for dividing the file. On the other hand, when the silent part continues for a certain period of time T0 or more, it is further determined whether or not the voice part immediately before the silent part (the total voice part not divided) continues for a certain period of time T1 or more ( S103). For this determination, for example, a timer may be started simultaneously with the beginning of the audio unit, and the time obtained by stopping the timer when it is detected in S101 that the audio signal level is lower than a predetermined level may be compared with T1. Even if the silent part continues for a certain time T0 or more, if the immediately preceding voice part does not continue for a certain time T1 or more, the file division is not performed. Only when the immediately preceding audio part continues for a certain time T1 or more, the file is divided by the silent part detected in S101 (S104). The process of S104 can also be referred to as a process of ensuring the minimum time or the minimum data length of the file to be divided. At the time of file division, the counter value of the file counter is incremented by 1 (S105) and registered in the file management unit of the storage medium 22.

以上のようにして、入力音声信号が複数のファイルに自動分割されて記憶媒体２２に記憶される。例えばニュース番組において、１０個のニュース記事が読み上げられた場合、合計１０個のファイルが分割して記憶される。各ファイルにはインデックスナンバが付加されており、ユーザは操作部１８を操作することで、任意のファイルをランダムに再生出力することができる。 As described above, the input audio signal is automatically divided into a plurality of files and stored in the storage medium 22. For example, when 10 news articles are read out in a news program, a total of 10 files are divided and stored. An index number is added to each file, and the user can play and output an arbitrary file at random by operating the operation unit 18.

なお、音声信号記録再生装置が音楽データを再生する機能を有し、従来技術のようにそれぞれの曲の先頭部をフェードイン処理して再生する機能を有する場合、ニュース記事ファイルの先頭部がフェードイン処理により途切れてしまい音声が聞き取りにくい事態が生じ得る。 When the audio signal recording / playback device has a function of playing music data and has a function of playing back the top of each song as in the prior art, the top of the news article file is faded. There may be a situation where the sound is interrupted due to the in-process and it is difficult to hear the voice.

そこで、図４に示すように、制御部１６は、入力音声信号を無音部で複数のファイルに自動分割するとともに、分割された各ファイルの先頭に所定時間継続する無音データを強制的に付加して記憶媒体２２に記憶してもよい。図４（ｂ）において、図２と同様に入力音声信号が複数のファイル１，２，３に分割され、その後、ファイル１に無音データ２００が付加され、ファイル２に無音データ２０２が付加され、ファイル３に無音データ２０４が付加される。各ファイル１〜３の先頭に強制的に所定時間継続する無音データを付加することで、図５に示すように、各ファイルの先頭を出力０％から出力１００％まで順次出力を増大させるフェードイン処理を実行しても、各ファイルの先頭に強制的に付加された無音データ部分においてフェードイン処理が完了する結果、本来の音声部ではフェードイン処理されず、出力１００パーセントの状態から再生出力される。したがって、フェードイン処理されていても、本来の音声が途切れることなく再生される。無音データ２００、２０２、２０４の継続時間あるいはデータ長は、フェードイン処理に要する時間と略同一に設定される。 Therefore, as shown in FIG. 4, the control unit 16 automatically divides the input audio signal into a plurality of files by the silence unit and forcibly adds silence data that continues for a predetermined time to the head of each divided file. May be stored in the storage medium 22. In FIG. 4B, the input audio signal is divided into a plurality of files 1, 2, 3 as in FIG. 2, after which the silence data 200 is added to the file 1, the silence data 202 is added to the file 2, Silence data 204 is added to the file 3. By adding silent data that continues for a predetermined time to the beginning of each file 1 to 3, as shown in FIG. 5, the beginning of each file is gradually increased from 0% output to 100% output. Even if the process is executed, the fade-in process is completed in the silent data portion forcibly added to the head of each file. As a result, the original audio part is not fade-in processed and is reproduced and output from the state of 100% output. The Therefore, even if the fade-in process is performed, the original sound is reproduced without interruption. The duration or data length of the silence data 200, 202, 204 is set to be substantially the same as the time required for the fade-in process.

図４においては、自動分割された各ファイル１、２、３の先頭に無音データ２００、２０２、２０４を付加しているが、無音データではなく他のデータ（ダミーデータを含む）を付加してフェードイン処理による音切れを抑制することも可能である。 In FIG. 4, silence data 200, 202, 204 is added to the beginning of each automatically divided file 1, 2, 3, but other data (including dummy data) is added instead of silence data. It is also possible to suppress sound interruption due to the fade-in process.

例えば、図６に示すように、自動分割した各ファイルの先頭に音声のインデックスデータを付加してもよい。この音声インデックスデータは、図３のＳ１０５の処理で設定された各ファイルの識別データＩＤを音声データとして付加したものである。図６（ｂ）においては、「インデックス１」なる音声データが付加されている。音声インデックスデータが付加されたファイルを再生すると、「インデックス１」なる音声に続き本来の音声データが再生される。音声インデックス部ではフェードイン処理が施される結果、「インデックス１」の部分がフェードインで再生され、本来の音声部がそれに引き続いて音声出力される。以上の処理により、ファイルの先頭がフェードイン処理される装置構成においても、先頭音声が聞き取りにくくなる事態を確実に解消できる。 For example, as shown in FIG. 6, audio index data may be added to the head of each automatically divided file. This audio index data is obtained by adding the identification data ID of each file set in the process of S105 of FIG. 3 as audio data. In FIG. 6B, audio data “index 1” is added. When the file to which the audio index data is added is reproduced, the original audio data is reproduced after the audio “index 1”. As a result of the fade-in process being performed in the audio index portion, the portion of “index 1” is reproduced by fade-in, and the original audio portion is subsequently output as audio. With the above processing, even in an apparatus configuration in which the beginning of a file is faded in, it is possible to reliably eliminate the situation where it is difficult to hear the beginning voice.

本実施形態において、記憶媒体２２としてＵＳＢフラッシュオーディオプレ−ヤを用いることもできる。この場合、図１において制御部１６とＵＳＢフラッシュオーディオプレーヤはＵＳＢ端子を介して接続される。ＵＳＢフラッシュオーディオプレーヤは各種の製品が市販されているが、早送り再生／巻き戻し再生機能がなく、前曲／次曲への送り／戻しボタンしかない場合が多い。このようなＵＳＢフラッシュオーディオプレーヤであっても、本実施形態ではラジオ番組のコンテンツを自動的にファイルに分割してＵＳＢフラッシュオーディオプレーヤにダウンロードできるので、擬似的な早送り／巻き戻し再生できる利点がある。 In this embodiment, a USB flash audio player can also be used as the storage medium 22. In this case, in FIG. 1, the control unit 16 and the USB flash audio player are connected via a USB terminal. Various types of USB flash audio players are commercially available, but they often have no fast forward / rewind playback function and only have a forward / backward button for the previous / next song. Even in such a USB flash audio player, in the present embodiment, the contents of a radio program can be automatically divided into files and downloaded to the USB flash audio player, so that there is an advantage that pseudo fast forward / rewind playback can be performed. .

実施形態に係る音声信号記録再生装置の構成ブロック図である。1 is a configuration block diagram of an audio signal recording / reproducing apparatus according to an embodiment. 実施形態における入力音声信号のファイル分割説明図である。It is file division explanatory drawing of the input audio | voice signal in embodiment. 実施形態のファイル分割処理フローチャートである。It is a file division processing flowchart of an embodiment. 他の実施形態におけるファイル分割説明図である。It is file division explanatory drawing in other embodiment. 無音部分におけるフェードイン処理説明図である。It is fade-in process explanatory drawing in a silence part. 音声インデックス部分におけるフェードイン説明図である。It is fade-in explanatory drawing in an audio | voice index part.

Explanation of symbols

１０ＡＭ／ＦＭチューナ、１２音声信号レベル比較部、１４音声処理部、１６制御部、１８操作部、２０記録再生制御部、２２記憶媒体、２４音声出力部。 10 AM / FM tuner, 12 audio signal level comparison unit, 14 audio processing unit, 16 control unit, 18 operation unit, 20 recording / playback control unit, 22 storage medium, 24 audio output unit.

Claims

Silence detection means for detecting a silence portion of the input audio signal;
First determination means for determining whether or not the silent portion continues for a first predetermined time or more;
Second determination means for determining whether or not the sound part immediately before the silent part continues for a second predetermined time or more when it is determined that the silent part continues for a first fixed time or longer; ,
A dividing unit configured to divide the input voice signal into a file by the silent part and store the file in a storage medium when it is determined that the voice part continues for a second predetermined time or more;
User-operable operation means,
In response to an operation signal from the operation means, a reproduction means for reproducing the input audio signal stored in the storage medium in units of the file;
An audio signal recording / reproducing apparatus comprising:

The apparatus of claim 1.
The audio signal recording / reproducing apparatus characterized in that the dividing means appends silence data that continues for a predetermined time to the beginning of the file and stores it in the storage medium.

The apparatus of claim 1.
The audio signal recording / reproducing apparatus characterized in that the dividing means adds audio index data that continues for a predetermined time to the beginning of the file and stores it in the storage medium.

The apparatus according to any one of claims 2 and 3,
The audio signal recording / reproducing apparatus, wherein the predetermined time is set to be substantially the same as a fade-in time when the file is reproduced by the reproducing means.