JP2022054315A

JP2022054315A - Sound processor, control method, and program

Info

Publication number: JP2022054315A
Application number: JP2020161435A
Authority: JP
Inventors: 吉信飯田; Yoshinobu Iida
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2020-09-25
Filing date: 2020-09-25
Publication date: 2022-04-06
Anticipated expiration: 2040-09-25
Also published as: JP7604154B2

Abstract

To effectively reduce noise.SOLUTION: A sound processor comprises: imaging means; a first microphone; a second microphone; means for performing Fourier transform on sound signals from the first microphone to generate first sound signals; means for performing Fourier transform on sound signals from the second microphone to generate second sound signals; means for generating noise data using the second sound signals and a first parameter related to noise of a noise source; subtraction means for subtracting the noise data from the first sound signals; transform means for performing inverse Fourier transform on sound signals from the subtraction means; record means for recording, on a recording medium, moving image data generated by the imaging means and sound signals from the transform means, as moving image data with sounds; and update means for updating the first parameter using a parameter generated with the first sound signals and the second sound signals in a state in which, recording of the moving image data with sounds is not executed by the record means.SELECTED DRAWING: Figure 13

Description

本発明は、音声データに含まれるノイズを低減可能な音声処理装置に関する。 The present invention relates to a voice processing device capable of reducing noise contained in voice data.

音声処理装置の一例であるデジタルカメラは、動画データを記録する場合、周囲の音声も併せて記録することができる。また、デジタルカメラは、光学レンズを駆動することで、動画データの記録中に被写体に対してフォーカスを合わせるオートフォーカス機能を持つ。また、デジタルカメラは、動画の記録中に光学レンズを駆動してズームを行う機能を持つ。 When recording moving image data, a digital camera, which is an example of an audio processing device, can also record ambient audio. In addition, the digital camera has an autofocus function that focuses on the subject during recording of moving image data by driving an optical lens. In addition, the digital camera has a function of driving an optical lens to perform zooming while recording a moving image.

このように、動画の記録中に光学レンズを駆動すると、動画とともに記録される音声に光学レンズの駆動音がノイズとして含まれることがある。そこで、従来、デジタルカメラは、光学レンズが駆動する際に発生する摺動音等をノイズとして収音した場合、そのノイズを低減して周囲の音声を記録することができる。特許文献１では、スペクトルサブトラクション法によってノイズを低減するデジタルカメラが開示されている。 As described above, when the optical lens is driven during the recording of the moving image, the driving sound of the optical lens may be included as noise in the sound recorded together with the moving image. Therefore, conventionally, when a digital camera collects a sliding sound or the like generated when an optical lens is driven as noise, the noise can be reduced and the surrounding sound can be recorded. Patent Document 1 discloses a digital camera that reduces noise by a spectral subtraction method.

特開２０１１－２０５５２７号公報Japanese Unexamined Patent Publication No. 2011-205527

しかし、特許文献１では、デジタルカメラは、周囲の音声を記録するマイクによって集音されたノイズからノイズパターンを作成するため、光学レンズの筐体内で発生する摺動音から正確なノイズパターンを取得できない可能性がある。この場合、デジタルカメラは、収音した音声に含まれるノイズを効果的に低減できないおそれがあった。 However, in Patent Document 1, since the digital camera creates a noise pattern from the noise collected by the microphone that records the surrounding sound, an accurate noise pattern is acquired from the sliding noise generated in the housing of the optical lens. It may not be possible. In this case, the digital camera may not be able to effectively reduce the noise contained in the collected sound.

そこで本発明は、効果的にノイズを低減することを目的とする。 Therefore, an object of the present invention is to effectively reduce noise.

音声処理装置は、装置撮像手段と、環境音を取得するための第一のマイクと、ノイズ源からの音を取得するための第二のマイクと、前記第一のマイクからの音声信号をフーリエ変換して第一の音声信号を生成する第一の変換手段と、前記第二のマイクからの音声信号をフーリエ変換して第二の音声信号を生成する第二の変換手段と、前記第二の音声信号と、前記ノイズ源のノイズに係る第一のパラメータとを用いてノイズデータを生成する生成手段と、前記第一の音声信号から前記ノイズデータを減算する減算手段と、前記減算手段からの音声信号を逆フーリエ変換する第三の変換手段と、前記撮像手段によって生成された動画データと、前記第三の変換手段からの音声信号とを音声付き動画データとして記録媒体に記録する記録手段と、前記記録手段による音声付き動画データの記録が行われていない状態において、前記第一の音声信号と前記第二の音声信号とを用いて、前記ノイズ源のノイズに係るパラメータを生成し、前記生成したパラメータを用いて前記第一のパラメータを更新する更新手段と、を有することを特徴とする。 The voice processing device Fouriers the device image pickup means, the first microphone for acquiring the environmental sound, the second microphone for acquiring the sound from the noise source, and the voice signal from the first microphone. A first conversion means for converting and generating a first audio signal, a second conversion means for Fourier-converting an audio signal from the second microphone to generate a second audio signal, and the second conversion means. From the generation means for generating noise data using the voice signal of the above and the first parameter related to the noise of the noise source, the subtraction means for subtracting the noise data from the first voice signal, and the subtraction means. A third conversion means for inverse Fourier transforming the audio signal of the above, a recording means for recording the moving image data generated by the imaging means and the audio signal from the third conversion means on a recording medium as moving image data with sound. And, in a state where the moving image data with sound is not recorded by the recording means, the parameter related to the noise of the noise source is generated by using the first sound signal and the second sound signal. It is characterized by having an update means for updating the first parameter using the generated parameter.

本発明の音声処理装置は、効果的にノイズを低減することができる。 The voice processing apparatus of the present invention can effectively reduce noise.

第一の実施例における撮像装置の斜視図である。It is a perspective view of the image pickup apparatus in 1st Example. 第一の実施例における撮像装置の構成を示すブロック図である。It is a block diagram which shows the structure of the image pickup apparatus in 1st Example. 第一の実施例における撮像装置の音声入力部の構成を示すブロック図である。It is a block diagram which shows the structure of the audio input part of the image pickup apparatus in the 1st Example. 第一の実施例における撮像装置の音声入力部におけるマイクの配置を示す図である。It is a figure which shows the arrangement of the microphone in the audio input part of the image pickup apparatus in the 1st Example. 第一の実施例におけるノイズパラメータを示す図である。It is a figure which shows the noise parameter in 1st Example. 第一の実施例における、環境音がないと見なせる状況において駆動音が発生した場合における音声の周波数スペクトル、および、ノイズパラメータの周波数スペクトルを示す図である。It is a figure which shows the frequency spectrum of voice, and the frequency spectrum of a noise parameter in the case where the driving sound is generated in the situation which can be considered that there is no environmental sound in 1st Example. 第一の実施例における、環境音がある状況において駆動音が発生した場合における音声の周波数スペクトルを示す図である。It is a figure which shows the frequency spectrum of the voice in the case where the driving sound is generated in the situation which there is an environmental sound in the 1st Example. 第一の実施例におけるノイズパラメータ選択部の構成を示すブロック図である。It is a block diagram which shows the structure of the noise parameter selection part in 1st Example. 第一の実施例における音声ノイズ低減処理に係るタイミングチャートである。It is a timing chart which concerns on voice noise reduction processing in 1st Example. 第一の実施例における、環境音がある状況において駆動音が発生した場合における音声の周波数スペクトルを示す図である。It is a figure which shows the frequency spectrum of the voice in the case where the driving sound is generated in the situation which there is an environmental sound in the 1st Example. 第一の実施例におけるノイズパラメータ選択部の構成を示すブロック図である。It is a block diagram which shows the structure of the noise parameter selection part in 1st Example. 第一の実施例におけるノイズパラメータの更新方法に係るタイミングチャートである。It is a timing chart which concerns on the update method of a noise parameter in 1st Example. 第一の実施例におけるノイズパラメータの更新処理に係るフローチャートである。It is a flowchart which concerns on the update process of a noise parameter in 1st Example. 第一の実施例におけるノイズパラメータの更新処理に係るタイミングチャートである。It is a timing chart which concerns on the update process of a noise parameter in 1st Example. 第二の実施例におけるノイズパラメータの更新処理に係るフローチャートである。It is a flowchart which concerns on the update process of a noise parameter in 2nd Example. 第二の実施例におけるノイズパラメータの更新処理に係るタイミングチャートである。It is a timing chart which concerns on the update process of a noise parameter in the 2nd Example.

以下、図面を参照して本発明の実施例を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

［第一の実施例］
＜撮像装置１００の外観図＞
図１（ａ）、（ｂ）に本発明を適用可能な音声処理装置の一例としての撮像装置１００の外観図の一例を示す。図１（ａ）は撮像装置１００の前面斜視図の一例である。図１（ｂ）は撮像装置１００の背面斜視図の一例である。図１において、レンズマウント３０１には不図示の光学レンズが装着される。 [First Example]
<External view of the image pickup device 100>
1A and 1B show an example of an external view of an image pickup apparatus 100 as an example of an audio processing apparatus to which the present invention can be applied. FIG. 1A is an example of a front perspective view of the image pickup apparatus 100. FIG. 1B is an example of a rear perspective view of the image pickup apparatus 100. In FIG. 1, an optical lens (not shown) is attached to the lens mount 301.

表示部１０７は画像データおよび文字情報等を表示する。表示部１０７は撮像装置１００の背面に設けられる。ファインダー外表示部４３は、撮像装置１００の上面に設けられた表示部である。ファインダー外表示部４３は、シャッター速度、絞り値等の撮像装置１００の設定値を表示する。接眼ファインダー１６は覗き込み型のファインダーである。ユーザは接眼ファインダー１６内のフォーカシングスクリーンを観察することで、被写体の光学像の焦点および構図を確認することができる。 The display unit 107 displays image data, character information, and the like. The display unit 107 is provided on the back surface of the image pickup apparatus 100. The display unit 43 outside the viewfinder is a display unit provided on the upper surface of the image pickup apparatus 100. The out-of-finder display unit 43 displays the set values of the image pickup apparatus 100 such as the shutter speed and the aperture value. The eyepiece finder 16 is a peep-type finder. The user can confirm the focus and composition of the optical image of the subject by observing the focusing screen in the eyepiece finder 16.

レリーズスイッチ６１はユーザが撮影指示を行うための操作部材である。モード切替スイッチ６０はユーザが各種モードを切り替えるための操作部材である。メイン電子ダイヤル７１は回転操作部材である。ユーザはこのメイン電子ダイヤル７１を回すことで、シャッター速度、絞り値等の撮像装置１００の設定値を変更することができる。レリーズスイッチ６１、モード切替スイッチ６０、メイン電子ダイヤル７１は、操作部１１２に含まれる。 The release switch 61 is an operating member for the user to give a shooting instruction. The mode changeover switch 60 is an operation member for the user to switch between various modes. The main electronic dial 71 is a rotation operation member. The user can change the setting values of the image pickup apparatus 100 such as the shutter speed and the aperture value by turning the main electronic dial 71. The release switch 61, the mode changeover switch 60, and the main electronic dial 71 are included in the operation unit 112.

電源スイッチ７２は撮像装置１００の電源のオンおよびオフを切り替える操作部材である。サブ電子ダイヤル７３は回転操作部材である。ユーザは、サブ電子ダイヤル７３によって表示部１０７に表示された選択枠の移動および再生モードにおける画像送りなどを行える。十字キー７４は上、下、左、右部分をそれぞれ押し込み可能な十字キー（４方向キー）である。撮像装置１００は十字キー７４の押された部分（方向）に応じた処理を実行する。電源スイッチ７２、サブ電子ダイヤル７３、十字キー７４は操作部１１２に含まれる。 The power switch 72 is an operating member that switches the power of the image pickup apparatus 100 on and off. The sub electronic dial 73 is a rotation operation member. The user can move the selection frame displayed on the display unit 107 by the sub-electronic dial 73, advance the image in the reproduction mode, and the like. The cross key 74 is a cross key (four-way key) capable of pushing up, down, left, and right portions, respectively. The image pickup apparatus 100 executes processing according to the pressed portion (direction) of the cross key 74. The power switch 72, the sub electronic dial 73, and the cross key 74 are included in the operation unit 112.

ＳＥＴボタン７５は押しボタンである。ＳＥＴボタン７５は、主に、ユーザが表示部１０７に表示された選択項目を決定するためなどに用いられる。ＬＶボタン７６はライブビュー（以下、ＬＶ）のオンおよびオフを切り替えるために使用されるボタンである。ＬＶボタン７６は、動画記録モードにおいては、動画撮影（記録）の開始および停止の指示に用いられる。拡大ボタン７７は撮影モードのライブビュー表示において拡大モードのオンおよびオフ、並びに、拡大モード中の拡大率の変更を行うための押しボタンである。ＳＥＴボタン７５、ＬＶボタン７６、拡大ボタン７７は操作部１１２に含まれる。 The SET button 75 is a push button. The SET button 75 is mainly used for the user to determine a selection item displayed on the display unit 107. The LV button 76 is a button used to switch the live view (hereinafter referred to as LV) on and off. The LV button 76 is used to instruct to start and stop moving image recording (recording) in the moving image recording mode. The magnifying button 77 is a push button for turning on and off the magnifying mode and changing the magnifying ratio during the magnifying mode in the live view display of the shooting mode. The SET button 75, the LV button 76, and the enlargement button 77 are included in the operation unit 112.

拡大ボタン７７は、再生モードにおいては表示部１０７に表示された画像データの拡大率を増加させるためのボタンとして機能する。縮小ボタン７８は表示部１０７において拡大表示された画像データの拡大率を低減させるためのボタンである。再生ボタン７９は撮影モードと再生モードとを切り替える操作ボタンである。撮像装置１００は撮影モード中にユーザが再生ボタン７９を押すと、撮像装置１００が再生モードに移行し、記録媒体１１０に記録された画像データを表示部１０７に表示する。縮小ボタン７８、再生ボタン７９は、操作部１１２に含まれる。 The enlargement button 77 functions as a button for increasing the enlargement ratio of the image data displayed on the display unit 107 in the reproduction mode. The reduction button 78 is a button for reducing the enlargement ratio of the image data enlarged and displayed on the display unit 107. The play button 79 is an operation button for switching between a shooting mode and a play mode. When the user presses the play button 79 during the shooting mode, the image pickup device 100 shifts to the play mode and displays the image data recorded on the recording medium 110 on the display unit 107. The reduction button 78 and the play button 79 are included in the operation unit 112.

クイックリターンミラー１２（以下、ミラー１２）は、撮像装置１００に装着された光学レンズから入射した光束を接眼ファインダー１６側または撮像部１０１側のどちらかに入射するよう切り替えるためのミラーである。ミラー１２は、露光、ライブビュー撮影、および動画撮影の際に、制御部１１１によって不図示のアクチュエータを制御されることによりアップダウンされる。ミラー１２は通常時は接眼ファインダー１６へと光束を入射させるように配されている。ミラー１２は、撮影が行われる場合およびライブビュー表示の場合には、撮像部１０１に光束が入射するように上方に跳ね上がる（ミラーアップ）。またミラー１２はその中央部がハーフミラーとなっている。ミラー１２の中央部を透過した光束の一部は、焦点検出を行うための焦点検出部（不図示）に入射する。 The quick return mirror 12 (hereinafter referred to as a mirror 12) is a mirror for switching the light beam incident from the optical lens mounted on the image pickup apparatus 100 so as to be incident on either the eyepiece finder 16 side or the image pickup unit 101 side. The mirror 12 is moved up and down by controlling an actuator (not shown) by the control unit 111 during exposure, live view shooting, and moving image shooting. The mirror 12 is normally arranged so that a light flux is incident on the eyepiece finder 16. In the case of shooting and live view display, the mirror 12 jumps upward so that a light flux is incident on the image pickup unit 101 (mirror lockup). Further, the central portion of the mirror 12 is a half mirror. A part of the light flux transmitted through the central portion of the mirror 12 is incident on a focal detection unit (not shown) for performing focus detection.

通信端子１０は、撮像装置１００に装着された光学レンズ３００と撮像装置１００とが通信を行う為の通信端子である。端子カバー４０は外部機器との接続ケーブルと撮像装置１００とを接続する接続ケーブル等のコネクタ（不図示）を保護するカバーである。蓋４１は記録媒体１１０を格納したスロットの蓋である。レンズマウント３０１は不図示の光学レンズ３００を取り付けることができる取り付け部である。 The communication terminal 10 is a communication terminal for communicating between the optical lens 300 mounted on the image pickup apparatus 100 and the image pickup apparatus 100. The terminal cover 40 is a cover that protects a connector (not shown) such as a connection cable for connecting a connection cable to an external device and the image pickup device 100. The lid 41 is a lid of a slot in which the recording medium 110 is stored. The lens mount 301 is a mounting portion to which an optical lens 300 (not shown) can be mounted.

Ｌマイク２０１ａおよびＲマイク２０１ｂはユーザの音声等の環境音を収音するためのマイクである。撮像装置１００の背面から見て、左側にＬマイク２０１ａが、右側にＲマイク２０１ｂが配置される。 The L microphone 201a and the R microphone 201b are microphones for collecting environmental sounds such as user's voice. When viewed from the back surface of the image pickup apparatus 100, the L microphone 201a is arranged on the left side and the R microphone 201b is arranged on the right side.

＜撮像装置１００の構成＞
図２は本実施例における撮像装置１００の構成の一例を示すブロック図である。 <Structure of image pickup device 100>
FIG. 2 is a block diagram showing an example of the configuration of the image pickup apparatus 100 in this embodiment.

光学レンズ３００は、撮像装置１００に着脱可能なレンズユニットである。例えば光学レンズ３００はズームレンズまたはバリフォーカルレンズである。光学レンズ３００は光学レンズ、光学レンズを駆動させるためのモーター、および後述する撮像装置１００のレンズ制御部１０２と通信する通信部を有する。光学レンズ３００は、通信部によって受信した制御信号に基づいて、光学レンズをモーターによって移動させることで、被写体に対するフォーカスおよびズーミング、並びに、手ブレの補正ができる。 The optical lens 300 is a lens unit that can be attached to and detached from the image pickup device 100. For example, the optical lens 300 is a zoom lens or a varifocal lens. The optical lens 300 has an optical lens, a motor for driving the optical lens, and a communication unit that communicates with the lens control unit 102 of the image pickup apparatus 100 described later. The optical lens 300 can focus and zoom on the subject and correct camera shake by moving the optical lens by a motor based on the control signal received by the communication unit.

撮像部１０１は、光学レンズ３００を経て撮像面に結像された被写体の光学像を電気信号に変換するための撮像素子、および撮像素子で生成された電気信号から画像データまたは動画データを生成して出力する画像処理部とを有する。撮像素子は、例えばＣＣＤ（ＣｈａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ）、およびＣＭＯＳ（ＣｏｍｐｌｅｍｅｎｔａｒｙＭｅｔａｌＯｘｉｄｅＳｅｍｉｃｏｎｄｕｃｔｏｒ）である。本実施例では、撮像部１０１において静止画像データや動画データを含む画像データを生成して撮像部１０１から出力する一連の処理を「撮影」という。撮像装置１００では、画像データは、ＤＣＦ（ＤｅｓｉｇｎｒｕｌｅｆｏｒＣａｍｅｒａＦｉｌｅｓｙｓｔｅｍ）規格に従って、後述する記録媒体１１０に記録される。 The image pickup unit 101 generates image data or moving image data from an image pickup element for converting an optical image of a subject imaged on an image pickup surface through an optical lens 300 into an electric signal, and an electric signal generated by the image pickup element. It has an image processing unit to output the image. The image pickup device is, for example, a CCD (Charge Coupled Device) and a CMOS (Complementary Metal Oxide Semiconductor). In this embodiment, a series of processes in which the image pickup unit 101 generates image data including still image data and moving image data and outputs the image data from the image pickup unit 101 is referred to as “shooting”. In the image pickup apparatus 100, the image data is recorded on the recording medium 110 described later in accordance with the DCF (Design rule for Camera File system) standard.

レンズ制御部１０２は撮像部１０１から出力されたデータ、および後述する制御部１１１から出力された制御信号に基づいて、通信端子１０を介して光学レンズ３００に制御信号を送信し、光学レンズ３００を制御する。また、レンズ制御部１０２は、撮像装置１００に装着されている光学レンズ３００からレンズ情報を受信する。レンズ情報は、例えば、レンズの種類、レンズの型番、ズーム倍率、およびノイズ源の種類等である。 The lens control unit 102 transmits a control signal to the optical lens 300 via the communication terminal 10 based on the data output from the image pickup unit 101 and the control signal output from the control unit 111 described later, and causes the optical lens 300 to operate. Control. Further, the lens control unit 102 receives lens information from the optical lens 300 mounted on the image pickup apparatus 100. The lens information is, for example, the type of lens, the model number of the lens, the zoom magnification, the type of noise source, and the like.

情報取得部１０３は、撮像装置１００の傾きおよび撮像装置１００の筐体内の温度などを検出する。例えば情報取得部１０３は撮像装置１００の傾きを加速度センサまたはジャイロセンサによって検出する。また、例えば情報取得部１０３は撮像装置１００の筐体内の温度を温度センサによって検出する。 The information acquisition unit 103 detects the tilt of the image pickup device 100, the temperature inside the housing of the image pickup device 100, and the like. For example, the information acquisition unit 103 detects the inclination of the image pickup device 100 by an acceleration sensor or a gyro sensor. Further, for example, the information acquisition unit 103 detects the temperature inside the housing of the image pickup apparatus 100 by a temperature sensor.

音声入力部１０４は、マイクによって取得された音声から音声データを生成する。音声入力部１０４は、マイクによって撮像装置１００の周辺の音声を取得し、取得された音声に対してアナログデジタル変換（Ａ／Ｄ変換）、各種の音声処理を行い、音声データを生成する。本実施例では、音声入力部１０４はマイクを有する。音声入力部１０４の詳細な構成例については後述する。 The voice input unit 104 generates voice data from the voice acquired by the microphone. The voice input unit 104 acquires voice around the image pickup device 100 by a microphone, performs analog-digital conversion (A / D conversion) and various voice processing on the acquired voice, and generates voice data. In this embodiment, the voice input unit 104 has a microphone. A detailed configuration example of the voice input unit 104 will be described later.

揮発性メモリ１０５は、撮像部１０１において生成された画像データ、並びに音声入力部１０４によって生成された音声データを一時的に記録する。また、揮発性メモリ１０５は、表示部１０７に表示される画像データの一時的な記録領域、および制御部１１１の作業領域等としても使用される。 The volatile memory 105 temporarily records the image data generated by the imaging unit 101 and the audio data generated by the audio input unit 104. The volatile memory 105 is also used as a temporary recording area for image data displayed on the display unit 107, a work area for the control unit 111, and the like.

表示制御部１０６は、撮像部１０１から出力された画像データ、対話的な操作のための文字並びに、メニュー画面等を表示部１０７に表示するよう制御する。また、表示制御部１０６は静止画撮影および動画撮影の際、撮像部１０１から出力されたデジタルデータを逐次表示部１０７に表示するよう制御することで、表示部１０７を電子ビューファインダとして機能させることができる。例えば表示部１０７は、液晶ディスプレイまたは有機ＥＬディスプレイである。また、表示制御部１０６は、撮像部１０１から出力された画像データおよび動画データ、対話的な操作のための文字、並びにメニュー画面等を、後述する外部出力部１１５を介して外部のディスプレイに表示させるよう制御することもできる。 The display control unit 106 controls to display the image data output from the image pickup unit 101, characters for interactive operation, a menu screen, and the like on the display unit 107. Further, the display control unit 106 controls the display unit 107 to sequentially display the digital data output from the image pickup unit 101 on the display unit 107 during still image shooting and moving image shooting, so that the display unit 107 functions as an electronic viewfinder. Can be done. For example, the display unit 107 is a liquid crystal display or an organic EL display. Further, the display control unit 106 displays image data and moving image data output from the image pickup unit 101, characters for interactive operation, a menu screen, and the like on an external display via an external output unit 115, which will be described later. It can also be controlled to cause.

符号化処理部１０８は、揮発性メモリ１０５に一時的に記録された画像データおよび音声データをそれぞれ符号化することができる。例えば、符号化処理部１０８は、画像データをＪＰＥＧ規格またはＲＡＷ画像フォーマットに従って符号化およびデータ圧縮された動画データを生成することができる。例えば、符号化処理部１０８は、動画データをＭＰＥＧ２規格またはＨ．２６４／ＭＰＥＧ４－ＡＶＣ規格に従って符号化およびデータ圧縮された動画データを生成することができる。また例えば、符号化処理部１０８は、音声データをＡＣ３ＡＡＣ規格、ＡＴＲＡＣ規格、またはＡＤＰＣＭ方式に従って符号化およびデータ圧縮された音声データを生成することができる。また、符号化処理部１０８は、例えばリニアＰＣＭ方式に従って音声データをデータ圧縮しないように符号化してもよい。 The coding processing unit 108 can encode the image data and the audio data temporarily recorded in the volatile memory 105, respectively. For example, the coding processing unit 108 can generate moving image data in which image data is encoded and data-compressed according to a JPEG standard or a RAW image format. For example, the coding processing unit 108 uses the MPEG2 standard or H.M. It is possible to generate video data encoded and data-compressed according to the 264 / MPEG4-AVC standard. Further, for example, the coding processing unit 108 can generate voice data in which voice data is coded and data-compressed according to the AC3AAC standard, ATRAC standard, or ADPCM method. Further, the coding processing unit 108 may encode the voice data so as not to compress the voice data according to, for example, the linear PCM method.

記録制御部１０９は、データを記録媒体１１０に記録すること、および記録媒体１１０から読み出すことができる。例えば、記録制御部１０９は、符号化処理部１０８によって生成された静止画像データ、動画データ、および音声データを記録媒体１１０に記録すること、および記録媒体１１０から読み出すことができる。記録媒体１１０は例えばＳＤカード、ＣＦカード、ＸＱＤメモリーカード、ＨＤＤ（磁気ディスク）、光学式ディスク、および半導体メモリである。記録媒体１１０は、撮像装置１００に着脱可能なように構成してもよいし、撮像装置１００に内蔵されていてもよい。すなわち、記録制御部１０９は少なくとも記録媒体１１０にアクセスする手段を有していればよい。 The recording control unit 109 can record data on the recording medium 110 and read the data from the recording medium 110. For example, the recording control unit 109 can record the still image data, the moving image data, and the audio data generated by the coding processing unit 108 on the recording medium 110 and read them from the recording medium 110. The recording medium 110 is, for example, an SD card, a CF card, an XQD memory card, an HDD (magnetic disk), an optical disk, and a semiconductor memory. The recording medium 110 may be configured to be detachable from the image pickup device 100, or may be built in the image pickup device 100. That is, the recording control unit 109 may have at least a means for accessing the recording medium 110.

制御部１１１は、入力された信号、および後述のプログラムに従ってデータバス１１６を介して撮像装置１００の各構成要素を制御する。制御部１１１は、各種制御を実行するためのＣＰＵ、ＲＯＭ、およびＲＡＭを有する。なお、制御部１１１が撮像装置１００全体を制御する代わりに、複数のハードウェアが分担して撮像装置全体を制御してもよい。制御部１１１が有するＲＯＭには、各構成要素を制御するためのプログラムが格納されている。また制御部１１１が有するＲＡＭは演算処理等に利用される揮発性メモリである。 The control unit 111 controls each component of the image pickup apparatus 100 via the data bus 116 according to the input signal and the program described later. The control unit 111 has a CPU, a ROM, and a RAM for executing various controls. Instead of the control unit 111 controlling the entire image pickup device 100, a plurality of hardware may share control of the entire image pickup device 100. The ROM included in the control unit 111 stores a program for controlling each component. Further, the RAM included in the control unit 111 is a volatile memory used for arithmetic processing and the like.

操作部１１２は、撮像装置１００に対する指示をユーザから受け付けるためのユーザインタフェースである。操作部１１２は、例えば撮像装置１００の電源をオン状態またはオフ状態にするための電源スイッチ７２、撮影を指示するためのレリーズスイッチ６１、画像データまたは動画データの再生を指示するための再生ボタン、およびモード切替スイッチ６０等を有する。 The operation unit 112 is a user interface for receiving an instruction to the image pickup apparatus 100 from the user. The operation unit 112 includes, for example, a power switch 72 for turning on or off the power of the image pickup apparatus 100, a release switch 61 for instructing shooting, a play button for instructing playback of image data or moving image data, and the like. And has a mode changeover switch 60 and the like.

操作部１１２はユーザの操作に応じて、制御信号を制御部１１１に出力する。また、表示部１０７に形成されるタッチパネルも操作部１１２に含めることができる。なお、レリーズスイッチ６１は、ＳＷ１およびＳＷ２を有する。レリーズスイッチ６１が、いわゆる半押し状態となることにより、ＳＷ１がオンとなる。これにより、ＡＦ（オートフォーカス）処理、ＡＥ（自動露出）処理、ＡＷＢ（オートホワイトバランス）処理、ＥＦ（フラッシュプリ発光）処理等の撮像の準備動作を行うための準備指示を受け付ける。また、レリーズスイッチ６１が、いわゆる全押し状態となることにより、ＳＷ２がオンとなる。このようなユーザ操作により、撮像動作を行うための撮像指示を受け付ける。また、操作部１１２は後述するスピーカ１１４から再生される音声データの音量を調整することができる操作部材（例えばボタン）を含む。 The operation unit 112 outputs a control signal to the control unit 111 according to the operation of the user. Further, the touch panel formed on the display unit 107 can also be included in the operation unit 112. The release switch 61 has SW1 and SW2. When the release switch 61 is in the so-called half-pressed state, SW1 is turned on. As a result, it receives preparation instructions for performing preparatory operations for imaging such as AF (autofocus) processing, AE (automatic exposure) processing, AWB (auto white balance) processing, and EF (flash pre-flash) processing. Further, when the release switch 61 is in the so-called fully pressed state, SW2 is turned on. By such a user operation, an imaging instruction for performing an imaging operation is received. Further, the operation unit 112 includes an operation member (for example, a button) capable of adjusting the volume of audio data reproduced from the speaker 114, which will be described later.

音声出力部１１３は、音声データをスピーカ１１４、および外部出力部１１５に出力することができる。音声出力部１１３に入力される音声データは、記録制御部１０９により記録媒体１１０から読み出された音声データ、不揮発性メモリ１１７から出力される音声データ、および符号化処理部から出力される音声データである。スピーカ１１４は、音声データを再生することができる電気音響変換器である。 The audio output unit 113 can output audio data to the speaker 114 and the external output unit 115. The audio data input to the audio output unit 113 is audio data read from the recording medium 110 by the recording control unit 109, audio data output from the non-volatile memory 117, and audio data output from the coding processing unit. Is. The speaker 114 is an electroacoustic converter capable of reproducing audio data.

外部出力部１１５は、画像データ、動画データ、および音声データなどを外部機器に出力することができる。外部出力部１１５は、例えば映像端子、マイク端子、およびヘッドホン端子等で構成される。 The external output unit 115 can output image data, moving image data, audio data, and the like to an external device. The external output unit 115 is composed of, for example, a video terminal, a microphone terminal, a headphone terminal, and the like.

データバス１１６は、音声データ、動画データ、および画像データ等の各種データ、各種制御信号を撮像装置１００の各ブロックへ伝達するためのデータバスである。 The data bus 116 is a data bus for transmitting various data such as audio data, moving image data, and image data, and various control signals to each block of the image pickup apparatus 100.

不揮発性メモリ１１７は不揮発性メモリであり、制御部１１１で実行される後述のプログラム等が格納される。また、不揮発性メモリ１１７には、音声データが記録されている。この音声データは例えば、被写体に合焦した場合に出力される合焦音、撮影を指示された場合に出力される電子シャッター音、撮像装置１００を操作された場合に出力される操作音等の電子音の音声データである。 The non-volatile memory 117 is a non-volatile memory, and stores a program or the like described later executed by the control unit 111. In addition, voice data is recorded in the non-volatile memory 117. This audio data includes, for example, a focusing sound output when the subject is in focus, an electronic shutter sound output when shooting is instructed, an operation sound output when the image pickup device 100 is operated, and the like. It is voice data of electronic sound.

＜撮像装置１００の動作＞
これから、本実施例の撮像装置１００の動作について説明する。 <Operation of image pickup device 100>
From now on, the operation of the image pickup apparatus 100 of this Example will be described.

本実施例の撮像装置１００は、ユーザが電源スイッチ７２を操作して電源をオンされたことに応じて、不図示の電源から、撮像装置の各構成要素に電力を供給する。例えば電源はリチウムイオン電池またはアルカリマンガン乾電池等の電池である。 The image pickup apparatus 100 of this embodiment supplies electric power to each component of the image pickup apparatus from a power source (not shown) in response to the user operating the power switch 72 to turn on the power. For example, the power source is a battery such as a lithium ion battery or an alkaline manganese dry battery.

制御部１１１は、電力が供給されたことに応じてモード切替スイッチ６０の状態に基づいて、例えば、撮影モードおよび再生モードのどのモードで動作するかを判断する。動画記録モードでは、制御部１１１は撮像部１０１から出力された動画データと音声入力部１０４から出力された音声データとを１つの音声付き動画データとして記録する。再生モードでは、制御部１１１は記録媒体１１０に記録された画像データまたは動画データを記録制御部１０９によって読み出し、表示部１０７に表示するよう制御する。 The control unit 111 determines, for example, which mode of the shooting mode and the reproduction mode operates based on the state of the mode changeover switch 60 according to the power supply. In the moving image recording mode, the control unit 111 records the moving image data output from the imaging unit 101 and the audio data output from the audio input unit 104 as one moving image data with audio. In the reproduction mode, the control unit 111 controls the recording control unit 109 to read out the image data or the moving image data recorded on the recording medium 110 and display the image data or the moving image data on the display unit 107.

まず、動画記録モードについて説明する。動画記録モードでは、まず制御部１１１は、撮像装置１００を撮影待機状態に移行させるように制御信号を撮像装置１００の各構成要素に送信する。例えば、制御部１１１は、撮像部１０１および音声入力部１０４に以下のような動作をさせるよう制御する。 First, the moving image recording mode will be described. In the moving image recording mode, first, the control unit 111 transmits a control signal to each component of the image pickup device 100 so as to shift the image pickup device 100 to the shooting standby state. For example, the control unit 111 controls the image pickup unit 101 and the voice input unit 104 to perform the following operations.

撮像部１０１は、光学レンズ３００を経て撮像面に結像された被写体の光学像を電気信号に変換し、撮像素子で生成された電気信号から動画データを生成する。そして、撮像部１０１は動画データを表示制御部１０６に送信し、表示部１０７によって表示する。ユーザは表示部１０７に表示された動画データを見ながら撮影の準備を行うことができる。 The image pickup unit 101 converts the optical image of the subject imaged on the image pickup surface through the optical lens 300 into an electric signal, and generates moving image data from the electric signal generated by the image pickup element. Then, the image pickup unit 101 transmits the moving image data to the display control unit 106, and the display unit 107 displays the moving image data. The user can prepare for shooting while viewing the moving image data displayed on the display unit 107.

音声入力部１０４は、複数のマイクから入力されたアナログ音声信号をそれぞれＡ／Ｄ変換し、複数のデジタル音声信号を生成する。そして音声入力部１０４はその複数のデジタル音声信号から複数のチャンネルの音声データを生成する。音声入力部１０４は生成された音声データを音声出力部１１３に送信し、スピーカ１１４から音声データを再生させる。ユーザは、スピーカ１１４から再生された音声データを聞きながら、音声付き動画データに記録される音声データの音量を操作部１１２によって調整することができる。 The audio input unit 104 A / D-converts analog audio signals input from a plurality of microphones to generate a plurality of digital audio signals. Then, the voice input unit 104 generates voice data of a plurality of channels from the plurality of digital voice signals. The voice input unit 104 transmits the generated voice data to the voice output unit 113, and reproduces the voice data from the speaker 114. The user can adjust the volume of the audio data recorded in the moving image data with audio by the operation unit 112 while listening to the audio data reproduced from the speaker 114.

次に、ユーザによってＬＶボタン７６が押下されたことに応じて、制御部１１１は、撮像装置１００の各構成要素に撮影開始の指示信号を送信する。例えば、制御部１１１は、撮像部１０１、音声入力部１０４、符号化処理部１０８、および記録制御部１０９に以下のような動作をさせるよう制御する。 Next, in response to the user pressing the LV button 76, the control unit 111 transmits an instruction signal for starting shooting to each component of the image pickup apparatus 100. For example, the control unit 111 controls the image pickup unit 101, the voice input unit 104, the coding processing unit 108, and the recording control unit 109 to perform the following operations.

撮像部１０１は、光学レンズ３００を経て撮像面に結像された被写体の光学像を電気信号に変換し、撮像素子で生成された電気信号から動画データを生成する。そして、撮像部１０１は動画データを表示制御部１０６に送信し、表示部１０７によって表示する。また、また撮像部１０１は生成された動画データを揮発性メモリ１０５へ送信する。 The image pickup unit 101 converts the optical image of the subject imaged on the image pickup surface through the optical lens 300 into an electric signal, and generates moving image data from the electric signal generated by the image pickup element. Then, the image pickup unit 101 transmits the moving image data to the display control unit 106, and the display unit 107 displays the moving image data. Further, the image pickup unit 101 also transmits the generated moving image data to the volatile memory 105.

音声入力部１０４は、複数のマイクから入力されたアナログ音声信号をそれぞれＡ／Ｄ変換し、複数のデジタル音声信号を生成する。そして音声入力部１０４はその複数のデジタル音声信号からマルチチャンネルの音声データを生成する。そして、音声入力部１０４は生成された音声データを揮発性メモリ１０５へ送信する。 The audio input unit 104 A / D-converts analog audio signals input from a plurality of microphones to generate a plurality of digital audio signals. Then, the voice input unit 104 generates multi-channel voice data from the plurality of digital voice signals. Then, the voice input unit 104 transmits the generated voice data to the volatile memory 105.

符号化処理部１０８は、揮発性メモリ１０５に一時的に記録された動画データおよび音声データを読み出してそれぞれ符号化する。制御部１１１は、符号化処理部１０８によって符号化された動画データおよび音声データからデータストリームを生成し、記録制御部１０９に出力する。記録制御部１０９は、ＵＤＦまたはＦＡＴ等のファイルシステムに従って、入力されたデータストリームを音声付き動画データとして記録媒体１１０に記録していく。 The coding processing unit 108 reads out the moving image data and the audio data temporarily recorded in the volatile memory 105 and encodes them respectively. The control unit 111 generates a data stream from the moving image data and the audio data encoded by the coding processing unit 108, and outputs the data stream to the recording control unit 109. The recording control unit 109 records the input data stream as moving image data with audio on the recording medium 110 according to a file system such as UDF or FAT.

撮像装置１００の各構成要素は以上の動作を動画撮影中において継続する。 Each component of the image pickup apparatus 100 continues the above operation during movie shooting.

そして、ユーザからＬＶボタン７６が押下されたことに応じて、制御部１１１は、撮像装置１００の各構成要素に撮影終了の指示信号を送信する。例えば、制御部１１１は撮像部１０１、音声入力部１０４、符号化処理部１０８、および記録制御部１０９に以下のような動作をさせるよう制御する。 Then, in response to the user pressing the LV button 76, the control unit 111 transmits an instruction signal for the end of shooting to each component of the image pickup apparatus 100. For example, the control unit 111 controls the image pickup unit 101, the voice input unit 104, the coding processing unit 108, and the recording control unit 109 to perform the following operations.

撮像部１０１は、動画データの生成を停止する。音声入力部１０４は、音声データの生成を停止する。 The image pickup unit 101 stops the generation of moving image data. The voice input unit 104 stops the generation of voice data.

符号化処理部１０８は、揮発性メモリ１０５に記録されている残りの動画データおよび音声データを読み出して符号化する。制御部１１１は、符号化処理部１０８によって符号化された動画データおよび音声データからデータストリームを生成し、記録制御部１０９に出力する。 The coding processing unit 108 reads out and encodes the remaining moving image data and audio data recorded in the volatile memory 105. The control unit 111 generates a data stream from the moving image data and the audio data encoded by the coding processing unit 108, and outputs the data stream to the recording control unit 109.

記録制御部１０９は、ＵＤＦまたはＦＡＴ等のファイルシステムに従って、データストリームを音声付き動画データのファイルとして記録媒体１１０に記録していく。そして、記録制御部１０９は、データストリームの入力が停止したことに応じて、音声付き動画データを完成させる。音声付き動画データの完成をもって、撮像装置１００の記録動作は停止する。 The recording control unit 109 records the data stream on the recording medium 110 as a file of moving image data with audio according to a file system such as UDF or FAT. Then, the recording control unit 109 completes the moving image data with audio in response to the stoppage of the input of the data stream. When the moving image data with sound is completed, the recording operation of the image pickup apparatus 100 is stopped.

制御部１１１は、記録動作が停止したことに応じて、撮影待機状態に移行させるように制御信号を撮像装置１００の各構成要素に送信する。これにより、制御部１１１は撮像装置１００を撮影待機状態に戻るよう制御する。 The control unit 111 transmits a control signal to each component of the image pickup apparatus 100 so as to shift to the shooting standby state when the recording operation is stopped. As a result, the control unit 111 controls the image pickup apparatus 100 to return to the shooting standby state.

次に、再生モードについて説明する。再生モードでは、制御部１１１は、再生状態に移行させるように制御信号を撮像装置１００の各構成要素に送信する。例えば、制御部１１１は符号化処理部１０８、記録制御部１０９、表示制御部１０６、および音声出力部１１３に以下のような動作をさせるよう制御する。 Next, the reproduction mode will be described. In the reproduction mode, the control unit 111 transmits a control signal to each component of the image pickup apparatus 100 so as to shift to the reproduction state. For example, the control unit 111 controls the coding processing unit 108, the recording control unit 109, the display control unit 106, and the voice output unit 113 to perform the following operations.

記録制御部１０９は、記録媒体１１０に記録された音声付き動画データを読み出して読みだした音声付き動画データを符号化処理部１０８に送信する。 The recording control unit 109 reads out the moving image data with sound recorded on the recording medium 110 and transmits the read moving image data with sound to the coding processing unit 108.

符号化処理部１０８は、音声付き動画データから画像データ、および音声データを復号化する。符号化処理部１０８は、復号化された動画データを表示制御部１０６へ、復号化された音声データを音声出力部１１３へ、それぞれ送信する。 The coding processing unit 108 decodes the image data and the audio data from the moving image data with audio. The coding processing unit 108 transmits the decoded moving image data to the display control unit 106 and the decoded audio data to the audio output unit 113, respectively.

表示制御部１０６は、復号化された画像データを表示部１０７によって表示する。音声出力部１１３は、復号化された音声データをスピーカ１１４によって再生する。 The display control unit 106 displays the decoded image data by the display unit 107. The voice output unit 113 reproduces the decoded voice data by the speaker 114.

以上のように、本実施例の撮像装置１００は画像データ、および音声データを記録および再生することができる。 As described above, the image pickup apparatus 100 of this embodiment can record and reproduce image data and audio data.

本実施例では、音声入力部１０４は、マイクから入力された音声信号のレベルの調整処理等の音声処理を実行する。本実施例では、音声入力部１０４は動画記録が開始されたことに応じてこの音声処理を実行する。なお、この音声処理は、撮像装置１００の電源がオンにされてから実行されてもよい。また、この音声処理は、撮影モードが選択されたことに応じて実行されてもよい。また、この音声処理は、動画記録モードおよび音声メモ機能等の音声の記録に関連するモードが選択されたことに応じて実行されてもよい。また、この音声処理は、音声信号の記録が開始したことに応じて実行されてもよい。 In this embodiment, the voice input unit 104 executes voice processing such as adjustment processing of the level of the voice signal input from the microphone. In this embodiment, the voice input unit 104 executes this voice processing in response to the start of video recording. Note that this audio processing may be executed after the power of the image pickup apparatus 100 is turned on. Further, this audio processing may be executed depending on the shooting mode selected. Further, this voice processing may be executed depending on the selection of a mode related to voice recording such as a moving image recording mode and a voice memo function. Further, this voice processing may be executed in response to the start of recording of the voice signal.

＜音声入力部１０４の構成＞
図３は本実施例における音声入力部１０４の詳細な構成の一例を示すブロック図である。 <Structure of voice input unit 104>
FIG. 3 is a block diagram showing an example of a detailed configuration of the voice input unit 104 in this embodiment.

本実施例において、音声入力部１０４は、Ｌマイク２０１ａ、Ｒマイク２０１ｂ、およびノイズマイク２０１ｃの３つのマイクを有する。Ｌマイク２０１ａおよびＲマイク２０１ｂはそれぞれ第一のマイクの一例である。本実施例では、撮像装置１００は環境音をＬマイク２０１ａおよびＲマイク２０１ｂによって収音し、Ｌマイク２０１ａおよびＲマイク２０１ｂから入力された音声信号をステレオ方式で記録する。例えば環境音は、ユーザの音声、動物の鳴き声、雨音、および楽曲等の撮像装置１００の筐体外および光学レンズ３００の筐体外において発生する音である。 In this embodiment, the voice input unit 104 has three microphones, an L microphone 201a, an R microphone 201b, and a noise microphone 201c. The L microphone 201a and the R microphone 201b are examples of the first microphone, respectively. In this embodiment, the image pickup apparatus 100 picks up the ambient sound by the L microphone 201a and the R microphone 201b, and records the audio signals input from the L microphone 201a and the R microphone 201b in a stereo system. For example, the environmental sound is a user's voice, an animal's bark, a rain sound, and a sound generated outside the housing of the image pickup device 100 and the housing of the optical lens 300 such as music.

また、ノイズマイク２０１ｃは第２のマイクの一例である。ノイズマイク２０１ｃは、撮像装置１００の筐体内、および光学レンズ３００の筐体内で発生する、所定の騒音源（ノイズ源）からの駆動音等の騒音（ノイズ）を収音するためのマイクである。ノイズ源は例えば、超音波モータ（ＵｌｔｒａｓｏｎｉｃＭｏｔｏｒ、以下ＵＳＭ）およびステッピングモータ（ＳｔｅｐｐｅｒＭｏｔｏｒ、以下ＳＴＭ）などのモータである。騒音（ノイズ）は例えば、ＵＳＭおよびＳＴＭ等のモータの駆動によって発生する振動音である。例えば、モータは被写体に合焦するためのＡＦ処理において駆動する。撮像装置１００は撮像装置１００の筐体内、および光学レンズ３００の筐体内で発生する駆動音等の騒音（ノイズ）をノイズマイク２０１ｃによって取得し、取得したノイズの音声データを用いて、後述するノイズパラメータを生成する。なお、本実施例では、Ｌマイク２０１ａ、Ｒマイク２０１ｂ、およびノイズマイク２０１ｃは無指向性のマイクである。本実施例における、Ｌマイク２０１ａ、Ｒマイク２０１ｂ、およびノイズマイク２０１ｃの配置例は図４を用いて後述する。 The noise microphone 201c is an example of the second microphone. The noise microphone 201c is a microphone for collecting noise such as drive sound from a predetermined noise source (noise source) generated in the housing of the image pickup apparatus 100 and the housing of the optical lens 300. .. The noise source is, for example, a motor such as an ultrasonic motor (USM) and a stepper motor (STM). Noise (noise) is, for example, vibration noise generated by driving a motor such as USM and STM. For example, the motor is driven in an AF process for focusing on the subject. The image pickup device 100 acquires noise such as driving sound generated in the housing of the image pickup device 100 and the housing of the optical lens 300 by the noise microphone 201c, and uses the acquired noise voice data to generate noise described later. Generate parameters. In this embodiment, the L microphone 201a, the R microphone 201b, and the noise microphone 201c are omnidirectional microphones. An arrangement example of the L microphone 201a, the R microphone 201b, and the noise microphone 201c in this embodiment will be described later with reference to FIG.

Ｌマイク２０１ａ、Ｒマイク２０１ｂ、およびノイズマイク２０１ｃは、それぞれ取得した音声からアナログ音声信号を生成し、Ａ／Ｄ変換部２０２に入力する。ここで、Ｌマイク２０１ａから入力される音声信号をＬｃｈ、Ｒマイク２０１ｂから入力される音声信号をＲｃｈ、およびノイズマイク２０１ｃから入力される音声信号をＮｃｈと記載する。 The L microphone 201a, the R microphone 201b, and the noise microphone 201c each generate an analog audio signal from the acquired voice and input it to the A / D conversion unit 202. Here, the audio signal input from the L microphone 201a is referred to as Lch, the audio signal input from the R microphone 201b is referred to as Rch, and the audio signal input from the noise microphone 201c is referred to as Nch.

Ａ／Ｄ変換部２０２は、Ｌマイク２０１ａ、Ｒマイク２０１ｂ、およびノイズマイク２０１ｃから入力されたアナログ音声信号をデジタル音声信号に変換する。Ａ／Ｄ変換部２０２は変換されたデジタル音声信号をＦＦＴ部２０３に出力する。本実施例においてＡ／Ｄ変換部２０２はサンプリング周波数を４８ｋＨｚ、およびビット深度を１６ｂｉｔとして標本化処理を実行することで、アナログ音声信号をデジタル音声信号に変換する。 The A / D conversion unit 202 converts the analog audio signal input from the L microphone 201a, the R microphone 201b, and the noise microphone 201c into a digital audio signal. The A / D conversion unit 202 outputs the converted digital audio signal to the FFT unit 203. In this embodiment, the A / D conversion unit 202 converts an analog audio signal into a digital audio signal by executing sampling processing with a sampling frequency of 48 kHz and a bit depth of 16 bits.

ＦＦＴ部２０３は、Ａ／Ｄ変換部２０２から入力された時間領域のデジタル音声信号に高速フーリエ変換処理を施し、周波数領域のデジタル音声信号に変換する。本実施例において、周波数領域のデジタル音声信号は、０Ｈｚから４８ｋＨｚまでの周波数帯域において、１０２４ポイントの周波数スペクトルを有する。また、周波数領域のデジタル音声信号は、０Ｈｚからナイキスト周波数である２４ｋＨｚまでの周波数帯域においては、５１３ポイントの周波数スペクトルを有する。本実施例では、撮像装置１００は、ＦＦＴ部２０３から出力された音声データのうち、０Ｈｚから２４ｋＨｚまでの５１３ポイントの周波数スペクトルを利用して、ノイズ低減の処理を行う。 The FFT unit 203 performs a fast Fourier transform process on the digital audio signal in the time domain input from the A / D conversion unit 202, and converts it into a digital audio signal in the frequency domain. In this embodiment, the digital audio signal in the frequency domain has a frequency spectrum of 1024 points in the frequency band from 0 Hz to 48 kHz. Further, the digital audio signal in the frequency domain has a frequency spectrum of 513 points in the frequency band from 0 Hz to 24 kHz, which is the Nyquist frequency. In this embodiment, the image pickup apparatus 100 performs noise reduction processing by using the frequency spectrum of 513 points from 0 Hz to 24 kHz in the audio data output from the FFT unit 203.

ここで、高速フーリエ変換されたＬｃｈの周波数スペクトルを、Ｌｃｈ＿Ｂｅｆｏｒｅ［０］～Ｌｃｈ＿Ｂｅｆｏｒｅ［５１２］の５１３ポイントの配列データで表す。これらの配列データを総称する場合、Ｌｃｈ＿Ｂｅｆｏｒｅと記載する。また、高速フーリエ変換されたＲｃｈの周波数スペクトルを、Ｒｃｈ＿Ｂｅｆｏｒｅ［０］～Ｒｃｈ＿Ｂｅｆｏｒｅ［５１２］の５１３ポイントの配列データで表す。これらの配列データを総称する場合、Ｒｃｈ＿Ｂｅｆｏｒｅと記載する。ここで、本実施例では、Ｌｃｈ＿Ｂｅｆｏｒｅ［０］は０Ｈｚの音声の周波数スペクトル、およびＬｃｈ＿Ｂｅｆｏｒｅ［５１２］は２４ｋＨｚの音声の周波数スペクトルとする。なお、Ｌｃｈ＿Ｂｅｆｏｒｅおよび、Ｒｃｈ＿Ｂｅｆｏｒｅはそれぞれ第１の周波数スペクトルデータの一例である。 Here, the frequency spectrum of the Lch subjected to the fast Fourier transform is represented by the array data of 513 points from Lch_Before [0] to Lch_Before [512]. When these sequence data are generically referred to, they are described as Lch_Before. Further, the frequency spectrum of the Rch subjected to the fast Fourier transform is represented by the array data of 513 points from Rch_Before [0] to Rch_Before [512]. When these sequence data are generically referred to, they are described as Rch_Before. Here, in this embodiment, Lch_Before [0] is the frequency spectrum of the voice of 0 Hz, and Lch_Before [512] is the frequency spectrum of the voice of 24 kHz. Note that Lch_Before and Rch_Before are examples of the first frequency spectrum data, respectively.

また、高速フーリエ変換されたＮｃｈの周波数スペクトルを、Ｎｃｈ＿Ｂｅｆｏｒｅ［０］～Ｎｃｈ＿Ｂｅｆｏｒｅ［５１２］の５１３ポイントの配列データで表す。これらの配列データを総称する場合、Ｎｃｈ＿Ｂｅｆｏｒｅと記載する。なお、Ｎｃｈ＿Ｂｅｆｏｒｅは第２の周波数スペクトルデータの一例である。 Further, the frequency spectrum of the Nch subjected to the fast Fourier transform is represented by the array data of 513 points from Nch_Before [0] to Nch_Before [512]. When these sequence data are generically referred to, they are described as Nch_Before. Note that Nch_Before is an example of the second frequency spectrum data.

ノイズデータ生成部２０４は、Ｎｃｈ＿Ｂｅｆｏｒｅに基づいて、Ｌｃｈ＿ＢｅｆｏｒｅおよびＲｃｈ＿Ｂｅｆｏｒｅに含まれるノイズを低減するためのデータを生成する。本実施例では、ノイズデータ生成部２０４は、Ｌｃｈ＿Ｂｅｆｏｒｅ［０］～Ｌｃｈ＿Ｂｅｆｏｒｅ［５１２］に含まれるノイズをそれぞれ低減するためのＮＬ［０］～ＮＬ［５１２］の配列データをノイズパラメータを用いて生成する。また、ノイズデータ生成部２０４は、Ｒｃｈ＿Ｂｅｆｏｒｅ［０］～Ｒｃｈ＿Ｂｅｆｏｒｅ［５１２］に含まれるノイズをそれぞれ低減するためのＮＲ［０］～ＮＲ［５１２］の配列データを生成する。ＮＬ［０］～ＮＬ［５１２］の配列データにおける周波数のポイントは、Ｌｃｈ＿Ｂｅｆｏｒｅ［０］～Ｌｃｈ＿Ｂｅｆｏｒｅ［５１２］の配列データにおける周波数のポイントと同じである。また、ＮＲ［０］～ＮＲ［５１２］の配列データにおける周波数のポイントは、Ｒｃｈ＿Ｂｅｆｏｒｅ［０］～Ｒｃｈ＿Ｂｅｆｏｒｅ［５１２］の配列データにおける周波数のポイントと同じである。 The noise data generation unit 204 generates data for reducing noise contained in Lch_Before and Rch_Before based on Nch_Before. In this embodiment, the noise data generation unit 204 generates array data of NL [0] to NL [512] for reducing noise contained in Lch_Before [0] to Lch_Before [512] by using noise parameters. do. Further, the noise data generation unit 204 generates array data of NR [0] to NR [512] for reducing noise contained in Rch_Before [0] to Rch_Before [512], respectively. The frequency points in the sequence data of NL [0] to NL [512] are the same as the frequency points in the sequence data of Lch_Before [0] to Lch_Before [512]. Further, the frequency points in the sequence data of NR [0] to NR [512] are the same as the frequency points in the sequence data of Rch_Before [0] to Rch_Before [512].

なお、ＮＬ［０］～ＮＬ［５１２］の配列データを総称する場合、ＮＬと記載する。また、ＮＲ［０］～ＮＲ［５１２］を総称する場合、ＮＲと記載する。ＮＬおよびＮＲはそれぞれ第３の周波数スペクトルデータの一例である。 In addition, when the sequence data of NL [0] to NL [512] is generically referred to, it is described as NL. When NR [0] to NR [512] are generically referred to, they are described as NR. NL and NR are examples of the third frequency spectrum data, respectively.

ノイズパラメータ記録部２０５には、ノイズデータ生成部２０４がＮｃｈ＿ＢｅｆｏｒｅからをＮＬおよびＮＲを生成するためのノイズパラメータが記録されている。ノイズパラメータ記録部２０５はレンズごとにノイズの種類に応じた複数種類のノイズパラメータを記録している。Ｎｃｈ＿ＢｅｆｏｒｅからＮＬを生成するためのノイズパラメータを総称する場合、ＰＬｘと記載する。Ｎｃｈ＿ＢｅｆｏｒｅからＮＲを生成するためのノイズパラメータを総称する場合、ＰＲｘと記載する。 The noise parameter recording unit 205 records noise parameters for the noise data generation unit 204 to generate NL and NR from Nch_Before. The noise parameter recording unit 205 records a plurality of types of noise parameters according to the type of noise for each lens. When the noise parameters for generating NL from Nch_Before are generically referred to as PLx. When the noise parameters for generating NR from Nch_Before are generically referred to as PRx.

ＰＬｘおよびＰＲｘはそれぞれＮＬおよびＮＲと同じ配列数を有する。例えば、ＰＬ１は、ＰＬ１［０］～ＰＬ１［５１２］の配列データである。また、ＰＬ１の周波数ポイントは、Ｌｃｈ＿Ｂｅｆｏｒｅの周波数ポイントと同じである。また例えばＰＲ１は、ＰＲ１［０］～ＰＲ１［５１２］の配列データである。ＰＲ１の周波数ポイントは、Ｒｃｈ＿Ｂｅｆｏｒｅと同じ周波数ポイントである。ノイズパラメータは図５を用いて後述する。 PLx and PRx have the same number of sequences as NL and NR, respectively. For example, PL1 is array data of PL1 [0] to PL1 [512]. Further, the frequency point of PL1 is the same as the frequency point of Lch_Before. Further, for example, PR1 is sequence data of PR1 [0] to PR1 [512]. The frequency point of PR1 is the same frequency point as Rch_Before. The noise parameters will be described later with reference to FIG.

ノイズパラメータ選択部２０６は、ノイズパラメータ記録部２０５に記録されているノイズパラメータから、ノイズデータ生成部２０４において使用されるノイズパラメータを決定する。ノイズパラメータ選択部２０６は、Ｌｃｈ＿Ｂｅｆｏｒｅ、Ｒｃｈ＿Ｂｅｆｏｒｅ、Ｎｃｈ＿Ｂｅｆｏｒｅ、およびレンズ制御部１０２から受信したレンズ情報等のデータに基づいて、ノイズデータ生成部２０４において用いられるノイズパラメータを決定する。ノイズパラメータ選択部２０６の動作については図８を用いて詳しく後述する。 The noise parameter selection unit 206 determines the noise parameter used in the noise data generation unit 204 from the noise parameter recorded in the noise parameter recording unit 205. The noise parameter selection unit 206 determines the noise parameter used in the noise data generation unit 204 based on data such as lens information received from Lch_Before, Rch_Before, Nch_Before, and the lens control unit 102. The operation of the noise parameter selection unit 206 will be described in detail later with reference to FIG.

なお、本実施例では、ノイズパラメータ記録部２０５には、ノイズパラメータとして５１３ポイントの周波数スペクトルそれぞれに対する係数がすべて記録されている。しかし、５１３ポイントの全ての周波数に対する係数ではなく、少なくともノイズを低減するために必要な周波数ポイントの係数が記録されていればよい。例えば、ノイズパラメータ記録部２０５は、ノイズパラメータとして、典型的な可聴周波数と考えられている２０Ｈｚ～２０ｋＨｚの周波数スペクトルそれぞれに対する係数を記録し、他の周波数スペクトルの係数を記録しなくてもよい。また例えば、ノイズパラメータとして、係数の値がゼロである周波数スペクトルに対する係数はノイズパラメータ記録部２０５に記録されていなくてもよい。 In this embodiment, the noise parameter recording unit 205 records all the coefficients for each frequency spectrum of 513 points as noise parameters. However, it is sufficient that at least the coefficient of the frequency point necessary for reducing noise is recorded instead of the coefficient for all frequencies of 513 points. For example, the noise parameter recording unit 205 may record coefficients for each of the frequency spectra of 20 Hz to 20 kHz, which are considered to be typical audible frequencies, as noise parameters, and may not record coefficients of other frequency spectra. Further, for example, as a noise parameter, the coefficient for the frequency spectrum in which the value of the coefficient is zero may not be recorded in the noise parameter recording unit 205.

減算処理部２０７は、Ｌｃｈ＿ＢｅｆｏｒｅおよびＲｃｈ＿ＢｅｆｏｒｅからＮＬおよびＮＲをそれぞれ減算する。例えば、減算処理部２０７はＬｃｈ＿ＢｅｆｏｒｅからＮＬを減算するＬ減算器２０７ａ、およびＲｃｈ＿ＢｅｆｏｒｅからＮＲを減算するＲ減算器２０７ｂを有する。Ｌ減算器２０７ａはＬｃｈ＿ＢｅｆｏｒｅからＮＬを減算し、Ｌｃｈ＿Ａｆｔｅｒ［０］～Ｌｃｈ＿Ａｆｔｅｒ［５１２］の５１３ポイントの配列データを出力する。Ｒ減算器２０７ｂはＲｃｈ＿ＢｅｆｏｒｅからＮＲを減算し、Ｒｃｈ＿Ａｆｔｅｒ［０］～Ｒｃｈ＿Ａｆｔｅｒ［５１２］の５１３ポイントの配列データを出力する。本実施例では、減算処理部２０７はスペクトルサブトラクション法によって減算処理を実行する。 The subtraction processing unit 207 subtracts NL and NR from Lch_Before and Rch_Before, respectively. For example, the subtraction processing unit 207 has an L subtractor 207a that subtracts NL from Lch_Before, and an R subtractor 207b that subtracts NR from Rch_Before. The L subtractor 207a subtracts NL from Lch_Before and outputs 513 point sequence data of Lch_After [0] to Lch_After [512]. The R subtractor 207b subtracts NR from Rch_Before and outputs 513 point sequence data of Rch_After [0] to Rch_After [512]. In this embodiment, the subtraction processing unit 207 executes the subtraction processing by the spectral subtraction method.

ｉＦＦＴ部２０８は、減算処理部２０７から入力された周波数領域のデジタル音声信号を逆高速フーリエ変換（逆フーリエ変換）して時間領域のデジタル音声信号に変換する。 The iFFT unit 208 converts the digital audio signal in the frequency domain input from the subtraction processing unit 207 into a digital audio signal in the time domain by performing an inverse fast Fourier transform (inverse Fourier transform).

音声処理部２０９は、イコライザ、オートレベルコントローラ、およびステレオ感の強調処理等の時間領域のデジタル音声信号に対する音声処理を実行する。音声処理部２０９は、音声処理を行った音声データを揮発性メモリ１０５へ出力する。 The audio processing unit 209 executes audio processing for a digital audio signal in the time domain, such as an equalizer, an auto-level controller, and a stereo feeling enhancement process. The voice processing unit 209 outputs the voice-processed voice data to the volatile memory 105.

ノイズパラメータ更新部２１６は、ノイズパラメータ記録部２０５に記録されているノイズパラメータを更新する。ノイズパラメータ更新部２１６は、Ｌｃｈ＿Ｂｅｆｏｒｅ、Ｒｃｈ＿Ｂｅｆｏｒｅ、Ｎｃｈ＿Ｂｅｆｏｒｅ、およびレンズ制御部１０２から受信したレンズ情報等のデータに基づいて、ノイズパラメータを生成する。そして、ノイズパラメータ更新部２１６は、生成したノイズパラメータを用いて、ノイズパラメータ記録部２０５に記録されているノイズパラメータを更新する。ノイズパラメータ更新部２１６の動作については図１１を用いて後述する。 The noise parameter updating unit 216 updates the noise parameter recorded in the noise parameter recording unit 205. The noise parameter updating unit 216 generates noise parameters based on data such as lens information received from Lch_Before, Rch_Before, Nch_Before, and the lens control unit 102. Then, the noise parameter updating unit 216 updates the noise parameter recorded in the noise parameter recording unit 205 by using the generated noise parameter. The operation of the noise parameter updating unit 216 will be described later with reference to FIG.

なお、本実施例では撮像装置１００は第一のマイクとして２つのマイクを有するが、撮像装置１００は第一のマイクを１つのマイクまたは３つ以上のマイクとしてもよい。例えば撮像装置１００は、音声入力部１０４に第一のマイクとして１つのマイクを有する場合、１つのマイクによって取得された音声データをモノラル方式で記録する。また例えば撮像装置１００は、音声入力部１０４に第一のマイクとして３つ以上のマイクを有する場合、３つ以上のマイクによって取得された音声データをサラウンド方式で記録する。 In this embodiment, the image pickup apparatus 100 has two microphones as the first microphone, but the image pickup apparatus 100 may use the first microphone as one microphone or three or more microphones. For example, when the image pickup apparatus 100 has one microphone as the first microphone in the voice input unit 104, the voice data acquired by one microphone is recorded in a monaural manner. Further, for example, when the image pickup apparatus 100 has three or more microphones as the first microphone in the voice input unit 104, the sound data acquired by the three or more microphones is recorded in a surround system.

なお、本実施例では、Ｌマイク２０１ａ、Ｒマイク２０１ｂ、およびノイズマイク２０１ｃは無指向性のマイクとしたが、これらのマイクは指向性マイクであってもよい。 In this embodiment, the L microphone 201a, the R microphone 201b, and the noise microphone 201c are omnidirectional microphones, but these microphones may be directional microphones.

＜音声入力部１０４のマイクの配置＞
ここで、本実施例の音声入力部１０４のマイクの配置例を説明する。図４はＬマイク２０１ａ、Ｒマイク２０１ｂ、およびノイズマイク２０１ｃの配置例を示している。 <Arrangement of microphones in voice input unit 104>
Here, an example of arranging the microphone of the voice input unit 104 of this embodiment will be described. FIG. 4 shows an arrangement example of the L microphone 201a, the R microphone 201b, and the noise microphone 201c.

図４は、Ｌマイク２０１ａ、Ｒマイク２０１ｂ、およびノイズマイク２０１ｃが取り付けられた撮像装置１００の部分の断面図の一例である。この撮像装置１００の部分は、外装部３０２、マイクブッシュ３０３、および固定部３０４により構成される。 FIG. 4 is an example of a cross-sectional view of a portion of the image pickup apparatus 100 to which the L microphone 201a, the R microphone 201b, and the noise microphone 201c are attached. The portion of the image pickup apparatus 100 is composed of an exterior portion 302, a microphone bush 303, and a fixed portion 304.

外装部３０２は、マイクに環境音を入力するための穴（以下、マイク穴という）を有する。本実施例では、マイク穴はＬマイク２０１ａ、およびＲマイク２０１ｂの上方に形成される。一方、ノイズマイク２０１ｃは、撮像装置１００の筐体内および光学レンズ３００の筐体内において発生する駆動音を取得するために設けられており、環境音を取得する必要はない。したがって、本実施例では、外装部３０２にはノイズマイク２０１ｃの上方にマイク穴は形成されない。 The exterior portion 302 has a hole (hereinafter referred to as a microphone hole) for inputting an environmental sound into the microphone. In this embodiment, the microphone hole is formed above the L microphone 201a and the R microphone 201b. On the other hand, the noise microphone 201c is provided to acquire the driving sound generated in the housing of the image pickup apparatus 100 and the housing of the optical lens 300, and it is not necessary to acquire the environmental sound. Therefore, in this embodiment, the microphone hole is not formed above the noise microphone 201c in the exterior portion 302.

撮像装置１００の筐体内および光学レンズ３００の筐体内において発生する駆動音は、マイク穴を介してＬマイク２０１ａ、およびＲマイク２０１ｂにより取得される。環境音が小さい状態で撮像装置１００および光学レンズ３００の筐体内において駆動音等が発生した場合、各マイクが取得する音声は、主としてこの駆動音となる。そのため、Ｌマイク２０１ａ、Ｒマイク２０１ｂからの音声レベルよりも、ノイズマイク２０１ｃからの音声レベルの方が大きい。つまり、この場合、各マイクから出力される音声信号のレベルの関係は、以下のようになる。
Ｌｃｈ≒Ｒｃｈ＜Ｎｃｈ The drive sound generated in the housing of the image pickup apparatus 100 and the housing of the optical lens 300 is acquired by the L microphone 201a and the R microphone 201b through the microphone holes. When a driving sound or the like is generated in the housing of the image pickup apparatus 100 and the optical lens 300 in a state where the environmental sound is small, the sound acquired by each microphone is mainly the driving sound. Therefore, the sound level from the noise microphone 201c is higher than the sound level from the L microphone 201a and the R microphone 201b. That is, in this case, the relationship between the levels of the audio signals output from each microphone is as follows.
Lch ≒ Rch <Nch

また、環境音が大きくなると、マイク２０１ｃからの、撮像装置１００または光学レンズ３００で発生した駆動音の音声レベルよりも、Ｌマイク２０１ａ、Ｒマイク２０１ｂからの環境音の音声レベルの方が大きくなる。そのため、この場合、各マイクから出力される音声信号のレベルの関係は、以下のようになる。
Ｌｃｈ≒Ｒｃｈ＞Ｎｃｈ Further, when the environmental sound becomes louder, the sound level of the environmental sound from the L microphone 201a and the R microphone 201b becomes louder than the sound level of the drive sound generated by the image pickup device 100 or the optical lens 300 from the microphone 201c. .. Therefore, in this case, the relationship between the levels of the audio signals output from each microphone is as follows.
Lch ≒ Rch> Nch

なお、本実施例では、外装部３０２に形成されるマイク穴の形状は楕円状であるが、円状または方形状等の他の形状でもよい。また、マイク２０１ａ上のマイク穴の形状とマイク２０１ｂ上のマイク穴の形状とは、互いに異なっていてもよい。 In this embodiment, the shape of the microphone hole formed in the exterior portion 302 is elliptical, but it may be another shape such as a circular shape or a square shape. Further, the shape of the microphone hole on the microphone 201a and the shape of the microphone hole on the microphone 201b may be different from each other.

なお、本実施例では、ノイズマイク２０１ｃは、Ｌマイク２０１ａとＲマイク２０１ｂに近接するように配置される。また、本実施例では、ノイズマイク２０１ｃは、Ｌマイク２０１ａとＲマイク２０１ｂの間に配置される。これにより、撮像装置１００の筐体内および光学レンズ３００の筐体内において発生する駆動音等からノイズマイク２０１ｃによって生成される音声信号は、この駆動音等からＬマイク２０１ａおよびＲマイク２０１ｂによって生成される音声信号と似た信号になる。 In this embodiment, the noise microphone 201c is arranged so as to be close to the L microphone 201a and the R microphone 201b. Further, in this embodiment, the noise microphone 201c is arranged between the L microphone 201a and the R microphone 201b. As a result, the audio signal generated by the noise microphone 201c from the drive sound or the like generated in the housing of the image pickup apparatus 100 and the housing of the optical lens 300 is generated by the L microphone 201a and the R microphone 201b from the drive sound or the like. It becomes a signal similar to an audio signal.

マイクブッシュ３０３は、Ｌマイク２０１ａ、Ｒマイク２０１ｂ、およびノイズマイク２０１ｃを固定するための部材である。固定部３０４は、マイクブッシュ３０３を外装部３０２に固定する部材である。 The microphone bush 303 is a member for fixing the L microphone 201a, the R microphone 201b, and the noise microphone 201c. The fixing portion 304 is a member that fixes the microphone bush 303 to the exterior portion 302.

なお、本実施例では、外装部３０２および固定部３０４はＰＣ材等のモールド部材で構成される。また、外装部３０２および固定部３０４はアルミまたはステンレス等の金属部材で構成されてもよい。また、本実施例では、マイクブッシュ３０３は、エチレンプロピレンジエンゴム等のゴム材で構成される。 In this embodiment, the exterior portion 302 and the fixing portion 304 are made of a mold member such as a PC material. Further, the exterior portion 302 and the fixing portion 304 may be made of a metal member such as aluminum or stainless steel. Further, in this embodiment, the Mike Busch 303 is made of a rubber material such as ethylene propylene diene rubber.

＜ノイズパラメータ＞
図５はノイズパラメータ記録部２０５に記録されているノイズパラメータの一例である。図５に示すノイズパラメータは、ある１つのレンズから発生するノイズに対するノイズパラメータである。ノイズパラメータは、撮像装置１００の筐体内、および光学レンズ３００の筐体内において発生した駆動音をノイズマイク２０１ｃが取得することにより生成した音声信号を補正するためのパラメータである。図５に示すように、本実施例では、ノイズパラメータ記録部２０５にはＰＬｘおよびＰＲｘが記録されている。本実施例では、駆動音の発生源は光学レンズ３００の筐体内であるとして説明する。光学レンズ３００の筐体内で発生した駆動音はレンズマウント３０１を介して撮像装置１００の筐体内に伝達し、Ｌマイク２０１ａ、Ｒマイク２０１ｂ、およびノイズマイク２０１ｃによって取得される。 <Noise parameter>
FIG. 5 is an example of noise parameters recorded in the noise parameter recording unit 205. The noise parameter shown in FIG. 5 is a noise parameter for noise generated from a certain lens. The noise parameter is a parameter for correcting the audio signal generated by the noise microphone 201c acquiring the driving sound generated in the housing of the image pickup apparatus 100 and the housing of the optical lens 300. As shown in FIG. 5, in this embodiment, PLx and PRx are recorded in the noise parameter recording unit 205. In this embodiment, the source of the driving sound will be described as being inside the housing of the optical lens 300. The drive sound generated in the housing of the optical lens 300 is transmitted to the housing of the image pickup apparatus 100 via the lens mount 301, and is acquired by the L microphone 201a, the R microphone 201b, and the noise microphone 201c.

駆動音の種類によって、駆動音の周波数が異なる。そのため、本実施例では、撮像装置１００は、駆動音（ノイズ）の種類に対応した複数のノイズパラメータを記録する。そして、これら複数のノイズパラメータのうちの何れかを用いてノイズデータを生成する。本実施例では、撮像装置１００は、恒常的なノイズとしてホワイトノイズに対するノイズパラメータを記録する。また、撮像装置１００は、例えば光学レンズ３００内のギアがかみ合わさることによって発生する短期的なノイズに対するノイズパラメータを記録する。また、撮像装置１００は、長期的なノイズとして、例えばレンズ３００の筐体内における摺動音に対するノイズパラメータを記録する。他にも、撮像装置１００は光学レンズ３００の種類ごと、並びに、情報取得部１０３によって検出される撮像装置１００の筐体内の温度および撮像装置１００の傾きごとにノイズパラメータを記録してもよい。 The frequency of the drive sound differs depending on the type of drive sound. Therefore, in this embodiment, the image pickup apparatus 100 records a plurality of noise parameters corresponding to the types of drive sounds (noise). Then, noise data is generated using any one of these plurality of noise parameters. In this embodiment, the image pickup apparatus 100 records a noise parameter with respect to white noise as constant noise. Further, the image pickup apparatus 100 records noise parameters for short-term noise generated by, for example, meshing of gears in the optical lens 300. Further, the image pickup apparatus 100 records noise parameters for sliding noise in the housing of the lens 300, for example, as long-term noise. In addition, the image pickup apparatus 100 may record noise parameters for each type of the optical lens 300, and for each temperature inside the housing of the image pickup apparatus 100 and the inclination of the image pickup apparatus 100 detected by the information acquisition unit 103.

＜ノイズデータの生成方法＞
図６および図７を用いて、ノイズデータ生成部２０４におけるノイズデータの生成処理を説明する。ここではＬｃｈのデータに関するノイズデータの生成処理について説明するが、Ｒｃｈのデータに関するノイズデータの生成方法も同様である。 <How to generate noise data>
The noise data generation process in the noise data generation unit 204 will be described with reference to FIGS. 6 and 7. Here, the noise data generation process for the Lch data will be described, but the noise data generation method for the Rch data is also the same.

まず、環境音がないと見なせる状況において、ノイズパラメータを生成する処理について説明する。図６（ａ）は、環境音がないと見なせる状況において光学レンズ３００の筐体内で駆動音が発生した場合におけるＬｃｈ＿Ｂｅｆｏｒｅの周波数スペクトルの一例である。図６（ｂ）は、環境音がないと見なせる状況において光学レンズ３００の筐体内で駆動音が発生した場合におけるＮｃｈ＿Ｂｅｆｏｒｅの周波数スペクトルの一例である。横軸は０ポイント目から５１２ポイント目までの周波数を示す軸、縦軸は周波数スペクトルの振幅を示す軸である。 First, a process of generating a noise parameter will be described in a situation where it can be considered that there is no environmental sound. FIG. 6A is an example of the frequency spectrum of Lch_Before when a driving sound is generated in the housing of the optical lens 300 in a situation where it can be considered that there is no environmental sound. FIG. 6B is an example of the frequency spectrum of Nch_Before when a driving sound is generated in the housing of the optical lens 300 in a situation where it can be considered that there is no environmental sound. The horizontal axis is the axis showing the frequency from the 0th point to the 512th point, and the vertical axis is the axis showing the amplitude of the frequency spectrum.

環境音がないと見なせる状況のため、Ｌｃｈ＿ＢｅｆｏｒｅおよびＮｃｈ＿Ｂｅｆｏｒｅでは、同じ周波数帯域の周波数スペクトルの振幅が大きくなる。また、光学レンズ３００の筐体内において駆動音が発生しているため、同じ駆動音に対する各周波数スペクトルの振幅はＬｃｈ＿ＢｅｆｏｒｅよりもＮｃｈ＿Ｂｅｆｏｒｅのほうが大きい傾向になる。 In Lch_Before and Nch_Before, the amplitude of the frequency spectrum in the same frequency band becomes large because the situation can be regarded as having no environmental sound. Further, since the driving sound is generated in the housing of the optical lens 300, the amplitude of each frequency spectrum with respect to the same driving sound tends to be larger in Nch_Before than in Lch_Before.

図６（ｃ）は本実施例におけるＰＬｘの一例である。本実施例では、ＰＬｘは、環境音の小さい状況において、Ｌｃｈ＿Ｂｅｆｏｒｅの各周波数スペクトルの振幅をＮｃｈ＿Ｂｅｆｏｒｅの各周波数スペクトルの振幅で除算したことによって算出された各周波数スペクトルの係数である。この除算の結果を、Ｌｃｈ＿Ｂｅｆｏｒｅ／Ｎｃｈ＿Ｂｅｆｏｒｅと記載する。すなわち、ＰＬｘはＬｃｈ＿ＢｅｆｏｒｅおよびＮｃｈ＿Ｂｅｆｏｒｅの振幅の比である。ノイズパラメータ記録部２０５は、Ｌｃｈ＿Ｂｅｆｏｒｅ／Ｎｃｈ＿Ｂｅｆｏｒｅの値をノイズパラメータＰＬｘとして記録している。前述のように、同じ駆動音に対する周波数スペクトルの振幅はＬｃｈ＿ＢｅｆｏｒｅよりもＮｃｈ＿Ｂｅｆｏｒｅのほうが大きい傾向にあるため、ノイズパラメータＰＬｘの各係数の値は１よりも小さい値になる傾向になる。ただし、Ｎｃｈ＿Ｂｅｆｏｒｅ［ｎ］の値が所定の閾値より小さい場合、ノイズパラメータ記録部２０５はＰＬｘ［ｎ］＝０としてノイズパラメータＰＬｘを記録する。 FIG. 6C is an example of PLx in this embodiment. In this embodiment, PLx is a coefficient of each frequency spectrum calculated by dividing the amplitude of each frequency spectrum of Lch_Before by the amplitude of each frequency spectrum of Nch_Before in a situation where the environmental sound is small. The result of this division is described as Lch_Before / Nch_Before. That is, PLx is the ratio of the amplitudes of Lch_Before and Nch_Before. The noise parameter recording unit 205 records the value of Lch_Before / Nch_Before as the noise parameter PLx. As described above, since the amplitude of the frequency spectrum for the same drive sound tends to be larger in Nch_Before than in Lch_Before, the value of each coefficient of the noise parameter PLx tends to be smaller than 1. However, when the value of Nch_Before [n] is smaller than a predetermined threshold value, the noise parameter recording unit 205 records the noise parameter PLx with PLx [n] = 0.

次に、生成されたノイズパラメータをＮｃｈ＿Ｂｅｆｏｒｅに適用する処理について説明する。図７（ａ）は環境音が存在している状況において光学レンズ３００の筐体内で駆動音が発生した場合におけるＬｃｈ＿Ｂｅｆｏｒｅの周波数スペクトルの一例である。図７（ｂ）は環境音が存在している状況において光学レンズ３００の筐体内で駆動音が発生した場合におけるＮｃｈ＿Ｂｅｆｏｒｅの周波数スペクトルの一例である。横軸は０ポイント目から５１２ポイント目までの周波数を示す軸、縦軸は周波数スペクトルの振幅を示す軸である。 Next, the process of applying the generated noise parameter to Nch_Before will be described. FIG. 7A is an example of the frequency spectrum of Lch_Before when the driving sound is generated in the housing of the optical lens 300 in the presence of the environmental sound. FIG. 7B is an example of the frequency spectrum of Nch_Before when the driving sound is generated in the housing of the optical lens 300 in the presence of the environmental sound. The horizontal axis is the axis showing the frequency from the 0th point to the 512th point, and the vertical axis is the axis showing the amplitude of the frequency spectrum.

図７（ｃ）は環境音が存在している状況において光学レンズ３００の筐体内で駆動音が発生した場合におけるＮＬの一例である。ノイズデータ生成部２０４は、Ｎｃｈ＿Ｂｅｆｏｒｅの各周波数スペクトルと、ＰＬｘの各係数とを乗算し、ＮＬを生成する。ＮＬは、このように生成された周波数スペクトルである。 FIG. 7C is an example of NL in the case where the driving sound is generated in the housing of the optical lens 300 in the presence of the environmental sound. The noise data generation unit 204 multiplies each frequency spectrum of Nch_Before by each coefficient of PLx to generate NL. NL is the frequency spectrum thus generated.

図７（ｄ）は環境音が存在している状況において光学レンズ３００の筐体内で駆動音が発生した場合におけるＬｃｈ＿Ａｆｔｅｒの一例である。減算処理部２０７は、Ｌｃｈ＿ＢｅｆｏｒｅからＮＬを減算し、Ｌｃｈ＿Ａｆｔｅｒを生成する。Ｌｃｈ＿Ａｆｔｅｒは、このように生成された周波数スペクトルである。 FIG. 7D is an example of Lch_After when a driving sound is generated in the housing of the optical lens 300 in a situation where an environmental sound is present. The subtraction processing unit 207 subtracts NL from Lch_Before to generate Lch_After. Lch_After is a frequency spectrum thus generated.

これにより、撮像装置１００は、光学レンズ３００の筐体内の駆動音が原因であるノイズを低減し、ノイズの少ない環境音を記録することができる。 As a result, the image pickup apparatus 100 can reduce the noise caused by the driving sound in the housing of the optical lens 300 and record the environmental sound with less noise.

＜ノイズパラメータ選択部２０６の説明＞
図８は、ノイズパラメータ選択部２０６の詳細な構成の一例を示すブロック図である。 <Explanation of noise parameter selection unit 206>
FIG. 8 is a block diagram showing an example of a detailed configuration of the noise parameter selection unit 206.

ノイズパラメータ選択部２０６には、Ｌｃｈ＿Ｂｅｆｏｒｅ、Ｒｃｈ＿Ｂｅｆｏｒｅ、Ｎｃｈ＿Ｂｅｆｏｒｅ、レンズ情報、およびレンズ制御信号が入力される。 Lch_Before, Rch_Before, Nch_Before, lens information, and a lens control signal are input to the noise parameter selection unit 206.

Ｎｃｈノイズ検出部２０６１は、光学レンズ３００の筐体内で発生した駆動音よるノイズをＮｃｈ＿Ｂｅｆｏｒｅから検出する。なお、本実施例では、Ｎｃｈノイズ検出部２０６１はＮｃｈ＿Ｂｅｆｏｒｅの５１３ポイントのデータを利用してノイズを検出する。 The Nch noise detection unit 2061 detects noise due to the drive sound generated in the housing of the optical lens 300 from Nch_Before. In this embodiment, the Nch noise detection unit 2061 detects noise by using the data of 513 points of Nch_Before.

環境音検出部２０６２は、環境音のレベルをＬｃｈ＿ＢｅｆｏｒｅおよびＲｃｈ＿Ｂｅｆｏｒｅから検出する。 The environmental sound detection unit 2062 detects the level of the environmental sound from Lch_Before and Rch_Before.

ノイズ判定部２０６３は、レンズ情報およびレンズ制御信号、Ｎｃｈノイズ検出部２０６１から入力されるデータ、ならびに環境音検出部から入力されるデータに基づいて、ノイズデータ生成部２０４が用いるノイズパラメータを決める。ノイズ判定部２０６３は、決定したノイズパラメータの種類を示すデータをノイズデータ生成部２０４およびノイズパラメータ更新部２１６に出力する。 The noise determination unit 2063 determines the noise parameters used by the noise data generation unit 204 based on the lens information and the lens control signal, the data input from the Nch noise detection unit 2061, and the data input from the environmental sound detection unit. The noise determination unit 2063 outputs data indicating the determined type of noise parameter to the noise data generation unit 204 and the noise parameter update unit 216.

Ｎｃｈ微分部２０６４はＮｃｈ＿Ｂｅｆｏｒｅに対して微分処理を実行する。Ｎｃｈ微分部２０６４はＮｃｈ＿Ｂｅｆｏｒｅを微分処理した結果を示すデータを短期雑音検出部２０６５に出力する。短期雑音検出部２０６５は、Ｎｃｈ微分部２０６４から入力されたデータに基づいて、短期的なノイズが発生しているか否かを検出する。短期雑音検出部２０６５は、短期的なノイズが発生しているか否かを示すデータをノイズ判定部２０６３に出力する。なお、Ｎｃｈ微分部２０６４および短期雑音検出部２０６５はＮｃｈノイズ検出部２０６１に含まれる。 The Nch differential unit 2064 executes the differential process on Nch_Before. The Nch differentiation unit 2064 outputs data indicating the result of differential processing of Nch_Before to the short-term noise detection unit 2065. The short-term noise detection unit 2065 detects whether or not short-term noise is generated based on the data input from the Nch differentiation unit 2064. The short-term noise detection unit 2065 outputs data indicating whether or not short-term noise is generated to the noise determination unit 2063. The Nch differentiation unit 2064 and the short-term noise detection unit 2065 are included in the Nch noise detection unit 2061.

Ｎｃｈ積分部２０６６は、Ｎｃｈ＿Ｂｅｆｏｒｅに対して積分処理を実行する。Ｎｃｈ積分部２０６６はＮｃｈ＿Ｂｅｆｏｒｅを微分処理した結果を示すデータを長期雑音検出部２０６７に出力する。長期雑音検出部２０６７は、Ｎｃｈ積分部２０６６から入力されたデータに基づいて、長期的なノイズが発生しているか否かを検出する。長期雑音検出部２０６７は、長期的なノイズが発生しているか否かを示すデータをノイズ判定部２０６３に出力する。なお、Ｎｃｈ積分部２０６６および長期雑音検出部２０６７はＮｃｈノイズ検出部２０６１に含まれる。 The Nch integration unit 2066 executes an integration process on Nch_Before. The Nch integration unit 2066 outputs data indicating the result of differential processing of Nch_Before to the long-term noise detection unit 2067. The long-term noise detection unit 2067 detects whether or not long-term noise is generated based on the data input from the Nch integration unit 2066. The long-term noise detection unit 2067 outputs data indicating whether or not long-term noise is generated to the noise determination unit 2063. The Nch integration unit 2066 and the long-term noise detection unit 2067 are included in the Nch noise detection unit 2061.

環境音抽出部２０６８は、環境音を抽出する。本実施例では、環境音抽出部２０６８はノイズラメータに基づいて、ノイズの影響が少ない周波数のデータを抽出する。例えば、環境音抽出部２０６８はノイズパラメータが所定の値以下である周波数のデータを抽出する。そして、環境音抽出部２０６８は抽出した周波数のデータに基づいて、環境音の大きさを示すデータを出力する。なお、環境音抽出部２０６８は環境音検出部２０６２に含まれる。 The environmental sound extraction unit 2068 extracts the environmental sound. In this embodiment, the environmental sound extraction unit 2068 extracts frequency data that is less affected by noise based on the noise meter. For example, the environmental sound extraction unit 2068 extracts frequency data having a noise parameter of a predetermined value or less. Then, the environmental sound extraction unit 2068 outputs data indicating the magnitude of the environmental sound based on the extracted frequency data. The environmental sound extraction unit 2068 is included in the environmental sound detection unit 2062.

環境音判定部２０６９は、環境音の大きさを判定する。環境音判定部２０６９は、判定した環境音の大きさを示すデータをＮｃｈノイズ検出部２０６１およびノイズ判定部２０６３に入力する。Ｎｃｈノイズ検出部２０６１は、環境音判定部２０６９から入力された環境音の大きさを示すデータに基づいて、後述する第一の閾値および第二の閾値を変更する。なお、環境音判定部２０６９は環境音検出部２０６２に含まれる。 The environmental sound determination unit 2069 determines the loudness of the environmental sound. The environmental sound determination unit 2069 inputs data indicating the loudness of the determined environmental sound to the Nch noise detection unit 2061 and the noise determination unit 2063. The Nch noise detection unit 2061 changes the first threshold value and the second threshold value, which will be described later, based on the data indicating the loudness of the environmental sound input from the environmental sound determination unit 2069. The environmental sound determination unit 2069 is included in the environmental sound detection unit 2062.

＜ノイズ低減処理のタイミングチャート＞
本実施例におけるノイズ低減処理に関して、図９を用いて説明する。 <Timing chart of noise reduction processing>
The noise reduction processing in this embodiment will be described with reference to FIG.

図９（ａ）～（ｉ）はノイズデータ生成部２０４、ノイズパラメータ選択部２０６、および減算処理部２０７における音声処理のタイミングチャートの一例である。本実施例では説明の簡易化のため、Ｌｃｈの音声処理について説明するが、Ｒｃｈの音声処理も同様である。図９（ａ）～（ｉ）におけるグラフの横軸はすべて時間軸である。 9 (a) to 9 (i) are examples of timing charts of voice processing in the noise data generation unit 204, the noise parameter selection unit 206, and the subtraction processing unit 207. In this embodiment, the Lch voice processing will be described for the sake of simplification of the description, but the same applies to the Rch voice processing. The horizontal axes of the graphs in FIGS. 9 (a) to 9 (i) are all time axes.

図９（ａ）はレンズ制御信号の一例を示す。レンズ制御信号はレンズ制御部１０２が光学レンズ３００に駆動するよう指示する信号である。本実施例では、レンズ制御信号のレベルはＨｉｇｈとＬｏｗの２値で表される。レンズ制御信号のレベルがＨｉｇｈである場合、レンズ制御部１０２は光学レンズ３００に駆動するよう指示している状態である。レンズ制御信号のレベルがＬｏｗである場合、レンズ制御部１０２は光学レンズ３００に駆動を指示していない状態である。 FIG. 9A shows an example of a lens control signal. The lens control signal is a signal instructing the lens control unit 102 to drive the optical lens 300. In this embodiment, the level of the lens control signal is represented by two values, High and Low. When the level of the lens control signal is High, the lens control unit 102 is instructing the optical lens 300 to drive the lens 300. When the level of the lens control signal is Low, the lens control unit 102 is in a state in which the optical lens 300 is not instructed to drive.

図９（ｂ）はＬｃｈ＿Ｂｅｆｏｒｅ［ｎ］の値の一例を示すグラフである。縦軸はＬｃｈ＿Ｂｅｆｏｒｅ［ｎ］の値を示す軸である。本実施例では、Ｌｃｈ＿Ｂｅｆｏｒｅ［ｎ］はＦＦＴ部２０３から出力されるＬｃｈ＿Ｂｅｆｏｒｅのうち、光学レンズ３００の駆動音を示す信号が特徴的に表れるｎ番目の周波数ポイントの信号である。なお、本実施例では、ｎ番目の周波数ポイントの信号について説明するが、ほかの周波数に対しても同様に音声処理が実行される。また、信号Ｘおよび信号Ｙで示す信号はノイズが含まれる信号である。本実施例では信号Ｘは短期的なノイズが含まれる信号を示す。信号Ｙは長期的なノイズが含まれるノイズ信号を示す。 FIG. 9B is a graph showing an example of the value of Lch_Before [n]. The vertical axis is an axis indicating the value of Lch_Before [n]. In this embodiment, Lch_Before [n] is a signal at the nth frequency point in which the signal indicating the driving sound of the optical lens 300 is characteristically displayed among the Lch_Before output from the FFT unit 203. In this embodiment, the signal at the nth frequency point will be described, but voice processing is similarly executed for other frequencies. Further, the signals represented by the signal X and the signal Y are signals containing noise. In this embodiment, the signal X indicates a signal including short-term noise. The signal Y indicates a noise signal including long-term noise.

図９（ｃ）は環境音抽出部２０６８において抽出された環境音の大きさの一例を示すグラフである。縦軸は取得された環境音から生成された音声信号のレベルを示す。閾値Ｔｈ１および閾値Ｔｈ２は、環境音判定部２０６９において用いられる２つの閾値である。 FIG. 9C is a graph showing an example of the magnitude of the environmental sound extracted by the environmental sound extraction unit 2068. The vertical axis shows the level of the audio signal generated from the acquired environmental sound. The threshold value Th1 and the threshold value Th2 are two threshold values used in the environmental sound determination unit 2069.

図９（ｄ）はＮｃｈ＿Ｂｅｆｏｒｅ［ｎ］の値の一例を示すグラフである。Ｎｃｈ＿Ｂｅｆｏｒｅ［ｎ］は、ＦＦＴ部２０３から出力されるＮｃｈ＿Ｂｅｆｏｒｅのうち、光学レンズ３００の駆動音を示す信号が特徴的に表れるｎ番目の周波数ポイントの信号である。縦軸は、Ｎｃｈ＿Ｂｅｆｏｒｅ［ｎ］の値を示す軸である。Ｎｃｈ＿Ｂｅｆｏｒｅ［ｎ］には、図９（ｂ）における、信号Ｘおよび信号Ｙで示したノイズ信号がＬｃｈ＿Ｂｅｆｏｒｅよりも特徴的に表れる。 FIG. 9D is a graph showing an example of the value of Nch_Before [n]. Nch_Before [n] is a signal at the nth frequency point in which the signal indicating the driving sound of the optical lens 300 is characteristically displayed among the Nch_Before output from the FFT unit 203. The vertical axis is an axis indicating the value of Nch_Before [n]. In Nch_Before [n], the noise signal shown by the signal X and the signal Y in FIG. 9B appears more characteristically than in Lch_Before.

図９（ｅ）はＮｄｉｆｆ［ｎ］の値の一例を示すグラフである。Ｎｄｉｆｆ［ｎ］は、Ｎｃｈ微分部２０６４から出力されるＮｄｉｆｆのうち、ｎ番目の周波数ポイントの信号の値を示したものである。縦軸は、Ｎｄｉｆｆ［ｎ］の値を示す軸である。Ｎｃｈ＿Ｂｅｆｏｒｅ［ｎ］の所定時間あたりの値の変化量が大きい場合、Ｎｄｉｆｆ［ｎ］の値が大きくなる。短期雑音検出部２０６５は、短期的なノイズを検出するために、第一の閾値である閾値Ｔｈ＿Ｎｄｉｆｆ［ｎ］を持つ。閾値Ｔｈ＿Ｎｄｉｆｆ［ｎ］は、環境音判定部２０６９から入力された環境音の大きさを示すデータおよびレンズ制御信号に基づいて、レベル１～３の間で変化する。閾値Ｔｈ＿Ｎｄｉｆｆ［ｎ］の初期値はレベル２とする。また閾値Ｔｈ＿Ｎｄｉｆｆ［ｎ］のレベルは横の破線で表される。 FIG. 9 (e) is a graph showing an example of the value of Ndiff [n]. Ndiff [n] indicates the value of the signal at the nth frequency point in the Ndiff output from the Nch differential unit 2064. The vertical axis is an axis indicating the value of Ndiff [n]. When the amount of change in the value of Nch_Before [n] per predetermined time is large, the value of Ndiff [n] becomes large. The short-term noise detection unit 2065 has a threshold value Th_Ndiff [n], which is a first threshold value, in order to detect short-term noise. The threshold value Th_Ndiff [n] changes between levels 1 to 3 based on the data indicating the magnitude of the environmental sound input from the environmental sound determination unit 2069 and the lens control signal. The initial value of the threshold value Th_Ndiff [n] is level 2. The level of the threshold value Th_Ndiff [n] is represented by a horizontal broken line.

図９（ｆ）はＮｉｎｔ［ｎ］の値の一例を示すグラフである。本実施例では、Ｎｉｎｔ［ｎ］は、Ｎｃｈ積分部２０６６から出力されるＮｉｎｔのうち、ｎ番目の周波数ポイントの信号の値を示したものである。縦軸は、Ｎｉｎｔ［ｎ］の値を示す軸である。Ｎｃｈ＿Ｂｅｆｏｒｅ［ｎ］が継続的に大きい場合、Ｎｉｎｔ［ｎ］の値が大きくなる。長期雑音検出部２０６７は、長期的なノイズを検出するために、第二の閾値である閾値Ｔｈ＿Ｎｉｎｔ［ｎ］を持つ。閾値Ｔｈ＿Ｎｉｎｔ［ｎ］は、環境音判定部２０６９から入力された環境音の大きさを示すデータおよびレンズ制御信号に基づいてレベル１～３の間で変化する。閾値Ｔｈ＿Ｎｉｎｔ［ｎ］の初期値はレベル２とする。また閾値Ｔｈ＿Ｎｉｎｔ［ｎ］のレベルは横の破線で表される。 FIG. 9 (f) is a graph showing an example of the value of Nint [n]. In this embodiment, Nint [n] indicates the value of the signal at the nth frequency point of the Nint output from the Nch integrating unit 2066. The vertical axis is an axis indicating the value of Nint [n]. When Nch_Before [n] is continuously large, the value of Nint [n] becomes large. The long-term noise detection unit 2067 has a threshold value Th_Nint [n], which is a second threshold value, in order to detect long-term noise. The threshold value Th_Nint [n] changes between levels 1 to 3 based on the data indicating the loudness of the environmental sound input from the environmental sound determination unit 2069 and the lens control signal. The initial value of the threshold value Th_Nint [n] is level 2. The level of the threshold value Th_Nint [n] is represented by a horizontal broken line.

図９（ｇ）はノイズパラメータ選択部２０６によって選択されたノイズパラメータの一例を表す。本実施例では、無地部はＰＬ１のノイズパラメータのみが選択されていることを示す。斜線部はＰＬ１およびＰＬ２のノイズパラメータが選択されていることを示す。格子縞部はＰＬ１およびＰＬ３のノイズパラメータが選択されていることを示す。 FIG. 9 (g) shows an example of the noise parameter selected by the noise parameter selection unit 206. In this embodiment, the plain portion indicates that only the noise parameter of PL1 is selected. The shaded area indicates that the noise parameters of PL1 and PL2 are selected. The plaid portion indicates that the noise parameters of PL1 and PL3 are selected.

図９（ｈ）はＮＬ［ｎ］の値の一例を示すグラフである。本実施例では、ＮＬ［ｎ］は、ノイズデータ生成部２０４で生成されるＮＬのうち、ｎ番目の周波数ポイントの信号の値を示したものである。縦軸は、ＮＬ［ｎ］の値を示す軸である。 FIG. 9 (h) is a graph showing an example of the value of NL [n]. In this embodiment, NL [n] indicates the value of the signal at the nth frequency point among the NLs generated by the noise data generation unit 204. The vertical axis is an axis indicating the value of NL [n].

図９（ｉ）はＬｃｈ＿Ａｆｔｅｒ［ｎ］の値の一例を示すグラフである。本実施例では、Ｌｃｈ＿Ａｆｔｅｒ［ｎ］は、減算処理部２０７から出力されるＬｃｈ＿Ａｆｔｅｒのうち、ｎ番目の周波数ポイントの信号の値を示したものである。縦軸は、Ｌｃｈ＿Ａｆｔｅｒ［ｎ］の値を示す軸である。 FIG. 9 (i) is a graph showing an example of the value of Lch_After [n]. In this embodiment, Lch_After [n] indicates the value of the signal at the nth frequency point of the Lch_After output from the subtraction processing unit 207. The vertical axis is an axis indicating the value of Lch_After [n].

次にそれぞれの動作に関してタイミングを時刻ｔ７０１～ｔ７０９を用いて説明する。 Next, the timing of each operation will be described using the times t701 to t709.

時刻ｔ７０１において、レンズ制御部１０２は光学レンズ３００およびノイズパラメータ選択部２０６に、レンズ制御信号としてＨｉｇｈの信号を出力する（図９（ａ））。時刻ｔ７０１において、光学レンズ３００の筐体内で駆動音が発生する可能性が高いため、短期雑音検出部２０６５は閾値Ｔｈ＿Ｎｄｉｆｆ［ｎ］をレベル１に下げる（図９（ｅ））。また時刻ｔ７０１において、光学レンズ３００の筐体内で駆動音が発生する可能性が高いため、長期雑音検出部２０６７は閾値Ｔｈ＿Ｎｉｎｔ［ｎ］をレベル１に下げる（図９（ｆ））。 At time t701, the lens control unit 102 outputs a High signal as a lens control signal to the optical lens 300 and the noise parameter selection unit 206 (FIG. 9A). Since there is a high possibility that a driving sound is generated in the housing of the optical lens 300 at time t701, the short-term noise detection unit 2065 lowers the threshold value Th_Ndiff [n] to level 1 (FIG. 9 (e)). Further, at time t701, since there is a high possibility that a driving sound is generated in the housing of the optical lens 300, the long-term noise detection unit 2067 lowers the threshold value Th_Nint [n] to level 1 (FIG. 9 (f)).

時刻ｔ７０２において、光学レンズ３００が駆動し、ギアのかみ合う音などの短期的な駆動音が発生する。ノイズマイク２０１ｃがその短期的な駆動音を収音したことにより、Ｎｄｉｆｆ［ｎ］の値が閾値Ｔｈ＿Ｎｄｉｆｆ［ｎ］を超える（図９（ｅ））。これに応じて、ノイズパラメータ選択部２０６はノイズパラメータＰＬ１およびＰＬ２を選択する（図９（ｇ））。ノイズデータ生成部２０４はＮｃｈ＿Ｂｅｆｏｒｅ［ｎ］、およびノイズパラメータＰＬ１およびＰＬ２に基づいてＮＬ［ｎ］を生成する（図９（ｈ））。減算処理部２０７は、Ｌｃｈ＿Ｂｅｆｏｒｅ［ｎ］からＮＬ［ｎ］を減算し、Ｌｃｈ＿Ａｆｔｅｒ［ｎ］を出力する（図９（ｉ））。この場合、Ｌｃｈ＿Ａｆｔｅｒ［ｎ］は恒常的なノイズおよび短期的なノイズが低減された音声信号になる。 At time t702, the optical lens 300 is driven, and a short-term driving sound such as a gear meshing sound is generated. The value of Ndiff [n] exceeds the threshold value Th_Ndiff [n] due to the noise microphone 201c picking up the short-term drive sound (FIG. 9 (e)). In response to this, the noise parameter selection unit 206 selects the noise parameters PL1 and PL2 (FIG. 9 (g)). The noise data generation unit 204 generates NL [n] based on Nch_Before [n] and the noise parameters PL1 and PL2 (FIG. 9 (h)). The subtraction processing unit 207 subtracts NL [n] from Lch_Before [n] and outputs Lch_After [n] (FIG. 9 (i)). In this case, Lch_After [n] becomes an audio signal in which constant noise and short-term noise are reduced.

時刻ｔ７０３において、光学レンズ３００が連続的な駆動を開始し、光学レンズ３００の筐体内において摺動音などの長期的な駆動音が発生する。ノイズマイク２０１ｃがその長期的な駆動音を収音したことにより、Ｎｉｎｔ［ｎ］の値が閾値Ｔｈ＿Ｎｉｎｔ［ｎ］を超える（図９（ｆ））。これに応じて、ノイズパラメータ選択部２０６はノイズパラメータＰＬ１およびＰＬ３を選択する（図９（ｇ））。ノイズデータ生成部２０４はＮｃｈ＿Ｂｅｆｏｒｅ［ｎ］、および、ノイズパラメータＰＬ１およびＰＬ３に基づいてＮＬ［ｎ］を生成する（図９（ｈ））。減算処理部２０７は、Ｌｃｈ＿Ｂｅｆｏｒｅ［ｎ］からＮＬ［ｎ］を減算し、Ｌｃｈ＿Ａｆｔｅｒ［ｎ］を出力する（図９（ｉ））。この場合、Ｌｃｈ＿Ａｆｔｅｒ［ｎ］は恒常的なノイズおよび長期的なノイズが低減された音声信号になる。 At time t703, the optical lens 300 starts continuous driving, and a long-term driving sound such as a sliding sound is generated in the housing of the optical lens 300. The value of Nint [n] exceeds the threshold value Th_Nint [n] due to the noise microphone 201c picking up the long-term driving sound (FIG. 9 (f)). In response to this, the noise parameter selection unit 206 selects the noise parameters PL1 and PL3 (FIG. 9 (g)). The noise data generation unit 204 generates NL [n] based on Nch_Before [n] and the noise parameters PL1 and PL3 (FIG. 9 (h)). The subtraction processing unit 207 subtracts NL [n] from Lch_Before [n] and outputs Lch_After [n] (FIG. 9 (i)). In this case, Lch_After [n] becomes an audio signal in which constant noise and long-term noise are reduced.

時刻ｔ７０４において、光学レンズ３００が連続的な駆動を停止する。ノイズマイク２０１ｃがその長期的な駆動音を収音しなくなるため、Ｎｉｎｔ［ｎ］の値が閾値Ｔｈ＿Ｎｉｎｔ［ｎ］以下になる（図９（ｆ））。これに応じて、ノイズパラメータ選択部２０６はノイズパラメータＰＬ１を選択する（図９（ｇ））。ノイズデータ生成部２０４は、Ｎｃｈ＿Ｂｅｆｏｒｅ［ｎ］、および、ノイズパラメータＰＬ１に基づいてＮＬ［ｎ］を生成する（図９（ｈ））。減算処理部２０７は、Ｌｃｈ＿Ｂｅｆｏｒｅ［ｎ］からＮＬ［ｎ］を減算し、Ｌｃｈ＿Ａｆｔｅｒ［ｎ］を出力する（図９（ｉ））。この場合、Ｌｃｈ＿Ａｆｔｅｒ［ｎ］は恒常的なノイズが低減された音声信号になる。 At time t704, the optical lens 300 stops continuous driving. Since the noise microphone 201c does not collect the long-term driving sound, the value of Nint [n] becomes equal to or less than the threshold value Th_Nint [n] (FIG. 9 (f)). In response to this, the noise parameter selection unit 206 selects the noise parameter PL1 (FIG. 9 (g)). The noise data generation unit 204 generates NL [n] based on Nch_Before [n] and the noise parameter PL1 (FIG. 9 (h)). The subtraction processing unit 207 subtracts NL [n] from Lch_Before [n] and outputs Lch_After [n] (FIG. 9 (i)). In this case, Lch_After [n] becomes an audio signal with constant noise reduced.

時刻ｔ７０５においてレンズ制御部１０２は、光学レンズ３００およびノイズパラメータ選択部２０６にレンズ制御信号としてＬｏｗの信号を出力する（図９（ａ））。この場合、光学レンズ３００の筐体内において駆動音が発生する可能性が低くなるため、短期雑音検出部２０６５は閾値Ｔｈ＿Ｎｄｉｆｆ［ｎ］をレベル２に上げる（図９（ｅ））。また、この場合、光学レンズ３００の筐体内において駆動音が発生する可能性が低くなるため、長期雑音検出部２０６７は閾値Ｔｈ＿Ｎｉｎｔ［ｎ］をレベル２に上げる（図９（ｆ））。 At time t705, the lens control unit 102 outputs a Low signal as a lens control signal to the optical lens 300 and the noise parameter selection unit 206 (FIG. 9A). In this case, since the possibility that a driving sound is generated in the housing of the optical lens 300 is low, the short-term noise detection unit 2065 raises the threshold value Th_Ndiff [n] to level 2 (FIG. 9 (e)). Further, in this case, since the possibility that the driving sound is generated in the housing of the optical lens 300 is low, the long-term noise detection unit 2067 raises the threshold value Th_Nint [n] to level 2 (FIG. 9 (f)).

時刻ｔ７０６において、環境音抽出部２０６８において抽出された環境音の大きさが閾値Ｔｈ１を超える。環境音が大きい場合、ユーザには音声信号に含まれるノイズが感じられにくくなるため、短期雑音検出部２０６５は閾値Ｔｈ＿Ｎｄｉｆｆ［ｎ］をレベル３に上げる（図９（ｅ））。また、環境音が大きい場合、ユーザには音声信号に含まれるノイズが感じられにくくなるため、長期雑音検出部２０６７は閾値Ｔｈ＿Ｎｉｎｔ［ｎ］をレベル３に上げる（図９（ｆ））。 At time t706, the loudness of the environmental sound extracted by the environmental sound extraction unit 2068 exceeds the threshold value Th1. When the ambient sound is loud, the noise contained in the audio signal is less likely to be perceived by the user, so the short-term noise detection unit 2065 raises the threshold value Th_Ndiff [n] to level 3 (FIG. 9 (e)). Further, when the environmental sound is loud, the noise contained in the audio signal is less likely to be perceived by the user, so that the long-term noise detection unit 2067 raises the threshold value Th_Nint [n] to level 3 (FIG. 9 (f)).

時刻ｔ７０７において、レンズ制御部１０２は光学レンズ３００およびノイズパラメータ選択部２０６に、レンズ制御信号としてＨｉｇｈの信号を出力する（図９（ａ））。この場合、光学レンズ３００の筐体内において駆動音が発生する可能性が高いため、短期雑音検出部２０６５は閾値Ｔｈ＿Ｎｄｉｆｆ［ｎ］をレベル２に下げる（図９（ｅ））。また、この場合、光学レンズ３００の筐体内において駆動音が発生する可能性が高いため、長期雑音検出部２０６７は閾値Ｔｈ＿Ｎｉｎｔ［ｎ］をレベル２に下げる（図９（ｆ））。 At time t707, the lens control unit 102 outputs a High signal as a lens control signal to the optical lens 300 and the noise parameter selection unit 206 (FIG. 9A). In this case, since there is a high possibility that a driving sound is generated in the housing of the optical lens 300, the short-term noise detection unit 2065 lowers the threshold value Th_Ndiff [n] to level 2 (FIG. 9 (e)). Further, in this case, since there is a high possibility that a driving sound is generated in the housing of the optical lens 300, the long-term noise detection unit 2067 lowers the threshold value Th_Nint [n] to level 2 (FIG. 9 (f)).

時刻ｔ７０８において、環境音抽出部２０６８において抽出された環境音の大きさが閾値Ｔｈ２を超える。環境音がさらに大きい場合、ユーザには音声信号に含まれるノイズはほとんど感じられないため、ノイズパラメータ選択部２０６はＮｃｈノイズ検出部２０６１から入力されるデータにかかわらずノイズパラメータＰＬ１のみを選択する。 At time t708, the loudness of the environmental sound extracted by the environmental sound extraction unit 2068 exceeds the threshold value Th2. When the ambient sound is louder, the user hardly feels the noise contained in the voice signal. Therefore, the noise parameter selection unit 206 selects only the noise parameter PL1 regardless of the data input from the Nch noise detection unit 2061.

以上のように、撮像装置１００は第２のマイクであるノイズマイク２０１ｃを利用してノイズ低減処理を実行することで、ノイズが低減された環境音を記録することができる。 As described above, the image pickup apparatus 100 can record the environmental sound with reduced noise by executing the noise reduction processing using the noise microphone 201c which is the second microphone.

なお、本実施例では、撮像装置１００は、光学レンズ３００の筐体内で発生する駆動音を低減したが、撮像装置１００内で発生する駆動音を低減してもよい。撮像装置１００内で発生する駆動音は例えば、基板の音鳴き、および無線電波ノイズである。なお、基板の音鳴きは、例えば基板上のコンデンサに電圧を印加した際に生じる基板のきしみによって発生する音である。 In this embodiment, the image pickup device 100 reduces the drive sound generated in the housing of the optical lens 300, but the drive sound generated in the image pickup device 100 may be reduced. The driving sound generated in the image pickup apparatus 100 is, for example, the noise of the substrate and the radio wave noise. The squeal of the substrate is, for example, a sound generated by a squeak of the substrate generated when a voltage is applied to a capacitor on the substrate.

なお、環境音判定部２０６９の閾値Ｔｈ１および閾値Ｔｈ２、短期雑音検出部２０６５の閾値Ｔｈ＿Ｎｄｉｆｆ［ｎ］、並びに、長期雑音検出部２０６７の閾値Ｔｈ＿Ｎｉｎｔ［ｎ］は発生する駆動音と環境音とに基づいて決定される。そのため、撮像装置１００は、光学レンズ３００の種類および撮像装置１００の傾き等によって、これらの閾値をそれぞれ変更してもよい。 The threshold values Th1 and Th2 of the environmental sound determination unit 2069, the threshold value Th_Ndiff [n] of the short-term noise detection unit 2065, and the threshold value Th_Nint [n] of the long-term noise detection unit 2067 are based on the generated driving sound and the environmental sound. Will be decided. Therefore, the image pickup apparatus 100 may change these threshold values depending on the type of the optical lens 300, the inclination of the image pickup apparatus 100, and the like.

＜レンズごとのノイズパラメータ＞
まず、ユーザが光学レンズ３００を交換した場合において、それぞれのレンズに対して同じノイズパラメータを適用した場合ついて図７および図１０を用いて説明する。図７と図１０とでは、撮像装置１００には互いに異なるレンズが装着されているものとする。なお、図７および図１０では同じ環境音が発生している状況とする。 <Noise parameter for each lens>
First, when the user replaces the optical lens 300, the case where the same noise parameter is applied to each lens will be described with reference to FIGS. 7 and 10. In FIGS. 7 and 10, it is assumed that the image pickup apparatus 100 is equipped with different lenses. In addition, it is assumed that the same environmental sound is generated in FIGS. 7 and 10.

図７（ａ）および図１０（ａ）では、Ｌｃｈ＿Ｂｅｆｏｒｅの各周波数スペクトルの振幅が示されている。図７（ｂ）および図１０（ｂ）では、Ｎｃｈ＿Ｂｅｆｏｒｅの各周波数スペクトルの振幅が示されている。ここで、それぞれの周波数スペクトルを比較すると、少なくとも一部において互いに異なることが読み取れる。これは、Ｌマイク２０１ａ、Ｒマイク２０１ｂ、およびノイズマイク２０１ｃのそれぞれに取得されるノイズが、レンズの形状および構造、並びに、レンズに含まれる駆動体等のノイズ源の数および位置によって異なるためである。 7 (a) and 10 (a) show the amplitude of each frequency spectrum of Lch_Before. 7 (b) and 10 (b) show the amplitude of each frequency spectrum of Nch_Before. Here, when comparing the respective frequency spectra, it can be read that they are different from each other in at least a part. This is because the noise acquired by each of the L microphone 201a, the R microphone 201b, and the noise microphone 201c differs depending on the shape and structure of the lens and the number and position of noise sources such as the drive body included in the lens. be.

図７（ｃ）および図１０（ｃ）では、ＮＬの各周波数スペクトルの振幅が示されている。ここでは、撮像装置１００に装着されているレンズに関わりなく、同じノイズパラメータが利用されてＮＬが生成される。 7 (c) and 10 (c) show the amplitude of each frequency spectrum of NL. Here, the same noise parameter is used to generate the NL regardless of the lens mounted on the image pickup apparatus 100.

図７（ｄ）および図１０（ｄ）では、Ｌｃｈ＿Ａｆｔｅｒの各周波数スペクトルの振幅が示されている。Ｌｃｈ＿Ａｆｔｅｒは環境音として記録される音声信号であるから、理想的には図７（ｄ）および図１０（ｄ）に示される周波数スペクトルは同じになる。しかし、図７（ｄ）および図１０（ｄ）のそれぞれの周波数スペクトルを比較すると、少なくとも一部において互いに異なることが読み取れる。これは、Ｌｃｈ＿Ｂｅｆｏｒｅから減算されるＮＬの周波数スペクトルがそれぞれのレンズで発生したノイズの周波数スペクトルと異なるからである。ＮＬの周波数スペクトルがノイズの周波数スペクトルと異なる理由は、ノイズパラメータがレンズから発生するノイズに合わせて作られたパラメータではなく、ある種類のノイズに対する汎用的なパラメータであるためである。すなわち、種々のレンズから発生するノイズに対して同じノイズパラメータを適用した場合、記録媒体１１０に環境音として記録される音声データは、撮像装置１００に装着されるレンズごとに異なる音声データとなってしまう。 7 (d) and 10 (d) show the amplitude of each frequency spectrum of Lch_After. Since Lch_After is an audio signal recorded as an environmental sound, ideally, the frequency spectra shown in FIGS. 7 (d) and 10 (d) are the same. However, when the frequency spectra of FIGS. 7 (d) and 10 (d) are compared, it can be read that they are different from each other in at least a part. This is because the frequency spectrum of NL subtracted from Lch_Before is different from the frequency spectrum of the noise generated in each lens. The reason why the frequency spectrum of NL is different from the frequency spectrum of noise is that the noise parameter is not a parameter created for the noise generated from the lens, but a general-purpose parameter for a certain kind of noise. That is, when the same noise parameter is applied to the noise generated from various lenses, the audio data recorded as the environmental sound on the recording medium 110 becomes different audio data for each lens mounted on the image pickup apparatus 100. It ends up.

そこで本実施例では、撮像装置１００は、レンズが交換されたことに応じてノイズパラメータを更新することで、ノイズを効果的に低減し、環境音をより正確に記録する。 Therefore, in the present embodiment, the image pickup apparatus 100 effectively reduces the noise and records the environmental sound more accurately by updating the noise parameter according to the replacement of the lens.

＜ノイズパラメータの更新＞
ここで、ノイズパラメータ更新部２１６がノイズパラメータを更新する処理について説明する。本実施例では、撮像装置１００がＰＬｘのノイズパラメータを更新する処理について説明する。なお、ＰＲｘのノイズパラメータの更新処理は、ＰＬｘのノイズパラメータを更新する処理と同様である。 <Update of noise parameters>
Here, a process of updating the noise parameter by the noise parameter updating unit 216 will be described. In this embodiment, the process of updating the noise parameter of PLx by the image pickup apparatus 100 will be described. The process of updating the noise parameter of PRx is the same as the process of updating the noise parameter of PLx.

図１１は、ノイズパラメータ更新部２１６のブロック図の一例である。 FIG. 11 is an example of a block diagram of the noise parameter update unit 216.

ノイズパラメータ演算部２１６１は、入力されたＬｃｈ＿ＢｅｆｏｒｅおよびＮｃｈ＿ＢｅｆｏｒｅからＰＬｘを生成する。 The noise parameter calculation unit 2161 generates PLx from the input Lch_Before and Nch_Before.

比較器２１６２は、ノイズパラメータ演算部２１６１において生成されたＰＬｘと、ノイズパラメータ記録部２０５に記録されているＰＬｘと、を比較する。ここで、比較器２１６２は、ノイズパラメータ選択部２０６から入力されたノイズパラメータの種類を示すデータに基づいて、ノイズパラメータ記録部２０５に記録されているＰＬｘを読み込む。例えば、ノイズパラメータの種類が長期的なノイズである場合、比較器２１６２は、長期的なノイズに対するノイズパラメータをノイズパラメータ記録部２０５から読み込む。比較器２１６２は、ノイズパラメータ演算部２１６１により生成されたＰＬｘの各周波数スペクトル係数のうち，ノイズパラメータ記録部２０５に記録されているＰＬｘの各周波数スペクトルの係数よりも値が小さい係数を上書き記録する。このように更新する理由は次の通りである。 The comparator 2162 compares the PLx generated in the noise parameter calculation unit 2161 with the PLx recorded in the noise parameter recording unit 205. Here, the comparator 2162 reads the PLx recorded in the noise parameter recording unit 205 based on the data indicating the type of the noise parameter input from the noise parameter selection unit 206. For example, when the type of the noise parameter is long-term noise, the comparator 2162 reads the noise parameter for the long-term noise from the noise parameter recording unit 205. The comparator 2162 overwrites and records a coefficient having a smaller value than the coefficient of each frequency spectrum of PLx recorded in the noise parameter recording unit 205 among the frequency spectrum coefficients of PLx generated by the noise parameter calculation unit 2161. .. The reason for updating in this way is as follows.

例えば、ある１つの周波数における、Ｌマイク２０１ａに収音される環境音の周波数スペクトルの振幅をＳ、レンズからのノイズ音の周波数スペクトルの振幅をＮｆｌとする。また、その周波数において、ノイズマイク２０１ｃに収音される、レンズからのノイズ音の周波数スペクトルの振幅をＮｆｎとする。なお、本実施例では、ノイズマイク２０１ｃに収音される環境音の振幅はＮｆｎと比べて十分小さいものとする。この場合ＰＬｘは次の式で表せる。
ＰＬｘ＝（Ｓ＋Ｎｆｌ）／Ｎｆｎ For example, let S be the amplitude of the frequency spectrum of the environmental sound picked up by the L microphone 201a at a certain frequency, and let Nfl be the amplitude of the frequency spectrum of the noise sound from the lens. Further, at that frequency, the amplitude of the frequency spectrum of the noise sound from the lens collected by the noise microphone 201c is defined as Nfn. In this embodiment, the amplitude of the environmental sound picked up by the noise microphone 201c is assumed to be sufficiently smaller than that of Nfn. In this case, PLx can be expressed by the following equation.
PLx = (S + Nfl) / Nfn

ノイズを効果的に軽減するためには、ノイズパラメータはＳが０に近い場合におけるＰＬｘ（≒Ｎｆｌ／Ｎｆｎ）が望ましい。すなわち、望ましいノイズパラメータＰＬｘの値は小さい値になる。また、Ｓが０に近い場合とは、環境音が小さい場合である。そのため本実施例では、比較器２１６２は、ノイズパラメータ演算部２１６１において生成されたＰＬｘの値のうち、ノイズパラメータ記録部２０５に記録されているＰＬｘよりも値が小さい周波数スペクトルの係数を、新しいノイズパラメータとして記録する。この場合、ノイズパラメータ演算部２１６１においてＰＬｘが生成されたときの環境音の振幅は、ノイズパラメータ記録部２０５に記録されているＰＬｘが生成されたときの環境音の振幅よりも小さい。 In order to effectively reduce noise, it is desirable that the noise parameter is PLx (≈Nfl / Nfn) when S is close to 0. That is, the value of the desirable noise parameter PLx is small. Further, the case where S is close to 0 is the case where the environmental sound is small. Therefore, in this embodiment, the comparator 2162 sets the coefficient of the frequency spectrum, which is smaller than the PLx value recorded in the noise parameter recording unit 205, among the PLx values generated by the noise parameter calculation unit 2161, into new noise. Record as a parameter. In this case, the amplitude of the environmental sound when the PLx is generated in the noise parameter calculation unit 2161 is smaller than the amplitude of the environmental sound when the PLx recorded in the noise parameter recording unit 205 is generated.

なお、比較器２１６２は、ノイズパラメータに記録されている周波数スペクトルごとに、ＰＬｘを比較する。また、比較器２１６２は、ノイズパラメータ演算部２１６１において生成されたＰＬｘの値のうち、ノイズパラメータ記録部２０５に記録されているＰＬｘの値以上の周波数スペクトルの係数は、更新しない。 The comparator 2162 compares PLx for each frequency spectrum recorded in the noise parameter. Further, the comparator 2162 does not update the coefficient of the frequency spectrum having the PLx value or more recorded in the noise parameter recording unit 205 among the PLx values generated by the noise parameter calculation unit 2161.

カウンター２１６３は、比較器２１６２において、ノイズパラメータが比較された回数をカウントする。本実施例では、ノイズパラメータが比較された回数の初期値は０である。カウンター２１６３は、レンズが交換された場合、ノイズパラメータが比較された回数をリセットする。本実施例では、カウンター２１６３は、レンズ制御部１０２からレンズ情報を受信したことに応じて、ノイズパラメータが比較された回数をリセットする。 The counter 2163 counts the number of times the noise parameters have been compared in the comparator 2162. In this embodiment, the initial value of the number of times the noise parameters are compared is 0. The counter 2163 resets the number of times the noise parameters were compared when the lens was replaced. In this embodiment, the counter 2163 resets the number of times the noise parameters are compared in response to receiving the lens information from the lens control unit 102.

これから、図１２のタイミングチャートを用いて、ノイズパラメータ更新部２１６によるノイズパラメータの更新処理について説明する。本タイミングチャートでは、ユーザが撮像装置１００の電源をオンにしたタイミングを時刻ｔ０とする。なお、本実施例ではある１つの周波数スペクトルに対するノイズパラメータＰＬｘについて説明する。本実施例では、このノイズパラメータＰＬｘの初期値を１．４とする。 From now on, the noise parameter update process by the noise parameter update unit 216 will be described with reference to the timing chart of FIG. In this timing chart, the timing at which the user turns on the power of the image pickup apparatus 100 is set to time t0. In this embodiment, the noise parameter PLx for one frequency spectrum will be described. In this embodiment, the initial value of this noise parameter PLx is 1.4.

本実施例では、時刻ｔ０の後、撮像装置１００は、ライブビュー動作を開始し、被写体に対してＡＦ処理を行う。このＡＦ処理に伴って、光学レンズ３００のモータの駆動等によって発生したノイズが検出された場合、ノイズパラメータの更新処理が行われる。なお、ノイズパラメータの更新処理中において、ノイズパラメータ選択部２０６はノイズの種類の判別処理を行う。 In this embodiment, after the time t0, the image pickup apparatus 100 starts the live view operation and performs AF processing on the subject. When noise generated by driving the motor of the optical lens 300 or the like is detected along with this AF processing, the noise parameter is updated. During the noise parameter update process, the noise parameter selection unit 206 performs noise type discrimination process.

時刻ｔ１において、ノイズパラメータ演算部２１６１は、レンズの駆動が開始したことに応じてノイズパラメータＰＬｘを計算する。例えば、被写体に合焦させるための操作を撮像装置１００が受け付けたことに応じて、このレンズの駆動が行われる。ここで、例えば、時刻ｔ１から時刻ｔ２までレンズが駆動した場合、レンズ制御信号がその間Ｈｉｇｈになる。ノイズパラメータ演算部２１６１は、レンズ制御信号がＨｉｇｈである間、ノイズパラメータ選択部２０６からのノイズパラメータの種類を示すデータに基づいて、ＰＬｘを計算する。また、ノイズパラメータ演算部２１６１は、時刻ｔ２でレンズ制御信号がＬｏｗになったことに応じて、ＰＬｘの計算を終了する。ここでは、ノイズパラメータ演算部２１６１によって生成されたＰＬｘの値は１．３とする。なお、例えば、時刻ｔ１から時刻ｔ２までの間に、光学レンズ３００において複数の種類のノイズが発生した場合ノイズパラメータ演算部２１６１は、それぞれのノイズに対してＰＬｘを生成する。ノイズパラメータ演算部２１６１は、ノイズパラメータの種類を示すデータ、および生成されたＰＬｘを比較器２１６２に出力する。 At time t1, the noise parameter calculation unit 2161 calculates the noise parameter PLx according to the start of driving the lens. For example, the lens is driven in response to the acceptance of the operation for focusing on the subject by the image pickup apparatus 100. Here, for example, when the lens is driven from the time t1 to the time t2, the lens control signal becomes High during that time. The noise parameter calculation unit 2161 calculates PLx based on the data indicating the type of noise parameter from the noise parameter selection unit 206 while the lens control signal is High. Further, the noise parameter calculation unit 2161 ends the calculation of PLx according to the fact that the lens control signal becomes Low at time t2. Here, the value of PLx generated by the noise parameter calculation unit 2161 is 1.3. For example, when a plurality of types of noise are generated in the optical lens 300 between the time t1 and the time t2, the noise parameter calculation unit 2161 generates PLx for each noise. The noise parameter calculation unit 2161 outputs data indicating the type of noise parameter and the generated PLx to the comparator 2162.

時刻ｔ２において、比較器２１６２は、ノイズパラメータ演算部２１６１によって生成されたＰＬｘと、ノイズパラメータ記録部２０５に記録されているＰＬｘとを比較し、より小さい値をノイズパラメータＰＬｘの値として決定する。時刻ｔ２では、ノイズパラメータ演算部２１６１によって新たに生成されたＰＬｘ（＝１．３）のほうが、ノイズパラメータ記録部２０５に記録されているＰＬｘよりも小さい。そのため、比較器２１６２はノイズパラメータＰＬｘの値として、１．３をノイズパラメータ記録部２０５に記録することで、ＰＬｘを更新する。 At time t2, the comparator 2162 compares the PLx generated by the noise parameter calculation unit 2161 with the PLx recorded in the noise parameter recording unit 205, and determines a smaller value as the value of the noise parameter PLx. At time t2, the PLx (= 1.3) newly generated by the noise parameter calculation unit 2161 is smaller than the PLx recorded in the noise parameter recording unit 205. Therefore, the comparator 2162 updates the PLx by recording 1.3 as the value of the noise parameter PLx in the noise parameter recording unit 205.

なお、比較器２１６２は、ノイズパラメータ演算部２１６１から出力された、ノイズパラメータの種類を示すデータおよびＰＬｘに基づいて、更新対象となるノイズパラメータの種類を決定する。例えば、短期的なノイズに対するＰＬｘがノイズパラメータ演算部２１６１から出力された場合、比較器２１６２は、ノイズパラメータ演算部２１６１から出力されたＰＬｘとノイズパラメータ記録部２０５に記録されているＰＬ２とを比較する。 The comparator 2162 determines the type of noise parameter to be updated based on the data indicating the type of noise parameter and PLx output from the noise parameter calculation unit 2161. For example, when PLx for short-term noise is output from the noise parameter calculation unit 2161, the comparator 2162 compares the PLx output from the noise parameter calculation unit 2161 with the PL2 recorded in the noise parameter recording unit 205. do.

なお、ノイズパラメータ演算部２１６１から同じ種類のノイズに対して複数のＰＬｘが出力された場合、比較器２１６２は、その複数のＰＬｘのうち、周波数スペクトルごとに最も値が低い係数を組み合わせて生成されたノイズパラメータを用いる。これは、より値が低い係数のほうが、生成されたときの環境音の振幅が小さいと考えられるからである。この場合、比較器２１６２は、組み合わせて生成されたノイズパラメータと、ノイズパラメータ記録部２０５に記録されているＰＬｘと比較する。 When a plurality of PLxs are output for the same type of noise from the noise parameter calculation unit 2161, the comparator 2162 is generated by combining the coefficients having the lowest values for each frequency spectrum among the plurality of PLxs. Use the noise parameters. This is because it is considered that the coefficient with a lower value has a smaller amplitude of the environmental sound when it is generated. In this case, the comparator 2162 compares the noise parameter generated in combination with the PLx recorded in the noise parameter recording unit 205.

同様に、時刻ｔ３から時刻ｔ４までレンズが駆動した場合において、時刻ｔ４においてノイズパラメータ演算部２１６１によって算出されたＰＬｘは１．１とする。時刻ｔ４において、比較器２１６２は、ノイズパラメータ演算部２１６１によって算出されたＰＬｘと、ノイズパラメータ記録部２０５に記録されているＰＬｘとを比較し、より小さい値をノイズパラメータＰＬｘとして決定する。時刻ｔ４では、ノイズパラメータ演算部２１６１によって算出されたＰＬｘ（＝１．１）のほうが小さいため、比較器２１６２はノイズパラメータＰＬｘの値として、１．１をノイズパラメータ記録部２０５に記録することで、ＰＬｘを更新する。 Similarly, when the lens is driven from time t3 to time t4, PLx calculated by the noise parameter calculation unit 2161 at time t4 is 1.1. At time t4, the comparator 2162 compares the PLx calculated by the noise parameter calculation unit 2161 with the PLx recorded in the noise parameter recording unit 205, and determines a smaller value as the noise parameter PLx. At time t4, the PLx (= 1.1) calculated by the noise parameter calculation unit 2161 is smaller, so the comparator 2162 records 1.1 as the value of the noise parameter PLx in the noise parameter recording unit 205. , PLx is updated.

他方、時刻ｔ５から時刻ｔ６までレンズが駆動した場合において、時刻ｔ６においてノイズパラメータ演算部２１６１によって算出されたＰＬｘは１．６とする。時刻ｔ６において、比較器２１６２は、ノイズパラメータ演算部２１６１によって算出されたＰＬｘと、ノイズパラメータ記録部２０５に記録されているＰＬｘとを比較し、より小さい値をノイズパラメータＰＬｘとして決定する。時刻ｔ６では、ノイズパラメータ記録部２０５に記録されているＰＬｘ（＝１．１）のほうが小さいため、比較器２１６２はＰＬｘを更新しない。 On the other hand, when the lens is driven from time t5 to time t6, PLx calculated by the noise parameter calculation unit 2161 at time t6 is 1.6. At time t6, the comparator 2162 compares the PLx calculated by the noise parameter calculation unit 2161 with the PLx recorded in the noise parameter recording unit 205, and determines a smaller value as the noise parameter PLx. At time t6, since the PLx (= 1.1) recorded in the noise parameter recording unit 205 is smaller, the comparator 2162 does not update the PLx.

以降、同様に光学レンズ３００が駆動したことに応じて、ノイズパラメータ更新部２１６はノイズパラメータＰＬｘの更新処理を実行する。 After that, the noise parameter updating unit 216 executes the noise parameter PLx updating process in response to the driving of the optical lens 300 in the same manner.

なお、他にも、ユーザが、動画撮影を開始する前に、撮像装置１００に対して、ノイズを伴う動作をさせるような操作を行った場合にも、撮像装置１００によるノイズパラメータの更新処理が行われる。例えば、ユーザがレリーズスイッチ６１を半押ししたことに応じて、撮像装置１００は光学レンズ３００のモータを駆動させ、被写体に合焦させる。この時発生したノイズを利用して、撮像装置１００はノイズパラメータの更新処理を行う。 In addition, even when the user performs an operation that causes the image pickup device 100 to perform an operation accompanied by noise before starting the moving image shooting, the noise parameter update process by the image pickup device 100 is performed. It will be done. For example, in response to the user pressing the release switch 61 halfway, the image pickup apparatus 100 drives the motor of the optical lens 300 to focus on the subject. Using the noise generated at this time, the image pickup apparatus 100 updates the noise parameter.

なお、ノイズパラメータ更新部２１６がノイズパラメータを更新した場合でも、ノイズパラメータ記録部２０５はノイズパラメータの初期値を保持する。これは、撮像装置１００に装着される光学レンズ３００が交換された場合に、ノイズパラメータ更新部２１６がノイズパラメータを初期値に変更するためである。上述したように、撮像装置１００に装着される光学レンズ３００が交換された場合、Ｌマイク２０１ａ、Ｒマイク２０１ｂ、およびノイズマイク２０１ｃによって取得されるノイズが変化する。本実施例では、その変化に対応するため、ノイズパラメータ更新部２１６は、ノイズパラメータを汎用的な初期値から更新する。撮像装置１００はレンズが交換されるたびにノイズパラメータを汎用的な初期値から更新することで、装着されるレンズに適応したノイズパラメータを使用してノイズ低減処理を行うことができる。 Even when the noise parameter updating unit 216 updates the noise parameter, the noise parameter recording unit 205 holds the initial value of the noise parameter. This is because the noise parameter updating unit 216 changes the noise parameter to the initial value when the optical lens 300 mounted on the image pickup apparatus 100 is replaced. As described above, when the optical lens 300 mounted on the image pickup apparatus 100 is replaced, the noise acquired by the L microphone 201a, the R microphone 201b, and the noise microphone 201c changes. In this embodiment, the noise parameter update unit 216 updates the noise parameter from a general-purpose initial value in order to cope with the change. By updating the noise parameter from a general-purpose initial value each time the lens is replaced, the image pickup apparatus 100 can perform noise reduction processing using the noise parameter adapted to the mounted lens.

次に、図１３のフローチャートおよび図１４のタイミングチャートを用いて撮像装置１００のノイズパラメータ更新処理について説明する。この撮像装置１００の処理は、例えば、ユーザによって電源スイッチ７２が操作され、電源をオンされたことをトリガに開始される。また、例えば、この撮像装置１００の処理は、ユーザによってモードダイヤルが操作され、撮像装置１００のモードが撮影モードにされたことをトリガに開始される。 Next, the noise parameter update process of the image pickup apparatus 100 will be described with reference to the flowchart of FIG. 13 and the timing chart of FIG. The processing of the image pickup apparatus 100 is started, for example, when the power switch 72 is operated by the user and the power is turned on. Further, for example, the process of the image pickup apparatus 100 is started when the mode dial is operated by the user and the mode of the image pickup apparatus 100 is set to the shooting mode.

ここで、本実施例では、撮像装置１００は、電源をオンされた後、撮像部１０１によって撮像された映像をライブビュー表示している。 Here, in this embodiment, the image pickup apparatus 100 displays the image captured by the image pickup unit 101 in a live view after the power is turned on.

ステップＳ１３００では、レンズ制御部１０２は、光学レンズ３００からレンズ情報を受信する。レンズ情報は、例えば、レンズの種類、レンズの型番、およびノイズ源の種類等である。レンズ制御部１０２は、受信したレンズ情報を制御部１１１およびカウンター２１６３に出力する。制御部１１１は受信したレンズ情報を不揮発性メモリ１１７に記録する。 In step S1300, the lens control unit 102 receives lens information from the optical lens 300. The lens information is, for example, the type of lens, the model number of the lens, the type of noise source, and the like. The lens control unit 102 outputs the received lens information to the control unit 111 and the counter 2163. The control unit 111 records the received lens information in the non-volatile memory 117.

ステップＳ１３０１では、レンズ制御部１０２は、レンズが交換されたか否かを判断する。例えば、レンズ制御部１０２は、ステップＳ１３０１の処理より前に不揮発性メモリ１１７に記録されていたレンズ情報と、ステップＳ１３００において受信したレンズ情報とを比較する。２つのレンズ情報が一致すると判断した場合、レンズ制御部１０２はレンズが交換されていないと判断する。２つのレンズ情報が不一致であると判断した場合、レンズ制御部１０２はレンズが交換されたと判断する。また、例えば、ステップＳ１３０１の処理より前に不揮発性メモリ１１７にレンズ情報が記録されていない場合、レンズ制御部１０２はレンズが交換されたと判断する。レンズ制御部１０２がレンズが交換されたと判断した場合、ステップＳ１３０２の処理が実行される。レンズ制御部１０２がレンズが交換されていないと判断した場合、ステップＳ１３０５の処理が実行される。 In step S1301, the lens control unit 102 determines whether or not the lens has been replaced. For example, the lens control unit 102 compares the lens information recorded in the non-volatile memory 117 before the process of step S1301 with the lens information received in step S1300. When it is determined that the two lens information match, the lens control unit 102 determines that the lenses have not been exchanged. If it is determined that the two lens information do not match, the lens control unit 102 determines that the lenses have been exchanged. Further, for example, when the lens information is not recorded in the non-volatile memory 117 before the process of step S1301, the lens control unit 102 determines that the lens has been replaced. When the lens control unit 102 determines that the lens has been replaced, the process of step S1302 is executed. If the lens control unit 102 determines that the lens has not been replaced, the process of step S1305 is executed.

ステップＳ１３０２では、ノイズパラメータ更新部２１６は、ノイズパラメータ記録部２０５に記録されているノイズパラメータの値を初期値に変更する。 In step S1302, the noise parameter updating unit 216 changes the value of the noise parameter recorded in the noise parameter recording unit 205 to the initial value.

ステップＳ１３０３では、カウンター２１６３は、ノイズパラメータが比較された回数をリセットする。例えば、カウンター２１６３は、ノイズパラメータが比較された回数を０回にする。 In step S1303, the counter 2163 resets the number of times the noise parameters have been compared. For example, the counter 2163 sets the number of times the noise parameters are compared to 0.

ステップＳ１３０４では、制御部１１１は、ノイズパラメータ更新部２１６への録音パスをオンにする。例えば、制御部１１１は、音声処理を開始するよう音声入力部１０４を制御する。なお、本ステップにおいて、ノイズパラメータ選択部２０６も処理を開始する。 In step S1304, the control unit 111 turns on the recording path to the noise parameter update unit 216. For example, the control unit 111 controls the voice input unit 104 so as to start voice processing. In this step, the noise parameter selection unit 206 also starts processing.

ステップＳ１３０５では、制御部１１１は、ノイズパラメータが比較された回数が所定回数以下か否かを判断する。例えば、制御部１１１はノイズパラメータが比較された回数が４回以下か否かを判断する。ノイズパラメータが比較された回数が所定回数以下であると判断された場合、ステップＳ１３０６の処理が実行される。ノイズパラメータが比較された回数が所定回数より多いと判断された場合、ステップＳ１３１２の処理が実行される。 In step S1305, the control unit 111 determines whether or not the number of times the noise parameters are compared is equal to or less than a predetermined number of times. For example, the control unit 111 determines whether or not the number of times the noise parameters are compared is 4 times or less. If it is determined that the number of times the noise parameters are compared is less than or equal to the predetermined number of times, the process of step S1306 is executed. If it is determined that the number of times the noise parameters are compared is greater than the predetermined number of times, the process of step S1312 is executed.

ステップＳ１３０６では、制御部１１１は、光学レンズ３００が駆動を開始したか否かを判断する。例えば、制御部１１１は、レンズ制御部１０２から出力されるレンズ制御信号のレベルがＨｉｇｈかＬｏｗかを判断する。レンズ制御信号のレベルがＨｉｇｈである場合、制御部１１１は、光学レンズが駆動を開始したと判断する。レンズ制御信号のレベルがＬｏｗである場合、制御部１１１は、光学レンズが駆動していないと判断する。光学レンズ３００が駆動を開始したと判断された場合、ステップＳ１３０７の処理が実行される。光学レンズ３００が駆動していないと判断された場合、ステップＳ１３１０の処理が実行される。 In step S1306, the control unit 111 determines whether or not the optical lens 300 has started driving. For example, the control unit 111 determines whether the level of the lens control signal output from the lens control unit 102 is High or Low. When the level of the lens control signal is High, the control unit 111 determines that the optical lens has started driving. When the level of the lens control signal is Low, the control unit 111 determines that the optical lens is not driven. If it is determined that the optical lens 300 has started driving, the process of step S1307 is executed. If it is determined that the optical lens 300 is not driven, the process of step S1310 is executed.

ステップＳ１３０７では、ノイズパラメータ更新部２１６は、ノイズパラメータＰＬｘを生成する。なお、本ステップにおいて、ノイズパラメータ更新部２１６は、ノイズが発生するたびに、そのノイズに対するノイズパラメータを生成する。 In step S1307, the noise parameter update unit 216 generates the noise parameter PLx. In this step, the noise parameter update unit 216 generates a noise parameter for the noise each time the noise is generated.

ステップＳ１３０８では、制御部１１１は、光学レンズ３００が駆動を終了したか否かを判断する。例えば、制御部１１１は、レンズ制御部１０２から出力されるレンズ制御信号のレベルがＨｉｇｈかＬｏｗかを判断する。レンズ制御信号のレベルがＨｉｇｈである場合、制御部１１１は、光学レンズが駆動していると判断する。レンズ制御信号のレベルがＬｏｗである場合、制御部１１１は、光学レンズが駆動を終了したと判断する。光学レンズ３００が駆動していると判断された場合、ステップＳ１３０７の処理が実行される。光学レンズ３００が駆動を終了したと判断された場合、ステップＳ１３０９の処理が実行される。 In step S1308, the control unit 111 determines whether or not the optical lens 300 has finished driving. For example, the control unit 111 determines whether the level of the lens control signal output from the lens control unit 102 is High or Low. When the level of the lens control signal is High, the control unit 111 determines that the optical lens is being driven. When the level of the lens control signal is Low, the control unit 111 determines that the optical lens has finished driving. If it is determined that the optical lens 300 is being driven, the process of step S1307 is executed. If it is determined that the optical lens 300 has finished driving, the process of step S1309 is executed.

ステップＳ１３０９では、ノイズパラメータ更新部２１６は、ノイズパラメータが比較された回数を１回増加させる。 In step S1309, the noise parameter update unit 216 increases the number of times the noise parameters are compared by one time.

ステップＳ１３１０では、ノイズパラメータ更新部２１６は、ステップＳ１３０７において生成されたノイズパラメータＰＬｘと、ノイズパラメータ記録部２０５に記録されているＰＬｘと、を比較する。ステップＳ１３０７において生成されたノイズパラメータＰＬｘのほうが小さいと判断された場合、ステップＳ１３１１の処理が実行される。ノイズパラメータ記録部２０５に記録されているＰＬｘのほうが小さいと判断された場合、ステップＳ１３１３の処理が実行される。 In step S1310, the noise parameter updating unit 216 compares the noise parameter PLx generated in step S1307 with the PLx recorded in the noise parameter recording unit 205. If it is determined that the noise parameter PLx generated in step S1307 is smaller, the process of step S1311 is executed. If it is determined that the PLx recorded in the noise parameter recording unit 205 is smaller, the process of step S1313 is executed.

ステップＳ１３１１では、ノイズパラメータ更新部２１６はＰＬｘを更新する。例えば、ノイズパラメータ更新部２１６は、ステップＳ１３０７において生成されたノイズパラメータＰＬｘを、ノイズパラメータ記録部２０５に記録されているＰＬｘに上書き記録する。これにより、撮像装置１００は望ましいノイズパラメータを後述の動画記録において利用することができる。また、撮像装置１００は望ましいノイズパラメータを利用することで効果的にノイズを低減することができる。なお、ノイズパラメータ記録部２０５は、本ステップの処理が実行された場合でもノイズパラメータの初期値を保持している。 In step S1311, the noise parameter update unit 216 updates PLx. For example, the noise parameter updating unit 216 overwrites and records the noise parameter PLx generated in step S1307 on the PLx recorded in the noise parameter recording unit 205. As a result, the image pickup apparatus 100 can utilize the desirable noise parameters in the moving image recording described later. Further, the image pickup apparatus 100 can effectively reduce noise by using desirable noise parameters. The noise parameter recording unit 205 holds the initial value of the noise parameter even when the process of this step is executed.

ここまで、ノイズパラメータの比較回数が所定回数以下の場合について説明した。図１４のタイミングチャートでは、時刻ｔ１～ｔ８の期間に相当する。次にステップＳ１３０５の処理において、ノイズパラメータの比較回数が所定回数より多いと判断された場合の処理について説明する。図１４のタイミングチャートでは、時刻ｔ８以降の期間に相当する。 Up to this point, the case where the number of comparisons of noise parameters is less than or equal to a predetermined number has been described. In the timing chart of FIG. 14, it corresponds to the period from time t1 to t8. Next, in the process of step S1305, the process when it is determined that the number of times of comparison of the noise parameters is more than the predetermined number of times will be described. In the timing chart of FIG. 14, it corresponds to the period after the time t8.

ステップＳ１３１２では、制御部１１１は、ノイズパラメータ更新部２１６への録音パスをオフにする。例えば、制御部１１１は、音声処理を停止するよう音声入力部１０４を制御する。なお、本ステップにおいて、ノイズパラメータ選択部２０６も処理を停止する。 In step S1312, the control unit 111 turns off the recording path to the noise parameter update unit 216. For example, the control unit 111 controls the voice input unit 104 so as to stop the voice processing. In this step, the noise parameter selection unit 206 also stops processing.

ステップＳ１３１３では、制御部１１１は、動画記録を開始するか否かを判断する。例えば、制御部１１１は、レリーズスイッチ６１を押下されたことに応じて、動画記録を開始すると判断する。逆に、制御部１１１は、レリーズスイッチ６１を押下されていない場合、動画記録を開始しないと判断する。制御部１１１が動画記録を開始すると判断した場合、ステップＳ１３１４の処理が実行される。制御部１１１が動画記録を開始しないと判断した場合、ステップＳ１３００の処理が実行される。 In step S1313, the control unit 111 determines whether or not to start video recording. For example, the control unit 111 determines that the moving image recording is started in response to the pressing of the release switch 61. On the contrary, the control unit 111 determines that the moving image recording is not started when the release switch 61 is not pressed. If the control unit 111 determines to start video recording, the process of step S1314 is executed. If the control unit 111 determines that the moving image recording is not started, the process of step S1300 is executed.

ステップＳ１３１４では、制御部１１１は、音声入力部１０４への録音パスをオンにする。例えば、制御部１１１は、音声処理を開始するよう音声入力部１０４を制御する。 In step S1314, the control unit 111 turns on the recording path to the voice input unit 104. For example, the control unit 111 controls the voice input unit 104 so as to start voice processing.

ステップＳ１３１５では、制御部１１１は動画記録する。例えば、前述したように、制御部１１１は記録媒体１１０に音声付き動画データを記録する。この動画記録中の音声処理では、ノイズパラメータ記録部２０５に記録されたノイズパラメータが利用される。 In step S1315, the control unit 111 records a moving image. For example, as described above, the control unit 111 records the moving image data with audio on the recording medium 110. In the voice processing during the moving image recording, the noise parameter recorded in the noise parameter recording unit 205 is used.

ステップＳ１３１６では、制御部１１１は動画記録を終了するか否かを判断する。例えば、制御部１１１は、レリーズスイッチ６１を押下されたことに応じて、動画記録を終了すると判断する。逆に、制御部１１１は、レリーズスイッチ６１を押下されていない場合、動画記録を終了しないと判断する。制御部１１１が動画記録を終了すると判断した場合、ステップＳ１３１７の処理が実行される。制御部１１１が動画記録を終了しないと判断した場合、ステップＳ１３１５の処理が実行される。 In step S1316, the control unit 111 determines whether or not to end the moving image recording. For example, the control unit 111 determines that the moving image recording is finished in response to the pressing of the release switch 61. On the contrary, the control unit 111 determines that the moving image recording is not finished when the release switch 61 is not pressed. When the control unit 111 determines that the moving image recording is finished, the process of step S1317 is executed. If the control unit 111 determines that the moving image recording is not completed, the process of step S1315 is executed.

ステップＳ１３１７では、制御部１１１は、ノイズパラメータ更新部２１６への録音パスをオフにする。例えば、制御部１１１は、音声処理を停止するよう音声入力部１０４を制御する。 In step S1317, the control unit 111 turns off the recording path to the noise parameter update unit 216. For example, the control unit 111 controls the voice input unit 104 so as to stop the voice processing.

以上、撮像装置１００のノイズパラメータの更新処理について説明した。このように動画記録処理の開始前にレンズの駆動に応じて新たにノイズパラメータを生成する。そして、既に生成済のノイズパラメータよりも、新たに生成されたノイズパラメータの方が適切にノイズを低減できる場合には、新たに生成したノイズパラメータを用いて、記録済のノイズパラメータを更新する。更新されたノイズパラメータを用いて、動画記録中の音声に含まれるノイズを低減することで、撮像装置１００は音質の良い音声付き動画データを記録することができる。 The process of updating the noise parameter of the image pickup apparatus 100 has been described above. In this way, a new noise parameter is generated according to the drive of the lens before the start of the moving image recording process. Then, when the newly generated noise parameter can reduce the noise more appropriately than the already generated noise parameter, the newly generated noise parameter is used to update the recorded noise parameter. By using the updated noise parameters to reduce the noise contained in the sound during the moving image recording, the image pickup apparatus 100 can record the moving image data with sound having good sound quality.

なお、ノイズパラメータ記録部２０５は、撮像装置１００の電源がオフされた後も、更新されたノイズパラメータを保持する。これにより、撮像装置１００は、次に撮像装置１００の電源がオンされたときに、レンズが交換されていない状態であれば、更新されたノイズパラメータを用いて音声付き動画データを記録することができる。 The noise parameter recording unit 205 retains the updated noise parameters even after the power of the image pickup apparatus 100 is turned off. As a result, when the power of the image pickup device 100 is turned on next time, the image pickup device 100 can record the moving image data with sound using the updated noise parameter if the lens is not replaced. can.

なお、本実施例では、ノイズパラメータ更新部２１６は、動画記録中はノイズパラメータを更新する処理を実行しない。撮像装置１００は、動画記録中においてノイズパラメータを一定にすることで、音質を保ちながら音声を記録することができる。 In this embodiment, the noise parameter updating unit 216 does not execute the process of updating the noise parameter during the moving image recording. The image pickup apparatus 100 can record sound while maintaining sound quality by keeping the noise parameter constant during video recording.

なお、本実施例では、ノイズパラメータ更新部２１６はノイズパラメータが比較される前のステップにおいて、ノイズパラメータが比較された回数を増加させているが、ノイズパラメータを比較した後にノイズパラメータが比較された回数を増加させてもよい。 In this embodiment, the noise parameter update unit 216 increases the number of times the noise parameters are compared in the step before the noise parameters are compared, but the noise parameters are compared after the noise parameters are compared. The number of times may be increased.

なお、ノイズパラメータ更新部２１６は、情報取得部１０３によって取得された撮像装置１００の傾きに基づいて、ノイズパラメータを更新するか否かを判断してもよい。例えば、ノイズパラメータが撮像装置１００が水平である場合に適用されることを想定されている場合、撮像装置１００が上向きの状態では、ノイズパラメータ更新部２１６は、ノイズパラメータを正しく計算できないおそれがある。そのため、この場合、ノイズパラメータ更新部２１６は、撮像装置１００が水平であると判断した場合、ノイズパラメータを更新する更新処理を実行する。逆に、ノイズパラメータ更新部２１６は、撮像装置１００が水平ではないと判断した場合、ノイズパラメータを更新する更新処理を実行しない。なお、例えば、撮像装置１００が水平であるとみなせる傾きの範囲は、水平方向に対して撮像装置１００の傾きが±３０度以内の範囲である。 The noise parameter updating unit 216 may determine whether or not to update the noise parameter based on the inclination of the image pickup apparatus 100 acquired by the information acquisition unit 103. For example, when it is assumed that the noise parameter is applied when the image pickup device 100 is horizontal, the noise parameter update unit 216 may not be able to correctly calculate the noise parameter when the image pickup device 100 is facing upward. .. Therefore, in this case, the noise parameter update unit 216 executes an update process for updating the noise parameter when it is determined that the image pickup apparatus 100 is horizontal. On the contrary, when the noise parameter update unit 216 determines that the image pickup apparatus 100 is not horizontal, the noise parameter update unit 216 does not execute the update process for updating the noise parameter. For example, the range of inclination that the image pickup apparatus 100 can be regarded as horizontal is the range in which the inclination of the image pickup apparatus 100 is within ± 30 degrees with respect to the horizontal direction.

なお、ノイズパラメータ更新部２１６は、光学レンズ３００がズームレンズである場合、レンズ制御部１０２によって取得された光学レンズ３００のズーム倍率に基づいて、ノイズパラメータを更新するか否かを判断してもよい。例えば、ノイズパラメータが光学レンズ３００がワイド端である場合に適用されることを想定されている場合、光学レンズ３００がテレ端の状態では、ノイズパラメータ更新部２１６は、ノイズパラメータを正しく計算できないおそれがある。そのため、この場合、ノイズパラメータ更新部２１６は、光学レンズ３００がワイド端であると判断した場合、ノイズパラメータを更新する更新処理を実行する。逆に、ノイズパラメータ更新部２１６は、光学レンズ３００がワイド端ではないと判断した場合、ノイズパラメータを更新する更新処理を実行しない。なお、例えば、光学レンズ３００がワイド端であるとみなせるズーム倍率の範囲は、光学レンズ３００のズーム倍率が１～２倍の範囲である。 When the optical lens 300 is a zoom lens, the noise parameter updating unit 216 may determine whether or not to update the noise parameter based on the zoom magnification of the optical lens 300 acquired by the lens control unit 102. good. For example, if it is assumed that the noise parameter is applied when the optical lens 300 is at the wide end, the noise parameter update unit 216 may not be able to correctly calculate the noise parameter when the optical lens 300 is at the telephoto end. There is. Therefore, in this case, the noise parameter update unit 216 executes an update process for updating the noise parameter when it is determined that the optical lens 300 is at the wide end. On the contrary, when the noise parameter update unit 216 determines that the optical lens 300 is not at the wide end, the noise parameter update unit 216 does not execute the update process for updating the noise parameter. For example, the range of the zoom magnification that can be regarded as the wide end of the optical lens 300 is the range in which the zoom magnification of the optical lens 300 is 1 to 2 times.

なお、ノイズパラメータ記録部２０５は、レンズごとにノイズパラメータを記録してもよい。これはレンズごとに音声入力部１０４のマイクに収音されるノイズが異なることに対応してノイズ低減するためである。この場合、撮像装置１００は、レンズごとのノイズパラメータに対してノイズパラメータを更新する更新処理を実行する。 The noise parameter recording unit 205 may record noise parameters for each lens. This is because the noise is reduced in response to the fact that the noise picked up by the microphone of the voice input unit 104 is different for each lens. In this case, the image pickup apparatus 100 executes an update process for updating the noise parameter with respect to the noise parameter for each lens.

［第二の実施例］
第二の実施例では、第一の実施例とは異なるノイズパラメータの更新処理について説明する。ここで、撮像装置１００の構成は、第一の実施例と同様である。なお、本実施例においてもＰＬｘのノイズパラメータの更新処理について説明するが、ＰＲｘのノイズパラメータの更新処理は、ＰＬｘのノイズパラメータを更新する処理と同様である。 [Second Example]
In the second embodiment, the noise parameter update process different from that of the first embodiment will be described. Here, the configuration of the image pickup apparatus 100 is the same as that of the first embodiment. Although the PLx noise parameter update process will be described in this embodiment as well, the PRx noise parameter update process is the same as the PLx noise parameter update process.

これから、図１５のフローチャートおよび図１６のタイミングチャートを用いて撮像装置１００のノイズパラメータ更新処理について説明する。この撮像装置１００の処理は、例えば、ユーザによって電源スイッチ７２が操作され、電源をオンされたことをトリガに開始される。 From now on, the noise parameter update process of the image pickup apparatus 100 will be described with reference to the flowchart of FIG. 15 and the timing chart of FIG. The processing of the image pickup apparatus 100 is started, for example, when the power switch 72 is operated by the user and the power is turned on.

ステップＳ１５００～Ｓ１５１０までの処理は、それぞれ図１３のステップＳ１３００～Ｓ１３１０の処理と同様であるため説明を省略する。 Since the processes of steps S1500 to S1510 are the same as the processes of steps S1300 to S1310 of FIG. 13, the description thereof will be omitted.

ステップＳ１５１１では、ノイズパラメータ更新部２１６は、ステップＳ１３０７において生成されたノイズパラメータＰＬｘを、ノイズパラメータ記録部２０５に記録されているＰＬｘに上書き記録する。ステップＳ１５１０の処理後、ステップＳ１５１１の処理が実行される。 In step S1511, the noise parameter updating unit 216 overwrites and records the noise parameter PLx generated in step S1307 on the PLx recorded in the noise parameter recording unit 205. After the process of step S1510, the process of step S1511 is executed.

ステップＳ１５１２では、制御部１１１は、ノイズパラメータ更新部２１６への録音パスをオフにする。例えば、制御部１１１は、音声処理を停止するよう音声入力部１０４を制御する。 In step S1512, the control unit 111 turns off the recording path to the noise parameter update unit 216. For example, the control unit 111 controls the voice input unit 104 so as to stop the voice processing.

ステップＳ１５１３～Ｓ１５１５までの処理は、それぞれ図１３のステップＳ１３１３～Ｓ１３１５の処理と同様であるため説明を省略する。 Since the processes of steps S1513 to S1515 are the same as the processes of steps S1313 to S1315 of FIG. 13, the description thereof will be omitted.

ステップＳ１５１６では、制御部１１１は、光学レンズ３００が駆動したか否かを判断する。例えば、制御部１１１は、レンズ制御部１０２から出力されるレンズ制御信号のレベルがＨｉｇｈかＬｏｗかを判断する。レンズ制御信号のレベルがＨｉｇｈである場合、制御部１１１は、光学レンズが駆動したと判断する。レンズ制御信号のレベルがＬｏｗである場合、制御部１１１は、光学レンズが駆動していないと判断する。光学レンズ３００が駆動したと判断された場合、ステップＳ１５１７の処理が実行される。光学レンズ３００が駆動していないと判断された場合、ステップＳ１５１９の処理が実行される。 In step S1516, the control unit 111 determines whether or not the optical lens 300 has been driven. For example, the control unit 111 determines whether the level of the lens control signal output from the lens control unit 102 is High or Low. When the level of the lens control signal is High, the control unit 111 determines that the optical lens has been driven. When the level of the lens control signal is Low, the control unit 111 determines that the optical lens is not driven. If it is determined that the optical lens 300 has been driven, the process of step S1517 is executed. If it is determined that the optical lens 300 is not driven, the process of step S1519 is executed.

ステップＳ１５１７では、ノイズパラメータ更新部２１６は、ノイズパラメータＰＬｘを生成する。 In step S1517, the noise parameter update unit 216 generates the noise parameter PLx.

ステップＳ１５１８では、ノイズパラメータ更新部２１６は、ノイズパラメータが比較された回数を１回増加させる。 In step S1518, the noise parameter update unit 216 increases the number of times the noise parameters are compared by one time.

ステップＳ１５１９では、ノイズパラメータ更新部２１６は、ステップＳ１５１７において生成されたノイズパラメータＰＬｘと、ノイズパラメータ記録部２０５に記録されているＰＬｘと、を比較する。ステップＳ１５１７において生成されたノイズパラメータＰＬｘのほうが小さいと判断された場合、ステップＳ１５２０の処理が実行される。ノイズパラメータ記録部２０５に記録されているＰＬｘのほうが小さいと判断された場合、ステップＳ１５２１の処理が実行される。 In step S1519, the noise parameter updating unit 216 compares the noise parameter PLx generated in step S1517 with the PLx recorded in the noise parameter recording unit 205. If it is determined that the noise parameter PLx generated in step S1517 is smaller, the process of step S1520 is executed. If it is determined that the PLx recorded in the noise parameter recording unit 205 is smaller, the process of step S1521 is executed.

ステップＳ１５２０では、ノイズパラメータ更新部２１６はＰＬｘを更新する。例えば、ノイズパラメータ更新部２１６は、ステップＳ１５１７において生成されたノイズパラメータＰＬｘを、ノイズパラメータ記録部２０５に記録されているＰＬｘに上書き記録する。なお、ノイズパラメータ記録部２０５は、本ステップの処理が実行された場合でもノイズパラメータの初期値を保持している。 In step S1520, the noise parameter update unit 216 updates PLx. For example, the noise parameter updating unit 216 overwrites and records the noise parameter PLx generated in step S1517 on the PLx recorded in the noise parameter recording unit 205. The noise parameter recording unit 205 holds the initial value of the noise parameter even when the process of this step is executed.

ステップＳ１５２１では、制御部１１１は動画記録を終了するか否かを判断する。例えば、制御部１１１は、レリーズスイッチ６１を押下されたことに応じて、動画記録を終了すると判断する。逆に、制御部１１１は、レリーズスイッチ６１を押下されていない場合、動画記録を終了しないと判断する。制御部１１１が動画記録を終了すると判断した場合、ステップＳ１５２２の処理が実行される。制御部１１１が動画記録を終了しないと判断した場合、ステップＳ１５１５の処理が実行される。 In step S1521, the control unit 111 determines whether or not to end the moving image recording. For example, the control unit 111 determines that the moving image recording is finished in response to the pressing of the release switch 61. On the contrary, the control unit 111 determines that the moving image recording is not finished when the release switch 61 is not pressed. When the control unit 111 determines that the moving image recording is finished, the process of step S1522 is executed. If the control unit 111 determines that the moving image recording is not completed, the process of step S1515 is executed.

ステップＳ１５２１では、制御部１１１は、ノイズパラメータ更新部２１６への録音パスをオフにする。例えば、制御部１１１は、音声処理を停止するよう音声入力部１０４を制御する。 In step S1521, the control unit 111 turns off the recording path to the noise parameter update unit 216. For example, the control unit 111 controls the voice input unit 104 so as to stop the voice processing.

このように、本実施例では、撮像装置１００は動画記録前ではノイズパラメータを更新したことに応じて、録音パスをオフにする。これにより、撮像装置１００は、早ければ１回のレンズ駆動によって録音パスをオフにすることができるため、音声入力部１０４における電力の消費を抑えることができる。また、本実施例では、撮像装置１００は動画記録中もノイズパラメータを更新する。これにより、動画記録中であっても効率的にノイズ低減して音声を記録することができる。 As described above, in this embodiment, the image pickup apparatus 100 turns off the recording path in response to updating the noise parameter before recording the moving image. As a result, the image pickup apparatus 100 can turn off the recording path by driving the lens once at the earliest, so that the power consumption in the audio input unit 104 can be suppressed. Further, in this embodiment, the image pickup apparatus 100 updates the noise parameter even during video recording. As a result, it is possible to efficiently reduce noise and record voice even during video recording.

［その他の実施形態］
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 [Other embodiments]
The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

なお、本発明は上記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記実施形態に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。さらに、異なる実施形態にわたる構成要素を適宜組み合わせてもよい。 It should be noted that the present invention is not limited to the above embodiment as it is, and at the implementation stage, the components can be modified and embodied within a range that does not deviate from the gist thereof. In addition, various inventions can be formed by an appropriate combination of the plurality of components disclosed in the above-described embodiment. For example, some components may be removed from all the components shown in the embodiments. In addition, components across different embodiments may be combined as appropriate.

Claims

Imaging means and
With the first microphone to get the ambient sound,
With a second microphone to get the sound from the noise source,
The first conversion means for generating the first audio signal by Fourier transforming the audio signal from the first microphone,
A second conversion means that Fourier transforms the audio signal from the second microphone to generate a second audio signal,
A generation means for generating noise data using the second audio signal and the first parameter related to noise of the noise source.
A subtraction means for subtracting the noise data from the first audio signal,
A third transforming means for inverse Fourier transforming the audio signal from the subtracting means,
A recording means for recording the moving image data generated by the imaging means and the audio signal from the third conversion means on a recording medium as moving image data with audio.
In a state where the video data with audio is not recorded by the recording means, the first audio signal and the second audio signal are used to generate parameters related to noise of the noise source, and the generation is performed. An update means for updating the first parameter using the above-mentioned parameters, and
A voice processing device characterized by having.

When the number of times the parameter relating to the noise of the noise source is generated exceeds a predetermined number of times in a state where the moving image data with audio is not recorded by the recording means, the updating means causes the noise of the noise source. The voice processing apparatus according to claim 1, wherein the process of generating the parameter is terminated.

The update means according to claim 1 or 2, wherein when it is determined that the lens attached to the image pickup means has been replaced, the updating means resets the number of times the parameter related to the noise of the noise source is generated. Voice processing device.

When the first parameter is updated in a state where the video data with audio is not recorded by the recording means, the updating means ends the process of generating the parameter related to the noise of the noise source. The voice processing apparatus according to any one of claims 1 to 3.

The update means is any one of claims 1 to 4, wherein the updating means performs a process of updating the first parameter even when the recording means is recording the moving image data with audio. The audio processing device described in.

Claims 1 to 5, wherein the updating means changes the first parameter updated by the updating means to an initial value when it is determined that the lens attached to the imaging means has been replaced. The voice processing device according to any one of the following items.

When the first noise parameter is updated by the updating means, the generation means uses the first noise parameter updated by the updating means and the second voice signal to obtain the noise data. The voice processing apparatus according to any one of claims 1 to 6, wherein the voice processing apparatus is generated.

When the updating means determines that the value of the parameter generated by using the first audio signal and the second audio signal is smaller than the value of the first parameter, the first audio signal. The voice processing apparatus according to any one of claims 1 to 7, wherein the first parameter is updated by using a parameter generated by using the second voice signal and the second voice signal.

In the updating means, the amplitude of the environmental sound when the parameter generated by using the first audio signal and the second audio signal is generated is the amplitude of the environmental sound when the first parameter is generated. When the amplitude is smaller than the amplitude of, any one of claims 1 to 8, wherein the first parameter is updated by using the parameter generated from the first voice signal and the second voice signal. The audio processing device according to item 1.

From claim 1, the updating means updates the first parameter for each frequency spectrum based on the parameters generated by using the first voice signal and the second voice signal. 9. The audio processing device according to any one of 9.

The lens has a driving means as the noise source.
The updating means is characterized in that, while the driving means is being driven, the first voice signal and the second voice signal are used to generate a parameter related to noise of the noise source. The voice processing device according to any one of claims 1 to 10.

When a plurality of parameters are generated while the driving means is being driven, the updating means uses the parameter having a smaller amplitude of the environmental sound when the plurality of parameters are generated, and the first one. The voice processing apparatus according to claim 11, wherein a process for updating a parameter is executed.

The generation means obtains at least one of the plurality of the first parameters and the second voice signal, including a parameter corresponding to the first kind of noise and a parameter corresponding to the second kind of noise. The voice processing apparatus according to any one of claims 1 to 12, wherein the noise data is generated by using the voice processing apparatus.

The generation means generates the noise data by using the parameter corresponding to the type of noise included in the second voice signal among the plurality of first parameters and the second voice signal. 13. The voice processing device according to claim 13.

Further, it has a determination means for determining the type of noise contained in the second audio signal, and has a determination means.
The updating means is characterized in that the parameter corresponding to the type of noise included in the second voice signal among the plurality of first parameters is updated based on the type of noise determined by the determination means. The voice processing device according to claim 13 or 14.

The first parameter according to any one of claims 1 to 15, further comprising means for holding the first parameter updated by the updating means when the power of the voice processing device is turned off. Voice processing device.

The voice processing apparatus according to any one of claims 1 to 16, wherein the noise includes at least one of constant noise, short-term noise, and long-term noise.

The voice processing apparatus according to any one of claims 1 to 17, wherein the first parameter is a ratio of the amplitudes of the first voice signal and the second voice signal.

It is a control method of a voice processing device having an image pickup means, a first microphone for acquiring environmental sound, and a second microphone for acquiring sound from a noise source.
The step of Fourier transforming the audio signal from the first microphone to generate the first audio signal,
The step of Fourier transforming the audio signal from the second microphone to generate the second audio signal,
A generation step of generating noise data using the second audio signal and the first parameter related to noise of the noise source.
A subtraction step of subtracting the noise data from the first audio signal,
A step of inverse Fourier transforming the audio signal generated by the subtraction step,
A recording step of recording the moving image data generated by the imaging means and the inverse Fourier transformed voice signal as moving image data with sound on a recording medium.
In a state where the video data with audio is not recorded, the first audio signal and the second audio signal are used to generate a parameter related to the noise of the noise source, and the generated parameter is used. A control method comprising an update step for updating the first parameter.

A computer-readable program for operating a computer as each means of the voice processing device according to any one of claims 1 to 18.