WO2016203994A1 - 符号化装置および方法、復号装置および方法、並びにプログラム - Google Patents
符号化装置および方法、復号装置および方法、並びにプログラム Download PDFInfo
- Publication number
- WO2016203994A1 WO2016203994A1 PCT/JP2016/066574 JP2016066574W WO2016203994A1 WO 2016203994 A1 WO2016203994 A1 WO 2016203994A1 JP 2016066574 W JP2016066574 W JP 2016066574W WO 2016203994 A1 WO2016203994 A1 WO 2016203994A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- metadata
- frame
- sample
- audio signal
- decoding
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 133
- 230000005236 sound signal Effects 0.000 claims abstract description 164
- 238000009877 rendering Methods 0.000 claims description 33
- 238000005516 engineering process Methods 0.000 abstract description 26
- 239000000523 sample Substances 0.000 description 202
- 238000000926 separation method Methods 0.000 description 20
- 239000013074 reference sample Substances 0.000 description 10
- 230000006870 function Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000004807 localization Effects 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 238000004091 panning Methods 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000006866 deterioration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/167—Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/008—Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S5/00—Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation
- H04S5/02—Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation of the pseudo four-channel type, e.g. in which rear channel signals are derived from two-channel stereo signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/11—Positioning of individual sound objects, e.g. moving airplane, within a sound field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/15—Aspects of sound capture and related signal processing for recording or reproduction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/03—Application of parametric coding in stereophonic audio systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
Definitions
- the present technology relates to an encoding device and method, a decoding device and method, and a program, and more particularly, to an encoding device and method, a decoding device and method, and a program that can obtain higher-quality sound.
- an MPEG (Moving Picture Experts Group) -H 3D Audio standard that compresses (encodes) an audio signal of an audio object and metadata such as position information of the audio object is known (for example, non-patent literature). 1).
- audio signals and metadata of audio objects are encoded and transmitted for each frame.
- a maximum of one metadata is encoded and transmitted for each frame of the audio signal of the audio object. That is, depending on the frame, there may be no metadata.
- the encoded audio signal and metadata are decoded by a decoding device, and rendering is performed based on the audio signal and metadata obtained by decoding.
- the audio signal and metadata are first decoded.
- a PCM (Pulse Code Modulation) sample value for each sample in the frame is obtained for the audio signal. That is, PCM data is obtained as an audio signal.
- metadata of the representative sample in the frame specifically, metadata of the last sample in the frame is obtained.
- the renderer in the decoding device based on the position information as the metadata of the representative sample in the frame, the sound image of the audio object at the position indicated by the position information.
- the VBAP gain is calculated by VBAP (Vector
- the metadata of the audio object is the metadata of the representative sample in the frame, that is, the last sample in the frame as described above. Therefore, the VBAP gain calculated by the renderer is the gain of the last sample in the frame, and the VBAP gain of other samples in the frame is not obtained. Therefore, in order to reproduce the sound of the audio object, it is necessary to calculate the VBAP gain of samples other than the representative sample of the audio signal.
- the renderer calculates the VBAP gain of each sample by interpolation processing. Specifically, for each speaker, the VBAP gain of the last sample of the current frame and the VBAP gain of the last sample of the frame immediately preceding the current frame, and the VBAP of the sample of the current frame between those samples. The gain is calculated by linear interpolation.
- the VBAP gain calculated for each speaker is multiplied by the audio signal of the audio object and supplied to each speaker to reproduce the sound.
- the localization position of the sound image has a radius of 1 around a predetermined reference point in the reproduction space, for example, the position of the head of a virtual user who views content such as a moving image or music with sound. It will be located on the surface of the sphere.
- the VBAP gain of samples other than the representative sample in the frame is calculated by interpolation processing, the square sum of the VBAP gain of each speaker of such a sample is not 1. Therefore, for the sample for which the VBAP gain has been calculated by interpolation processing, the position of the sound image is shifted in the normal direction of the spherical surface described above or in the vertical and horizontal directions on the surface of the sphere when viewed from the virtual user It will be. Then, during sound reproduction, the sound image position of the audio object fluctuates within a period of one frame, the sense of localization is deteriorated, and the sound quality of the sound is deteriorated.
- the length between the last sample position of the current frame and the last sample position of the frame immediately before the current frame becomes longer. Then, the difference between the sum of squares of the VBAP gain of each speaker calculated by the interpolation process and 1 becomes large, and the deterioration of sound quality becomes large.
- the faster the movement of the audio object the more the VBAP gain of the last sample of the current frame and the last sample of the last frame of the current frame. Difference from VBAP gain increases. Then, the movement of the audio object cannot be accurately rendered, and the sound quality is deteriorated.
- This technology has been made in view of such a situation, and is intended to obtain higher-quality sound.
- the decoding device includes an acquisition unit that acquires encoded audio data obtained by encoding audio signals of frames of a predetermined time interval of an audio object, and a plurality of metadata of the frames.
- a decoding unit that decodes the encoded audio data, a rendering unit that performs rendering based on the audio signal obtained by the decoding, and the plurality of metadata.
- the metadata can include position information indicating the position of the audio object.
- Each of the plurality of metadata may be metadata of a plurality of samples in the frame of the audio signal.
- Each of the plurality of metadata may be metadata of a plurality of samples arranged at intervals of the number of samples obtained by dividing the number of samples constituting the frame by the number of the plurality of metadata. .
- Each of the plurality of metadata can be metadata of a plurality of samples indicated by each of a plurality of sample indexes.
- Each of the plurality of metadata may be metadata of a plurality of samples arranged at a predetermined number of samples in the frame.
- the plurality of metadata may include metadata for performing gain interpolation processing of the audio signal sample calculated based on the metadata.
- the decoding method or program acquires encoded audio data obtained by encoding an audio signal of a frame at a predetermined time interval of an audio object, and a plurality of metadata of the frame, Decoding the encoded audio data, and rendering based on the audio signal obtained by the decoding and the plurality of metadata.
- encoded audio data obtained by encoding an audio signal of a frame at a predetermined time interval of an audio object and a plurality of metadata of the frame are acquired, and the encoded audio is acquired.
- Data is decoded, and rendering is performed based on the audio signal obtained by the decoding and the plurality of metadata.
- An encoding device includes an encoding unit that encodes audio signals of frames of a predetermined time interval of an audio object, encoded audio data obtained by the encoding, and a plurality of the frames And a generation unit that generates a bitstream including the metadata.
- the metadata can include position information indicating the position of the audio object.
- Each of the plurality of metadata may be metadata of a plurality of samples in the frame of the audio signal.
- Each of the plurality of metadata may be metadata of a plurality of samples arranged at intervals of the number of samples obtained by dividing the number of samples constituting the frame by the number of the plurality of metadata. .
- Each of the plurality of metadata can be metadata of a plurality of samples indicated by each of a plurality of sample indexes.
- Each of the plurality of metadata may be metadata of a plurality of samples arranged at a predetermined number of samples in the frame.
- the plurality of metadata may include metadata for performing gain interpolation processing of the audio signal sample calculated based on the metadata.
- the encoding device may further include an interpolation processing unit that performs an interpolation process on the metadata.
- An encoding method or program encodes an audio signal of a frame at a predetermined time interval of an audio object, encodes audio data obtained by the encoding, and a plurality of metadata of the frame And generating a bitstream including.
- an audio signal of a frame at a predetermined time interval of an audio object is encoded, and includes encoded audio data obtained by the encoding and a plurality of metadata of the frame.
- a bitstream is generated.
- This technology encodes and transmits an audio signal of an audio object and metadata such as position information of the audio object, and reproduces audio by decoding the audio signal and metadata on the decoding side. In some cases, it is possible to obtain higher-quality sound.
- the audio object is also simply referred to as an object.
- a plurality of metadata that is, two or more metadata are encoded and transmitted for an audio signal of one frame.
- the metadata is metadata of the sample in the frame of the audio signal, that is, metadata given to the sample.
- the position of the audio object in the space indicated by the position information as metadata indicates the position at the audio reproduction timing based on the sample to which the metadata is given.
- the metadata can be transmitted by any one of the following three methods as a method of transmitting the metadata, that is, the number specifying method, the sample specifying method, and the automatic switching method. Further, at the time of transmitting metadata, it is possible to transmit metadata while switching these three methods for each frame or object that is a section of a predetermined time interval.
- the number designation method is a method in which metadata number information indicating the number of metadata transmitted for one frame is included in the bitstream syntax and a specified number of metadata is transmitted. Information indicating the number of samples constituting one frame is stored in the header of the bit stream.
- each metadata to be transmitted may be determined in advance, such as a position when one frame is equally divided.
- the number of samples constituting one frame is 2048, and four metadata are transmitted per frame.
- the section of one frame is equally divided by the number of metadata to be transmitted, and the metadata of the sample position of the divided section boundary is transmitted. That is, it is assumed that the metadata of samples in frames arranged at intervals of the number of samples obtained by dividing the number of samples in one frame by the number of metadata is transmitted.
- metadata is transmitted for the 512th sample, the 1024th sample, the 1536th sample, and the 2048th sample from the beginning of the frame.
- Metadata may be transmitted for each sample arranged at a predetermined interval, that is, for each predetermined number of samples.
- sample designation method Next, the sample designation method will be described.
- a sample index indicating the sample position of each metadata is also stored in the bitstream and transmitted.
- the number of samples constituting one frame is 2048 samples and four metadata are transmitted per frame. Also, assume that metadata is transmitted for the 128th sample, the 512th sample, the 1536th sample, and the 2048th sample from the beginning of the frame.
- the bitstream includes metadata number information indicating the number of metadata “4” transmitted per frame, the 128th sample from the top of the frame, the 512th sample, the 1536th sample, And a sample index indicating the position of each of the 2048th samples is stored.
- the value of the sample index indicating the position of the 128th sample from the beginning of the frame is 128 or the like.
- the sample designation method it is possible to transmit metadata of an arbitrary sample for each frame. For example, metadata of samples before and after a scene switching position can be transmitted. In this case, discontinuous movement of the object can be expressed by rendering, and high-quality sound can be obtained.
- the number of metadata transmitted for each frame is automatically switched according to the number of samples constituting one frame, that is, the number of samples in one frame.
- the metadata of each sample arranged at intervals of 256 samples in the frame is transmitted.
- a total of four pieces of metadata are transmitted for the 256th sample, the 512th sample, the 768th sample, and the 1024th sample from the beginning of the frame.
- the number of samples in one frame is 2048 samples
- metadata of each sample arranged at 256 sample intervals in the frame is transmitted.
- a total of eight pieces of metadata are transmitted.
- the distance between samples having metadata is shortened, the difference in VBAP gain in those samples is also reduced, and the movement of the object can be rendered more accurately. Furthermore, when the distance between samples with metadata is shortened, the period during which the object moves continuously, such as the scene switching part, becomes longer as if the object is moving continuously. Can be shortened. In particular, in the sample designation method, discontinuous movement of an object can be expressed by transmitting metadata at an appropriate sample position.
- Metadata may be transmitted using only one of the three methods, the number designating method, the sample designating method, and the automatic switching method described above. Of these three methods, These two or more methods may be switched for each frame or each object.
- a switching index indicating which method is used to transmit the metadata is stored in the bitstream. What should I do?
- the switching index when the value of the switching index is 0, it indicates that the number designation method has been selected, that is, the metadata has been transmitted by the number designation method. This indicates that the method has been selected.
- the value of the switching index is 2, it indicates that the automatic switching method has been selected.
- the playback side (decoding side)
- the VBAP gain of a frame before the randomly accessed frame is not calculated. Gain interpolation cannot be performed. For this reason, the MPEG-H 3D Audio standard cannot perform random access.
- the metadata necessary for performing the interpolation process is also transmitted together with the metadata of the frames, so that the frames before the current frame are transmitted.
- the VBAP gain of the sample or the first sample of the current frame can be calculated. Thereby, random access becomes possible.
- the metadata for performing the interpolation process that is transmitted together with the normal metadata is also referred to as additional metadata.
- the additional metadata transmitted together with the metadata of the current frame is, for example, the metadata of the last sample of the frame immediately before the current frame or the metadata of the first sample of the current frame.
- an additional metadata flag indicating whether or not there is additional metadata for each frame is stored for each object in the bitstream so that it can be easily specified whether or not there is additional metadata for each frame. . For example, when the value of the additional metadata flag of a predetermined frame is 1, additional metadata exists in the frame, and when the value of the additional metadata flag is 0, the additional metadata is not included in the frame. It is said that it does not exist.
- a frame with additional metadata that is closest in time to the frame may be used as a random access destination. Therefore, by transmitting additional metadata at an appropriate frame interval or the like, random access can be realized without making the user feel unnatural.
- VBAP gain interpolation processing may be performed without using additional metadata in a frame designated as an access destination for random access. In this case, random access is possible while suppressing an increase in the data amount (bit rate) of the bit stream due to storing additional metadata.
- the VBAP gain value of the frame before the current frame is set to 0, and interpolation processing with the VBAP gain value calculated in the current frame is performed.
- the present invention is not limited to this method, and the interpolation processing may be performed so that the VBAP gain values of the samples in the current frame are all the same as the VBAP gain calculated in the current frame.
- an interpolation process using the VBAP gain of a frame before the current frame is performed as usual.
- random access can be performed without using additional metadata.
- an independent frame for each frame, whether or not the current frame is a frame (referred to as an independent frame) that can be decoded and rendered using only the data of the current frame in the bitstream.
- An independent flag also referred to as indepFlag
- the decoding side can perform decoding and rendering without using any data in the bitstream before the current frame and any information obtained by decoding the data. It is supposed to be possible.
- the above-described additional metadata may be stored in the bit stream, or the above-described interpolation processing may be switched.
- the value of the independent flag is 1 by switching whether to store additional metadata in the bitstream or switching the interpolation processing of the VBAP gain according to the value of the independent flag.
- decoding and rendering can be performed without using the VBAP gain of the frame before the current frame.
- the metadata obtained by decoding is only the representative sample in the frame, that is, the metadata of the last sample.
- metadata before compression (encoding) input to the encoding device is hardly defined for all samples in the frame. That is, many samples in the audio signal frame have no metadata from the state before encoding.
- samples arranged at equal intervals such as 0th sample, 1024th sample, 2048th sample have metadata, 0th sample, 138th sample, 2044th sample, etc. In most cases, only samples arranged at unequal intervals have metadata.
- the metadata of those samples is obtained by interpolation processing (sample interpolation), and decoding and rendering are performed in real time on the decoding side.
- sample interpolation sample interpolation
- decoding and rendering are performed in real time on the decoding side.
- the metadata interpolation process may be any process such as linear interpolation or nonlinear interpolation using a higher-order function.
- bit stream shown in FIG. 1 is output from the encoding device that encodes the audio signal and metadata of each object.
- a header is arranged at the head, and in the header, information indicating the number of samples constituting one frame of the audio signal of each object, that is, the number of samples of one frame (hereinafter, the number of samples) (Also referred to as sample number information).
- data for each frame is arranged behind the header.
- an independent flag indicating whether or not the current frame is an independent frame is arranged in the region R10.
- encoded audio data obtained by encoding the audio signal of each object in the same frame is arranged.
- encoded metadata obtained by encoding the metadata of each object in the same frame is arranged.
- encoded metadata for one frame of one object is arranged in a region R21 in the region R12.
- an additional metadata flag is arranged at the head of the encoded metadata, and a switching index is arranged following the additional metadata flag.
- the metadata number information is arranged following the switching index, but the sample index is not arranged.
- the method indicated by the switching index is the sample designation method
- the metadata number information and the sample index are arranged following the switching index.
- the method indicated by the switching index is the automatic switching method, neither the metadata number information nor the sample index is arranged following the switching index.
- Additional metadata is arranged at the position following the metadata number information and sample index that are arranged as necessary, and the metadata of each sample is arranged by the defined number following the additional metadata. .
- the additional metadata is arranged only when the value of the additional metadata flag is 1, and is not arranged when the value of the additional metadata flag is 0.
- encoded metadata similar to the encoded metadata arranged in the area R21 is arranged for each object.
- bitstream from the independent flag arranged in the area R10, the encoded audio data of each object arranged in the area R11, and the encoded metadata of each object arranged in the area R12 One frame of data is configured.
- FIG. 2 is a diagram illustrating a configuration example of an encoding device to which the present technology is applied.
- the encoding device 11 includes an audio signal acquisition unit 21, an audio signal encoding unit 22, a metadata acquisition unit 23, an interpolation processing unit 24, a related information acquisition unit 25, a metadata encoding unit 26, a multiplexing unit 27, and an output. A portion 28 is provided.
- the audio signal acquisition unit 21 acquires the audio signal of each object and supplies it to the audio signal encoding unit 22.
- the audio signal encoding unit 22 encodes the audio signal supplied from the audio signal acquisition unit 21 in units of frames, and supplies the encoded audio data for each frame of each object obtained as a result to the multiplexing unit 27.
- the metadata acquisition unit 23 acquires metadata for each frame of each object, more specifically, metadata of each sample in the frame, and supplies it to the interpolation processing unit 24.
- the metadata includes, for example, position information indicating the position of the object in the space, importance information indicating the importance of the object, information indicating the extent of the sound image of the object, and the like.
- the metadata acquisition unit 23 acquires metadata of a predetermined sample (PCM sample) of the audio signal of each object.
- the interpolation processing unit 24 performs an interpolation process on the metadata supplied from the metadata acquisition unit 23, and the metadata of all samples or some specific samples among samples without metadata of the audio signal. Generate.
- the interpolation processing unit 24 performs the interpolation processing so that the audio signal of one frame of one object has a plurality of metadata, that is, the plurality of samples in one frame have metadata. Data is generated.
- the interpolation processing unit 24 supplies metadata for each frame of each object obtained by the interpolation processing to the metadata encoding unit 26.
- the related information acquisition unit 25 is information (referred to as independent frame information) indicating whether the current frame is to be an independent frame for each frame, sample number information for each frame of the audio signal for each object, Information related to metadata, such as information indicating whether to transmit metadata in a scheme, information indicating whether additional metadata is transmitted, information indicating which sample metadata is transmitted, is acquired as related information. Further, the related information acquisition unit 25 generates necessary information of an additional metadata flag, a switching index, metadata number information, and a sample index for each frame based on the acquired related information, This is supplied to the metadata encoding unit 26.
- the metadata encoding unit 26 encodes the metadata supplied from the interpolation processing unit 24 based on the information supplied from the related information acquisition unit 25, and the code for each frame of each object obtained as a result. And the independent frame information included in the information supplied from the related information acquisition unit 25 is supplied to the multiplexing unit 27.
- the multiplexing unit 27 encodes the encoded audio data supplied from the audio signal encoding unit 22, the encoded metadata supplied from the metadata encoding unit 26, and the independent frame supplied from the metadata encoding unit 26.
- a bit stream is generated by multiplexing the independent flag obtained based on the information and supplied to the output unit 28.
- the output unit 28 outputs the bit stream supplied from the multiplexing unit 27. That is, a bit stream is transmitted.
- the encoding device 11 When the audio signal of the object is supplied from the outside, the encoding device 11 performs an encoding process and outputs a bit stream.
- the encoding process performed by the encoding device 11 will be described with reference to the flowchart of FIG. 3. This encoding process is performed for each frame of the audio signal.
- step S11 the audio signal acquisition unit 21 acquires the audio signal of each object for one frame and supplies it to the audio signal encoding unit 22.
- step S ⁇ b> 12 the audio signal encoding unit 22 encodes the audio signal supplied from the audio signal acquisition unit 21, and supplies the encoded audio data for one frame of each object obtained as a result to the multiplexing unit 27. To do.
- the audio signal encoding unit 22 converts the audio signal from a time signal to a frequency signal by performing MDCT (Modified Discrete Cosine Transform) or the like on the audio signal. Then, the audio signal encoding unit 22 encodes the MDCT coefficient obtained by MDCT, and the encoded audio obtained by encoding the audio signal with the scale factor, side information, and quantized spectrum obtained as a result. Data.
- MDCT Modified Discrete Cosine Transform
- step S13 the metadata acquisition unit 23 acquires metadata for each frame of the audio signal for each object and supplies the acquired metadata to the interpolation processing unit 24.
- step S14 the interpolation processing unit 24 performs an interpolation process on the metadata supplied from the metadata acquisition unit 23 and supplies the metadata to the metadata encoding unit 26.
- the interpolation processing unit 24 is based on position information as metadata of a predetermined sample and position information as metadata of another sample positioned temporally before the predetermined sample.
- position information of each sample located between these two samples is calculated by linear interpolation.
- interpolation processing such as linear interpolation is performed on importance level information as metadata and information indicating the degree of sound image spread, and metadata of each sample is generated.
- Metadata may be calculated. Further, the interpolation process is not limited to linear interpolation, and may be nonlinear interpolation.
- step S15 the related information acquisition unit 25 acquires related information related to metadata for the frame of the audio signal of each object.
- the related information acquisition unit 25 generates necessary information among the additional metadata flag, the switching index, the metadata number information, and the sample index for each object based on the acquired related information, and performs metadata encoding. To the unit 26.
- the related information acquisition unit 25 may acquire the additional metadata flag, the switching index, and the like from the outside instead of generating the additional metadata flag, the switching index, and the like.
- step S16 the metadata encoding unit 26 is supplied from the interpolation processing unit 24 based on the additional metadata flag, switching index, metadata number information, sample index, and the like supplied from the related information acquisition unit 25. Encode the metadata.
- the encoded metadata is generated so that only the metadata of the first is transmitted. Further, the metadata of the first sample of the frame or the metadata of the last sample of the immediately preceding frame that has been held is used as additional metadata as necessary.
- the encoded metadata includes, in addition to metadata, an additional metadata flag and a switching index, and includes metadata number information, a sample index, additional metadata, and the like as necessary.
- the encoded metadata of each object stored in the bitstream region R12 shown in FIG. 1 is obtained.
- the encoded metadata stored in the region R21 is encoded metadata for one frame of one object.
- an additional metadata flag, a switching index, metadata number information, additional metadata, and meta data are transmitted. Encoded metadata consisting of data is generated.
- the frame when a sample designation method is selected in a frame that is a processing target of an object and no additional metadata is transmitted, the frame includes an additional metadata flag, a switching index, metadata number information, a sample index, and metadata. Encoded metadata is generated.
- an encoded metadata including an additional metadata flag, a switching index, additional metadata, and metadata is used. Data is generated.
- the metadata encoding unit 26 supplies the encoded metadata of each object obtained by encoding the metadata and the independent frame information included in the information supplied from the related information acquisition unit 25 to the multiplexing unit 27. To do.
- step S ⁇ b> 17 the multiplexing unit 27 supplies encoded audio data supplied from the audio signal encoding unit 22, encoded metadata supplied from the metadata encoding unit 26, and supplied from the metadata encoding unit 26.
- a bit stream is generated by multiplexing the independent flag obtained based on the independent frame information, and is supplied to the output unit 28.
- bit stream for one frame for example, a bit stream composed of the areas R10 to R12 of the bit stream shown in FIG. 1 is generated.
- step S18 the output unit 28 outputs the bit stream supplied from the multiplexing unit 27, and the encoding process ends.
- a header including sample number information and the like is also output as shown in FIG.
- the encoding device 11 encodes the audio signal, encodes the metadata, and outputs a bit stream including the encoded audio data and the encoded metadata obtained as a result.
- one or more metadata can always be transmitted in one frame, and decoding and rendering can be performed in real time on the decoding side. Furthermore, random access can be realized by transmitting additional metadata as necessary.
- a decoding device that receives (acquires) the bit stream output from the encoding device 11 and performs decoding will be described.
- a decoding device to which the present technology is applied is configured as shown in FIG.
- the speaker system 52 including a plurality of speakers arranged in the reproduction space is connected to the decoding device 51.
- the decoding device 51 supplies the audio signal of each channel obtained by decoding and rendering to the speaker of each channel constituting the speaker system 52, and reproduces the sound.
- the decoding device 51 includes an acquisition unit 61, a separation unit 62, an audio signal decoding unit 63, a metadata decoding unit 64, a gain calculation unit 65, and an audio signal generation unit 66.
- the acquisition unit 61 acquires the bit stream output from the encoding device 11 and supplies the bit stream to the separation unit 62.
- the separation unit 62 separates the bit stream supplied from the acquisition unit 61 into an independent flag, encoded audio data, and encoded metadata, and supplies the encoded audio data to the audio signal decoding unit 63.
- the encoded metadata are supplied to the metadata decoding unit 64.
- the separation unit 62 reads various types of information such as the number-of-samples information from the header of the bit stream as necessary, and supplies the information to the audio signal decoding unit 63 and the metadata decoding unit 64.
- the audio signal decoding unit 63 decodes the encoded audio data supplied from the separation unit 62, and supplies the audio signal of each object obtained as a result to the audio signal generation unit 66.
- the metadata decoding unit 64 decodes the encoded metadata supplied from the separation unit 62, the metadata of each frame of the audio signal for each object obtained as a result, and the independent flag supplied from the separation unit 62. Is supplied to the gain calculation unit 65.
- the metadata decoding unit 64 includes an additional metadata flag reading unit 71 that reads an additional metadata flag from the encoded metadata, and a switching index reading unit 72 that reads a switching index from the encoded metadata.
- the gain calculation unit 65 includes arrangement position information indicating the arrangement position of each speaker constituting the speaker system 52 held in advance, and metadata for each frame of each object supplied from the metadata decoding unit 64. Based on the independent flag, the VBAP gain of the sample in the frame of the audio signal is calculated for each object.
- the gain calculation unit 65 has an interpolation processing unit 73 that calculates the VBAP gain of other samples by interpolation processing based on the VBAP gain of a predetermined sample.
- the gain calculator 65 supplies the audio signal generator 66 with the VBAP gain calculated for each sample in the frame of the audio signal for each object.
- the audio signal generation unit 66 based on the audio signal of each object supplied from the audio signal decoding unit 63 and the VBAP gain for each sample of each object supplied from the gain calculation unit 65, That is, an audio signal to be supplied to the speaker of each channel is generated.
- the audio signal generation unit 66 supplies the generated audio signal to each speaker constituting the speaker system 52, and outputs sound based on the audio signal.
- the block including the gain calculation unit 65 and the audio signal generation unit 66 functions as a renderer (rendering unit) that performs rendering based on the audio signal and metadata obtained by decoding.
- the decoding device 51 When a bit stream is transmitted from the encoding device 11, the decoding device 51 performs a decoding process of receiving (acquiring) the bit stream and decoding it. Hereinafter, the decoding process by the decoding device 51 will be described with reference to the flowchart of FIG. This decoding process is performed for each frame of the audio signal.
- step S41 the acquisition unit 61 acquires the bit stream output from the encoding device 11 for one frame and supplies the bit stream to the separation unit 62.
- step S ⁇ b> 42 the separation unit 62 separates the bitstream supplied from the acquisition unit 61 into an independent flag, encoded audio data, and encoded metadata, and supplies the encoded audio data to the audio signal decoding unit 63.
- the independent flag and the encoded metadata are supplied to the metadata decoding unit 64.
- the separation unit 62 supplies the number-of-samples information read from the header of the bitstream to the metadata decoding unit 64.
- the sample number information supply timing may be the timing at which the bitstream header is acquired.
- step S43 the audio signal decoding unit 63 decodes the encoded audio data supplied from the separation unit 62, and supplies the audio signal for one frame of each object obtained as a result to the audio signal generation unit 66.
- the audio signal decoding unit 63 decodes the encoded audio data and obtains MDCT coefficients. Specifically, the audio signal decoding unit 63 calculates an MDCT coefficient based on the scale factor, side information, and quantized spectrum supplied as encoded audio data.
- the audio signal decoding unit 63 performs IMDCT (Inverse Modified Discrete Cosine Transform) based on the MDCT coefficient, and supplies the PCM data obtained as a result to the audio signal generating unit 66 as an audio signal.
- IMDCT Inverse Modified Discrete Cosine Transform
- step S44 the additional metadata flag reading unit 71 of the metadata decoding unit 64 reads the additional metadata flag from the encoded metadata supplied from the separation unit 62.
- the metadata decoding unit 64 sequentially sets objects corresponding to the encoded metadata sequentially supplied from the separation unit 62 as objects to be processed.
- the additional metadata flag reading unit 71 reads the additional metadata flag from the encoded metadata of the object to be processed.
- step S45 the switching index reading unit 72 of the metadata decoding unit 64 reads the switching index from the encoded metadata of the processing target object supplied from the separation unit 62.
- step S46 the switching index reading unit 72 determines whether or not the method indicated by the switching index read in step S45 is the number designation method.
- step S47 the metadata decoding unit 64 reads metadata number information from the encoded metadata of the processing target object supplied from the separation unit 62.
- step S48 the metadata decoding unit 64 is transmitted in the frame of the audio signal of the object to be processed based on the metadata number information read in step S47 and the sample number information supplied from the separation unit 62. Specify the location of the sample metadata.
- a section of one frame composed of the number of samples indicated by the sample number information is equally divided into sections of the number of metadata indicated by the metadata number information, and the last sample position of each equally divided section is the metadata.
- the sample position that is, the position of the sample having metadata.
- the sample position obtained in this way is set as a sample position of each metadata included in the encoded metadata, that is, a sample having the metadata.
- step S53 when the number of metadata included in the encoded metadata of the object to be processed and the sample position of each metadata are specified, the process proceeds to step S53.
- step S49 the switching index reading unit 72 determines whether or not the method indicated by the switching index read in step S45 is the sample designation method.
- step S50 the metadata decoding unit 64 reads metadata number information from the encoded metadata of the processing target object supplied from the separation unit 62.
- step S51 the metadata decoding unit 64 reads the sample index from the encoded metadata of the processing target object supplied from the separation unit 62. At this time, as many sample indexes as the number indicated by the metadata number information are read.
- the number of metadata stored in the encoded metadata of the object to be processed and the sample position of the metadata can be specified from the metadata number information and the sample index read in this way. it can.
- step S53 When the number of metadata included in the encoded metadata of the object to be processed and the sample position of each metadata are specified, the process proceeds to step S53.
- step S49 If it is determined in step S49 that the method is not the sample designation method, that is, if the method indicated by the switching index is the automatic switching method, the process proceeds to step S52.
- step S52 the metadata decoding unit 64 determines the number of metadata included in the encoded metadata of the object to be processed and the sample of each metadata based on the number-of-samples information supplied from the separation unit 62. The position is specified, and the process proceeds to step S53.
- the number of metadata to be transmitted and the sample position of each metadata, that is, which sample metadata is to be transmitted are predetermined for the number of samples constituting one frame. .
- the metadata decoding unit 64 can specify the number of metadata stored in the encoded metadata of the object to be processed and the sample position of those metadata from the sample number information.
- step S53 the metadata decoding unit 64 determines whether there is additional metadata based on the value of the additional metadata flag read in step S44. Determine whether or not.
- step S54 the metadata decoding unit 64 reads the additional metadata from the encoded metadata of the object to be processed. When the additional metadata is read, the process thereafter proceeds to step S55.
- step S53 determines that there is no additional metadata
- step S54 the process of step S54 is skipped, and the process proceeds to step S55.
- step S55 the metadata decoding unit 64 determines the metadata from the encoded metadata of the processing target object. Is read.
- metadata is read from the encoded metadata by the number specified by the above-described processing.
- metadata and additional metadata are read out for the audio signal for one frame of the object to be processed.
- the metadata decoding unit 64 supplies each read metadata to the gain calculation unit 65. At that time, the gain calculation unit 65 supplies the metadata so that it can be specified which metadata is which sample of which object. When the additional metadata is read, the metadata decoding unit 64 also supplies the read additional metadata to the gain calculation unit 65.
- step S56 the metadata decoding unit 64 determines whether or not the metadata has been read for all the objects.
- step S56 If it is determined in step S56 that metadata has not been read for all objects, the process returns to step S44, and the above-described process is repeated. In this case, an object that has not yet been processed is set as a new object to be processed, and metadata or the like is read from the encoded metadata of the object.
- step S56 if it is determined in step S56 that the metadata has been read for all objects, the metadata decoding unit 64 supplies the independent flag supplied from the separation unit 62 to the gain calculation unit 65, and then The process proceeds to step S57, and rendering is started.
- step S57 the gain calculation unit 65 calculates the VBAP gain based on the metadata supplied from the metadata decoding unit 64, the additional metadata, and the independent flag.
- the gain calculation unit 65 sequentially selects each object as an object to be processed, and in addition, samples with metadata in the frame of the audio signal of the object to be processed are sequentially processed samples. Choose as.
- the gain calculator 65 determines the position of the object in the space indicated by the position information as the metadata of the sample, and the position in the space of each speaker of the speaker system 52 indicated by the arrangement position information. Based on the above, VBAP calculates the VBAP gain of each channel of the sample to be processed, that is, the speaker of each channel.
- VBAP a sound image can be localized at the position of an object by outputting sound with a predetermined gain from three or two speakers around the object.
- VBAP is described in detail in, for example, “Ville Pulkki,“ Virtual Sound Source Positioning Using Vector Base Amplitude Panning ”, Journal of AES, vol.45, no.6, pp.456-466, 1997. Yes.
- step S58 the interpolation processing unit 73 performs an interpolation process to calculate the VBAP gain of each speaker of the sample without metadata.
- a VBAP gain of a certain sample (hereinafter also referred to as a reference sample) is used. That is, the VBAP gain of the sample to be processed and the VBAP gain of the reference sample are used for each speaker (channel) constituting the speaker system 52, and are between the sample to be processed and the reference sample.
- the VBAP gain of each sample is calculated by linear interpolation or the like.
- the gain calculation unit 65 adds the additional metadata. VBAP gain is calculated using.
- the gain calculation unit 65 uses the additional metadata to determine the first sample of the frame or the last sample of the frame immediately before the frame. Is used as a reference sample, and the VBAP gain of the reference sample is calculated.
- the interpolation processing unit 73 calculates the VBAP gain of each sample between the sample to be processed and the reference sample from the VBAP gain of the sample to be processed and the VBAP gain of the reference sample by interpolation processing.
- the VBAP gain using the additional metadata Is not calculated, and the interpolation process is switched.
- the gain calculation unit 65 uses the first sample of the frame or the last sample of the frame immediately before the frame as a reference sample, and the reference Calculate with the sample VBAP gain set to zero.
- the interpolation processing unit 73 calculates the VBAP gain of each sample between the sample to be processed and the reference sample from the VBAP gain of the sample to be processed and the VBAP gain of the reference sample by interpolation processing.
- interpolation processing may be performed so that the VBAP gain of each sample to be interpolated is set to the same value as the VBAP gain of the sample to be processed.
- the metadata decoding unit 64 is configured to obtain sample metadata by interpolation processing for a sample without metadata. May be. In this case, since the metadata of all the samples of the audio signal is obtained, the interpolation processing unit 73 does not perform VBAP gain interpolation processing.
- step S59 the gain calculation unit 65 determines whether the VBAP gain of all samples in the frame of the audio signal of the object to be processed has been calculated.
- step S59 If it is determined in step S59 that the VBAP gains for all samples have not yet been calculated, the process returns to step S57, and the above-described processes are repeated. That is, the next sample having metadata is selected as a sample to be processed, and the VBAP gain is calculated.
- step S60 the gain calculation unit 65 determines whether the VBAP gains of all objects have been calculated.
- step S60 If it is determined in step S60 that the VBAP gains of all objects have not yet been calculated, the process returns to step S57, and the above-described processes are repeated.
- step S60 if it is determined in step S60 that the VBAP gains of all objects have been calculated, the gain calculation unit 65 supplies the calculated VBAP gains to the audio signal generation unit 66, and the process proceeds to step S61.
- the VBAP gain of each sample in the frame of the audio signal of each object calculated for each speaker is supplied to the audio signal generation unit 66.
- step S ⁇ b> 61 the audio signal generation unit 66 sets each speaker based on the audio signal of each object supplied from the audio signal decoding unit 63 and the VBAP gain for each sample of each object supplied from the gain calculation unit 65. Generate an audio signal.
- the audio signal generation unit 66 adds the signal obtained by multiplying each of the audio signals of each object by the VBAP gain of the same speaker obtained for each object for each sample. The audio signal of the speaker is generated.
- VBAP gain G1 to VBAP gain G3 are obtained as VBAP gains of a predetermined speaker SP1 constituting the speaker system 52 of these objects.
- the audio signal of the object OB1 multiplied by the VBAP gain G1 the audio signal of the object OB2 multiplied by the VBAP gain G2
- the audio signal of the object OB3 multiplied by the VBAP gain G3 are added and obtained as a result.
- the audio signal is the audio signal supplied to the speaker SP1.
- step S62 the audio signal generation unit 66 supplies the audio signal of each speaker obtained in the process of step S61 to each speaker of the speaker system 52, reproduces the sound based on the audio signal, and performs the decoding process. finish. Thereby, the sound of each object is reproduced by the speaker system 52.
- the decoding device 51 decodes the encoded audio data and the encoded metadata, performs rendering based on the audio signal and metadata obtained by the decoding, and generates an audio signal of each speaker.
- the decoding device 51 since a plurality of metadata is obtained with respect to the frame of the audio signal of the object when rendering is performed, the length of the section where the samples where the VBAP gain is calculated by the interpolation process can be shortened. it can. Thereby, not only high-quality sound can be obtained, but also decoding and rendering can be performed in real time. Further, depending on the frame, additional metadata is included in the encoded metadata, so that random access and decoding and rendering in an independent frame can also be realized. Even in a frame that does not include additional metadata, random access and decoding and rendering in an independent frame can be realized by switching VBAP gain interpolation processing.
- the series of processes described above can be executed by hardware or can be executed by software.
- a program constituting the software is installed in the computer.
- the computer includes, for example, a general-purpose personal computer capable of executing various functions by installing a computer incorporated in dedicated hardware and various programs.
- FIG. 6 is a block diagram showing an example of the hardware configuration of a computer that executes the above-described series of processing by a program.
- a CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- An input / output interface 505 is further connected to the bus 504.
- An input unit 506, an output unit 507, a recording unit 508, a communication unit 509, and a drive 510 are connected to the input / output interface 505.
- the input unit 506 includes a keyboard, a mouse, a microphone, an image sensor, and the like.
- the output unit 507 includes a display, a speaker, and the like.
- the recording unit 508 includes a hard disk, a nonvolatile memory, and the like.
- the communication unit 509 includes a network interface or the like.
- the drive 510 drives a removable recording medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
- the CPU 501 loads the program recorded in the recording unit 508 to the RAM 503 via the input / output interface 505 and the bus 504 and executes the program, for example. Is performed.
- the program executed by the computer (CPU 501) can be provided by being recorded in a removable recording medium 511 as a package medium, for example.
- the program can be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.
- the program can be installed in the recording unit 508 via the input / output interface 505 by attaching the removable recording medium 511 to the drive 510. Further, the program can be received by the communication unit 509 via a wired or wireless transmission medium and installed in the recording unit 508. In addition, the program can be installed in the ROM 502 or the recording unit 508 in advance.
- the program executed by the computer may be a program that is processed in time series in the order described in this specification, or in parallel or at a necessary timing such as when a call is made. It may be a program for processing.
- the present technology can take a cloud computing configuration in which one function is shared by a plurality of devices via a network and is jointly processed.
- each step described in the above flowchart can be executed by one device or can be shared by a plurality of devices.
- the plurality of processes included in the one step can be executed by being shared by a plurality of apparatuses in addition to being executed by one apparatus.
- the present technology can be configured as follows.
- An acquisition unit that acquires encoded audio data obtained by encoding audio signals of frames of a predetermined time interval of the audio object, and a plurality of metadata of the frames;
- a decoding unit for decoding the encoded audio data A decoding apparatus comprising: a rendering unit that performs rendering based on the audio signal obtained by the decoding and the plurality of metadata.
- the metadata includes position information indicating a position of the audio object.
- each of the plurality of metadata is metadata of each of a plurality of samples in the frame of the audio signal.
- Each of the plurality of metadata is metadata of each of a plurality of samples arranged at intervals of the number of samples obtained by dividing the number of samples constituting the frame by the number of the plurality of metadata.
- each of the plurality of metadata is metadata of each of a plurality of samples indicated by each of a plurality of sample indexes.
- the plurality of metadata includes metadata for performing interpolation processing of the gain of the sample of the audio signal calculated based on the metadata. (1) to (6) The decoding device described.
- An encoding unit that encodes audio signals of frames of a predetermined time interval of the audio object;
- An encoding device comprising: a generation unit that generates a bitstream including encoded audio data obtained by the encoding and a plurality of metadata of the frame.
- the metadata includes position information indicating a position of the audio object.
- each of the plurality of metadata is metadata of each of a plurality of samples in the frame of the audio signal.
- Each of the plurality of metadata is metadata of each of a plurality of samples arranged at intervals of the number of samples obtained by dividing the number of samples constituting the frame by the number of the plurality of metadata.
- each of the plurality of metadata is metadata of each of a plurality of samples indicated by each of a plurality of sample indexes.
- each of the plurality of metadata is metadata of each of a plurality of samples arranged at a predetermined number of sample intervals in the frame.
- the plurality of metadata includes metadata for performing interpolation processing of the gain of the sample of the audio signal calculated based on the metadata. (10) to (15) The encoding device described. (17) The encoding apparatus according to any one of (10) to (16), further including an interpolation processing unit that performs an interpolation process on the metadata.
- An encoding method including a step of generating a bitstream including encoded audio data obtained by the encoding and a plurality of metadata of the frame.
- An encoding method including a step of generating a bitstream including encoded audio data obtained by the encoding and a plurality of metadata of the frame.
- a program for causing a computer to execute a process including a step of generating a bitstream including encoded audio data obtained by the encoding and a plurality of metadata of the frame.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Mathematical Physics (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- Stereophonic System (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Signal Processing For Digital Recording And Reproducing (AREA)
Abstract
Description
〈本技術の概要について〉
本技術は、オーディオオブジェクトのオーディオ信号と、そのオーディオオブジェクトの位置情報などのメタデータとを符号化して伝送したり、復号側においてそれらのオーディオ信号とメタデータを復号して音声を再生したりする場合に、より高音質な音声を得ることができるようにするものである。なお、以下では、オーディオオブジェクトを単にオブジェクトとも称することとする。
まず、個数指定方式について説明する。
次に、サンプル指定方式について説明する。
さらに、自動切り替え方式について説明する。
次に、以上において説明した本技術を適用した、より具体的な実施の形態について説明する。
次に、図1に示したビットストリームを出力する符号化装置の構成について説明する。図2は、本技術を適用した符号化装置の構成例を示す図である。
符号化装置11は、外部からオブジェクトのオーディオ信号が供給されると、符号化処理を行ってビットストリームを出力する。以下、図3のフローチャートを参照して、符号化装置11による符号化処理について説明する。なお、この符号化処理はオーディオ信号のフレームごとに行われる。
続いて、符号化装置11から出力されたビットストリームを受信(取得)して復号を行う復号装置について説明する。例えば本技術を適用した復号装置は、図4に示すように構成される。
復号装置51は、符号化装置11からビットストリームが送信されてくると、そのビットストリームを受信(取得)して復号する復号処理を行う。以下、図5のフローチャートを参照して、復号装置51による復号処理について説明する。なお、この復号処理はオーディオ信号のフレームごとに行われる。
オーディオオブジェクトの所定時間間隔のフレームのオーディオ信号を符号化して得られた符号化オーディオデータと、前記フレームの複数のメタデータとを取得する取得部と、
前記符号化オーディオデータを復号する復号部と、
前記復号により得られたオーディオ信号と、前記複数のメタデータとに基づいてレンダリングを行うレンダリング部と
を備える復号装置。
(2)
前記メタデータには、前記オーディオオブジェクトの位置を示す位置情報が含まれている
(1)に記載の復号装置。
(3)
前記複数のメタデータのそれぞれは、前記オーディオ信号の前記フレーム内の複数のサンプルのそれぞれのメタデータである
(1)または(2)に記載の復号装置。
(4)
前記複数のメタデータのそれぞれは、前記フレームを構成するサンプルの数を前記複数のメタデータの数で除算して得られるサンプル数の間隔で並ぶ複数のサンプルのそれぞれのメタデータである
(3)に記載の復号装置。
(5)
前記複数のメタデータのそれぞれは、複数のサンプルインデックスのそれぞれにより示される複数のサンプルのそれぞれのメタデータである
(3)に記載の復号装置。
(6)
前記複数のメタデータのそれぞれは、前記フレーム内の所定サンプル数間隔で並ぶ複数のサンプルのそれぞれのメタデータである
(3)に記載の復号装置。
(7)
前記複数のメタデータには、メタデータに基づいて算出される前記オーディオ信号のサンプルのゲインの補間処理を行うためのメタデータが含まれている
(1)乃至(6)の何れか一項に記載の復号装置。
(8)
オーディオオブジェクトの所定時間間隔のフレームのオーディオ信号を符号化して得られた符号化オーディオデータと、前記フレームの複数のメタデータとを取得し、
前記符号化オーディオデータを復号し、
前記復号により得られたオーディオ信号と、前記複数のメタデータとに基づいてレンダリングを行う
ステップを含む復号方法。
(9)
オーディオオブジェクトの所定時間間隔のフレームのオーディオ信号を符号化して得られた符号化オーディオデータと、前記フレームの複数のメタデータとを取得し、
前記符号化オーディオデータを復号し、
前記復号により得られたオーディオ信号と、前記複数のメタデータとに基づいてレンダリングを行う
ステップを含む処理をコンピュータに実行させるプログラム。
(10)
オーディオオブジェクトの所定時間間隔のフレームのオーディオ信号を符号化する符号化部と、
前記符号化により得られた符号化オーディオデータと、前記フレームの複数のメタデータとが含まれたビットストリームを生成する生成部と
を備える符号化装置。
(11)
前記メタデータには、前記オーディオオブジェクトの位置を示す位置情報が含まれている
(10)に記載の符号化装置。
(12)
前記複数のメタデータのそれぞれは、前記オーディオ信号の前記フレーム内の複数のサンプルのそれぞれのメタデータである
(10)または(11)に記載の符号化装置。
(13)
前記複数のメタデータのそれぞれは、前記フレームを構成するサンプルの数を前記複数のメタデータの数で除算して得られるサンプル数の間隔で並ぶ複数のサンプルのそれぞれのメタデータである
(12)に記載の符号化装置。
(14)
前記複数のメタデータのそれぞれは、複数のサンプルインデックスのそれぞれにより示される複数のサンプルのそれぞれのメタデータである
(12)に記載の符号化装置。
(15)
前記複数のメタデータのそれぞれは、前記フレーム内の所定サンプル数間隔で並ぶ複数のサンプルのそれぞれのメタデータである
(12)に記載の符号化装置。
(16)
前記複数のメタデータには、メタデータに基づいて算出される前記オーディオ信号のサンプルのゲインの補間処理を行うためのメタデータが含まれている
(10)乃至(15)の何れか一項に記載の符号化装置。
(17)
メタデータに対する補間処理を行う補間処理部をさらに備える
(10)乃至(16)の何れか一項に記載の符号化装置。
(18)
オーディオオブジェクトの所定時間間隔のフレームのオーディオ信号を符号化し、
前記符号化により得られた符号化オーディオデータと、前記フレームの複数のメタデータとが含まれたビットストリームを生成する
ステップを含む符号化方法。
(19)
オーディオオブジェクトの所定時間間隔のフレームのオーディオ信号を符号化し、
前記符号化により得られた符号化オーディオデータと、前記フレームの複数のメタデータとが含まれたビットストリームを生成する
ステップを含む処理をコンピュータに実行させるプログラム。
Claims (19)
- オーディオオブジェクトの所定時間間隔のフレームのオーディオ信号を符号化して得られた符号化オーディオデータと、前記フレームの複数のメタデータとを取得する取得部と、
前記符号化オーディオデータを復号する復号部と、
前記復号により得られたオーディオ信号と、前記複数のメタデータとに基づいてレンダリングを行うレンダリング部と
を備える復号装置。 - 前記メタデータには、前記オーディオオブジェクトの位置を示す位置情報が含まれている
請求項1に記載の復号装置。 - 前記複数のメタデータのそれぞれは、前記オーディオ信号の前記フレーム内の複数のサンプルのそれぞれのメタデータである
請求項1に記載の復号装置。 - 前記複数のメタデータのそれぞれは、前記フレームを構成するサンプルの数を前記複数のメタデータの数で除算して得られるサンプル数の間隔で並ぶ複数のサンプルのそれぞれのメタデータである
請求項3に記載の復号装置。 - 前記複数のメタデータのそれぞれは、複数のサンプルインデックスのそれぞれにより示される複数のサンプルのそれぞれのメタデータである
請求項3に記載の復号装置。 - 前記複数のメタデータのそれぞれは、前記フレーム内の所定サンプル数間隔で並ぶ複数のサンプルのそれぞれのメタデータである
請求項3に記載の復号装置。 - 前記複数のメタデータには、メタデータに基づいて算出される前記オーディオ信号のサンプルのゲインの補間処理を行うためのメタデータが含まれている
請求項1に記載の復号装置。 - オーディオオブジェクトの所定時間間隔のフレームのオーディオ信号を符号化して得られた符号化オーディオデータと、前記フレームの複数のメタデータとを取得し、
前記符号化オーディオデータを復号し、
前記復号により得られたオーディオ信号と、前記複数のメタデータとに基づいてレンダリングを行う
ステップを含む復号方法。 - オーディオオブジェクトの所定時間間隔のフレームのオーディオ信号を符号化して得られた符号化オーディオデータと、前記フレームの複数のメタデータとを取得し、
前記符号化オーディオデータを復号し、
前記復号により得られたオーディオ信号と、前記複数のメタデータとに基づいてレンダリングを行う
ステップを含む処理をコンピュータに実行させるプログラム。 - オーディオオブジェクトの所定時間間隔のフレームのオーディオ信号を符号化する符号化部と、
前記符号化により得られた符号化オーディオデータと、前記フレームの複数のメタデータとが含まれたビットストリームを生成する生成部と
を備える符号化装置。 - 前記メタデータには、前記オーディオオブジェクトの位置を示す位置情報が含まれている
請求項10に記載の符号化装置。 - 前記複数のメタデータのそれぞれは、前記オーディオ信号の前記フレーム内の複数のサンプルのそれぞれのメタデータである
請求項10に記載の符号化装置。 - 前記複数のメタデータのそれぞれは、前記フレームを構成するサンプルの数を前記複数のメタデータの数で除算して得られるサンプル数の間隔で並ぶ複数のサンプルのそれぞれのメタデータである
請求項12に記載の符号化装置。 - 前記複数のメタデータのそれぞれは、複数のサンプルインデックスのそれぞれにより示される複数のサンプルのそれぞれのメタデータである
請求項12に記載の符号化装置。 - 前記複数のメタデータのそれぞれは、前記フレーム内の所定サンプル数間隔で並ぶ複数のサンプルのそれぞれのメタデータである
請求項12に記載の符号化装置。 - 前記複数のメタデータには、メタデータに基づいて算出される前記オーディオ信号のサンプルのゲインの補間処理を行うためのメタデータが含まれている
請求項10に記載の符号化装置。 - メタデータに対する補間処理を行う補間処理部をさらに備える
請求項10に記載の符号化装置。 - オーディオオブジェクトの所定時間間隔のフレームのオーディオ信号を符号化し、
前記符号化により得られた符号化オーディオデータと、前記フレームの複数のメタデータとが含まれたビットストリームを生成する
ステップを含む符号化方法。 - オーディオオブジェクトの所定時間間隔のフレームのオーディオ信号を符号化し、
前記符号化により得られた符号化オーディオデータと、前記フレームの複数のメタデータとが含まれたビットストリームを生成する
ステップを含む処理をコンピュータに実行させるプログラム。
Priority Applications (13)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/735,630 US20180315436A1 (en) | 2015-06-19 | 2016-06-03 | Encoding apparatus, encoding method, decoding apparatus, decoding method, and program |
JP2017524823A JP6915536B2 (ja) | 2015-06-19 | 2016-06-03 | 符号化装置および方法、復号装置および方法、並びにプログラム |
KR1020187027071A KR102140388B1 (ko) | 2015-06-19 | 2016-06-03 | 복호 장치, 복호 방법, 및 기록 매체 |
BR112017026743-8A BR112017026743B1 (pt) | 2015-06-19 | 2016-06-03 | Aparelho de decodificação, e, aparelho codificação |
EP16811469.2A EP3316599B1 (en) | 2015-06-19 | 2016-06-03 | Audio decoding apparatus and audio encoding apparatus |
CN201680034330.XA CN107637097B (zh) | 2015-06-19 | 2016-06-03 | 编码装置和方法、解码装置和方法以及程序 |
CN202110632109.7A CN113470665B (zh) | 2015-06-19 | 2016-06-03 | 编码装置和方法、解码装置和方法及计算机可读记录介质 |
MX2017016228A MX2017016228A (es) | 2015-06-19 | 2016-06-03 | Aparato codificador, metodo de codificacion, aparato decodificador, metodo de decodificacion, y programa. |
RU2017143404A RU2720439C2 (ru) | 2015-06-19 | 2016-06-03 | Устройство кодирования, способ кодирования, устройство декодирования, способ декодирования и программа |
CA2989099A CA2989099C (en) | 2015-06-19 | 2016-06-03 | Encoding apparatus, encoding method, decoding apparatus, decoding method, and program |
KR1020177035762A KR20170141276A (ko) | 2015-06-19 | 2016-06-03 | 부호화 장치 및 방법, 복호 장치 및 방법, 그리고 기록 매체 |
HK18103780.8A HK1244384A1 (zh) | 2015-06-19 | 2018-03-19 | 編碼裝置和方法、解碼裝置和方法以及程序 |
US16/447,693 US11170796B2 (en) | 2015-06-19 | 2019-06-20 | Multiple metadata part-based encoding apparatus, encoding method, decoding apparatus, decoding method, and program |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2015-123589 | 2015-06-19 | ||
JP2015123589 | 2015-06-19 | ||
JP2015196494 | 2015-10-02 | ||
JP2015-196494 | 2015-10-02 |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/735,630 A-371-Of-International US20180315436A1 (en) | 2015-06-19 | 2016-06-03 | Encoding apparatus, encoding method, decoding apparatus, decoding method, and program |
US16/447,693 Continuation US11170796B2 (en) | 2015-06-19 | 2019-06-20 | Multiple metadata part-based encoding apparatus, encoding method, decoding apparatus, decoding method, and program |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016203994A1 true WO2016203994A1 (ja) | 2016-12-22 |
Family
ID=57545216
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2016/066574 WO2016203994A1 (ja) | 2015-06-19 | 2016-06-03 | 符号化装置および方法、復号装置および方法、並びにプログラム |
Country Status (12)
Country | Link |
---|---|
US (2) | US20180315436A1 (ja) |
EP (1) | EP3316599B1 (ja) |
JP (4) | JP6915536B2 (ja) |
KR (2) | KR20170141276A (ja) |
CN (2) | CN107637097B (ja) |
BR (1) | BR112017026743B1 (ja) |
CA (2) | CA3232321A1 (ja) |
HK (1) | HK1244384A1 (ja) |
MX (1) | MX2017016228A (ja) |
RU (1) | RU2720439C2 (ja) |
TW (1) | TWI607655B (ja) |
WO (1) | WO2016203994A1 (ja) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019069710A1 (ja) * | 2017-10-05 | 2019-04-11 | ソニー株式会社 | 符号化装置および方法、復号装置および方法、並びにプログラム |
JP2020120377A (ja) * | 2019-01-25 | 2020-08-06 | 日本放送協会 | オーディオオーサリング装置、オーディオレンダリング装置、送信装置、受信装置、及び方法 |
JP2021530723A (ja) * | 2018-07-02 | 2021-11-11 | ドルビー ラボラトリーズ ライセンシング コーポレイション | 没入的オーディオ信号を含むビットストリームを生成またはデコードするための方法および装置 |
WO2022009694A1 (ja) * | 2020-07-09 | 2022-01-13 | ソニーグループ株式会社 | 信号処理装置および方法、並びにプログラム |
JP2023526136A (ja) * | 2020-05-26 | 2023-06-20 | ドルビー・インターナショナル・アーベー | 効率的なダッキング利得適用による改善されたメイン‐関連オーディオ体験 |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI607655B (zh) * | 2015-06-19 | 2017-12-01 | Sony Corp | Coding apparatus and method, decoding apparatus and method, and program |
RU2632473C1 (ru) * | 2016-09-30 | 2017-10-05 | ООО "Ай Ти Ви групп" | Способ обмена данными между ip видеокамерой и сервером (варианты) |
CN114898761A (zh) * | 2017-08-10 | 2022-08-12 | 华为技术有限公司 | 立体声信号编解码方法及装置 |
US10650834B2 (en) | 2018-01-10 | 2020-05-12 | Savitech Corp. | Audio processing method and non-transitory computer readable medium |
CN114128309B (zh) * | 2019-07-19 | 2024-05-07 | 索尼集团公司 | 信号处理装置和方法、以及程序 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014036121A1 (en) * | 2012-08-31 | 2014-03-06 | Dolby Laboratories Licensing Corporation | System for rendering and playback of object based audio in various listening environments |
JP2014522155A (ja) * | 2011-07-01 | 2014-08-28 | ドルビー ラボラトリーズ ライセンシング コーポレイション | 適応的オーディオ信号生成、コーディング、及びレンダリングのためのシステムと方法 |
Family Cites Families (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3352406B2 (ja) * | 1998-09-17 | 2002-12-03 | 松下電器産業株式会社 | オーディオ信号の符号化及び復号方法及び装置 |
US7624021B2 (en) * | 2004-07-02 | 2009-11-24 | Apple Inc. | Universal container for audio data |
EP2629292B1 (en) | 2006-02-03 | 2016-06-29 | Electronics and Telecommunications Research Institute | Method and apparatus for control of randering multiobject or multichannel audio signal using spatial cue |
CN101290774B (zh) * | 2007-01-31 | 2011-09-07 | 广州广晟数码技术有限公司 | 音频编码和解码系统 |
EP2158791A1 (en) * | 2007-06-26 | 2010-03-03 | Koninklijke Philips Electronics N.V. | A binaural object-oriented audio decoder |
CN102714035B (zh) * | 2009-10-16 | 2015-12-16 | 弗兰霍菲尔运输应用研究公司 | 用以提供一或多个经调整参数的装置及方法 |
EP2450880A1 (en) * | 2010-11-05 | 2012-05-09 | Thomson Licensing | Data structure for Higher Order Ambisonics audio data |
WO2013001325A1 (en) * | 2011-06-29 | 2013-01-03 | Thomson Licensing | Managing common content on a distributed storage system |
US9473870B2 (en) * | 2012-07-16 | 2016-10-18 | Qualcomm Incorporated | Loudspeaker position compensation with 3D-audio hierarchical coding |
US9516446B2 (en) * | 2012-07-20 | 2016-12-06 | Qualcomm Incorporated | Scalable downmix design for object-based surround codec with cluster analysis by synthesis |
WO2014087277A1 (en) * | 2012-12-06 | 2014-06-12 | Koninklijke Philips N.V. | Generating drive signals for audio transducers |
WO2014091375A1 (en) * | 2012-12-14 | 2014-06-19 | Koninklijke Philips N.V. | Reverberation processing in an audio signal |
MX347551B (es) * | 2013-01-15 | 2017-05-02 | Koninklijke Philips Nv | Procesamiento de audio binaural. |
RU2602332C1 (ru) * | 2013-01-21 | 2016-11-20 | Долби Лабораторис Лайсэнзин Корпорейшн | Перекодировка метаданных |
US9607624B2 (en) * | 2013-03-29 | 2017-03-28 | Apple Inc. | Metadata driven dynamic range control |
TWI530941B (zh) * | 2013-04-03 | 2016-04-21 | 杜比實驗室特許公司 | 用於基於物件音頻之互動成像的方法與系統 |
US8804971B1 (en) * | 2013-04-30 | 2014-08-12 | Dolby International Ab | Hybrid encoding of higher frequency and downmixed low frequency content of multichannel audio |
CN109712630B (zh) * | 2013-05-24 | 2023-05-30 | 杜比国际公司 | 包括音频对象的音频场景的高效编码 |
TWM487509U (zh) * | 2013-06-19 | 2014-10-01 | 杜比實驗室特許公司 | 音訊處理設備及電子裝置 |
TWI607655B (zh) * | 2015-06-19 | 2017-12-01 | Sony Corp | Coding apparatus and method, decoding apparatus and method, and program |
-
2016
- 2016-06-02 TW TW105117389A patent/TWI607655B/zh active
- 2016-06-03 CA CA3232321A patent/CA3232321A1/en active Pending
- 2016-06-03 KR KR1020177035762A patent/KR20170141276A/ko active Search and Examination
- 2016-06-03 CA CA2989099A patent/CA2989099C/en active Active
- 2016-06-03 RU RU2017143404A patent/RU2720439C2/ru active
- 2016-06-03 CN CN201680034330.XA patent/CN107637097B/zh active Active
- 2016-06-03 US US15/735,630 patent/US20180315436A1/en not_active Abandoned
- 2016-06-03 CN CN202110632109.7A patent/CN113470665B/zh active Active
- 2016-06-03 WO PCT/JP2016/066574 patent/WO2016203994A1/ja active Application Filing
- 2016-06-03 BR BR112017026743-8A patent/BR112017026743B1/pt active IP Right Grant
- 2016-06-03 EP EP16811469.2A patent/EP3316599B1/en active Active
- 2016-06-03 MX MX2017016228A patent/MX2017016228A/es unknown
- 2016-06-03 KR KR1020187027071A patent/KR102140388B1/ko active IP Right Grant
- 2016-06-03 JP JP2017524823A patent/JP6915536B2/ja active Active
-
2018
- 2018-03-19 HK HK18103780.8A patent/HK1244384A1/zh unknown
-
2019
- 2019-06-20 US US16/447,693 patent/US11170796B2/en active Active
-
2021
- 2021-05-10 JP JP2021079510A patent/JP7205566B2/ja active Active
-
2022
- 2022-12-12 JP JP2022198009A patent/JP7509190B2/ja active Active
-
2024
- 2024-06-20 JP JP2024099700A patent/JP2024111209A/ja active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2014522155A (ja) * | 2011-07-01 | 2014-08-28 | ドルビー ラボラトリーズ ライセンシング コーポレイション | 適応的オーディオ信号生成、コーディング、及びレンダリングのためのシステムと方法 |
WO2014036121A1 (en) * | 2012-08-31 | 2014-03-06 | Dolby Laboratories Licensing Corporation | System for rendering and playback of object based audio in various listening environments |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11595056B2 (en) | 2017-10-05 | 2023-02-28 | Sony Corporation | Encoding device and method, decoding device and method, and program |
JP7358986B2 (ja) | 2017-10-05 | 2023-10-11 | ソニーグループ株式会社 | 復号装置および方法、並びにプログラム |
CN111164679B (zh) * | 2017-10-05 | 2024-04-09 | 索尼公司 | 编码装置和方法、解码装置和方法以及程序 |
JPWO2019069710A1 (ja) * | 2017-10-05 | 2020-11-05 | ソニー株式会社 | 符号化装置および方法、復号装置および方法、並びにプログラム |
WO2019069710A1 (ja) * | 2017-10-05 | 2019-04-11 | ソニー株式会社 | 符号化装置および方法、復号装置および方法、並びにプログラム |
CN111164679A (zh) * | 2017-10-05 | 2020-05-15 | 索尼公司 | 编码装置和方法、解码装置和方法以及程序 |
JP2021530723A (ja) * | 2018-07-02 | 2021-11-11 | ドルビー ラボラトリーズ ライセンシング コーポレイション | 没入的オーディオ信号を含むビットストリームを生成またはデコードするための方法および装置 |
US12020718B2 (en) | 2018-07-02 | 2024-06-25 | Dolby International Ab | Methods and devices for generating or decoding a bitstream comprising immersive audio signals |
JP7575947B2 (ja) | 2018-07-02 | 2024-10-30 | ドルビー ラボラトリーズ ライセンシング コーポレイション | 没入的オーディオ信号を含むビットストリームを生成するための方法および装置 |
JP7441057B2 (ja) | 2019-01-25 | 2024-02-29 | 日本放送協会 | オーディオオーサリング装置、オーディオレンダリング装置、送信装置、受信装置、及び方法 |
JP2020120377A (ja) * | 2019-01-25 | 2020-08-06 | 日本放送協会 | オーディオオーサリング装置、オーディオレンダリング装置、送信装置、受信装置、及び方法 |
JP7434610B2 (ja) | 2020-05-26 | 2024-02-20 | ドルビー・インターナショナル・アーベー | 効率的なダッキング利得適用による改善されたメイン‐関連オーディオ体験 |
JP2023526136A (ja) * | 2020-05-26 | 2023-06-20 | ドルビー・インターナショナル・アーベー | 効率的なダッキング利得適用による改善されたメイン‐関連オーディオ体験 |
WO2022009694A1 (ja) * | 2020-07-09 | 2022-01-13 | ソニーグループ株式会社 | 信号処理装置および方法、並びにプログラム |
Also Published As
Publication number | Publication date |
---|---|
US20180315436A1 (en) | 2018-11-01 |
EP3316599A1 (en) | 2018-05-02 |
TWI607655B (zh) | 2017-12-01 |
KR102140388B1 (ko) | 2020-07-31 |
JP2023025251A (ja) | 2023-02-21 |
EP3316599A4 (en) | 2019-02-20 |
JP6915536B2 (ja) | 2021-08-04 |
JP7205566B2 (ja) | 2023-01-17 |
US11170796B2 (en) | 2021-11-09 |
MX2017016228A (es) | 2018-04-20 |
JPWO2016203994A1 (ja) | 2018-04-05 |
JP2021114001A (ja) | 2021-08-05 |
RU2017143404A (ru) | 2019-06-13 |
CA2989099C (en) | 2024-04-16 |
CA2989099A1 (en) | 2016-12-22 |
RU2720439C2 (ru) | 2020-04-29 |
CA3232321A1 (en) | 2016-12-22 |
KR20180107307A (ko) | 2018-10-01 |
EP3316599B1 (en) | 2020-10-28 |
BR112017026743A2 (ja) | 2018-08-28 |
KR20170141276A (ko) | 2017-12-22 |
US20190304479A1 (en) | 2019-10-03 |
CN113470665B (zh) | 2024-08-16 |
CN107637097B (zh) | 2021-06-29 |
RU2017143404A3 (ja) | 2019-11-13 |
CN113470665A (zh) | 2021-10-01 |
JP2024111209A (ja) | 2024-08-16 |
HK1244384A1 (zh) | 2018-08-03 |
JP7509190B2 (ja) | 2024-07-02 |
CN107637097A (zh) | 2018-01-26 |
TW201717663A (zh) | 2017-05-16 |
BR112017026743B1 (pt) | 2022-12-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7509190B2 (ja) | 復号装置および方法、並びにプログラム | |
JP6510541B2 (ja) | 環境高次アンビソニックス係数の遷移 | |
US9875746B2 (en) | Encoding device and method, decoding device and method, and program | |
US9847088B2 (en) | Intermediate compression for higher order ambisonic audio data | |
RU2689438C2 (ru) | Устройство кодирования и способ кодирования, устройство декодирования и способ декодирования и программа | |
US9875745B2 (en) | Normalization of ambient higher order ambisonic audio data | |
JP7459913B2 (ja) | 信号処理装置および方法、並びにプログラム | |
US20170092280A1 (en) | Information processing apparatus and information processing method | |
JP6619091B2 (ja) | 高次アンビソニック(hoa)コンテンツの画面に関連した適応 | |
KR102677399B1 (ko) | 신호 처리 장치 및 방법, 그리고 프로그램 | |
JP4743228B2 (ja) | デジタル音声信号解析方法、その装置、及び映像音声記録装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16811469 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2017524823 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2989099 Country of ref document: CA |
|
ENP | Entry into the national phase |
Ref document number: 20177035762 Country of ref document: KR Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15735630 Country of ref document: US Ref document number: 2017143404 Country of ref document: RU |
|
WWE | Wipo information: entry into national phase |
Ref document number: MX/A/2017/016228 Country of ref document: MX |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2016811469 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112017026743 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 112017026743 Country of ref document: BR Kind code of ref document: A2 Effective date: 20171212 |