WO2017082050A1 - デコード装置、デコード方法、およびプログラム - Google Patents
デコード装置、デコード方法、およびプログラム Download PDFInfo
- Publication number
- WO2017082050A1 WO2017082050A1 PCT/JP2016/081699 JP2016081699W WO2017082050A1 WO 2017082050 A1 WO2017082050 A1 WO 2017082050A1 JP 2016081699 W JP2016081699 W JP 2016081699W WO 2017082050 A1 WO2017082050 A1 WO 2017082050A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- decoding
- boundary position
- switching
- processing
- processing unit
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 120
- 238000012545 processing Methods 0.000 claims abstract description 173
- 230000008569 process Effects 0.000 claims abstract description 92
- 230000001360 synchronised effect Effects 0.000 claims abstract description 21
- 238000013139 quantization Methods 0.000 description 19
- 238000010586 diagram Methods 0.000 description 6
- 230000009466 transformation Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/022—Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0212—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/167—Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes
Definitions
- the present disclosure relates to a decoding device, a decoding method, and a program, and more particularly, to a decoding device, a decoding method, and a program suitable for use when switching output between audio encoded bitstreams whose reproduction timings are synchronized.
- some contents such as movies, news, and sports broadcasts are provided with audio in multiple languages (for example, Japanese and English) for the video.
- the playback timing of the multiple audio is synchronized. It will be.
- the audio with synchronized playback timing is prepared as an audio encoded bit stream, and the audio encoded bit stream includes AAC (Advanced Audio Coding) including at least MDCT (Modified Discrete Cosine Transform) processing, etc. It is assumed that the variable length encoding is performed by the encoding process. Note that the MPEG-2 AAC audio encoding method including MDCT processing is adopted in terrestrial digital television broadcasting (see, for example, Non-Patent Document 1).
- FIG. 1 shows a simplified example of a conventional configuration of an encoding apparatus that performs encoding processing on audio source data and a decoding apparatus that performs decoding processing on an audio encoded bitstream output from the encoding apparatus. ing.
- the encoding apparatus 10 includes an MDCT unit 11, a quantization unit 12, and a variable length coding unit 13.
- the MDCT unit 11 divides the audio source data input from the previous stage into frame units having a predetermined time width, and performs MDCT processing so that frames before and after are overlapped, so that the source that is the time domain value Data is converted into a frequency domain value and output to the quantization unit 12.
- the quantization unit 12 quantizes the input from the MDCT unit 11 and outputs the result to the variable length coding unit 13.
- the variable length encoding unit 13 generates and outputs an audio encoded bitstream by performing variable length encoding on the quantized value.
- the decoding device 20 is mounted on, for example, a receiving device that receives broadcast or distributed content, or a playback device that plays back content recorded on a recording medium, and includes a decoding unit 21 and an inverse quantization unit. 22 and an IMDCT (Inverse MDCT) unit 23.
- a receiving device that receives broadcast or distributed content
- a playback device that plays back content recorded on a recording medium
- IMDCT Inverse MDCT
- the decoding unit 21 corresponding to the variable length encoding unit 13 performs a decoding process on the audio encoded bit stream in units of frames and outputs the decoding result to the inverse quantization unit 22.
- the inverse quantization unit 22 corresponding to the quantization unit 12 performs inverse quantization on the decoding result and outputs the processing result to the IMDCT unit 23.
- the IMDCT unit 23 corresponding to the MDCT unit 11 performs IMDCT processing on the inverse quantization result, thereby reconstructing PCM data corresponding to the source data before encoding.
- the IMDCT process performed by the IMDCT unit 23 will be described in detail.
- FIG. 2 shows IMDCT processing by the IMDCT unit 23.
- the IMDCT unit 23 targets audio coding bitstreams (dequantization results) BS1-1 and BS1-2 for two frames (Frame # 1 and Frame # 2) that precede and follow.
- IMDCT-OUT # 1-1 is obtained as an inverse transformation result.
- the IMDCT process is performed on the audio encoded bitstreams (inverse quantization results) BS1-2 and BS1-3 for two frames (Frame # 2 and Frame # 3) that overlap with the above, and the inverse transformation result is obtained.
- IMDCT-OUT # 1-2 Furthermore, by performing overlap addition of IMDCT-OUT # 1-1 and IMDCT-OUT # 1-2, PCM1-2, which is PCM data corresponding to Frame # 2, is completely reconfigured.
- the term “complete” used here means that the PCM data can be reconstructed including the processing up to the overlap addition, and means that the source data is reproduced 100%. is not.
- FIG. 3 shows a state in the case of switching from the first audio encoded bit stream whose reproduction timing is synchronized to the second audio encoded bit stream by the conventional method.
- the first audio encoding is performed.
- the bit stream up to PCM1-2 corresponding to Frame # 2 is decoded and output.
- PCM2-3 and later corresponding to Frame # 3 are decoded and output.
- the present disclosure has been made in view of such a situation, and switches and decodes a plurality of audio encoded bitstreams whose reproduction timings are synchronized as quickly as possible without causing an increase in circuit scale and cost. , To be able to output.
- the decoding device includes an acquisition unit that acquires a plurality of audio encoded bitstreams in which a plurality of source data whose reproduction timings are synchronized is encoded after the MDCT processing in units of frames, A boundary position for switching output of the plurality of audio encoded bitstreams is determined, and one of the acquired audio encoded bitstreams is selectively supplied to a decoding processing unit according to the boundary position.
- a selection unit, and the decode processing unit that performs a decoding process including an IMDCT process corresponding to the MDCT process on one of the plurality of audio encoded bitstreams input via the selection unit.
- the decoding processing unit includes an autocord in the IMDCT processing corresponding to frames before and after the boundary position, respectively. It will not wrap added.
- the decoding device further includes a fade processing unit that performs a fade process on the decoding processing result of the frames before and after the boundary position where the overlap addition by the decoding processing unit is omitted. it can.
- the fade processing unit performs a fade-out process on the decoding processing result of the frame before the boundary position where the overlap addition by the decoding processing unit is omitted, and the decoding processing result of the frame after the boundary position Can be faded in.
- the fade processing unit performs a fade-out process on the decoding processing result of the frame before the boundary position where the overlap addition by the decoding processing unit is omitted, and the decoding processing result of the frame after the boundary position Can be muted.
- the fade processing unit performs mute processing on the decoding processing result of the frame before the boundary position where the overlap addition by the decoding processing unit is omitted, and the decoding processing result of the frame after the boundary position Can be faded in.
- the selection unit can determine the boundary position based on a switching optimum position flag added to each frame set on the supply side of the plurality of audio encoded bit streams.
- the switching optimum position flag may be set on the supply side of the audio encoded bitstream based on the energy or context of the source data.
- the selection unit can determine the boundary position based on information on gains of the plurality of audio encoded bit streams.
- a decoding method is a decoding method of a decoding device, wherein a plurality of source data whose playback timing is synchronized by the decoding device is encoded after MDCT processing in units of frames, respectively.
- a selection step that selectively supplies to the decoding processing step according to the boundary position, and an IMDCT processing corresponding to the MDCT processing for one of the plurality of audio encoded bit streams that are selectively supplied
- a decoding process step for performing a decoding process including The decoding processing step omits overlap addition in the IMDCT processing corresponding to the frames before and after the boundary position.
- a program includes a computer that obtains a plurality of audio encoded bitstreams in which a plurality of source data whose reproduction timings are synchronized are encoded after the MDCT processing in units of frames, respectively. Determining a boundary position for switching the output of the plurality of audio encoded bitstreams, and selectively selecting one of the acquired plurality of audio encoded bitstreams according to the boundary position to the decoding processing unit.
- a selection unit to be supplied, and the decoding processing unit that performs a decoding process including an IMDCT process corresponding to the MDCT process on one of the plurality of audio encoded bitstreams input via the selection unit And the decoding processing unit corresponds to the frames before and after the boundary position, respectively. Omit overlap addition in IMDCT processing.
- a plurality of audio encoded bitstreams are acquired, a boundary position for switching output of the plurality of audio encoded bitstreams is determined, and the selectively supplied according to the boundary position Decoding processing including IMDCT processing corresponding to MDCT processing is performed on one of the plurality of audio encoded bit streams.
- Decoding processing including IMDCT processing corresponding to MDCT processing is performed on one of the plurality of audio encoded bit streams.
- the overlap addition in the IMDCT process corresponding to the frames before and after the boundary position is omitted.
- FIG. 26 is a block diagram illustrating a configuration example of a decoding device to which the present disclosure is applied.
- Fig. 5 is a diagram illustrating a first switching method of an audio encoded bitstream by the decoding device of Fig. 4. It is a flowchart explaining an audio
- FIG. 11 is a block diagram illustrating a configuration example of a general-purpose computer.
- FIG. 4 illustrates a configuration example of the decoding apparatus according to the embodiment of the present disclosure.
- the decoding device 30 is mounted on, for example, a receiving device that receives broadcast or distributed content, or a playback device that plays back content recorded on a recording medium.
- the decoding device 30 can quickly switch and decode and output the first and second audio encoded bit streams whose reproduction timings are synchronized.
- first and second audio encoded bit streams are variable-length encoded by encoding processing including at least MDCT processing of audio source data.
- first and second audio encoded bit streams are also simply referred to as first and second encoded bit streams.
- the decoding device 30 includes a demultiplexing unit 31, decoding units 32-1 and 32-2, a selection unit 33, a decoding processing unit 34, and a fade processing unit 37.
- the demultiplexing unit 11 separates the first encoded bit stream and the second encoded stream whose reproduction timing is synchronized from the multiplexed stream input from the previous stage. Furthermore, the multiplexing unit 11 outputs the first encoded bit stream to the decoding unit 32-1, and outputs the second encoded stream to the decoding unit 32-2.
- the decoding unit 32-1 performs a decoding process for decoding the variable length code for the first encoded bit stream, and outputs the processing result (hereinafter referred to as quantized data) to the selection unit 33.
- the decoding unit 32-2 performs a decoding process for decoding the variable length code for the second encoded bit stream, and outputs the quantized data as the processing result to the selection unit 33.
- the selection unit 33 determines the switching boundary position based on the voice switching instruction from the user, and the quantized data from the decoding unit 32-1 or the decoding unit 32-2 is sent to the decoding processing unit 34 according to the determined switching boundary position. Output.
- the selection unit 33 can determine the switching boundary position based on the switching optimum position flag added to the first and second encoded bit streams for each frame. This will be described later with reference to FIGS.
- the decoding processing unit 34 includes an inverse quantization unit 35 and an IMDCT unit 36.
- the inverse quantization unit 35 performs inverse quantization on the quantized data input via the selection unit 33, and outputs the inverse quantization result (hereinafter referred to as MDCT data) to the IMDCT unit 36.
- MDCT data the inverse quantization result
- the IMDCT unit 36 reconstructs PCM data corresponding to the source data before encoding by performing IMDCT processing on the MDCT data.
- the IMDCT unit 36 does not completely reconstruct the PCM data corresponding to all the frames, and outputs the reconstructed PCM data in an incomplete state for the frames near the switching boundary position.
- the fade processing unit 37 performs a fade-out process, a fade-in process, or a mute process on the PCM data in the vicinity of the switching boundary position input from the decode processing unit 34, and outputs it to the subsequent stage.
- the configuration example shown in FIG. 4 shows a case where a multiplexed stream in which the first and second encoded bit streams are multiplexed is input to the decoding device 30. More encoded bit streams may be multiplexed in the encoded stream. In that case, the number of decoding units 32 may be increased in accordance with the number of multiplexed encoded bit streams.
- a plurality of encoded bit streams may be individually input instead of the multiplexed stream being input to the decoding device 30.
- the demultiplexing unit 31 can be omitted.
- FIG. 5 shows a first switching method of the encoded bit stream by the decoding device 30.
- the first encoded bit stream Is the target of IMDCT processing up to Frame # 2 immediately before the switching boundary position.
- the PCM 1-1 corresponding to Frame # 1 can be completely reconfigured, but the PCM1-2 corresponding to Frame # 2 is incompletely reconfigured.
- the frame from frame # 3 immediately after the switching boundary position is the target of IMDCT processing.
- the reconfiguration of PCM2-3 corresponding to Frame # 3 is incomplete, and the PCM2-4 corresponding to Frame # 4 is completely reconfigured from PCM2-4 onward.
- the second half of MDCT-OUT # 1-1 may be used as it is for PCM1-2 corresponding to Frame # 2 of the first encoded bitstream.
- the first half of MDCT-OUT # 2-3 may be used as it is for PCM2-3 corresponding to Frame # 3 of the second encoded bit stream.
- the incompletely reconstructed PCM1-2 and PCM2-3 have deteriorated sound quality as compared with the case where they are completely reconstructed.
- the method of switching the encoded bit stream by the decoding device 30 is not limited to the first switching method described above, and a second or third switching method described later can also be employed.
- FIG. 6 is a flowchart for explaining the voice switching process corresponding to the first switching method shown in FIG.
- the demultiplexing unit 11 separates the first and second encoded bit streams from the multiplexed stream, and each of them is decoded by a decoding unit 32-1 or 31-2. It is assumed that it has been decrypted by In addition, it is assumed that one of the quantized data from the decoding units 32-1 and 31-2 is selected by the selection unit 33 and input to the decoding processing unit 34.
- the selection unit 33 selects the quantized data from the decoding unit 32-1 and inputs it to the decoding processing unit 34. Accordingly, the PCM data based on the first encoded bit stream is currently being output from the decoding device 30 at a normal volume.
- step S1 the selection unit 33 determines whether or not there is a voice switching instruction from the user, and waits until there is a voice switching instruction. During this standby, the selective output by the selector 33 is maintained. That is, the PCM data based on the first encoded bit stream is continuously output from the decoding device 30 at a normal volume.
- step S2 the selection unit 33 determines a voice switching boundary position.
- the voice switching boundary position is determined after a predetermined number of frames have passed since the voice switching instruction was issued. However, it may be determined based on a switching optimum position flag included in the encoded bitstream (details will be described later).
- step S3 the selection unit 33 maintains the current selection until the quantized data corresponding to the frame immediately before the determined switching boundary position is output to the decoding processing unit 34. That is, the quantized data from the decoding unit 32-1 is output to the subsequent stage.
- step S4 the inverse quantization unit 35 of the decode processing unit 34 performs inverse quantization on the quantized data based on the first encoded bit stream, and outputs the resulting MDCT data to the IMDCT unit 36.
- the IMDCT unit 36 performs IMDCT processing up to MDCT data corresponding to the frame immediately before the switching boundary position, thereby reconstructing the PCM data corresponding to the source data before encoding and outputting the data to the fade processing unit 37. .
- PCM1-1 corresponding to Frame # 1 can be completely reconfigured, but PCM1-2 corresponding to Frame # 2 is incompletely reconfigured.
- step S5 the fade processing unit 37 applies to incomplete PCM data (in this case, PCM1-2 corresponding to Frame # 2) corresponding to the frame immediately before the switching boundary position input from the decoding processing unit 34. To output to the subsequent stage.
- incomplete PCM data in this case, PCM1-2 corresponding to Frame # 2
- step S6 the selection unit 33 switches the output to the decoding processing unit 34. That is, the quantized data from the decoding unit 32-2 is output to the subsequent stage.
- step S7 the inverse quantization unit 35 of the decode processing unit 34 performs inverse quantization on the quantized data based on the second encoded bit stream, and outputs the resulting MDCT data to the IMDCT unit 36.
- the IMDCT unit 36 performs IMDCT processing on the MDCT data corresponding to the frame immediately after the switching boundary position, thereby reconstructing the PCM data corresponding to the source data before encoding and outputting it to the fade processing unit 37. .
- step S8 the fade processing unit 37 applies to incomplete PCM data (in this case, PCM2-3 corresponding to Frame # 3) corresponding to the frame immediately after the switching boundary position input from the decoding processing unit 34. To perform fade-in processing and output to the subsequent stage. Thereafter, the process returns to step S1, and the subsequent steps are repeated.
- incomplete PCM data in this case, PCM2-3 corresponding to Frame # 3
- the voice switching boundary position is determined after a predetermined number of frames have passed.
- the switching boundary position is a position where the sound is as close to silence as possible, or the volume is temporarily set depending on the context. It is desirable to be in a position where the meaning of a series of words and conversations can be established even if lowered.
- the content supply side detects a state where the sound is as silent as possible (that is, a state where the gain or energy of the source data is low) and sets a switching optimal position flag there (hereinafter referred to as switching optimal).
- switching optimal a switching optimal position flag there
- FIG. 7 is a flowchart for explaining the switching optimum position flag setting process executed on the content supply side.
- FIG. 8 shows the state of the switching optimum position flag setting process.
- step S21 the first and second source data (from each of the first and second encoded bit streams whose reproduction timings are synchronized) input from the previous stage are divided into frame units, In S22, the energy in each divided frame is measured.
- step S23 it is determined for each frame whether or not the energy of the first and second source data is equal to or less than a predetermined threshold value. If the energy of both the first and second source data is less than or equal to the predetermined threshold value, the process proceeds to step S24, and the switching optimum position flag for the frame is “1”, which means that it is the optimum switching position.
- step S25 the switching optimum position flag for the frame is not the optimum switching position. It is set to “0” which means.
- step S26 it is determined whether or not the input of the first and second source data has been completed. If the input of the first and second source data continues, the process returns to step S21 and thereafter. Is repeated. When the input of the first and second source data is finished, the switching optimum position flag setting process is finished.
- FIG. 9 shows the audio in the decoding apparatus 30 corresponding to the case where the switching optimum position flag is set for each frame of the first and second encoded bitstreams by the switching optimum position flag setting process described above. It is a flowchart explaining a switching boundary position determination process.
- FIG. 10 is a diagram illustrating a state of the switching boundary position determination process.
- This switching boundary position determination process can be executed in place of Step S1 and Step S2 of the voice switching process described with reference to FIG.
- step S31 the selection unit 33 of the decoding device 30 determines whether or not there is a voice switching instruction from the user, and waits until there is a voice switching instruction. During this standby, the selective output by the selector 33 is maintained. That is, the PCM data based on the first encoded bit stream is continuously output from the decoding device 30 at a normal volume.
- step S32 the selection unit 33 sets the switching optimum position flag added to each frame of the first and second encoded bit streams (quantized data that is the decoding result thereof) sequentially input from the previous stage to 1. Wait until The selective output by the selector 33 is maintained even during this standby. If the optimum switching position flag is 1, the process proceeds to step S33, and a frame between the optimum switching position flag of 1 and the next frame is determined as the audio switching boundary position. This completes the switching boundary position determination process.
- a position where the sound is as close to silence as possible can be determined as the switching boundary position. Therefore, it is possible to suppress the influence caused by executing the fade-out process and the fade-in process.
- the selection unit 33 in the decoding device 30 refers to the information related to the gain of the encoded bitstream, and has a volume equal to or lower than the specified threshold value.
- the switching boundary position may be determined by detecting the position.
- information related to the gain for example, information such as a scale factor can be used in an encoding scheme such as AAC or MP3.
- FIG. 11 shows a second switching method of the encoded bit stream by the decoding device 30.
- the first encoded bit stream Is the target of IMDCT processing up to Frame # 2 immediately before the switching boundary position.
- the PCM 1-1 corresponding to Frame # 1 can be completely reconfigured, but the PCM1-2 corresponding to Frame # 2 is incompletely reconfigured.
- the frame from frame # 3 immediately after the switching boundary position is the target of IMDCT processing.
- the reconfiguration of PCM2-3 corresponding to Frame # 3 is incomplete, and the PCM2-4 corresponding to Frame # 4 is completely reconfigured from PCM2-4 onward.
- FIG. 12 shows a third switching method of the encoded bit stream by the decoding device 30.
- the first encoded bit stream Is the target of IMDCT processing up to Frame # 2 immediately before the switching boundary position.
- the PCM 1-1 corresponding to Frame # 1 can be completely reconfigured, but the PCM1-2 corresponding to Frame # 2 is incompletely reconfigured.
- the frame from frame # 3 immediately after the switching boundary position is the target of IMDCT processing.
- the reconfiguration of PCM2-3 corresponding to Frame # 3 is incomplete, and the PCM2-4 corresponding to Frame # 4 is completely reconfigured from PCM2-4.
- the present disclosure can be applied not only to switching between the first and second encoded bitstreams whose playback timings are synchronized, but also to switching between objects in 3D Audio encoding, for example. More specifically, when a group of object data is grouped and switched to another group (Switch Group), multiple objects can be switched simultaneously for reasons such as switching the playback position or the viewpoint position from a free viewpoint. Applicable to.
- this disclosure also applies to operations such as switching the channel environment from 2ch stereo audio to 5.1ch surround sound, or switching according to the movement of seats in a stream with surround at each seat in free viewpoint video. Can be applied.
- the series of processes by the decoding device 30 described above can be executed by hardware or can be executed by software.
- a program constituting the software is installed in the computer.
- the computer includes, for example, a general-purpose personal computer capable of executing various functions by installing a computer incorporated in dedicated hardware and various programs.
- FIG. 13 is a block diagram showing an example of the hardware configuration of a computer that executes the above-described series of processing by a program.
- a CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- An input / output interface 105 is further connected to the bus 104.
- An input unit 106, an output unit 107, a storage unit 108, a communication unit 109, and a drive 110 are connected to the input / output interface 105.
- the input unit 106 includes a keyboard, a mouse, a microphone, and the like.
- the output unit 107 includes a display, a speaker, and the like.
- the storage unit 108 includes a hard disk, a nonvolatile memory, and the like.
- the communication unit 109 includes a network interface or the like.
- the drive 110 drives a removable medium 111 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
- the CPU 101 loads the program stored in the storage unit 108 to the RAM 103 via the input / output interface 105 and the bus 104 and executes the program. A series of processing is performed.
- the program executed by the computer 100 may be a program that is processed in time series in the order described in this specification, or a necessary timing such as when a call is made in parallel. It may be a program in which processing is performed.
- An acquisition unit that acquires a plurality of audio encoded bitstreams in which a plurality of source data whose reproduction timings are synchronized is encoded after the MDCT processing in units of frames, A boundary position for switching output of the plurality of audio encoded bitstreams is determined, and one of the acquired audio encoded bitstreams is selectively supplied to a decoding processing unit according to the boundary position.
- a selection section The decoding processing unit that performs a decoding process including an IMDCT process corresponding to the MDCT process for one of the plurality of audio encoded bitstreams input via the selection unit, The decoding processing unit omits overlap addition in the IMDCT processing respectively corresponding to frames before and after the boundary position.
- the decoding device further including a fade processing unit that performs a fade process on a decoding processing result of frames before and after the boundary position where the overlap addition by the decoding processing unit is omitted.
- the fade processing unit performs a fade-out process on the decoding processing result of the frame before the boundary position where the overlap addition by the decoding processing unit is omitted, and the decoding processing result of the frame after the boundary position.
- the fade processing unit performs a fade-out process on the decoding processing result of the frame before the boundary position where the overlap addition by the decoding processing unit is omitted, and the decoding processing result of the frame after the boundary position
- the decoding device according to (2) wherein a mute process is performed on the decoding device.
- the fade processing unit performs mute processing on the decoding processing result of the frame before the boundary position where the overlap addition by the decoding processing unit is omitted, and the decoding processing result of the frame after the boundary position
- the selection unit determines the boundary position based on a switching optimum position flag set on the supply side of the plurality of audio encoded bit streams and added to each frame.
- (1) to (5) The decoding device according to any one of the above.
- the switching optimum position flag is set on the supply side of the audio encoded bitstream based on energy or context of the source data.
- the selection unit determines the boundary position based on information regarding gains of the plurality of audio encoded bit streams.
- (10) Computer An acquisition unit that acquires a plurality of audio encoded bitstreams in which a plurality of source data whose reproduction timings are synchronized is encoded after the MDCT processing in units of frames, A boundary position for switching output of the plurality of audio encoded bitstreams is determined, and one of the acquired audio encoded bitstreams is selectively supplied to a decoding processing unit according to the boundary position.
- a selection section For one of the plurality of audio encoded bitstreams input via the selection unit, function as the decoding processing unit that performs decoding processing including IMDCT processing corresponding to the MDCT processing, The decoding processing unit omits overlap addition in the IMDCT processing corresponding to frames before and after the boundary position, respectively.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Signal Processing For Digital Recording And Reproducing (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
Description
図4は、本開示の実施の形態であるデコード装置の構成例を示している。
次に、図5は、デコード装置30による符号化ビットストリームの第1の切り替え方法を示している。
次に、図6は、図5に示された第1の切り替え方法に対応する、音声切り替え処理を説明するフローチャートである。
上述した音声切り替え処理では、ユーザから音声切り替え指示に応じて、それから所定数のフレームが経過した後を音声の切り替え境界位置に決定していた。しかしながら、切り替え境界位置付近にフェードアウト処理およびフェードイン処理を実行することを考慮すると、切り替え境界位置としては、音声ができるだけ無音に近い状態の位置であるか、または、文脈に応じて一時的に音量を下げても一連の言葉や会話の意味が成立する位置であることが望ましい。
次に、図11は、デコード装置30による符号化ビットストリームの第2の切り替え方法を示している。
次に、図12は、デコード装置30による符号化ビットストリームの第3の切り替え方法を示している。
本開示は、再生タイミングが同期されている第1および第2の符号化ビットストリームの切り替え用途以外にも、例えば、3D Audio符号化におけるオブジェクト間の切り替え用途にも適用することができる。より具体的には、オブジェクトデータがグループ化されたものをまとめて別グループ(Switch Group)に切り替えるといった場合、再生シーンや自由視点での視点位置の切り替えなどの理由で一斉に複数オブジェクトを切り替える用途に適用できる。
(1)
再生タイミングが同期されている複数のソースデータがそれぞれフレーム単位でMDCT処理の後に符号化されている複数のオーディオ符号化ビットストリームを取得する取得部と、
前記複数のオーディオ符号化ビットストリームの出力を切り替える境界位置を決定し、取得された前記複数のオーディオ符号化ビットストリームのうちの一つを前記境界位置に応じて選択的にデコード処理部に供給する選択部と、
前記選択部を介して入力される前記複数のオーディオ符号化ビットストリームのうちの一つに対して、前記MDCT処理に対応するIMDCT処理を含むデコード処理を行う前記デコード処理部とを備え、
前記デコード処理部は、前記境界位置の前後のフレームにそれぞれ対応する前記IMDCT処理におけるオーバラップ加算を省略する
デコード装置。
(2)
前記デコード処理部による前記オーバラップ加算が省略された前記境界位置の前後のフレームのデコード処理結果に対してフェード処理を行うフェード処理部を
さらに備える前記(1)に記載のデコード装置。
(3)
前記フェード処理部は、前記デコード処理部による前記オーバラップ加算が省略された前記境界位置の前のフレームのデコード処理結果に対してフェードアウト処理を行うとともに、前記境界位置の後のフレームのデコード処理結果に対してフェードイン処理を行う
前記(2)に記載のデコード装置。
(4)
前記フェード処理部は、前記デコード処理部による前記オーバラップ加算が省略された前記境界位置の前のフレームのデコード処理結果に対してフェードアウト処理を行うとともに、前記境界位置の後のフレームのデコード処理結果に対してミュート処理を行う
前記(2)に記載のデコード装置。
(5)
前記フェード処理部は、前記デコード処理部による前記オーバラップ加算が省略された前記境界位置の前のフレームのデコード処理結果に対してミュート処理を行うとともに、前記境界位置の後のフレームのデコード処理結果に対してフェードイン処理を行う
前記(2)に記載のデコード装置。
(6)
前記選択部は、前記複数のオーディオ符号化ビットストリームの供給側において設定された、各フレームに付加されている切り替え最適位置フラグに基づいて前記境界位置を決定する
前記(1)から(5)のいずれかに記載のデコード装置。
(7)
前記切り替え最適位置フラグは、前記オーディオ符号化ビットストリームの供給側において、前記ソースデータのエネルギまたは文脈に基づいて設定されている
前記(6)に記載のデコード装置。
(8)
前記選択部は、前記複数のオーディオ符号化ビットストリームのゲインに関する情報に基づいて前記境界位置を決定する
前記(1)から(5)のいずれかに記載のデコード装置。
(9)
デコード装置のデコード方法において、
前記デコード装置による、
再生タイミングが同期されている複数のソースデータがそれぞれフレーム単位でMDCT処理の後に符号化されている複数のオーディオ符号化ビットストリームを取得する取得ステップと、
前記複数のオーディオ符号化ビットストリームの出力を切り替える境界位置を決定する決定ステップと、
取得された前記複数のオーディオ符号化ビットストリームのうちの一つを前記境界位置に応じて選択的にデコード処理ステップに供給する選択ステップと、
選択的に供給された前記複数のオーディオ符号化ビットストリームのうちの一つに対して、前記MDCT処理に対応するIMDCT処理を含むデコード処理を行う前記デコード処理ステップとを含み、
前記デコード処理ステップは、前記境界位置の前後のフレームにそれぞれ対応する前記IMDCT処理におけるオーバラップ加算を省略する
デコード方法。
(10)
コンピュータを、
再生タイミングが同期されている複数のソースデータがそれぞれフレーム単位でMDCT処理の後に符号化されている複数のオーディオ符号化ビットストリームを取得する取得部と、
前記複数のオーディオ符号化ビットストリームの出力を切り替える境界位置を決定し、取得された前記複数のオーディオ符号化ビットストリームのうちの一つを前記境界位置に応じて選択的にデコード処理部に供給する選択部と、
前記選択部を介して入力される前記複数のオーディオ符号化ビットストリームのうちの一つに対して、前記MDCT処理に対応するIMDCT処理を含むデコード処理を行う前記デコード処理部として機能させ、
前記デコード処理部は、前記境界位置の前後のフレームにそれぞれ対応する前記IMDCT処理におけるオーバラップ加算を省略する
プログラム。
Claims (10)
- 再生タイミングが同期されている複数のソースデータがそれぞれフレーム単位でMDCT処理の後に符号化されている複数のオーディオ符号化ビットストリームを取得する取得部と、
前記複数のオーディオ符号化ビットストリームの出力を切り替える境界位置を決定し、取得された前記複数のオーディオ符号化ビットストリームのうちの一つを前記境界位置に応じて選択的にデコード処理部に供給する選択部と、
前記選択部を介して入力される前記複数のオーディオ符号化ビットストリームのうちの一つに対して、前記MDCT処理に対応するIMDCT処理を含むデコード処理を行う前記デコード処理部とを備え、
前記デコード処理部は、前記境界位置の前後のフレームにそれぞれ対応する前記IMDCT処理におけるオーバラップ加算を省略する
デコード装置。 - 前記デコード処理部による前記オーバラップ加算が省略された前記境界位置の前後のフレームのデコード処理結果に対してフェード処理を行うフェード処理部を
さらに備える請求項1に記載のデコード装置。 - 前記フェード処理部は、前記デコード処理部による前記オーバラップ加算が省略された前記境界位置の前のフレームのデコード処理結果に対してフェードアウト処理を行うとともに、前記境界位置の後のフレームのデコード処理結果に対してフェードイン処理を行う
請求項2に記載のデコード装置。 - 前記フェード処理部は、前記デコード処理部による前記オーバラップ加算が省略された前記境界位置の前のフレームのデコード処理結果に対してフェードアウト処理を行うとともに、前記境界位置の後のフレームのデコード処理結果に対してミュート処理を行う
請求項2に記載のデコード装置。 - 前記フェード処理部は、前記デコード処理部による前記オーバラップ加算が省略された前記境界位置の前のフレームのデコード処理結果に対してミュート処理を行うとともに、前記境界位置の後のフレームのデコード処理結果に対してフェードイン処理を行う
請求項2に記載のデコード装置。 - 前記選択部は、前記複数のオーディオ符号化ビットストリームの供給側において設定された、各フレームに付加されている切り替え最適位置フラグに基づいて前記境界位置を決定する
請求項2に記載のデコード装置。 - 前記切り替え最適位置フラグは、前記オーディオ符号化ビットストリームの供給側において、前記ソースデータのエネルギまたは文脈に基づいて設定されている
請求項6に記載のデコード装置。 - 前記選択部は、前記複数のオーディオ符号化ビットストリームのゲインに関する情報に基づいて前記境界位置を決定する
請求項2に記載のデコード装置。 - デコード装置のデコード方法において、
前記デコード装置による、
再生タイミングが同期されている複数のソースデータがそれぞれフレーム単位でMDCT処理の後に符号化されている複数のオーディオ符号化ビットストリームを取得する取得ステップと、
前記複数のオーディオ符号化ビットストリームの出力を切り替える境界位置を決定する決定ステップと、
取得された前記複数のオーディオ符号化ビットストリームのうちの一つを前記境界位置に応じて選択的にデコード処理ステップに供給する選択ステップと、
選択的に供給された前記複数のオーディオ符号化ビットストリームのうちの一つに対して、前記MDCT処理に対応するIMDCT処理を含むデコード処理を行う前記デコード処理ステップとを含み、
前記デコード処理ステップは、前記境界位置の前後のフレームにそれぞれ対応する前記IMDCT処理におけるオーバラップ加算を省略する
デコード方法。 - コンピュータを、
再生タイミングが同期されている複数のソースデータがそれぞれフレーム単位でMDCT処理の後に符号化されている複数のオーディオ符号化ビットストリームを取得する取得部と、
前記複数のオーディオ符号化ビットストリームの出力を切り替える境界位置を決定し、取得された前記複数のオーディオ符号化ビットストリームのうちの一つを前記境界位置に応じて選択的にデコード処理部に供給する選択部と、
前記選択部を介して入力される前記複数のオーディオ符号化ビットストリームのうちの一つに対して、前記MDCT処理に対応するIMDCT処理を含むデコード処理を行う前記デコード処理部として機能させ、
前記デコード処理部は、前記境界位置の前後のフレームにそれぞれ対応する前記IMDCT処理におけるオーバラップ加算を省略する
プログラム。
Priority Applications (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201680064042.9A CN108352165B (zh) | 2015-11-09 | 2016-10-26 | 解码装置、解码方法以及计算机可读存储介质 |
EP16864014.2A EP3376500B1 (en) | 2015-11-09 | 2016-10-26 | Decoding device, decoding method, and program |
US15/772,310 US10553230B2 (en) | 2015-11-09 | 2016-10-26 | Decoding apparatus, decoding method, and program |
BR112018008874A BR112018008874A8 (pt) | 2015-11-09 | 2016-10-26 | aparelho e método de decodificação, e, programa. |
RU2018115550A RU2718418C2 (ru) | 2015-11-09 | 2016-10-26 | Устройство декодирования, способ декодирования и программа |
KR1020187011895A KR20180081504A (ko) | 2015-11-09 | 2016-10-26 | 디코드 장치, 디코드 방법, 및 프로그램 |
JP2017550052A JP6807033B2 (ja) | 2015-11-09 | 2016-10-26 | デコード装置、デコード方法、およびプログラム |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2015219415 | 2015-11-09 | ||
JP2015-219415 | 2015-11-09 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017082050A1 true WO2017082050A1 (ja) | 2017-05-18 |
Family
ID=58695167
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2016/081699 WO2017082050A1 (ja) | 2015-11-09 | 2016-10-26 | デコード装置、デコード方法、およびプログラム |
Country Status (8)
Country | Link |
---|---|
US (1) | US10553230B2 (ja) |
EP (1) | EP3376500B1 (ja) |
JP (1) | JP6807033B2 (ja) |
KR (1) | KR20180081504A (ja) |
CN (1) | CN108352165B (ja) |
BR (1) | BR112018008874A8 (ja) |
RU (1) | RU2718418C2 (ja) |
WO (1) | WO2017082050A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2559223A (en) * | 2017-01-30 | 2018-08-01 | Cirrus Logic Int Semiconductor Ltd | Auto-mute audio processing |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110730408A (zh) * | 2019-11-11 | 2020-01-24 | 北京达佳互联信息技术有限公司 | 一种音频参数切换方法、装置、电子设备及存储介质 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH09252254A (ja) * | 1995-09-29 | 1997-09-22 | Nippon Steel Corp | オーディオ復号装置 |
JP2002026738A (ja) * | 2000-07-11 | 2002-01-25 | Mitsubishi Electric Corp | オーディオデータ復号処理装置および方法、ならびにオーディオデータ復号処理プログラムを記録したコンピュータ読取可能な記録媒体 |
Family Cites Families (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6151441A (en) * | 1993-12-18 | 2000-11-21 | Sony Corporation | System for storing and reproducing multiplexed data |
JPH08287610A (ja) * | 1995-04-18 | 1996-11-01 | Sony Corp | オーディオデータの再生装置 |
US5867819A (en) | 1995-09-29 | 1999-02-02 | Nippon Steel Corporation | Audio decoder |
DE19861167A1 (de) * | 1998-08-19 | 2000-06-15 | Christoph Buskies | Verfahren und Vorrichtung zur koartikulationsgerechten Konkatenation von Audiosegmenten sowie Vorrichtungen zur Bereitstellung koartikulationsgerecht konkatenierter Audiodaten |
GB9911737D0 (en) * | 1999-05-21 | 1999-07-21 | Philips Electronics Nv | Audio signal time scale modification |
US7792681B2 (en) * | 1999-12-17 | 2010-09-07 | Interval Licensing Llc | Time-scale modification of data-compressed audio information |
US7113538B1 (en) * | 2000-11-01 | 2006-09-26 | Nortel Networks Limited | Time diversity searcher and scheduling method |
US7069208B2 (en) * | 2001-01-24 | 2006-06-27 | Nokia, Corp. | System and method for concealment of data loss in digital audio transmission |
US7189913B2 (en) * | 2003-04-04 | 2007-03-13 | Apple Computer, Inc. | Method and apparatus for time compression and expansion of audio data with dynamic tempo change during playback |
US7260035B2 (en) * | 2003-06-20 | 2007-08-21 | Matsushita Electric Industrial Co., Ltd. | Recording/playback device |
US20050149973A1 (en) * | 2004-01-06 | 2005-07-07 | Fang Henry Y. | Television with application/stream-specifiable language selection |
CA2457988A1 (en) * | 2004-02-18 | 2005-08-18 | Voiceage Corporation | Methods and devices for audio compression based on acelp/tcx coding and multi-rate lattice vector quantization |
ATE537536T1 (de) * | 2004-10-26 | 2011-12-15 | Panasonic Corp | Sprachkodierungsvorrichtung und sprachkodierungsverfahren |
SG124307A1 (en) * | 2005-01-20 | 2006-08-30 | St Microelectronics Asia | Method and system for lost packet concealment in high quality audio streaming applications |
DE102005014477A1 (de) * | 2005-03-30 | 2006-10-12 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Vorrichtung und Verfahren zum Erzeugen eines Datenstroms und zum Erzeugen einer Multikanal-Darstellung |
WO2006137425A1 (ja) * | 2005-06-23 | 2006-12-28 | Matsushita Electric Industrial Co., Ltd. | オーディオ符号化装置、オーディオ復号化装置およびオーディオ符号化情報伝送装置 |
CN101026725B (zh) * | 2005-07-15 | 2010-09-29 | 索尼株式会社 | 再现设备及再现方法 |
US8015000B2 (en) * | 2006-08-03 | 2011-09-06 | Broadcom Corporation | Classification-based frame loss concealment for audio signals |
US8010350B2 (en) * | 2006-08-03 | 2011-08-30 | Broadcom Corporation | Decimated bisectional pitch refinement |
DE102007028175A1 (de) * | 2007-06-20 | 2009-01-02 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Automatisiertes Verfahren zur zeitlichen Segmentierung eines Videos in Szenen unter Berücksichtigung verschiedener Typen von Übergängen zwischen Bildfolgen |
WO2009025142A1 (ja) * | 2007-08-22 | 2009-02-26 | Nec Corporation | 話者速度変換システムおよびその方法ならびに速度変換装置 |
MY154452A (en) * | 2008-07-11 | 2015-06-15 | Fraunhofer Ges Forschung | An apparatus and a method for decoding an encoded audio signal |
WO2010031049A1 (en) * | 2008-09-15 | 2010-03-18 | GH Innovation, Inc. | Improving celp post-processing for music signals |
US8185384B2 (en) * | 2009-04-21 | 2012-05-22 | Cambridge Silicon Radio Limited | Signal pitch period estimation |
US9992456B2 (en) * | 2010-02-24 | 2018-06-05 | Thomson Licensing Dtv | Method and apparatus for hypothetical reference decoder conformance error detection |
TWI476761B (zh) * | 2011-04-08 | 2015-03-11 | Dolby Lab Licensing Corp | 用以產生可由實施不同解碼協定之解碼器所解碼的統一位元流之音頻編碼方法及系統 |
US20150309844A1 (en) * | 2012-03-06 | 2015-10-29 | Sirius Xm Radio Inc. | Systems and Methods for Audio Attribute Mapping |
JP6126006B2 (ja) * | 2012-05-11 | 2017-05-10 | パナソニック株式会社 | 音信号ハイブリッドエンコーダ、音信号ハイブリッドデコーダ、音信号符号化方法、及び音信号復号方法 |
TWI557727B (zh) * | 2013-04-05 | 2016-11-11 | 杜比國際公司 | 音訊處理系統、多媒體處理系統、處理音訊位元流的方法以及電腦程式產品 |
US9685164B2 (en) * | 2014-03-31 | 2017-06-20 | Qualcomm Incorporated | Systems and methods of switching coding technologies at a device |
US20160071524A1 (en) * | 2014-09-09 | 2016-03-10 | Nokia Corporation | Audio Modification for Multimedia Reversal |
US10614609B2 (en) * | 2017-07-19 | 2020-04-07 | Mediatek Inc. | Method and apparatus for reduction of artifacts at discontinuous boundaries in coded virtual-reality images |
-
2016
- 2016-10-26 CN CN201680064042.9A patent/CN108352165B/zh active Active
- 2016-10-26 BR BR112018008874A patent/BR112018008874A8/pt active Search and Examination
- 2016-10-26 JP JP2017550052A patent/JP6807033B2/ja active Active
- 2016-10-26 EP EP16864014.2A patent/EP3376500B1/en active Active
- 2016-10-26 RU RU2018115550A patent/RU2718418C2/ru active
- 2016-10-26 US US15/772,310 patent/US10553230B2/en active Active
- 2016-10-26 KR KR1020187011895A patent/KR20180081504A/ko not_active Application Discontinuation
- 2016-10-26 WO PCT/JP2016/081699 patent/WO2017082050A1/ja active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH09252254A (ja) * | 1995-09-29 | 1997-09-22 | Nippon Steel Corp | オーディオ復号装置 |
JP2002026738A (ja) * | 2000-07-11 | 2002-01-25 | Mitsubishi Electric Corp | オーディオデータ復号処理装置および方法、ならびにオーディオデータ復号処理プログラムを記録したコンピュータ読取可能な記録媒体 |
Non-Patent Citations (1)
Title |
---|
See also references of EP3376500A4 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2559223A (en) * | 2017-01-30 | 2018-08-01 | Cirrus Logic Int Semiconductor Ltd | Auto-mute audio processing |
US10424311B2 (en) | 2017-01-30 | 2019-09-24 | Cirrus Logic, Inc. | Auto-mute audio processing |
Also Published As
Publication number | Publication date |
---|---|
EP3376500A4 (en) | 2018-09-19 |
EP3376500B1 (en) | 2019-08-21 |
CN108352165B (zh) | 2023-02-03 |
US20180286419A1 (en) | 2018-10-04 |
KR20180081504A (ko) | 2018-07-16 |
RU2018115550A3 (ja) | 2020-01-31 |
CN108352165A (zh) | 2018-07-31 |
RU2018115550A (ru) | 2019-10-28 |
RU2718418C2 (ru) | 2020-04-02 |
BR112018008874A8 (pt) | 2019-02-26 |
US10553230B2 (en) | 2020-02-04 |
JPWO2017082050A1 (ja) | 2018-08-30 |
BR112018008874A2 (ja) | 2018-11-06 |
JP6807033B2 (ja) | 2021-01-06 |
EP3376500A1 (en) | 2018-09-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20240055007A1 (en) | Encoding device and encoding method, decoding device and decoding method, and program | |
KR101849612B1 (ko) | 새로운 미디어 장치 상에 내장된 라우드니스 메타데이터를 갖거나 또는 갖지 않고 미디어의 정규화된 오디오 재생을 위한 방법 및 장치 | |
US9875745B2 (en) | Normalization of ambient higher order ambisonic audio data | |
US9922656B2 (en) | Transitioning of ambient higher-order ambisonic coefficients | |
KR102122137B1 (ko) | 인코딩된 오디오 확장 메타데이터-기반 동적 범위 제어 | |
US9875746B2 (en) | Encoding device and method, decoding device and method, and program | |
KR101283783B1 (ko) | 고품질 다채널 오디오 부호화 및 복호화 장치 | |
KR101759005B1 (ko) | 3d 오디오 계층적 코딩을 이용한 라우드스피커 포지션 보상 | |
KR20050097989A (ko) | 연속 백업 오디오 | |
JP2021513108A (ja) | ハイブリッドエンコーダ/デコーダ空間解析を使用する音響シーンエンコーダ、音響シーンデコーダおよびその方法 | |
JP2017519417A (ja) | 高次アンビソニック信号の間のクロスフェージング | |
WO2017082050A1 (ja) | デコード装置、デコード方法、およびプログラム | |
KR20230153402A (ko) | 다운믹스 신호들의 적응형 이득 제어를 갖는 오디오 코덱 | |
GB2614482A (en) | Seamless scalable decoding of channels, objects, and hoa audio content | |
JP2009008843A (ja) | 音響信号再生装置及び音響信号再生方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16864014 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2017550052 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2018115550 Country of ref document: RU |
|
ENP | Entry into the national phase |
Ref document number: 20187011895 Country of ref document: KR Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15772310 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112018008874 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 112018008874 Country of ref document: BR Kind code of ref document: A2 Effective date: 20180502 |