JP6614275B2 - Receiving device, receiving method, transmitting device, and transmitting method - Google Patents
Receiving device, receiving method, transmitting device, and transmitting method Download PDFInfo
- Publication number
- JP6614275B2 JP6614275B2 JP2018091095A JP2018091095A JP6614275B2 JP 6614275 B2 JP6614275 B2 JP 6614275B2 JP 2018091095 A JP2018091095 A JP 2018091095A JP 2018091095 A JP2018091095 A JP 2018091095A JP 6614275 B2 JP6614275 B2 JP 6614275B2
- Authority
- JP
- Japan
- Prior art keywords
- stream
- picture
- image data
- decoding
- encoded
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims description 53
- 230000005540 biological transmission Effects 0.000 claims description 60
- 238000012545 processing Methods 0.000 claims description 44
- 230000008569 process Effects 0.000 claims description 42
- 101001073193 Homo sapiens Pescadillo homolog Proteins 0.000 description 43
- 102100035816 Pescadillo homolog Human genes 0.000 description 43
- 230000002123 temporal effect Effects 0.000 description 29
- 238000005516 engineering process Methods 0.000 description 24
- 238000000605 extraction Methods 0.000 description 24
- 238000012805 post-processing Methods 0.000 description 24
- 239000000284 extract Substances 0.000 description 15
- 238000003780 insertion Methods 0.000 description 10
- 230000037431 insertion Effects 0.000 description 10
- 101150028534 cpb-1 gene Proteins 0.000 description 8
- 238000010586 diagram Methods 0.000 description 8
- 101150085553 cpb-2 gene Proteins 0.000 description 7
- 230000006978 adaptation Effects 0.000 description 6
- 101100243456 Arabidopsis thaliana PES2 gene Proteins 0.000 description 4
- 230000003139 buffering effect Effects 0.000 description 4
- 238000012546 transfer Methods 0.000 description 3
- 230000007704 transition Effects 0.000 description 3
- 101100294638 Neosartorya fumigata (strain ATCC MYA-4609 / Af293 / CBS 101355 / FGSC A1100) NRPS8 gene Proteins 0.000 description 2
- 241000801593 Pida Species 0.000 description 2
- ZBIKORITPGTTGI-UHFFFAOYSA-N [acetyloxy(phenyl)-$l^{3}-iodanyl] acetate Chemical group CC(=O)OI(OC(C)=O)C1=CC=CC=C1 ZBIKORITPGTTGI-UHFFFAOYSA-N 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000012447 hatching Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 102100035353 Cyclin-dependent kinase 2-associated protein 1 Human genes 0.000 description 1
- 101100221809 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) cpd-7 gene Proteins 0.000 description 1
- 102100029860 Suppressor of tumorigenicity 20 protein Human genes 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 101150025236 dmaW gene Proteins 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000001824 photoionisation detection Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000004148 unit process Methods 0.000 description 1
Images
Landscapes
- Compression Or Coding Systems Of Tv Signals (AREA)
Description
æ¬æè¡ã¯ãåä¿¡è£
眮ãåä¿¡æ¹æ³ãéä¿¡è£
眮ããã³éä¿¡æ¹æ³ã«é¢ããã
This technology, reception apparatus receiving method, related to the transmission apparatus and transmission method.
å§çž®åç»ããæŸéããããçã§ãµãŒãã¹ããéãåä¿¡æ©ã®ãã³ãŒãèœåã«ãã£ãŠåçå¯èœãªãã¬ãŒã åšæ³¢æ°ã®äžéãå¶éããããåŸã£ãŠããµãŒãã¹åŽã¯æ®åããŠããåä¿¡æ©ã®åçèœåãèæ ®ããŠãäœãã¬ãŒã åšæ³¢æ°ã®ãµãŒãã¹ã®ã¿ã«å¶éããããé«äœè€æ°ã®ãã¬ãŒã åšæ³¢æ°ã®ãµãŒãã¹ãåææäŸãããããå¿ èŠãããã   When a compressed moving image is serviced by broadcast, network, etc., the upper limit of the frame frequency that can be reproduced is limited by the decoding capability of the receiver. Therefore, it is necessary for the service side to restrict the service to a low frame frequency service or to provide a plurality of high and low frame frequency services at the same time in consideration of the reproduction capability of popular receivers.
åä¿¡æ©ã¯ãé«ãã¬ãŒã åšæ³¢æ°ã®ãµãŒãã¹ã«å¯Ÿå¿ããã«ã¯ãé«ã³ã¹ããšãªããæ©ææ®åã®é»å®³èŠå ãšãªããåæã«äœãã¬ãŒã åšæ³¢æ°ã®ãµãŒãã¹å°çšã®å®äŸ¡ãªåä¿¡æ©ã®ã¿æ®åããŠããŠãå°æ¥ãµãŒãã¹åŽãé«ãã¬ãŒã åšæ³¢æ°ã®ãµãŒãã¹ãéå§ããå Žåãæ°ããªåä¿¡æ©ãç¡ããšå šãèŠèŽäžå¯èœã§ãããæ°èŠãµãŒãã¹ã®æ®åã®é»å®³èŠå ãšãªãã   The receiver is expensive to support a high frame frequency service, which is an obstacle to early diffusion. Only low-cost receivers dedicated to low frame frequency services are prevailing at the beginning, and if the service side starts high frame frequency services in the future, it will be impossible to view without a new receiver, and new services will spread. It becomes an obstruction factor.
äŸãã°ãïŒïŒïŒïŒïŒïŒšïŒ¥ïŒ¶ïŒ£ïŒHigh Efficiency Video CodingïŒã«ãããŠãåç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãé局笊å·åããããšã«ããæéæ¹åã¹ã±ãŒã©ããªãã£ãææ¡ãããŠããïŒéç¹èš±æç®ïŒåç §ïŒãåä¿¡åŽã§ã¯ãïŒNetwork Abstraction LayerïŒãŠãããã®ãããã«æ¿å ¥ãããŠãããã³ãã©ã«ïŒ©ïŒ€ïŒtemporal_idïŒã«åºã¥ããåãã¯ãã£ã®éå±€ãèå¥ã§ãããã³ãŒãèœåã«å¯Ÿå¿ããéå±€ãŸã§ã®éžæçãªãã³ãŒããå¯èœãšãªãã   For example, H.M. In H.265 / HEVC (High Efficiency Video Coding), temporal direction scalability has been proposed by hierarchically encoding image data of each picture constituting moving image data (see Non-Patent Document 1). On the receiving side, the hierarchy of each picture can be identified based on the temporal ID (temporal_id) inserted in the header of the NAL (Network Abstraction Layer) unit, and selective decoding up to the hierarchy corresponding to the decoding capability becomes possible. .
æ¬æè¡ã®ç®çã¯ãåä¿¡åŽã«ãããŠè¯å¥œãªãã³ãŒãåŠçãå¯èœãšããããšã«ããã   An object of the present technology is to enable a good decoding process on the receiving side.
æ¬æè¡ã®æŠå¿µã¯ã
åç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãã該åé¡ãããåéå±€ã®ãã¯ãã£ã®ç»åããŒã¿ã笊å·åãããšå
±ã«ãäžèšè€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãã該åå²ãããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€äžèšæå®æ°ã®ãããªã¹ããªãŒã ãçæããç»å笊å·åéšãšã
äžèšçæãããæå®æ°ã®ãããªã¹ããªãŒã ãå«ãæå®ãã©ãŒãããã®ã³ã³ãããéä¿¡ããéä¿¡éšãåãã
äžèšç»å笊å·åéšã¯ã
å°ãªããšããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åãã
éä¿¡è£
眮ã«ããã
The concept of this technology is
Classifying the image data of each picture constituting the moving image data into a plurality of layers, encoding the image data of the classified pictures of each layer, and dividing the plurality of layers into a predetermined number of layer sets; An image encoding unit for generating the predetermined number of video streams respectively having encoded image data of pictures of each of the divided hierarchical groups;
A transmission unit configured to transmit a container of a predetermined format including the generated predetermined number of video streams;
The image encoding unit is
At least the transmission apparatus performs encoding so that the decoding intervals of the encoded image data of the pictures in the lowest layer set are equal.
æ¬æè¡ã«ãããŠãç»å笊å·åéšã«ãããåç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ã笊å·åãããŠæå®æ°ã®ãããªã¹ããªãŒã ãçæãããããã®å Žåãåç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãããŠç¬Šå·åãããããããŠããã®è€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ããããã®åå²ãããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€æå®æ°ã®ãããªã¹ããªãŒã ãçæãããã   In the present technology, the image encoding unit encodes the image data of each picture constituting the moving image data to generate a predetermined number of video streams. In this case, the image data of each picture constituting the moving image data is classified into a plurality of layers and encoded. Then, the plurality of hierarchies are divided into a predetermined number of hierarchies, and a predetermined number of video streams each having the encoded image data of the pictures of the divided hierarchies are generated.
ç»å笊å·åéšã§ã¯ãå°ãªããšããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åããããäŸãã°ãç»å笊å·åéšã¯ãæäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ãããã®éå±€çµããäžäœåŽã«äœçœ®ãããã¹ãŠã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã®äžéã¿ã€ãã³ã°ãšãªãããã«ç¬Šå·åãããããã«ãããŠããããããã«ãããäŸãã°ãåä¿¡åŽã§ã¯ãæäžäœã®éå±€çµã ãã§ãªãããããããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãŸã§ããã³ãŒãããèœåãããå Žåã«ãåãã¯ãã£ã®ãã³ãŒãåŠçãé 次ã¹ã ãŒãºã«é²ããããšãå¯èœãšãªãã   In the image encoding unit, encoding is performed so that at least the decoding intervals of the encoded image data of the pictures in the lowest layer set are equal. For example, the image encoding unit encodes all of the pictures in the hierarchical groups whose decoding timing of the encoded image data of the pictures in the hierarchical group positioned higher than the lowest hierarchical group is lower than the hierarchical group. The encoding may be performed so as to be an intermediate timing of the decoding timing of the image data. Thus, for example, when the receiving side has the ability to decode not only the lowest layer set but also the encoded image data of a picture of a layer set higher than that, the decoding process of each picture is performed. It becomes possible to proceed smoothly one after another.
ãŸããäŸãã°ãç»å笊å·åéšã¯ãæäžäœã®éå±€çµã«è€æ°ã®éå±€ãå«ã¿ããã®æäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã«ã¯ïŒã€ã®éå±€ãå«ãããã«ãè€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãããããã«ãããŠããããããã«ãããäŸãã°ãåä¿¡åŽã§ã¯ãæäžäœã®éå±€çµã«å«ãŸãè€æ°ã®éå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãåŠçå¯èœãªãã³ãŒãèœåãããå Žåããã®æäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ãããªã¹ããªãŒã ã ããéžæããŠãããã¡ã«åã蟌ãã§ãã³ãŒãåŠçãè¡ãæ§æã§æžã¿ãè€æ°ã®ãããªã¹ããªãŒã ã®çµååŠçãªã©ãè¡ããªã©ã®è€éãªæ§æãäžèŠãšãªãã   Further, for example, the image encoding unit includes a plurality of hierarchies so that the lowest hierarchy set includes a plurality of hierarchies, and the hierarchy set positioned higher than the lowest hierarchy set includes one hierarchy. It may be arranged such that it is divided into a number of hierarchical groups. Thus, for example, when the receiving side has a decoding capability capable of processing the encoded image data of the pictures of a plurality of hierarchies included in the lowest hierarchy set, the encoded image data of the pictures of the lowest hierarchy set is provided. Only a video stream having a video stream is selected and fetched into a buffer and decoded, and a complicated configuration such as a process of combining a plurality of video streams is unnecessary.
éä¿¡éšã«ãããäžè¿°ã®æå®æ°ã®ãããªã¹ããªãŒã ãå«ãæå®ãã©ãŒãããã®ã³ã³ãããéä¿¡ããããäŸãã°ãã³ã³ããã¯ãããžã¿ã«æŸéèŠæ Œã§æ¡çšãããŠãããã©ã³ã¹ããŒãã¹ããªãŒã ïŒïŒïŒ°ïŒ¥ïŒ§âïŒ ïŒŽïŒ³ïŒã§ãã£ãŠãããããŸããäŸãã°ãã³ã³ããã¯ãã€ã³ã¿ãŒãããã®é ä¿¡ãªã©ã§çšããããïŒïŒ°ïŒããããã¯ãã以å€ã®ãã©ãŒãããã®ã³ã³ããã§ãã£ãŠãããã   The transmission unit transmits a container of a predetermined format including the predetermined number of video streams. For example, the container may be a transport stream (MPEG-2 TS) adopted in the digital broadcasting standard. Further, for example, the container may be MP4 used for Internet distribution or the like, or a container of other formats.
ãã®ããã«æ¬æè¡ã«ãããŠã¯ãå°ãªããšããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åããããã®ã§ããããã®ãããäŸãã°ãåä¿¡åŽããæäžäœã®éå±€çµã«å«ãŸãè€æ°ã®éå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãåŠçå¯èœãªãã³ãŒãèœåãããå Žåãåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãåŠçãç¡çãªãé£ç¶ããŠè¡ãããšãå¯èœãšãªãã   As described above, according to the present technology, at least the decoding intervals of the encoded image data of the pictures in the lowest layer set are encoded so as to be equal intervals. Therefore, for example, when the receiving side has a decoding capability capable of processing the encoded image data of the pictures of a plurality of hierarchies included in the lowest hierarchy set, the decoding process of the encoded image data of each picture is continued without difficulty. Can be performed.
ãŸããæ¬æè¡ã®ä»ã®æŠå¿µã¯ã
åç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãã該åé¡ãããåéå±€ã®ãã¯ãã£ã®ç»åããŒã¿ã笊å·åãããšå
±ã«ãäžèšè€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãã該åå²ãããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€äžèšæå®æ°ã®ãããªã¹ããªãŒã ãçæããç»å笊å·åéšãšã
äžèšçæãããæå®æ°ã®ãããªã¹ããªãŒã ãå«ãæå®ãã©ãŒãããã®ã³ã³ãããéä¿¡ããéä¿¡éšãšã
äžèšã³ã³ããã®ã¬ã€ã€ã«ãäžèšæå®æ°ã®ãããªã¹ããªãŒã ã®ããããããäžèšæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ããŒã¹ã¹ããªãŒã ã§ãããã該æäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãå«ããšã³ãã³ã¹ã¹ããªãŒã ã§ããããèå¥ããããã®èå¥æ
å ±ãæ¿å
¥ããèå¥æ
å ±æ¿å
¥éšãåãã
éä¿¡è£
眮ã«ããã
Other concepts of this technology are
Classifying the image data of each picture constituting the moving image data into a plurality of layers, encoding the image data of the classified pictures of each layer, and dividing the plurality of layers into a predetermined number of layer sets; An image encoding unit for generating the predetermined number of video streams respectively having encoded image data of pictures of each of the divided hierarchical groups;
A transmission unit for transmitting a container of a predetermined format including the generated predetermined number of video streams;
In the container layer, each of the predetermined number of video streams is a base stream having encoded image data of pictures of the lowest hierarchy set, or a hierarchy set positioned higher than the lowest hierarchy set The transmission apparatus includes an identification information insertion unit that inserts identification information for identifying whether the stream is an enhancement stream including encoded image data of a picture.
æ¬æè¡ã«ãããŠãç»å笊å·åéšã«ãããåç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ã笊å·åãããŠæå®æ°ã®ãããªã¹ããªãŒã ãçæãããããã®å Žåãåç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãããŠç¬Šå·åãããããããŠããã®è€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ããããã®åå²ãããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€æå®æ°ã®ãããªã¹ããªãŒã ãçæãããã   In the present technology, the image encoding unit encodes the image data of each picture constituting the moving image data to generate a predetermined number of video streams. In this case, the image data of each picture constituting the moving image data is classified into a plurality of layers and encoded. Then, the plurality of hierarchies are divided into a predetermined number of hierarchies, and a predetermined number of video streams each having the encoded image data of the pictures of the divided hierarchies are generated.
äŸãã°ãç»å笊å·åéšã¯ãå°ãªããšããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åãããããã«ãããŠãããããã®å ŽåãäŸãã°ãç»å笊å·åéšã¯ãæäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ãããã®éå±€çµããäžäœåŽã«äœçœ®ãããã¹ãŠã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã®äžéã¿ã€ãã³ã°ãšãªãããã«ç¬Šå·åãããããã«ãããŠãããã   For example, the image encoding unit may perform encoding so that at least the decoding intervals of the encoded image data of the pictures in the lowest layer set are equal. In this case, for example, the image encoding unit may decode pictures of all hierarchical groups whose decoding timings of the encoded image data of the hierarchical group of pictures positioned higher than the lowest hierarchical group are lower than this hierarchical group. The encoded image data may be encoded so as to be intermediate in the decoding timing of the encoded image data.
éä¿¡éšã«ãããäžè¿°ã®æå®æ°ã®ãããªã¹ããªãŒã ãå«ãæå®ãã©ãŒãããã®ã³ã³ãããéä¿¡ããããäŸãã°ãã³ã³ããã¯ãããžã¿ã«æŸéèŠæ Œã§æ¡çšãããŠãããã©ã³ã¹ããŒãã¹ããªãŒã ïŒïŒïŒ°ïŒ¥ïŒ§âïŒ ïŒŽïŒ³ïŒã§ãã£ãŠãããããŸããäŸãã°ãã³ã³ããã¯ãã€ã³ã¿ãŒãããã®é ä¿¡ãªã©ã§çšããããïŒïŒ°ïŒããããã¯ãã以å€ã®ãã©ãŒãããã®ã³ã³ããã§ãã£ãŠãããã   The transmission unit transmits a container of a predetermined format including the predetermined number of video streams. For example, the container may be a transport stream (MPEG-2 TS) adopted in the digital broadcasting standard. Further, for example, the container may be MP4 used for Internet distribution or the like, or a container of other formats.
èå¥æ
å ±æ¿å
¥éšã«ãããã³ã³ããã®ã¬ã€ã€ã«ãèå¥æ
å ±ãæ¿å
¥ãããããã®èå¥æ
å ±ã¯ã
æå®æ°ã®ãããªã¹ããªãŒã ã®ããããããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ããŒã¹ã¹ããªãŒã ã§ãããããã®æäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãå«ããšã³ãã³ã¹ã¹ããªãŒã ã§ããããèå¥ããããã®èå¥æ
å ±ã§ããã
The identification information insertion unit inserts identification information into the container layer. This identification information
Each of the predetermined number of video streams is a base stream having encoded image data of pictures of the lowest hierarchical group, or encoded image data of pictures of a hierarchical group positioned higher than the lowest hierarchical group. This is identification information for identifying whether or not the enhancement stream is included.
äŸãã°ããšã³ãã³ã¹ã¹ããªãŒã ãè€æ°ååšãããšããèå¥æ å ±ã¯ãããããã®ãšã³ãã³ã¹ã¹ããªãŒã ãããã«èå¥å¯èœãšãããããã«ãããŠãããããŸããäŸãã°ãã³ã³ããã¯ãã©ã³ã¹ããŒãã¹ããªãŒã ã§ãããèå¥æ å ±æ¿å ¥éšã¯ãèå¥æ å ±ããããã°ã©ã ãããããŒãã«ã®é äžã«æå®æ°ã®ãããªã¹ããªãŒã ã«ãããã察å¿ããŠé 眮ããããããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«ãŒãã®äžã«ã¹ããªãŒã ã¿ã€ããšããŠæ¿å ¥ãããããã«ãããŠãããã   For example, when there are a plurality of enhanced streams, the identification information may be such that each enhanced stream can be further identified. In addition, for example, the container is a transport stream, and the identification information insertion unit streams the identification information in a video elementary stream loop arranged corresponding to a predetermined number of video streams under the program map table. Insert as a type.
ãã®ããã«æ¬æè¡ã«ãããŠã¯ãã³ã³ããã®ã¬ã€ã€ã«ãæå®æ°ã®ãããªã¹ããªãŒã ã®ããããããããŒã¹ã¹ããªãŒã ã§ããããšã³ãã³ã¹ã¹ããªãŒã ã§ããããèå¥ããããã®èå¥æ å ±ãæ¿å ¥ããããã®ã§ããããã®ãããåä¿¡åŽã«ãããŠã¯ããã®èå¥æ å ±ãå©çšããããšã§ãäŸãã°ãããŒã¹ã¹ããªãŒã ã ããéžæããäœéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãéžæçã«ãã³ãŒãããããšã容æã«å¯èœãšãªãã   As described above, in the present technology, identification information for identifying whether each of a predetermined number of video streams is a base stream or an enhanced stream is inserted into a container layer. Therefore, on the receiving side, by using this identification information, for example, it is possible to easily select only the base stream and selectively decode the encoded image data of the pictures in the lower layer set.
ãŸããæ¬æè¡ã®ä»ã®æŠå¿µã¯ã
åç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãã該åé¡ãããåéå±€ã®ãã¯ãã£ã®ç»åããŒã¿ã笊å·åãããšå
±ã«ãäžèšè€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãã該åå²ãããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€äžèšæå®æ°ã®ãããªã¹ããªãŒã ãçæããç»å笊å·åéšãšã
äžèšçæãããæå®æ°ã®ãããªã¹ããªãŒã ãå«ãæå®ãã©ãŒãããã®ã³ã³ãããéä¿¡ããéä¿¡éšãšã
äžèšã³ã³ããã®ã¬ã€ã€ã«ã該ã³ã³ããã«å«ãŸããæå®æ°ã®ãããªã¹ããªãŒã ã®ããããã«å¯Ÿå¿ããŠã該ãããªã¹ããªãŒã ã®æ§ææ
å ±ãæ¿å
¥ããæ§ææ
å ±æ¿å
¥éšãåãã
éä¿¡è£
眮ã«ããã
Other concepts of this technology are
Classifying the image data of each picture constituting the moving image data into a plurality of layers, encoding the image data of the classified pictures of each layer, and dividing the plurality of layers into a predetermined number of layer sets; An image encoding unit for generating the predetermined number of video streams respectively having encoded image data of pictures of each of the divided hierarchical groups;
A transmission unit for transmitting a container of a predetermined format including the generated predetermined number of video streams;
The transmission apparatus includes a configuration information insertion unit that inserts configuration information of the video stream corresponding to each of a predetermined number of video streams included in the container in the container layer.
æ¬æè¡ã«ãããŠãç»å笊å·åéšã«ãããåç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ã笊å·åãããŠæå®æ°ã®ãããªã¹ããªãŒã ãçæãããããã®å Žåãåç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãããŠç¬Šå·åãããããããŠããã®è€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ããããã®åå²ãããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€æå®æ°ã®ãããªã¹ããªãŒã ãçæãããããããŠãéä¿¡éšã«ããããã®æå®æ°ã®ãããªã¹ããªãŒã ãå«ãæå®ãã©ãŒãããã®ã³ã³ãããéä¿¡ãããã   In the present technology, the image encoding unit encodes the image data of each picture constituting the moving image data to generate a predetermined number of video streams. In this case, the image data of each picture constituting the moving image data is classified into a plurality of layers and encoded. Then, the plurality of hierarchies are divided into a predetermined number of hierarchies, and a predetermined number of video streams each having the encoded image data of the pictures of the divided hierarchies are generated. Then, a container having a predetermined format including the predetermined number of video streams is transmitted by the transmission unit.
æ§ææ å ±æ¿å ¥éšã«ãããã³ã³ããã®ã¬ã€ã€ã«ããã®ã³ã³ããã«å«ãŸããæå®æ°ã®ãããªã¹ããªãŒã ã®ããããã«å¯Ÿå¿ããŠããã®ãããªã¹ããªãŒã ã®æ§ææ å ±ãæ¿å ¥ããããäŸãã°ãã³ã³ããã¯ãã©ã³ã¹ããŒãã¹ããªãŒã ã§ãããæ§ææ å ±æ¿å ¥éšã¯ããã®æ§ææ å ±ããããã°ã©ã ãããããŒãã«ã®é äžã«æå®æ°ã®ãããªã¹ããªãŒã ã«ãããã察å¿ããŠé 眮ããããããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«ãŒãã®äžã«ãã¹ã¯ãªãã¿ãšããŠæ¿å ¥ãããããã«ãããŠãããã   The configuration information insertion unit inserts the configuration information of the video stream into the container layer corresponding to each of a predetermined number of video streams included in the container. For example, the container is a transport stream, and the configuration information insertion unit uses this configuration information as a descriptor in a video elementary stream loop arranged corresponding to a predetermined number of video streams under the program map table. It may be inserted.
äŸãã°ãæ§ææ å ±ã«ã¯ããããªã¹ããªãŒã ãå±ãããµãŒãã¹ã°ã«ãŒãã瀺ãæ å ±ãå«ãŸãããããã«ãããŠãããããŸããäŸãã°ãæ§ææ å ±ã«ã¯ãæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ããŒã¹ã¹ããªãŒã ããå§ãŸãã¹ããªãŒã éã®äŸåé¢ä¿ã瀺ãæ å ±ãå«ãŸãããããã«ãããŠãããããŸããäŸãã°ãæ§ææ å ±ã«ã¯ãç»å笊å·åéšã§åé¡ãããè€æ°ã®éå±€ã®éå±€æ°ã瀺ãæ å ±ãå«ãŸãããããã«ãããŠãããã   For example, the configuration information may include information indicating a service group to which the video stream belongs. Further, for example, the configuration information may include information indicating a dependency relationship between streams starting from a base stream having encoded image data of pictures in the lowest layer set. For example, the configuration information may include information indicating the number of hierarchies of a plurality of hierarchies classified by the image encoding unit.
ãã®ããã«æ¬æè¡ã«ãããŠã¯ãã³ã³ããã®ã¬ã€ã€ã«ããã®ã³ã³ããã«å«ãŸããæå®æ°ã®ãããªã¹ããªãŒã ã®ããããã«å¯Ÿå¿ããŠããã®ãããªã¹ããªãŒã ã®æ§ææ å ±ãæ¿å ¥ããããã®ã§ããããã®ãããäŸãã°ãåä¿¡åŽã§ã¯ãã³ã³ããã«å«ãŸããåãããªã¹ããªãŒã ã«ã€ããã©ã®ã°ã«ãŒãã«å±ããã®ããã©ã®ãããªã¹ããªãŒã äŸåé¢ä¿ã«ããã®ããéå±€æ°ããããã®é局笊å·åã«ä¿ããã®ã§ãããããªã©ã容æã«ææ¡å¯èœãšãªãã   As described above, in the present technology, the configuration information of the video stream is inserted into the container layer corresponding to each of the predetermined number of video streams included in the container. Therefore, for example, on the receiving side, for each video stream included in the container, to which group it belongs, what kind of stream dependency it is, how many layers the hierarchy is related to, etc. It can be easily grasped.
ãŸããæ¬æè¡ã®ä»ã®æŠå¿µã¯ã
åç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãããŠç¬Šå·åããããšå
±ã«ãäžèšè€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãããŠåŸããããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€äžèšæå®æ°ã®ãããªã¹ããªãŒã ãåä¿¡ããåä¿¡éšãšã
äžèšåä¿¡ãããæå®æ°ã®ãããªã¹ããªãŒã ãåŠçããåŠçéšãåãã
äžèšæå®æ°ã®ãããªã¹ããªãŒã ã®ãã¡ãå°ãªããšããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ãããªã¹ããªãŒã ã¯ãåãã¯ãã£ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åãããŠãã
åä¿¡è£
眮ã«ããã
Other concepts of this technology are
The picture data of each picture constituting the moving picture data is classified and encoded into a plurality of hierarchies, and the codes of the pictures in each hierarchic set obtained by dividing the plurality of hierarchies into a predetermined number of hierarchies. A receiving unit for receiving the predetermined number of video streams each having digitized image data;
A processing unit for processing the received predetermined number of video streams;
Among the predetermined number of video streams, at least the video stream having the encoded image data of the picture of the lowest layer set is in the receiving apparatus which is encoded so that the decoding interval of each picture is equal. .
æ¬æè¡ã«ãããŠãåä¿¡éšã«ãããåç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãããŠç¬Šå·åããããšå ±ã«ããã®è€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãããŠåŸããããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€æå®æ°ã®ãããªã¹ããªãŒã ãåä¿¡ãããããããŠãåŠçéšã«ããããã®åä¿¡ãããæå®æ°ã®ãããªã¹ããªãŒã ãåŠçãããã   In the present technology, the image data of each picture constituting the moving image data is classified into a plurality of hierarchies and encoded by the receiving unit, and the plurality of hierarchies are divided into a predetermined number of hierarchies. Then, a predetermined number of video streams each having encoded image data of pictures in each hierarchical group are received. Then, the processing unit processes the received predetermined number of video streams.
ãã®å Žåãæå®æ°ã®ãããªã¹ããªãŒã ã®ãã¡ãå°ãªããšããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ãããªã¹ããªãŒã ã¯ãåãã¯ãã£ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åãããŠããããã®ãããäŸãã°ãæäžäœã®éå±€çµã«å«ãŸãè€æ°ã®éå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãåŠçå¯èœãªãã³ãŒãèœåãããå Žåãåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãåŠçãç¡çãªãé£ç¶ããŠè¡ãããšãå¯èœãšãªãã   In this case, among the predetermined number of video streams, at least the video stream having the encoded image data of the pictures in the lowest hierarchical set is encoded so that the decoding intervals of each picture are equal. Therefore, for example, when there is a decoding capability capable of processing the encoded image data of a plurality of hierarchies included in the lowest hierarchy set, the decoding process of the encoded image data of each picture should be performed continuously without difficulty. Is possible.
ãªããæ¬æè¡ã«ãããŠãäŸãã°ãæå®æ°ã®ãããªã¹ããªãŒã ã¯ãæäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ãããã®éå±€çµããäžäœåŽã«äœçœ®ãããã¹ãŠã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã®äžéã¿ã€ãã³ã°ãšãªãããã«ç¬Šå·åãããŠãããããã«ãããŠããŠããããããã«ãããäŸãã°ãæäžäœã®éå±€çµã ãã§ãªãããããããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãŸã§ããã³ãŒãããèœåãããå Žåã«ãåãã¯ãã£ã®ãã³ãŒãåŠçãé 次ã¹ã ãŒãºã«é²ããããšãå¯èœãšãªãã   In the present technology, for example, a predetermined number of video streams have all the decoding timings of the encoded image data of the pictures of the hierarchical group positioned higher than the lowest hierarchical group at the lower level than this hierarchical group. The encoding may be performed so as to be an intermediate timing of the decoding timing of the encoded image data of the hierarchical set of pictures. As a result, for example, when there is an ability to decode not only the lowest hierarchical group but also the encoded image data of a picture of a hierarchical group positioned higher than the lowest hierarchical group, the decoding process of each picture proceeds sequentially and smoothly. It becomes possible.
ãŸããæ¬æè¡ã®ä»ã®æŠå¿µã¯ã
åç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãããŠç¬Šå·åããããšå
±ã«ãäžèšè€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãããããšã§åŸããããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€äžèšæå®æ°ã®ãããªã¹ããªãŒã ãå«ãæå®ãã©ãŒãããã®ã³ã³ãããåä¿¡ããåä¿¡éšãšã
äžèšåä¿¡ãããã³ã³ããã«å«ãŸããäžèšæå®æ°ã®ãããªã¹ããªãŒã ãããã³ãŒãèœåã«å¿ããæå®é局以äžã®éå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãéžæçã«ãããã¡ã«åã蟌ã¿ã該ãããã¡ã«åã蟌ãŸããåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ããã³ãŒãããŠãäžèšæå®é局以äžã®éå±€ã®ãã¯ãã£ã®ç»åããŒã¿ãåŸãç»å埩å·åéšãåãã
äžèšæå®æ°ã®ãããªã¹ããªãŒã ã®ãã¡ãå°ãªããšããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ãããªã¹ããªãŒã ã¯ãåãã¯ãã£ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åãããŠãã
åä¿¡è£
眮ã«ããã
Other concepts of this technology are
The picture data of each picture constituting the moving picture data is classified and encoded into a plurality of hierarchies, and the pictures of the respective hierarchies are obtained by dividing the plurality of hierarchies into a predetermined number of hierarchies. A receiving unit for receiving a container of a predetermined format including the predetermined number of video streams each having the encoded image data of
From the predetermined number of video streams included in the received container, the encoded image data of a picture of a predetermined hierarchy or lower according to the decoding capability is selectively taken into a buffer, and the code of each picture taken into the buffer is selected. An image decoding unit that decodes the converted image data and obtains image data of a picture of a layer below the predetermined layer,
Among the predetermined number of video streams, at least the video stream having the encoded image data of the picture of the lowest layer set is in the receiving apparatus which is encoded so that the decoding interval of each picture is equal. .
æ¬æè¡ã«ãããŠãåä¿¡éšã«ãããæå®ãã©ãŒãããã®ã³ã³ãããåä¿¡ãããããã®ã³ã³ããã«ã¯ãåç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãããŠç¬Šå·åããããšå ±ã«ããã®è€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãããŠåŸããããäžã€ä»¥äžã®éå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€æå®æ°ã®ãããªã¹ããªãŒã ãå«ãŸããŠããã   In the present technology, a container having a predetermined format is received by the receiving unit. In this container, image data of each picture constituting moving image data is classified into a plurality of hierarchies and encoded, and one obtained by dividing the plurality of hierarchies into a predetermined number of hierarchies. A predetermined number of video streams having encoded image data of pictures in the above hierarchy are included.
ç»å埩å·åéšã«ãããåä¿¡ãããã³ã³ããã«å«ãŸããæå®æ°ã®ãããªã¹ããªãŒã ãããã³ãŒãèœåã«å¿ããæå®é局以äžã®éå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãéžæçã«ãããã¡ã«åã蟌ãŸãããã®ãããã¡ã«åã蟌ãŸããåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ããã³ãŒããããŠãæå®é局以äžã®éå±€ã®ãã¯ãã£ã®ç»åããŒã¿ãåŸããããäŸãã°ãç»å埩å·åéšã¯ãæå®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãè€æ°ã®ãããªã¹ããªãŒã ã«å«ãŸããŠããå Žåãåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ããã³ãŒãã¿ã€ãã³ã°æ å ±ã«åºã¥ããŠïŒã€ã®ã¹ããªãŒã ã«ããŠãã³ãŒããããããã«ãããŠãããã   The image decoding unit selectively fetches the encoded image data of a picture of a layer below the predetermined layer corresponding to the decoding capability from the predetermined number of video streams included in the received container, and stores the encoded image data in the buffer. Then, the encoded image data of each picture is decoded, and image data of pictures in a hierarchy below a predetermined hierarchy is obtained. For example, the image decoding unit decodes the encoded image data of each picture as one stream based on the decoding timing information when the encoded image data of pictures of a predetermined hierarchy set is included in a plurality of video streams. You may be like.
ãã®å Žåãæå®æ°ã®ãããªã¹ããªãŒã ã®ãã¡ãå°ãªããšããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ãããªã¹ããªãŒã ã¯ãåãã¯ãã£ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åãããŠããããã®ãããäŸãã°ãæäžäœã®éå±€çµã«å«ãŸãè€æ°ã®éå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãåŠçå¯èœãªãã³ãŒãèœåãããå Žåãåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãåŠçãç¡çãªãé£ç¶ããŠè¡ãããšãå¯èœãšãªãã   In this case, among the predetermined number of video streams, at least the video stream having the encoded image data of the pictures in the lowest hierarchical set is encoded so that the decoding intervals of each picture are equal. Therefore, for example, when there is a decoding capability capable of processing the encoded image data of a plurality of hierarchies included in the lowest hierarchy set, the decoding process of the encoded image data of each picture should be performed continuously without difficulty. Is possible.
ãªããæ¬æè¡ã«ãããŠãäŸãã°ãã³ã³ããã®ã¬ã€ã€ã«ãæå®æ°ã®ãããªã¹ããªãŒã ã®ããããããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãå«ãããŒã¹ã¹ããªãŒã ã§ãããããã®æäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãå«ããšã³ãã³ã¹ã¹ããªãŒã ã§ããããèå¥ããããã®èå¥æ å ±ãæ¿å ¥ãããŠãããç»å埩å·åéšã¯ããã®èå¥æ å ±ã«åºã¥ããŠãããŒã¹ã¹ããªãŒã ãå«ãæå®æ°ã®ãããªã¹ããªãŒã ãããã³ãŒãèœåã«å¿ããæå®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ããããã¡ã«åã蟌ãã§ãã³ãŒããããããã«ãããŠãããããã®å Žåãèå¥æ å ±ãå©çšããããšã§ãäŸãã°ãããŒã¹ã¹ããªãŒã ã ããéžæããäœéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãéžæçã«ãã³ãŒãããããšã容æã«å¯èœãšãªãã   In the present technology, for example, each of a predetermined number of video streams in the container layer is a base stream including encoded image data of pictures in the lowest layer set or higher than this lowest layer set. The identification information for identifying whether or not the enhanced stream includes the encoded image data of the picture of the layer set positioned in is inserted, and based on the identification information, the image decoding unit includes a predetermined stream including the base stream The encoded image data of a predetermined layer set of pictures corresponding to the decoding capability from a number of video streams may be taken into a buffer and decoded. In this case, by using the identification information, for example, it is possible to easily select only the base stream and selectively decode the encoded image data of the low-layer set picture.
ãŸããæ¬æè¡ã«ãããŠãäŸãã°ãç»å埩å·åéšã§åŸãããåãã¯ãã£ã®ç»åããŒã¿ã®ãã¬ãŒã ã¬ãŒãã衚瀺èœåã«åããããã¹ãåŠçéšãããã«åãããããã«ãããŠãããããã®å Žåããã³ãŒãèœåãäœãå Žåã§ãã£ãŠããé«è¡šç€ºèœåã«ãã£ããã¬ãŒã ã¬ãŒãã®ç»åããŒã¿ãåŸãããšãå¯èœãšãªãã   In the present technology, for example, a post processing unit that matches the frame rate of the image data of each picture obtained by the image decoding unit with the display capability may be further provided. In this case, even if the decoding capability is low, it is possible to obtain image data with a frame rate suitable for the high display capability.
æ¬æè¡ã«ããã°ãåä¿¡åŽã«ãããŠè¯å¥œãªãã³ãŒãåŠçãå¯èœãšãªãããªããããã«èšèŒãããå¹æã¯å¿ ãããéå®ããããã®ã§ã¯ãªããæ¬é瀺äžã«èšèŒãããããããã®å¹æã§ãã£ãŠãããã   According to the present technology, a good decoding process can be performed on the receiving side. Note that the effects described here are not necessarily limited, and may be any of the effects described in the present disclosure.
以äžãçºæãå®æœããããã®åœ¢æ
ïŒä»¥äžããå®æœã®åœ¢æ
ããšããïŒã«ã€ããŠèª¬æããããªãã説æã¯ä»¥äžã®é åºã§è¡ãã
ïŒïŒå®æœã®åœ¢æ
ïŒïŒå€åœ¢äŸ
Hereinafter, modes for carrying out the invention (hereinafter referred to as âembodimentsâ) will be described. The description will be given in the following order.
1.
ïŒïŒïŒå®æœã®åœ¢æ
ïŒ
éåä¿¡ã·ã¹ãã 
å³ïŒã¯ãå®æœã®åœ¢æ
ãšããŠã®éåä¿¡ã·ã¹ãã ïŒïŒã®æ§æäŸã瀺ããŠããããã®éåä¿¡ã·ã¹ãã ïŒïŒã¯ãéä¿¡è£
眮ïŒïŒïŒãšãåä¿¡è£
眮ïŒïŒïŒãšãæããæ§æãšãªã£ãŠããã
<1. Embodiment>
[Transmission / reception system]
FIG. 1 shows a configuration example of a transmission /
éä¿¡è£
眮ïŒïŒïŒã¯ãã³ã³ãããšããŠã®ãã©ã³ã¹ããŒãã¹ããªãŒã ãæŸéæ³¢ã«èŒããŠéä¿¡ããããã®ãã©ã³ã¹ããŒãã¹ããªãŒã ã«ã¯ãåç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãããŠç¬Šå·åããããšå
±ã«ããã®è€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãããããšã§åŸããããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€æå®æ°ã®ãããªã¹ããªãŒã ãå«ãŸããããã®å ŽåãäŸãã°ãïŒïŒïŒïŒïŒïŒ¡ïŒ¶ïŒ£ãïŒïŒïŒïŒïŒïŒšïŒ¥ïŒ¶ïŒ£ãªã©ã®ç¬Šå·åãæœããã被åç
§ãã¯ãã£ãèªå·±éå±€ããã³ïŒãŸãã¯èªå·±éå±€ãããäœãéå±€ã«æå±ããããã«ç¬Šå·åãããã
  The
ãã®å®æœã®åœ¢æ ã«ãããŠãè€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ããå Žåãæäžäœã®éå±€çµã«è€æ°ã®éå±€ãå«ã¿ããã®æäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã«ã¯ïŒã€ã®éå±€ãå«ãããã«ãããããã®ãããªåå²ã«ãããåä¿¡åŽã§ã¯ãäŸãã°ãæäžäœã®éå±€çµã«å«ãŸãè€æ°ã®éå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãåŠçå¯èœãªãã³ãŒãèœåãããå Žåããã®æäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ãããªã¹ããªãŒã ã ããéžæããŠãããã¡ã«åã蟌ãã§ãã³ãŒãåŠçãè¡ãããšãå¯èœãšãªãã   In this embodiment, when dividing a plurality of hierarchies into a predetermined number of hierarchies, the lowest hierarchy includes a plurality of hierarchies, and the hierarchy set higher than the lowest hierarchy set has one hierarchy. To be included. Due to such division, on the receiving side, for example, when there is a decoding capability capable of processing encoded image data of pictures of a plurality of hierarchies included in the lowest hierarchy set, the encoding of the pictures of the lowest hierarchy set is performed. Only the video stream having the converted image data can be selected and taken into the buffer to be decoded.
åéå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã«ããã¯ãã£æ¯ã«ãæå±éå±€ãèå¥ããããã®éå±€èå¥æ å ±ãä»å ãããããã®å®æœã®åœ¢æ ã«ãããŠã¯ãåãã¯ãã£ã®ïŒ®ïŒ¡ïŒ¬ãŠãããïŒnal_unitïŒã®ãããéšåã«ãéå±€èå¥æ å ±ïŒtemporal_idãæå³ããânuh_temporal_id_plus1âïŒãé 眮ãããããã®ããã«éå±€èå¥æ å ±ãä»å ãããããšã§ãåä¿¡åŽã§ã¯ããŠãããã®ã¬ã€ã€ã«ãããŠåãã¯ãã£ã®éå±€èå¥ãå¯èœãšãªããæå®é局以äžã®éå±€ã®ç¬Šå·åç»åããŒã¿ãéžæçã«åãåºããŠãã³ãŒãåŠçãè¡ãããšãã§ããã   Hierarchy identification information for identifying the belonging hierarchy is added to the encoded image data of the picture of each hierarchy for each picture. In this embodiment, hierarchical identification information (ânuh_temporal_id_plus1â meaning temporal_id) is arranged in the header portion of the NAL unit (nal_unit) of each picture. By adding the layer identification information in this way, on the receiving side, the layer identification of each picture becomes possible in the layer of the NAL unit, and the encoded image data of the layer below the predetermined layer is selectively extracted and decoded. It can be carried out.
ãã®å®æœã®åœ¢æ ã«ãããŠãæå®æ°ã®ãããªã¹ããªãŒã ã®ãã¡ãå°ãªããšããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ãããªã¹ããªãŒã ã¯ãåãã¯ãã£ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åãããããã®ç¬Šå·åã«ãããåä¿¡åŽã§ã¯ãæäžäœã®éå±€çµã«å«ãŸãè€æ°ã®éå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãåŠçå¯èœãªãã³ãŒãèœåãããå Žåãåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãåŠçãç¡çãªãé£ç¶ããŠè¡ãããšãå¯èœãšãªãã   In this embodiment, among the predetermined number of video streams, at least the video stream having the encoded image data of the pictures in the lowest layer set is encoded so that the decoding interval of each picture is equal. . With this encoding, on the receiving side, when there is a decoding capability capable of processing the encoded image data of the pictures of a plurality of hierarchies included in the lowest hierarchy set, the decoding process of the encoded image data of each picture can be performed easily. It becomes possible to carry out continuously.
ãã®å®æœã®åœ¢æ ã«ãããŠãæäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ãããã®éå±€çµããäžäœåŽã«äœçœ®ãããã¹ãŠã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã®äžéã¿ã€ãã³ã°ãšãªãããã«ç¬Šå·åãããããã®ç¬Šå·åã«ãããåä¿¡åŽã§ã¯ãæäžäœã®éå±€çµã ãã§ãªãããããããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãŸã§ããã³ãŒãããèœåãããå Žåã«ãåãã¯ãã£ã®ãã³ãŒãåŠçãé 次ã¹ã ãŒãºã«é²ããããšãå¯èœãšãªãã   In this embodiment, the decoding timing of the encoded image data of the pictures of the hierarchical group positioned higher than the lowest hierarchical group is the encoded image data of the pictures of all the hierarchical groups positioned lower than this hierarchical group. Encoding is performed so as to be an intermediate timing of the decoding timing. With this encoding, on the receiving side, if there is the ability to decode not only the lowest layer set but also the encoded image data of the pictures in the layer set higher than that, the decoding process of each picture is performed. It becomes possible to proceed smoothly one after another.
ãã®å®æœã®åœ¢æ ã«ãããŠããã©ã³ã¹ããŒãã¹ããªãŒã ã®ã¬ã€ã€ã«ãæå®æ°ã®ãããªã¹ããªãŒã ã®ããããããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ããŒã¹ã¹ããªãŒã ã§ãããããã®æäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãå«ããšã³ãã³ã¹ã¹ããªãŒã ã§ããããèå¥ããããã®èå¥æ å ±ãæ¿å ¥ãããããã®èå¥æ å ±ã¯ãããã°ã©ã ãããããŒãã«ã®é äžã«æå®æ°ã®ãããªã¹ããªãŒã ã«ãããã察å¿ããŠé 眮ããããããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«ãŒãã®äžã«ã¹ããªãŒã ã¿ã€ããšããŠæ¿å ¥ãããããã®èå¥æ å ±ã«ãããåä¿¡åŽã§ã¯ãããŒã¹ã¹ããªãŒã ã ããéžæããäœéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãéžæçã«ãã³ãŒãããããšã容æã«å¯èœãšãªãã   In this embodiment, each of a predetermined number of video streams in the layer of the transport stream TS is a base stream having encoded image data of pictures in the lowest layer set, or from this lowest layer set Identification information for identifying whether the stream is an enhanced stream including encoded image data of a hierarchical set of pictures positioned at the upper level is inserted. This identification information is inserted as a stream type in a video elementary stream loop arranged corresponding to a predetermined number of video streams under the program map table. With this identification information, the receiving side can easily select only the base stream and selectively decode the encoded image data of the pictures in the lower layer set.
ãã®å®æœã®åœ¢æ ã«ãããŠããã©ã³ã¹ããŒãã¹ããªãŒã ã®ã¬ã€ã€ã«ãããã«å«ãŸããæå®æ°ã®ãããªã¹ããªãŒã ã®ããããã«å¯Ÿå¿ããŠããããªã¹ããªãŒã ã®æ§ææ å ±ãæ¿å ¥ãããããã®æ§ææ å ±ã¯ãããã°ã©ã ãããããŒãã«ã®é äžã«æå®æ°ã®ãããªã¹ããªãŒã ã«ãããã察å¿ããŠé 眮ããããããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«ãŒãã®äžã«ãã¹ã¯ãªãã¿ãšããŠæ¿å ¥ãããããã®æ§ææ å ±ã«ãããåä¿¡åŽã§ã¯ãã³ã³ããã«å«ãŸããåãããªã¹ããªãŒã ã«ã€ããã©ã®ã°ã«ãŒãã«å±ããã®ããã©ã®ãããªã¹ããªãŒã äŸåé¢ä¿ã«ããã®ããéå±€æ°ããããã®é局笊å·åã«ä¿ããã®ã§ãããããªã©ã容æã«ææ¡å¯èœãšãªãã   In this embodiment, video stream configuration information is inserted into the transport stream TS layer corresponding to each of a predetermined number of video streams included therein. This configuration information is inserted as a descriptor in a video elementary stream loop arranged corresponding to a predetermined number of video streams under the program map table. With this configuration information, on the receiving side, for each video stream included in the container, to which group it belongs, what kind of stream dependency relationship, how many layers the hierarchy is related to, etc. Can be easily grasped.
åä¿¡è£
眮ïŒïŒïŒã¯ãéä¿¡è£
眮ïŒïŒïŒããæŸéæ³¢ã«èŒããŠéãããŠããäžè¿°ã®ãã©ã³ã¹ããŒãã¹ããªãŒã ãåä¿¡ãããåä¿¡è£
眮ïŒïŒïŒã¯ããã®ãã©ã³ã¹ããŒãã¹ããªãŒã ã«å«ãŸããæå®æ°ã®ãããªã¹ããªãŒã ãããã³ãŒãèœåã«å¿ããŠéžæãããæå®é局以äžã®éå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãéžæçã«ãããã¡ã«åã蟌ãã§ãã³ãŒãããåãã¯ãã£ã®ç»åããŒã¿ãååŸããŠãç»ååçãè¡ãã
  The receiving
äžè¿°ããããã«ããã©ã³ã¹ããŒãã¹ããªãŒã ã®ã¬ã€ã€ã«ãæå®æ°ã®ãããªã¹ããªãŒã ãããŒã¹ã¹ããªãŒã ã§ããããšã³ãã³ã¹ã¹ããªãŒã ã§ããããèå¥ããèå¥æ å ±ãå«ãŸããŠããããã®èå¥æ å ±ã«åºã¥ããŠãããŒã¹ã¹ããªãŒã ãå«ãæå®æ°ã®ãããªã¹ããªãŒã ãããã³ãŒãèœåã«å¿ããæå®éå±€çµã®ç¬Šå·åç»åããŒã¿ããããã¡ã«åã蟌ãŸããŠåŠçãããã   As described above, the identification information for identifying whether the predetermined number of video streams is the base stream or the enhanced stream is included in the layer of the transport stream TS. Based on this identification information, a predetermined layer set of encoded image data corresponding to the decoding capability is fetched from the predetermined number of video streams including the base stream and processed.
ãŸããåä¿¡è£
眮ïŒïŒïŒã¯ãäžè¿°ã®ããã«ãã³ãŒãããŠåŸãããåãã¯ãã£ã®ç»åããŒã¿ã®ãã¬ãŒã ã¬ãŒãã衚瀺èœåã«åããããã¹ãåŠçãè¡ãããã®ãã¹ãåŠçã«ãããäŸãã°ããã³ãŒãèœåãäœãå Žåã§ãã£ãŠããé«è¡šç€ºèœåã«ãã£ããã¬ãŒã ã¬ãŒãã®ç»åããŒã¿ãåŸãããšãå¯èœãšãªãã
  In addition, the receiving
ãéä¿¡è£
眮ã®æ§æã
å³ïŒã¯ãéä¿¡è£
眮ïŒïŒïŒã®æ§æäŸã瀺ããŠããããã®éä¿¡è£
眮ïŒïŒïŒã¯ãïŒCentral Processing UnitïŒïŒïŒïŒãšããšã³ã³ãŒãïŒïŒïŒãšãå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒcoded picture bufferïŒïŒïŒïŒãšããã«ããã¬ã¯ãµïŒïŒïŒãšãéä¿¡éšïŒïŒïŒãæããŠãããïŒïŒïŒã¯ãå¶åŸ¡éšã§ãããéä¿¡è£
眮ïŒïŒïŒã®åéšã®åäœãå¶åŸ¡ããã
"Configuration of Transmitter"
FIG. 2 shows a configuration example of the
ãšã³ã³ãŒãïŒïŒïŒã¯ãéå§çž®ã®åç»åããŒã¿ãå
¥åããŠãé局笊å·åãè¡ãããšã³ã³ãŒãïŒïŒïŒã¯ããã®åç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ããããããŠããšã³ã³ãŒãïŒïŒïŒã¯ããã®åé¡ãããåéå±€ã®ãã¯ãã£ã®ç»åããŒã¿ã笊å·åããåéå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ãããªã¹ããªãŒã ãçæããããšã³ã³ãŒãïŒïŒïŒã¯ãäŸãã°ãïŒïŒïŒïŒïŒïŒ¡ïŒ¶ïŒ£ãïŒïŒïŒïŒïŒïŒšïŒ¥ïŒ¶ïŒ£ãªã©ã®ç¬Šå·åãè¡ãããã®éããšã³ã³ãŒãïŒïŒïŒã¯ãåç
§ãããã¯ãã£ïŒè¢«åç
§ãã¯ãã£ïŒããèªå·±éå±€ããã³ïŒãŸãã¯èªå·±éå±€ãããäžäœã®éå±€ã«æå±ããããã«ã笊å·åããã
  The
å³ïŒã¯ããšã³ã³ãŒãïŒïŒïŒã§è¡ãããé局笊å·åã®äžäŸã瀺ããŠããããã®äŸã¯ãïŒããïŒãŸã§ã®ïŒéå±€ã«åé¡ãããåéå±€ã®ãã¯ãã£ã®ç»åããŒã¿ã«å¯ŸããŠç¬Šå·åãæœãããäŸã§ããã
  FIG. 3 shows an example of hierarchical encoding performed by the
瞊軞ã¯éå±€ã瀺ããŠãããéå±€ïŒããïŒã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæ§æãããŠãããïŒnal_unitïŒã®ãããéšåã«é
眮ãããtemporal_idïŒéå±€èå¥æ
å ±ïŒãšããŠããããããïŒããïŒãèšå®ããããäžæ¹ã暪軞ã¯è¡šç€ºé ïŒïŒ°ïŒ¯ïŒ£ïŒpicture order of compositionïŒã瀺ããå·ŠåŽã¯è¡šç€ºæå»ãåã§ãå³åŽã¯è¡šç€ºæå»ãåŸã«ãªãã
  The vertical axis represents the hierarchy. 0 to 4 are set as temporal_id (hierarchy identification information) arranged in the header portion of the NAL unit (nal_unit) constituting the encoded image data of the pictures of
å³ïŒïŒïœïŒã¯ããŠããããããã®æ§é äŸïŒSyntaxïŒã瀺ããå³ïŒïŒïœïŒã¯ããã®æ§é äŸã«ãããäž»èŠãªãã©ã¡ãŒã¿ã®å 容ïŒSemanticsïŒã瀺ããŠããããForbidden_zero_bitãã®ïŒããããã£ãŒã«ãã¯ãïŒãå¿ é ã§ããããNal_unit_typeãã®ïŒããããã£ãŒã«ãã¯ããŠãããã¿ã€ãã瀺ãããNuh_layer_idãã®ïŒããããã£ãŒã«ãã¯ãïŒãåæãšããããNuh_temporal_id_plus1ãã®ïŒããããã£ãŒã«ãã¯ãtemporal_idã瀺ããïŒãå ããå€ïŒïŒãïŒïŒããšãã   FIG. 4A shows a structure example (Syntax) of the NAL unit header, and FIG. 4B shows contents (Semantics) of main parameters in the structure example. In the 1-bit field of âForbidden_zero_bitâ, 0 is essential. The 6-bit field âNal_unit_typeâ indicates the NAL unit type. The 6-bit field of âNuh_layer_idâ is assumed to be 0. A 3-bit field of âNuh_temporal_id_plus1â indicates temporal_id and takes a value (1 to 7) obtained by adding 1.
å³ïŒã«æ»ã£ãŠãç©åœ¢æ ã®ããããããã¯ãã£ã瀺ããæ°åã¯ã笊å·åãããŠãããã¯ãã£ã®é ãã€ãŸããšã³ã³ãŒãé ïŒåä¿¡åŽã§ã¯ãã³ãŒãé ïŒã瀺ããŠãããäŸãã°ããïŒããããïŒïŒãã®ïŒïŒåã®ãã¯ãã£ã«ãããµãã»ãã¯ãã£ã°ã«ãŒãïŒSub group of picturesïŒãæ§æãããŠããããïŒãã¯ãã®ãµãã»ãã¯ãã£ã°ã«ãŒãã®å é ã®ãã¯ãã£ãšãªãããïŒãã¯åã®ãµãã»ãã¯ãã£ã°ã«ãŒãã®ãã¯ãã£ã§ããããã®ãµãã»ãã¯ãã£ã°ã«ãŒããããã€ãéãŸã£ãŠïŒ§ïŒ¯ïŒ°ïŒGroup Of PicturesïŒãšãªãã   Returning to FIG. 3, each of the rectangular frames indicates a picture, and the numbers indicate the order of the encoded pictures, that is, the encoding order (decoding order on the receiving side). For example, a sub picture group (Sub group of pictures) is composed of 16 pictures from â2â to â17â, and â2â is the head picture of the sub picture group. â1â is a picture of the previous sub-picture group. Several of these sub-picture groups are gathered to form a GOP (Group Of Pictures).
ã®å é ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã¯ãå³ïŒã«ç€ºãããã«ããããããããã®ïŒ®ïŒ¡ïŒ¬ãŠãããã«ããæ§æããããäžæ¹ãã®å é ãã¯ãã£ä»¥å€ã®ãã¯ãã£ã¯ããããããã®ïŒ®ïŒ¡ïŒ¬ãŠãããã«ããæ§æããããã¯ïŒ³ïŒ°ïŒ³ãšå ±ã«ãã·ãŒã±ã³ã¹ïŒïŒ§ïŒ¯ïŒ°ïŒã«äžåºŠãã¯æ¯ãã¯ãã£ã§äŒéå¯èœãšãããŠããã   As shown in FIG. 5, the encoded image data of the first picture of the GOP is composed of NAL units of AUD, VPS, SPS, PPS, PSEI, SLICE, SSEI, and EOS. On the other hand, pictures other than the first picture of the GOP are configured by NAL units of AUD, PPS, PSEI, SLICE, SSEI, and EOS. VPS and SPS can be transmitted once in a sequence (GOP), and PPS can be transmitted in each picture.
å³ïŒã«æ»ã£ãŠãå®ç·ç¢å°ã¯ã笊å·åã«ããããã¯ãã£ã®åç
§é¢ä¿ã瀺ããŠãããäŸãã°ããïŒãã®ãã¯ãã£ã¯ããã¯ãã£ã§ããããïŒãã®ãã¯ãã£ãåç
§ããŠç¬Šå·åãããããŸãããïŒãã®ãã¯ãã£ã¯ããã¯ãã£ã§ããããïŒãããïŒãã®ãã¯ãã£ãåç
§ããŠç¬Šå·åããããåæ§ã«ããã®ä»ã®ãã¯ãã£ã¯ã衚瀺é ã§è¿ãã®ãã¯ãã£ãåç
§ããŠç¬Šå·åãããããªããéå±€ïŒã®ãã¯ãã£ã¯ãä»ã®ãã¯ãã£ããã®åç
§ããªãã
  Returning to FIG. 3, the solid line arrows indicate the reference relationship of pictures in encoding. For example, the picture â2â is a P picture and is encoded with reference to the picture â1â. The picture â3â is a B picture and is encoded with reference to the pictures â1â and â3â. Similarly, other pictures are encoded with reference to nearby pictures in display order. Note that the picture of
ãšã³ã³ãŒãïŒïŒïŒã¯ãè€æ°ã®éå±€ãïŒä»¥äžã®æå®æ°ã®éå±€çµã«åå²ããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€æå®æ°ã®ãããªã¹ããªãŒã ãçæãããäŸãã°ããšã³ã³ãŒãïŒïŒïŒã¯ãæäžäœã®éå±€çµã«è€æ°ã®éå±€ãå«ã¿ããã®æäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã«ã¯ïŒã€ã®éå±€ãå«ãããã«åå²ããã
  The
äŸãã°ãå³ïŒã®é局笊å·åã®äŸã«ãããŠããšã³ã³ãŒãïŒïŒïŒã¯ãäžç¹éç·ã§åºåãããã«ãéå±€ïŒããïŒãæäžäœã®éå±€çµãšããéå±€ïŒããã®äžäœã«äœçœ®ããéå±€çµãšããŠãïŒã€ã®éå±€çµã«åå²ããããã®å Žåããšã³ã³ãŒãïŒïŒïŒã¯ãåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€ïŒã€ã®ãããªã¹ããªãŒã ïŒç¬Šå·åã¹ããªãŒã ïŒãçæããããšã«ãªãã
  For example, in the hierarchical coding example of FIG. 3, the
ãŸããäŸãã°ãå³ïŒã®é局笊å·åã®äŸã«ãããŠããšã³ã³ãŒãïŒïŒïŒã¯ãäžç¹éç·ããã³ïŒç¹éç·ã§åºåãããã«ãéå±€ïŒããïŒãæäžäœã®éå±€çµãšããéå±€ïŒããã®äžäœã«äœçœ®ããéå±€çµãšããããã«éå±€ïŒããã®äžäœã«äœçœ®ããéå±€çµãšããŠãïŒã€ã®éå±€çµã«åå²ããããã®å Žåããšã³ã³ãŒãïŒïŒïŒã¯ãåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€ïŒã€ã®ãããªã¹ããªãŒã ïŒç¬Šå·åã¹ããªãŒã ïŒãçæããããšã«ãªãã
  Also, for example, in the example of hierarchical encoding in FIG. 3, the
ãã®å Žåãæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ãããªã¹ããªãŒã ã¯ããŒã¹ã¹ããªãŒã ãšããããã®ã¹ããªãŒã ã¿ã€ãã¯âïŒïœïŒïŒâãšãããããŸãããã®æäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãå«ããããªã¹ããªãŒã ã¯ãšã³ãã³ã¹ã¹ããªãŒã ãšããããã®ã¹ããªãŒã ã¿ã€ãã¯ãæ°èŠå®çŸ©ããâïŒïœïŒïŒâãšãããã In this case, a video stream having encoded image data of pictures in the lowest layer set is a base stream, and the stream type is â0x24â. In addition, a video stream including encoded image data of a picture of a hierarchical group positioned higher than the lowest hierarchical group is an enhanced stream, and the stream type is newly defined â0x25â.
ãªãããšã³ãã³ã¹ã¹ããªãŒã ãè€æ°ååšããå Žåãå šãŠã®ãšã³ãã³ã¹ã¹ããªãŒã ã®ã¹ããªãŒã ã¿ã€ããâïŒïœïŒïŒâãšããã®ã§ã¯ãªããåãšã³ãã³ã¹ã¹ããªãŒã ã®èå¥ãå¯èœãšãªãããã«ãã¹ããªãŒã ã¿ã€ããæ°èŠå®çŸ©ããããšãèãããããäŸãã°ããšã³ãã³ã¹ã¹ããªãŒã ãïŒã€ããå Žåã第ïŒã®ãšã³ãã³ã¹ã¹ããªãŒã ã®ã¹ããªãŒã ã¿ã€ãã¯âïŒïœïŒïŒâãšããã第ïŒã®ãšã³ãã³ã¹ã¹ããªãŒã ã®ã¹ããªãŒã ã¿ã€ãã¯âïŒïœïŒïŒâãšãããã   When there are a plurality of enhanced streams, it is possible to define a new stream type so that each enhanced stream can be identified instead of setting the stream types of all the enhanced streams to â0x25â. For example, when there are two enhanced streams, the stream type of the first enhanced stream is â0x25â, and the stream type of the second enhanced stream is â0x26â.
ãã®ã¹ããªãŒã ã¿ã€ãã¯ãæå®æ°ã®ãããªã¹ããªãŒã ã®ããããããããŒã¹ã¹ããªãŒã ã§ããããšã³ãã³ã¹ã¹ããªãŒã ã§ããããèå¥ããããã®èå¥æ å ±ãæ§æããããã®ã¹ããªãŒã ã¿ã€ãã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ã®ã¬ã€ã€ã«æ¿å ¥ããããããªãã¡ããã®ã¹ããªãŒã ã¿ã€ãã¯ãããã°ã©ã ãããããŒãã«ïŒïŒ°ïŒïŒŽïŒProgram Map TableïŒã®é äžã«æå®æ°ã®ãããªã¹ããªãŒã ã«ãããã察å¿ããŠé 眮ããããããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«ãŒãïŒVideo ES loopïŒã®äžã«æ¿å ¥ãããã   This stream type constitutes identification information for identifying whether each of a predetermined number of video streams is a base stream or an enhanced stream. This stream type is inserted into the layer of the transport stream TS. That is, this stream type is inserted into a video elementary stream loop (Video ES loop) arranged corresponding to a predetermined number of video streams under a program map table (PMT).
å³ïŒã¯ãé局笊å·åã®éã®ãšã³ã³ãŒãããã³ãŒãã衚瀺é åºãšé 延ã®äžäŸã瀺ããŠããããã®äŸã¯ãäžè¿°ã®å³ïŒã®é局笊å·åäŸã«å¯Ÿå¿ããŠããããã®äŸã¯ãå šéå±€ïŒå šã¬ã€ã€ïŒãããã«æé解å床ã§é局笊å·åããå Žåã瀺ããŠãããå³ïŒïŒïœïŒã¯ãšã³ã³ãŒãå ¥åã瀺ããå³ïŒïŒïœïŒã«ç€ºãããã«ãïŒïŒãã¯ãã£åã®é 延ããã£ãŠãåãã¯ãã£ããšã³ã³ãŒãé ã«ãšã³ã³ãŒããããŠã笊å·åã¹ããªãŒã ãåŸãããããŸããå³ïŒïŒïœïŒã¯ãã³ãŒãå ¥åã瀺ããåãã¯ãã£ããã³ãŒãé ã«ãã³ãŒããããããããŠãå³ïŒïŒïœïŒã«ç€ºãããã«ãïŒãã¯ãã£ã®é 延ããã£ãŠãåãã¯ãã£ã®ç»åããŒã¿ã衚瀺é ã«åŸãããã   FIG. 6 shows an example of encoding, decoding, display order and delay at the time of hierarchical encoding. This example corresponds to the above-described hierarchical encoding example of FIG. This example shows a case where all layers (all layers) are hierarchically encoded at full time resolution. FIG. 6A shows the encoder input. As shown in FIG. 6B, each picture is encoded in the encoding order with a delay of 16 pictures to obtain an encoded stream. FIG. 6B shows the decoder input, and each picture is decoded in decoding order. Then, as shown in FIG. 6C, the image data of each picture is obtained in the display order with a delay of 4 pictures.
å³ïŒïŒïœïŒã¯ãäžè¿°ã®å³ïŒïŒïœïŒã«ç€ºã笊å·åã¹ããªãŒã ãšåæ§ã®ç¬Šå·åã¹ããªãŒã ããéå±€ïŒããïŒãéå±€ïŒãéå±€ïŒã®ïŒæ®µéã«åããŠç€ºããŠãããããã§ããïœïœãã¯ãtemporal_idã瀺ããŠãããå³ïŒïŒïœïŒã¯ãéå±€ïŒããïŒãã€ãŸãïœïœïŒïŒãïŒã®éšåéå±€ã®åãã¯ãã£ãéžæçã«ãã³ãŒãããå Žåã®è¡šç€ºæåŸ
ïŒè¡šç€ºé ïŒã瀺ããŠããããŸããå³ïŒïŒïœïŒã¯ãéå±€ïŒããïŒãã€ãŸãïœïœïŒïŒãïŒã®éšåéå±€ã®åãã¯ãã£ãéžæçã«ãã³ãŒãããå Žåã®è¡šç€ºæåŸ
ïŒè¡šç€ºé ïŒã瀺ããŠãããããã«ãå³ïŒïŒïœïŒã¯ãéå±€ïŒããïŒãã€ãŸãïœïœïŒïŒãïŒã®å
šéå±€ã®åãã¯ãã£ãéžæçã«ãã³ãŒãããå Žåã®è¡šç€ºæåŸ
ïŒè¡šç€ºé ïŒã瀺ããŠããã
  FIG. 7A shows an encoded stream similar to the encoded stream shown in FIG. 6B described above, divided into three stages of
å³ïŒïŒïœïŒã®ç¬Šå·åã¹ããªãŒã ããã³ãŒãèœåå¥ã«ãã³ãŒãåŠçããã«ã¯ãæé解å床ããã«ã¬ãŒãã®ãã³ãŒãèœåãå¿ èŠãšãªããããããïœïœïŒïŒãïŒã®ãã³ãŒããè¡ãå Žåã笊å·åããããã«ã®æé解å床ã«å¯ŸããŠãïŒ/ïŒã®ãã³ãŒãèœåããã€ãã³ãŒããåŠçå¯èœãšãã¹ãã§ããããŸããïœïœïŒïŒãïŒã®ãã³ãŒããè¡ãå Žåã笊å·åããããã«ã®æé解å床ã«å¯ŸããŠãïŒ/ïŒã®ãã³ãŒãèœåããã€ãã³ãŒããåŠçå¯èœãšãã¹ãã§ããã   In order to decode the encoded stream of FIG. 7A according to the decoding capability, a decoding capability with a full resolution of time resolution is required. However, when decoding with Tid = 0-2, a decoder with 1/4 decoding capability should be able to process for the full encoded temporal resolution. Also, when decoding with Tid = 0-3, a decoder with 1/2 decoding capability should be able to process for the encoded full temporal resolution.
ããããé局笊å·åã«ãããŠåç §ãããäœéå±€ã«å±ãããã¯ãã£ãé£ç¶ããããããæé解å床ã§ãã«ãªã¿ã€ãã³ã°ã§ç¬Šå·åããããšãéšåãã³ãŒããããã³ãŒãã®èœåãè¿œãä»ããªãããšã«ãªããå³ïŒïŒïœïŒã®ïŒ¡ã®æéãããã«è©²åœãããïœïœïŒïŒãïŒããããã¯ïŒŽïœïœïŒïŒãïŒã®éšåçãªéå±€ããã³ãŒããããã³ãŒãã¯ã衚瀺ã®äŸã§ç€ºããããªãæé軞ãïŒ/ïŒãããã¯ïŒ/ïŒã®èœåã§ãã³ãŒãã»è¡šç€ºãè¡ããããã®æé笊å·åãããæé解å床ããã«ã§é£ç¶ãããã¯ãã£ã®ãã³ãŒãã¯ã§ããªãã   However, if pictures belonging to a lower hierarchy that are referred to in hierarchical encoding are consecutive and are encoded at full timing with temporal resolution, the ability of the decoder to perform partial decoding cannot catch up. The period A in FIG. A decoder that decodes a partial hierarchy of Tid = 0 to 2 or Tid = 0 to 3 performs decoding and display with a capability of 1/4 or 1/2 of the time axis as shown in the display example. , A picture encoded in the period A and having full time resolution cannot be decoded.
ïœã¯ïŒŽïœïœïŒïŒãïŒããã³ãŒããããã³ãŒãã«ããããã¯ãã£æ¯ã®ãã³ãŒãåŠçã«èŠããæéã瀺ããïœã¯ïŒŽïœïœïŒïŒãïŒããã³ãŒããããã³ãŒãã«ããããã¯ãã£æ¯ã®ãã³ãŒãåŠçã«èŠããæéã瀺ããïœã¯ïŒŽïœïœïŒïŒãïŒïŒå šéå±€ïŒããã³ãŒããããã³ãŒãã«ããããã¯ãã£æ¯ã®ãã³ãŒãåŠçã«èŠããæéã瀺ãããããã®åæéã®é¢ä¿ã¯ãïœïŒïŒŽïœïŒïŒŽïœãšãªãã   Ta indicates the time required for the decoding process for each picture in the decoder that decodes Tid = 0-2. Tb indicates the time required for the decoding process for each picture in the decoder that decodes Tid = 0-3. Tc indicates the time required for the decoding process for each picture in the decoder that decodes Tid = 0 to 4 (all layers). The relationship between these times is Ta> Tb> Tc.
ãã®å®æœã®åœ¢æ
ã«ãããŠããšã³ã³ãŒãïŒïŒïŒã¯ãå°ãªããšããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãééãçééãšãªãããã«ã笊å·åãããå³ïŒïŒïœïŒã¯ãå³ïŒã®é局笊å·åã®äŸã«ãããŠãåãã¯ãã£ãæé解å床ã§ãã«ãªïŒïŒïŒïŒšïœã¿ã€ãã³ã°ã§ç¬Šå·åãããå Žåã§ãã£ãŠãéå±€ïŒããïŒãããŒã¹ã¹ããªãŒã ïŒB streamïŒãæ§æããæäžäœã®éå±€çµãšãããéå±€ïŒããã®äžäœã«äœçœ®ãããšã³ãã³ã¹ã¹ããªãŒã ïŒE streamïŒãæ§æããéå±€çµãšãããŠãïŒã€ã®éå±€çµã«åå²ãããå Žåã瀺ããŠããã
  In this embodiment, the
ãã®å Žåãæäžäœã®éå±€çµã®ãã¯ãã£ã®æé解å床ã¯ïŒïŒïœïœïœã§ããããïŒïŒïŒïŒšïœã®ã¿ã€ãã³ã°ã§é£ç¶ããŠç¬Šå·åããããã¯ãã£ãååšããïŒïŒïœïœïœã®ãã³ãŒãèœåãåãããã³ãŒãã§ã¯é£ç¶ããŠå®å®ãããã³ãŒãåŠçãäžå¯èœãšãªãããã®ãããå³ïŒïŒïœïŒã«ç€ºãããã«ãããŒã¹ã¹ããªãŒã ãæ§æããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åã¿ã€ãã³ã°ãïŒïŒïŒšïœãšãªãããã«èª¿æŽããããã®æäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åããããããã«ãããïŒïŒïœïœïœã®ãã³ãŒãèœåãåãããã³ãŒãã«ãããããŒã¹ã¹ããªãŒã ãæ§æããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã«å¯ŸããŠé£ç¶ããŠå®å®ãããã³ãŒãåŠçãå¯èœãšãªãã   In this case, although the temporal resolution of the pictures in the lowest layer set is 60 fps, there are pictures that are continuously encoded at a timing of 120 Hz, and a decoder having a decoding capability of 60 fps has a continuous and stable decoding process. Is impossible. For this reason, as shown in FIG. 8B, the encoding timing of the pictures in the lowest hierarchical group constituting the base stream is adjusted to be 60 Hz, and the encoded image data of the pictures in the lowest hierarchical group is set. Are encoded so that their decoding intervals are equal. As a result, a decoder having a decoding capability of 60 fps enables continuous and stable decoding processing on the encoded image data of the pictures in the lowest layer set constituting the base stream.
ãŸããå³ïŒïŒïœïŒã«ç€ºãããã«ããšã³ãã³ã¹ã¹ããªãŒã ïŒE streamïŒãæ§æããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åã¿ã€ãã³ã°ãåŸã£ãŠãã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã¯ãããŒã¹ã¹ããªãŒã ïŒB streamïŒãæ§æããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã®äžéã¿ã€ãã³ã°ãšãªãããã«ç¬Šå·åããããããã«ãããåä¿¡åŽã§ãããŒã¹ã¹ããªãŒã ã ãã§ãªãããšã³ãã³ã¹ã¹ããªãŒã ãæ§æããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãŸã§ããã³ãŒãããèœåãããå Žåã«ãåãã¯ãã£ã®ãã³ãŒãåŠçãé 次ã¹ã ãŒãºã«é²ããããšãå¯èœãšãªãã   Further, as shown in FIG. 8B, the encoding timing of the hierarchical group of pictures constituting the enhancement stream (E stream), and hence the decoding timing of the encoded image data of the picture, is the base stream (B stream). Encoding is performed so as to be an intermediate timing of the decoding timing of the encoded image data of the picture of the lowest layer group to be configured. As a result, when the receiving side has the ability to decode not only the base stream but also the encoded image data of the hierarchical set of pictures that make up the enhanced stream, the decoding process of each picture can proceed smoothly and smoothly. It becomes.
å³ïŒã¯ãå³ïŒã®é局笊å·åã®äŸã«ãããŠãããŒã¹ã¹ããªãŒã ïŒB streamïŒãšããšã³ãã³ã¹ã¹ããªãŒã ïŒE streamïŒã®ïŒã€ã®ãããªã¹ããªãŒã ãçæããå Žåã«ããããåãã¯ãã£ã®ç¬Šå·åã¿ã€ãã³ã°ïŒãã³ãŒãã¿ã€ãã³ã°ïŒã®äžäŸã瀺ããŠããããã®äŸã¯ããšã³ãã³ã¹ã¹ããªãŒã ã®ãã³ãŒãé 延ãããŒã¹ã¹ããªãŒã ã«å¯ŸããŠæå°ãšãªãäŸã§ããããã®å Žåã®ãã³ãŒãé 延ã¯ããã«æé解å床ã®ç¬Šå·åééïŒããŒã¹ã¹ããªãŒã ã®ç¬Šå·åééã®ïŒ/ïŒïŒã§ïŒãã¯ãã£åã§ããã   FIG. 9 shows the encoding timing (decoding timing) of each picture in the case of generating two video streams of a base stream (B stream) and an enhancement stream (E stream) in the hierarchical encoding example of FIG. An example is shown. In this example, the decoding delay of the enhanced stream is minimized with respect to the base stream. The decoding delay in this case is 8 pictures at the encoding interval of full time resolution (1/2 of the encoding interval of the base stream).
ãã®äŸã§ã¯ãããŒã¹ã¹ããªãŒã ïŒB streamïŒã®ãã¯ãã£ã®ç¬Šå·åã¿ã€ãã³ã°ã¯å¶æ°ã¿ã€ãã³ã°ãšããããšã³ãã³ã¹ã¹ããªãŒã ïŒE streamïŒã®ç¬Šå·åã¿ã€ãã³ã°ã¯å¥æ°ã¿ã€ãã³ã°ãšãããããããŠããã®äŸã§ã¯ããšã³ãã³ã¹ã¹ããªãŒã ïŒE streamïŒã¯ãããŒã¹ã¹ããªãŒã ïŒB streamïŒã®æäžäœå±€ã®ç¬Šå·åé ã®ããåŸã«ç¬Šå·åããããããªãã¡ããšã³ãã³ã¹ã¹ããªãŒã ïŒE streamïŒã®ãïŒãã®ãã¯ãã£ã¯ãããŒã¹ã¹ããªãŒã ïŒB streamïŒã®ãïŒãã®ãã¯ãã£ã®çŽåŸã«ç¬Šå·åãããã   In this example, the encoding timing of the picture of the base stream (B stream) is an even timing, and the encoding timing of the enhancement stream (E stream) is an odd timing. In this example, the enhancement stream (E stream) is encoded immediately after the encoding order of the highest layer of the base stream (B stream). That is, the picture â9â of the enhancement stream (E stream) is encoded immediately after the picture â8â of the base stream (B stream).
å³ïŒïŒã¯ãå³ïŒã®é局笊å·åã®äŸã«ãããŠãããŒã¹ã¹ããªãŒã ïŒB streamïŒãšããšã³ãã³ã¹ã¹ããªãŒã ïŒE streamïŒã®ïŒã€ã®ãããªã¹ããªãŒã ãçæããå Žåã«ããããåãã¯ãã£ã®ç¬Šå·åã¿ã€ãã³ã°ïŒãã³ãŒãã¿ã€ãã³ã°ïŒã®ä»ã®äžäŸã瀺ããŠããããã®äŸã¯ããšã³ãã³ã¹ã¹ããªãŒã ã®ãã³ãŒãé 延ãããŒã¹ã¹ããªãŒã ã«å¯ŸããŠå€§ãããªãäŸã§ããããã®å Žåã®ãã³ãŒãé 延ã¯ããã«æé解å床ã®ç¬Šå·åééïŒããŒã¹ã¹ããªãŒã ã®ç¬Šå·åééã®ïŒ/ïŒïŒã§ïŒïŒãã¯ãã£åã§ããããã®ããã«ãã³ãŒãé 延ã倧ãããªãå Žåã«ã¯ãïœïœïœïŒéå§çž®ããŒã¿ãããã¡ïŒdecoded picture bufferïŒå éšã®åç §ã¡ã¢ãªãå€ãå¿ èŠãšãªãã   FIG. 10 shows the encoding timing (decoding timing) of each picture in the case of generating two video streams of a base stream (B stream) and an enhancement stream (E stream) in the example of hierarchical encoding of FIG. Another example is shown. In this example, the decoding delay of the enhanced stream is larger than that of the base stream. The decoding delay in this case is 16 pictures at the encoding interval of full time resolution (1/2 of the encoding interval of the base stream). When the decoding delay increases as described above, a large amount of reference memory is required inside dpb (decoded picture buffer).
ãã®äŸã§ã¯ãããŒã¹ã¹ããªãŒã ïŒB streamïŒã®ãã¯ãã£ã®ç¬Šå·åã¿ã€ãã³ã°ã¯å¶æ°ã¿ã€ãã³ã°ãšããããšã³ãã³ã¹ã¹ããªãŒã ïŒE streamïŒã®ç¬Šå·åã¿ã€ãã³ã°ã¯å¥æ°ã¿ã€ãã³ã°ãšãããããããŠããã®äŸã§ã¯ããšã³ãã³ã¹ã¹ããªãŒã ïŒE streamïŒã¯ãããŒã¹ã¹ããªãŒã ïŒB streamïŒã®æäžäœå±€ã®ç¬Šå·åãçµäºããåŸã«ç¬Šå·åããããããªãã¡ããšã³ãã³ã¹ã¹ããªãŒã ïŒE streamïŒã®ãïŒïŒãã®ãã¯ãã£ã¯ãããŒã¹ã¹ããªãŒã ïŒB streamïŒã®ãïŒïŒãã®ãã¯ãã£ã®çŽåŸã«ç¬Šå·åãããã   In this example, the encoding timing of the picture of the base stream (B stream) is an even timing, and the encoding timing of the enhancement stream (E stream) is an odd timing. In this example, the enhancement stream (E stream) is encoded after the highest layer encoding of the base stream (B stream) is completed. That is, the picture â17â of the enhancement stream (E stream) is encoded immediately after the picture â16â of the base stream (B stream).
å³ïŒïŒïŒïœïŒã¯ãå³ïŒã®é局笊å·åã®äŸã«ãããŠãåãã¯ãã£ãæé解å床ã§ãã«ãªïŒïŒïŒïŒšïœã¿ã€ãã³ã°ã§ç¬Šå·åãããå Žåã§ãã£ãŠãéå±€ïŒããïŒãããŒã¹ã¹ããªãŒã ïŒB streamïŒãæ§æããæäžäœã®éå±€çµãšãããéå±€ïŒããã®äžäœã«äœçœ®ãããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream1ïŒãæ§æããéå±€çµãšãããããã«éå±€ïŒããã®äžäœã«äœçœ®ãããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream2ïŒãæ§æããéå±€çµãšãããŠãïŒã€ã®éå±€çµã«åå²ãããå Žåã瀺ããŠããã
  FIG. 11 (a) shows a case where each picture is encoded at a full 120 Hz timing with temporal resolution in the example of the hierarchical encoding of FIG. 3, and layers 0 to 2 constitute a base stream (B stream). A hierarchy in which the
ãã®å Žåãæäžäœã®éå±€çµã®ãã¯ãã£ã®æé解å床ã¯ïŒïŒïœïœïœã§ããããïŒïŒïŒïŒšïœã®ã¿ã€ãã³ã°ã§é£ç¶ããŠç¬Šå·åããããã¯ãã£ãååšããïŒïŒïœïœïœã®ãã³ãŒãèœåãåãããã³ãŒãã§ã¯é£ç¶ããŠå®å®ãããã³ãŒãåŠçãäžå¯èœãšãªãããã®ãããå³ïŒïŒïŒïœïŒã«ç€ºãããã«ãããŒã¹ã¹ããªãŒã ãæ§æããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åã¿ã€ãã³ã°ãïŒïŒïŒšïœãšãªãããã«èª¿æŽããããã®æäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åããããããã«ãããïŒïŒïœïœïœã®ãã³ãŒãèœåãåãããã³ãŒãã«ãããããŒã¹ã¹ããªãŒã ãæ§æããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã«å¯ŸããŠé£ç¶ããŠå®å®ãããã³ãŒãåŠçãå¯èœãšãªãã   In this case, although the temporal resolution of the pictures in the lowest layer set is 30 fps, there are pictures that are continuously encoded at a timing of 120 Hz, and a decoder having a decoding capability of 30 fps has a continuous and stable decoding process. Is impossible. Therefore, as shown in FIG. 11 (b), the encoding timing of the pictures in the lowest layer set constituting the base stream is adjusted to be 30 Hz, and the encoded image data of the pictures in the lowest layer set Are encoded so that their decoding intervals are equal. As a result, a decoder having a decoding capability of 30 fps can continuously and stably decode the encoded image data of the pictures in the lowest layer set constituting the base stream.
ãŸããå³ïŒïŒïŒïœïŒã«ç€ºãããã«ããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream1ïŒãæ§æããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åã¿ã€ãã³ã°ãåŸã£ãŠãã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã¯ãããŒã¹ã¹ããªãŒã ïŒB streamïŒãæ§æããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã®äžéã¿ã€ãã³ã°ãšãªãããã«ç¬Šå·åããããããã«ãå³ïŒïŒïŒïœïŒã«ç€ºãããã«ããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream2ïŒãæ§æããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åã¿ã€ãã³ã°ãåŸã£ãŠãã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã¯ãããŒã¹ã¹ããªãŒã ïŒB streamïŒããã³ãšã³ãã³ã¹ã¹ããªãŒã ïŒE stream1ïŒãæ§æããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã®äžéã¿ã€ãã³ã°ãšãªãããã«ç¬Šå·åããããããã«ãããåä¿¡åŽã§ãããŒã¹ã¹ããªãŒã ã ãã§ãªããïŒã€ã®ãšã³ãã³ã¹ã¹ããªãŒã ãæ§æããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãŸã§ããã³ãŒãããèœåãããå Žåã«ãåãã¯ãã£ã®ãã³ãŒãåŠçãé 次ã¹ã ãŒãºã«é²ããããšãå¯èœãšãªãã   In addition, as shown in FIG. 11B, the encoding timing of the hierarchical group of pictures constituting the enhancement stream (E stream 1), and hence the decoding timing of the encoded image data of the picture, is the base stream (B stream). Encoding is performed so as to be an intermediate timing of the decoding timing of the encoded image data of the picture of the lowest layer group to be configured. Further, as shown in FIG. 11 (b), the encoding timing of the hierarchical group of pictures constituting the enhancement stream (E stream 2), and hence the decoding timing of the encoded image data of the picture, is the base stream (B stream) and Encoding is performed so as to be an intermediate timing of the decoding timing of the encoded image data of the hierarchical set of pictures constituting the enhancement stream (E stream 1). As a result, when the receiving side has the ability to decode not only the base stream but also the encoded image data of the hierarchical set of pictures that make up the two enhanced streams, the decoding process of each picture proceeds smoothly and sequentially. Is possible.
å³ïŒïŒã¯ãå³ïŒã®é局笊å·åã®äŸã«ãããŠãããŒã¹ã¹ããªãŒã ïŒB streamïŒãšããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream1ïŒãšããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream2ïŒã®ïŒã€ã®ãããªã¹ããªãŒã ãçæããå Žåã«ããããåãã¯ãã£ã®ç¬Šå·åã¿ã€ãã³ã°ïŒãã³ãŒãã¿ã€ãã³ã°ïŒã®äžäŸã瀺ããŠããããã®äŸã¯ããšã³ãã³ã¹ã¹ããªãŒã ã®ãã³ãŒãé 延ãããŒã¹ã¹ããªãŒã ã«å¯ŸããŠæå°ãšãªãäŸã§ããããã®å Žåã®ãã³ãŒãé 延ã¯ããã«æé解å床ã®ç¬Šå·åééïŒããŒã¹ã¹ããªãŒã ã®ç¬Šå·åééã®ïŒ/ïŒïŒã§ïŒïŒãã¯ãã£åã§ããã   FIG. 12 is a diagram illustrating an example of hierarchical encoding in FIG. 3 in which three video streams of a base stream (B stream), an enhanced stream (E stream 1), and an enhanced stream (E stream 2) are generated. An example of encoding timing (decoding timing) is shown. In this example, the decoding delay of the enhanced stream is minimized with respect to the base stream. The decoding delay in this case is 12 pictures at the encoding interval of full time resolution (1/4 of the encoding interval of the base stream).
ãã®äŸã§ã¯ãããŒã¹ã¹ããªãŒã ïŒB streamïŒã®ãã¯ãã£ã®ç¬Šå·åã¿ã€ãã³ã°ã¯ïŒã®åæ°ã®ã¿ã€ãã³ã°ãšããããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream1ïŒã®ç¬Šå·åã¿ã€ãã³ã°ã¯ïŒã®åæ°ã®ã¿ã€ãã³ã°ã§ãããŒã¹ã¹ããªãŒã ïŒB streamïŒã®ãã¯ãã£ã®ç¬Šå·åã¿ã€ãã³ã°ã®äžéã®ã¿ã€ãã³ã°ãšãããããŸãããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream1ïŒã®ç¬Šå·åã¿ã€ãã³ã°ã¯å¥æ°ã¿ã€ãã³ã°ãšãããã   In this example, the encoding timing of the picture of the base stream (B stream) is a multiple of 4, the encoding timing of the enhancement stream (E stream 1) is a multiple of 4, and the base stream (B stream) The timing is intermediate between the picture encoding timings. The encoding timing of the enhancement stream (E stream 1) is an odd timing.
ãããŠããã®äŸã§ã¯ããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream1ïŒã¯ãããŒã¹ã¹ããªãŒã ïŒB streamïŒã®æäžäœå±€ã®ç¬Šå·åé ã®ããåŸã«ç¬Šå·åããããããªãã¡ããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream1ïŒã®ãïŒïŒãã®ãã¯ãã£ã¯ãããŒã¹ã¹ããªãŒã ïŒB streamïŒã®ãïŒãã®ãã¯ãã£ã®çŽåŸã«ç¬Šå·åãããããŸãããã®äŸã§ã¯ããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream2ïŒã¯ããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream1ïŒã®ç¬Šå·åé ã®ããåŸã«ç¬Šå·åããããããªãã¡ããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream2ïŒã®ãïŒïŒãã®ãã¯ãã£ã¯ããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream1ïŒã®ãïŒïŒãã®ãã¯ãã£ã®çŽåŸã«ç¬Šå·åãããã   In this example, the enhancement stream (E stream 1) is encoded immediately after the encoding order of the highest layer of the base stream (B stream). That is, the picture â10â of the enhanced stream (E stream 1) is encoded immediately after the picture â8â of the base stream (B stream). In this example, the enhancement stream (E stream 2) is encoded immediately after the encoding order of the enhancement stream (E stream 1). That is, the picture â11â of the enhanced stream (E stream 2) is encoded immediately after the picture â10â of the enhanced stream (E stream 1).
å³ïŒïŒã¯ãå³ïŒã®é局笊å·åã®äŸã«ãããŠãããŒã¹ã¹ããªãŒã ïŒB streamïŒãšããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream1ïŒãšããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream2ïŒã®ïŒã€ã®ãããªã¹ããªãŒã ãçæããå Žåã«ããããåãã¯ãã£ã®ç¬Šå·åã¿ã€ãã³ã°ïŒãã³ãŒãã¿ã€ãã³ã°ïŒã®ä»ã®äžäŸã瀺ããŠããããã®äŸã¯ããšã³ãã³ã¹ã¹ããªãŒã ã®ãã³ãŒãé 延ãããŒã¹ã¹ããªãŒã ã«å¯ŸããŠå€§ãããªãäŸã§ããããã®å Žåã®ãã³ãŒãé 延ã¯ããã«æé解å床ã®ç¬Šå·åééïŒããŒã¹ã¹ããªãŒã ã®ç¬Šå·åééã®ïŒ/ïŒïŒã§ïŒïŒãã¯ãã£åã§ããããã®ããã«ãã³ãŒãé 延ã倧ãããªãå Žåã«ã¯ãïœïœïœïŒéå§çž®ããŒã¿ãããã¡ïŒdecoded picture bufferïŒå éšã®åç §ã¡ã¢ãªãå€ãå¿ èŠãšãªãã   FIG. 13 shows an example of hierarchical coding in FIG. 3, in which three video streams of a base stream (B stream), an enhanced stream (E stream 1), and an enhanced stream (E stream 2) are generated. The other example of the encoding timing (decoding timing) is shown. In this example, the decoding delay of the enhanced stream is larger than that of the base stream. The decoding delay in this case is 27 pictures at the encoding interval of full time resolution (1/4 of the encoding interval of the base stream). When the decoding delay increases as described above, a large amount of reference memory is required inside dpb (decoded picture buffer).
ãã®äŸã§ã¯ãããŒã¹ã¹ããªãŒã ïŒB streamïŒã®ãã¯ãã£ã®ç¬Šå·åã¿ã€ãã³ã°ã¯ïŒã®åæ°ã®ã¿ã€ãã³ã°ãšããããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream1ïŒã®ç¬Šå·åã¿ã€ãã³ã°ã¯ïŒã®åæ°ã®ã¿ã€ãã³ã°ã§ãããŒã¹ã¹ããªãŒã ïŒB streamïŒã®ãã¯ãã£ã®ç¬Šå·åã¿ã€ãã³ã°ã®äžéã®ã¿ã€ãã³ã°ãšãããããŸãããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream1ïŒã®ç¬Šå·åã¿ã€ãã³ã°ã¯å¥æ°ã¿ã€ãã³ã°ãšãããã   In this example, the encoding timing of the picture of the base stream (B stream) is a multiple of 4, the encoding timing of the enhancement stream (E stream 1) is a multiple of 4, and the base stream (B stream) The timing is intermediate between the picture encoding timings. The encoding timing of the enhancement stream (E stream 1) is an odd timing.
ãããŠããã®äŸã§ã¯ããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream1ïŒã¯ãããŒã¹ã¹ããªãŒã ïŒB streamïŒã®æäžäœå±€ã®ç¬Šå·åãçµäºããåŸã«ç¬Šå·åããããããªãã¡ããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream1ïŒã®ãïŒïŒãã®ãã¯ãã£ã¯ãããŒã¹ã¹ããªãŒã ïŒB streamïŒã®ãïŒïŒãã®ãã¯ãã£ã®çŽåŸã«ç¬Šå·åãããããŸãããã®äŸã§ã¯ããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream2ïŒã¯ããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream1ïŒã®ç¬Šå·åãçµäºããåŸã«ç¬Šå·åããããããªãã¡ããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream2ïŒã®ãïŒïŒãã®ãã¯ãã£ã¯ããšã³ãã³ã¹ã¹ããªãŒã ïŒE stream1ïŒã®ãïŒïŒãã®ãã¯ãã£ã®çŽåŸã«ç¬Šå·åãããã   In this example, the enhancement stream (E stream 1) is encoded after the highest layer encoding of the base stream (B stream) is completed. That is, the picture â14â of the enhancement stream (E stream 1) is encoded immediately after the picture â12â of the base stream (B stream). In this example, the enhanced stream (E stream 2) is encoded after the encoding of the enhanced stream (E stream 1) is completed. That is, the picture â27â of the enhanced stream (E stream 2) is encoded immediately after the picture â26â of the enhanced stream (E stream 1).
å³ïŒïŒã¯ããšã³ã³ãŒãïŒïŒïŒã®ïŒšïŒ²ïŒ€ïŒHypothetical Reference DecoderïŒå¶åŸ¡ã®äžäŸã瀺ããŠããããã®äŸã¯ãããŒã¹ã¹ããªãŒã ïŒB streamïŒãšããšã³ãã³ã¹ã¹ããªãŒã ïŒE streamïŒã®ïŒã€ã®ãããªã¹ããªãŒã ãçæããå Žåã®äŸã§ãããããã§ã¯ãããŒã¹ã¹ããªãŒã ããµãã¹ããªãŒã ïŒïŒSubstream1ïŒãšãããšã³ãã³ã¹ã¹ããªãŒã ããµãã¹ããªãŒã ïŒïŒSubstream2ïŒãšããŠèª¬æããã
  FIG. 14 shows an example of HRD (Hypothetical Reference Decoder) control of the
é段ç¶ã®å®ç·ïœïŒã¯ããšã³ã³ãŒãïŒç¬Šå·åïŒã«ããçºçãããµãã¹ããªãŒã ïŒã®ããŒã¿éã®æšç§»ã瀺ããŠãããå段ãããããäžã€ã®ãã¯ãã£ã®åäœã«å¯Ÿå¿ããŠããã段ã®é«ãã¯ããšã³ã³ãŒãã«ããçºçããããŒã¿éã瀺ããŠããã
  A stair-like solid line a1 indicates the transition of the data amount of the
ã¿ã€ãã³ã°ïŒ°01ã¯ãæåã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®æåã®ãã€ããïœïœïœïŒïŒcoded picture buffer 1:å§çž®ããŒã¿ãããã¡ïŒã«å ¥ãã¿ã€ãã³ã°ã瀺ããŠãããïŒã¯ãæåã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ïœïœïœïŒãžã®å ¥åãããã¬ãŒãã瀺ããŠãããããã§ãïŒã®æéããã£ãŠïœïœïœïŒã«å ¥åããã笊å·åããŒã¿éãïŒã§ãããšããïŒïŒïŒ±ïŒ/ïŒãšãªãããªããå³ç€ºã®äŸã§ã¯ããã®ä»ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ïœïœïœïŒãžã®å ¥åãããã¬ãŒããïŒã§ããå Žåã瀺ããŠããã   Timing P01 indicates the timing at which the first byte of the encoded image data of the first picture enters cpb1 (coded picture buffer 1: compressed data buffer). R1 indicates an input bit rate to the cpb1 of the encoded image data of the first picture. Here, when the amount of encoded data input to cpb1 with time T1 is Q1, R1 = Q1 / T1. In the example shown in the figure, the input bit rate to cpb1 of encoded image data of other pictures is also R1.
é段ç¶ã®å®ç·ïœïŒã¯ããã³ãŒãïŒåŸ©å·åïŒã«ããæ¶è²»ããïœïœïœïŒã«ãããããŒã¿éã®æšç§»ã瀺ããŠãããå段ãããããäžã€ã®ãã¯ãã£ã®åäœã«å¯Ÿå¿ããŠããã段ã®é«ãã¯ããã³ãŒãã«ããæ¶è²»ããããŒã¿éã瀺ããŠãããcpb1ã¯ãïœïœïœïŒã®å æéã瀺ããŠããããã®å æéããã©ã®ã¿ã€ãã³ã°ã«ãããŠãïœïœïœïŒã®ãµã€ãºïŒã¡ã¢ãªå®¹éïŒã«åãŸãããã«ãšã³ã³ãŒããããã   A stair-like solid line b1 indicates a transition of the data amount in cpb1 consumed by decoding (decoding), and each stage corresponds to one picture unit. The step height indicates the amount of data consumed by decoding. Qcpb1 indicates the occupation amount of cpd1. The occupation amount is encoded so as to be within the size (memory capacity) of cpb1 at any timing.
ãŸããé段ç¶ã®å®ç·ïœïŒã¯ããšã³ã³ãŒãïŒç¬Šå·åïŒã«ããçºçãããµãã¹ããªãŒã ïŒã®ããŒã¿éã®æšç§»ã瀺ããŠãããå段ãããããäžã€ã®ãã¯ãã£ã®åäœã«å¯Ÿå¿ããŠããã段ã®é«ãã¯ããšã³ã³ãŒãã«ããçºçããããŒã¿éã瀺ããŠããã
  Further, a stair-like solid line a2 indicates the transition of the data amount of the
ã¿ã€ãã³ã°ïŒ°02ã¯ãæåã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®æåã®ãã€ããïœïœïœïŒïŒcoded picture buffer 2:å§çž®ããŒã¿ãããã¡ïŒã«å ¥ãã¿ã€ãã³ã°ã瀺ããŠãããïŒã¯ãæåã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ïœïœïœïŒãžã®å ¥åãããã¬ãŒãã瀺ããŠãããããã§ãïŒã®æéããã£ãŠïœïœïœïŒã«å ¥åããã笊å·åããŒã¿éãïŒã§ãããšããïŒïŒïŒ±ïŒ/ïŒãšãªãããªããå³ç€ºã®äŸã§ã¯ããã®ä»ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ïœïœïœïŒãžã®å ¥åãããã¬ãŒããïŒã§ããå Žåã瀺ããŠããã   Timing P02 indicates the timing at which the first byte of the encoded image data of the first picture enters cpb2 (coded picture buffer 2: compressed data buffer). R2 indicates an input bit rate to the cpb2 of the encoded image data of the first picture. Here, when the amount of encoded data input to cpb2 with time T2 is Q2, R2 = Q2 / T2. In the illustrated example, the case where the input bit rate to the cpb2 of the encoded image data of other pictures is also R2 is shown.
é段ç¶ã®å®ç·ïœïŒã¯ããã³ãŒãïŒåŸ©å·åïŒã«ããæ¶è²»ããïœïœïœïŒã«ãããããŒã¿éã®æšç§»ã瀺ããŠãããå段ãããããäžã€ã®ãã¯ãã£ã®åäœã«å¯Ÿå¿ããŠããã段ã®é«ãã¯ããã³ãŒãã«ããæ¶è²»ããããŒã¿éã瀺ããŠãããcpb2ã¯ãïœïœïœïŒã®å æéã瀺ããŠããããã®å æéããã©ã®ã¿ã€ãã³ã°ã«ãããŠãïœïœïœïŒã®ãµã€ãºïŒã¡ã¢ãªå®¹éïŒã«åãŸãããã«ãšã³ã³ãŒããããã   A stair-like solid line b2 indicates a change in the amount of data in cpb2 consumed by decoding (decoding), and each stage corresponds to one picture unit. The step height indicates the amount of data consumed by decoding. Qcpb2 indicates the occupation amount of cpd2. The occupation amount is encoded so as to be within the size (memory capacity) of cpb2 at any timing.
å³ç€ºã®äŸã§ã¯ããµãã¹ããªãŒã ïŒã«é¢ããŠã¯ãïŒâïŒãããïŒâïŒãããïŒâïŒãããïŒâïŒãã»ã»ã»ã®ãã¯ãã£é ã«ãã³ãŒãããããµãã¹ããªãŒã ïŒã«é¢ããŠã¯ãïŒâïŒãããïŒâïŒãããïŒâïŒãããïŒâïŒãã»ã»ã»ã®ãã¯ãã£é ã«ãã³ãŒããããããã®ããã«ãµãã¹ããªãŒã ïŒã®ãã¯ãã£ãšãµãã¹ããªãŒã ïŒã®ãã¯ãã£ã亀äºã«ãã³ãŒããããããã³ãŒããããåãã¯ãã£ã®ç»åããŒã¿ã¯ãïœïœïœïŒdecoded picture buffer:éå§çž®ããŒã¿ãããã¡ïŒã«å
¥åãããããã®äŸã«ãããŠããã³ãŒããè¡ãããŠãã衚瀺ãéå§ããããŸã§ã®é
延ãã¯ãã£æ°ã¯ïŒãã¯ãã£ãšãããŠããã
  In the illustrated example,
ãªããäžè¿°ã§ã¯ãïŒïŒïŒ²ïŒãšããåºå®ãããã¬ãŒãïŒconstant_bit_rateïŒã®äŸã瀺ããŠããããããã«éå®ãããšããå¯å€ãããã¬ãŒãïŒvariable_bit_rateïŒã§ãèãæ¹ã¯åãã§ããã   In the above description, both R1 and R2 are examples of a constant bit rate (constant_bit_rate). However, the concept is the same even if the variable bit rate (variable_bit_rate) is not limited thereto.
å³ïŒïŒã¯ããšã³ã³ãŒãïŒïŒïŒã®æ§æäŸã瀺ããŠããããã®ãšã³ã³ãŒãïŒïŒïŒã¯ããã³ãã©ã«ïŒ©ïŒ€çºçéšïŒïŒïŒãšããããã¡é
延å¶åŸ¡éšïŒïŒïŒãšãïŒHypothetical Reference DecoderïŒèšå®éšïŒïŒïŒãšããã©ã¡ãŒã¿ã»ãã/ãšã³ã³ãŒãéšïŒïŒïŒãšãã¹ã©ã€ã¹ãšã³ã³ãŒãéšïŒïŒïŒãšããã±ããåéšïŒïŒïŒãæããŠããã
  FIG. 15 shows a configuration example of the
ãã³ãã©ã«ïŒ©ïŒ€çºçéšïŒïŒïŒã«ã¯ãïŒïŒïŒãããéå±€æ°ïŒNumber of layersïŒã®æ
å ±ãäŸçµŠãããããã³ãã©ã«ïŒ©ïŒ€çºçéšïŒïŒïŒã¯ããã®éå±€æ°ã®æ
å ±ã«åºã¥ããŠãéå±€æ°ã«å¿ããtemporal_idãçºçãããäŸãã°ãå³ïŒã®é局笊å·äŸã«ãããŠã¯ãtemporal_idïŒïŒãïŒãçºçãããã
  The temporal
ãããã¡é
延å¶åŸ¡éšïŒïŒïŒã«ã¯ãïŒïŒïŒããããããã ãã³ãŒãèœåïŒminimum_target_decoder_level_idcïŒã®æ
å ±ãäŸçµŠããããšå
±ã«ããã³ãã©ã«ïŒ©ïŒ€çºçéšïŒïŒïŒã§çºçãããtemporal_idãäŸçµŠãããããããã¡é
延å¶åŸ¡éšïŒïŒïŒã¯ããããªã¹ããªãŒã æ¯ã«ãïœïœïœãããã¡ãªã³ã°ïŒbufferingïŒåæå€ã§ãããâinitial_cpb_removal_delay âãšããã¯ãã£æ¯ã®âcpb_removal_delayâãâ dpb_output_delayâãèšç®ããã
  The buffer
ãããã¡é
延å¶åŸ¡éšïŒïŒïŒã¯ããµãã¹ããªãŒã ïŒSub-streamïŒããšã®ïœïœïœãããã¡ã«ãããŠâCpb_removal_delayâãå¶åŸ¡ããããããã¡é
延å¶åŸ¡éšïŒïŒïŒã¯ãïœïœïœãããã¡ã«ãããŠãã³ãŒãã®ãã³ãŒãã¿ã€ãã³ã°ãšè¡šç€ºã¿ã€ãã³ã°ã®éã§ãããã¡ç Žç¶»ããªãããå¶åŸ¡ããããã®å Žåãæäžäœã®éå±€çµã®ãã¯ãã£ã®ãã³ãŒãã¿ã€ãã³ã°ãçééãšãªãããã«ãâcpb_removal_delayâãå¶åŸ¡ããããŸãããã®å Žåãæäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãšã³ã³ãŒãã¿ã€ãã³ã°ãããã®éå±€çµããäžäœåŽã«äœçœ®ãããã¹ãŠã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãšã³ã³ãŒãã¿ã€ãã³ã°ã®äžéã¿ã€ãã³ã°ãšãªãããã«ãâcpb_removal_delayâãå¶åŸ¡ããããŸããïœïœïœãããã¡ã®ç Žãããæããªãããã«ãâdpb_output_delayâãå¶åŸ¡ããããªãããšã³ã³ãŒãã¿ã€ãã³ã°ã¯ãåä¿¡åŽã§å§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒcoded picture bufferïŒããèªã¿åºããããã³ãŒãã¿ã€ãã³ã°ãšåãæå³ã瀺ãã
  The buffer
ïŒHypothetical Reference DecoderïŒèšå®éšïŒïŒïŒã«ã¯ããããã¡é
延å¶åŸ¡éšïŒïŒïŒã§èšç®ãããåãããªã¹ããªãŒã ã®ãã¯ãã£ã®ãcpb_removal_delayãããdpb_output_delayããäŸçµŠããããšå
±ã«ãïŒïŒïŒããã¹ããªãŒã æ°ïŒNumber of streamsïŒã®æ
å ±ãäŸçµŠããããèšå®éšïŒïŒïŒã¯ããããã®æ
å ±ã«åºã¥ããŠïŒšïŒ²ïŒ€èšå®ãè¡ãã
  An HRD (Hypothetical Reference Decoder)
ãã©ã¡ãŒã¿ã»ãã/ãšã³ã³ãŒãéšïŒïŒïŒã«ã¯ãèšå®æ
å ±ãšå
±ã«ãtemporal_idãäŸçµŠãããããã©ã¡ãŒã¿ã»ãã/ãšã³ã³ãŒãéšïŒïŒïŒã¯ã笊å·åããã¹ããªãŒã æ°ã«å¿ããŠãåéå±€ã®ãã¯ãã£ã®ïŒ¶ïŒ°ïŒ³ãããªã©ã®ãã©ã¡ãŒã¿ã»ãããšïŒ³ïŒ¥ïŒ©ãçæããã
  The parameter set /
äŸãã°ããcpb_removal_delayããšãdpb_output_delayããå«ããã¯ãã£ã»ã¿ã€ãã³ã°ã»ïŒ³ïŒ¥ïŒ©ïŒPicture timing SEIïŒãçæãããããŸããäŸãã°ããinitial_cpb_removal_timeããå«ããããã¡ãªã³ã°ã»ããªãªãã»ïŒ³ïŒ¥ïŒ©ïŒBuffereing Perifod SEIïŒãçæãããããããã¡ãªã³ã°ã»ããªãªãã»ïŒ³ïŒ¥ïŒ©ã¯ãã®å é ã®ãã¯ãã£ïŒã¢ã¯ã»ã¹ãŠãããïŒã«å¯Ÿå¿ããŠçæãããã   For example, a picture timing SEI (Picture timing SEI) including âcpb_removal_delayâ and âdpb_output_delayâ is generated. Also, for example, a buffering period SEI (Buffere Perifod SEI) including âinitial_cpb_removal_timeâ is generated. The buffering period SEI is generated corresponding to the first picture (access unit) of the GOP.
ãinitial cpb removal timeãã¯ãå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒããïŒGroup Of PictureïŒã®å é ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ããã³ãŒãããéã«åãåºãæå»ïŒåææå»ïŒã瀺ãããcpb_removal_delayãã¯ãåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒããåãåºãæéã§ããããinitial_cpb_removal_timeããšåãããŠæå»ã決ãŸãããŸãããdpb_output_delayãã¯ããã³ãŒãããŠéå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒã«å ¥ã£ãŠããåãåºãæéã瀺ãã   âInitial cpb removal timeâ indicates a time (initial time) to be taken out when decoding the encoded image data of the first picture of the GOP (Group Of Picture) from the compressed data buffer (cpb). âCpb_removal_delayâ is a time for extracting the encoded image data of each picture from the compressed data buffer (cpb), and the time is determined together with âinitial_cpb_removal_timeâ. âDpb_output_delayâ indicates the time taken to decode and enter the uncompressed data buffer (dpb).
ã¹ã©ã€ã¹ãšã³ã³ãŒãéšïŒïŒïŒã¯ãåéå±€ã®ãã¯ãã£ã®ç»åããŒã¿ããšã³ã³ãŒãããŠã¹ã©ã€ã¹ããŒã¿ïŒslice segment header, slice segment dataïŒãåŸããã¹ã©ã€ã¹ãšã³ã³ãŒãéšïŒïŒïŒã¯ããã¬ãŒã ãããã¡ã«ãããæéæ¹åã®äºæž¬ã®ç¶æ
ãè¡šãæ
å ±ãšããŠããPrediction Unitãã®äºæž¬å
ãã¯ãã£ã®ã€ã³ããã¯ã¹ã瀺ããref_idx_l0_active(ref_idx_l1_active)ãããslice segment headerãã«æ¿å
¥ãããããã«ããããã³ãŒãæã«ã¯ãtemporal_idã§ç€ºãããéå±€ã¬ãã«ãšå
±ã«ã被åç
§ãã¯ãã£ã決å®ãããããŸããã¹ã©ã€ã¹ãšã³ã³ãŒãéšïŒïŒïŒã¯ãçŸåšã®ã¹ã©ã€ã¹ïŒsliceïŒã®ã€ã³ããã¯ã¹ãããshort_term_ref_pic_set_idxãã ãããã¯ãit_idx_spsããšããŠããslice segment headerãã«æ¿å
¥ããã
  The
ãã±ããåéšïŒïŒïŒã¯ããã©ã¡ãŒã¿ã»ãã/ãšã³ã³ãŒãéšïŒïŒïŒã§çæããããã©ã¡ãŒã¿ã»ããããã³ïŒ³ïŒ¥ïŒ©ãšãã¹ã©ã€ã¹ãšã³ã³ãŒãéšïŒïŒïŒã§çæãããã¹ã©ã€ã¹ããŒã¿ã«åºã¥ããåéå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãçæããã¹ããªãŒã æ°ã«å¿ããæ°ã®ãããªã¹ããªãŒã ïŒç¬Šå·åã¹ããªãŒã ïŒãåºåããã
  The
ãã®éããã¯ãã£ããšã«ããã®éå±€ã瀺ãtemporal_idããŠããããããã«ä»ãããïŒå³ïŒåç §ïŒããŸããtemporal_idã§ç€ºãããéå±€ã«å±ãããã¯ãã£ã¯ããµãã¬ã€ã€ïŒsub_layerïŒãšããŠæ¬ããããµãã¬ã€ã€ããšã®ãããã¬ãŒãã®ã¬ãã«æå®å€ãLevel_idcãããsublayer_level_idcããšãããŠããã«æ¿å ¥ãããã   In that case, temporal_id which shows the hierarchy is attached | subjected to a NAL unit header for every picture (refer FIG. 4). Also, pictures belonging to the layer indicated by temporal_id are bundled as a sublayer (sub_layer), and the bit rate level designation value âLevel_idcâ for each sublayer is set to âsublayer_level_idcâ and inserted into the VPS or SPS.
å³ïŒïŒã¯ããšã³ã³ãŒãïŒïŒïŒã®åŠçãããŒã瀺ãããšã³ã³ãŒãïŒïŒïŒã¯ãã¹ãããïŒã«ãããŠãåŠçãéå§ãããã®åŸã«ãã¹ãããïŒã®åŠçã«ç§»ãããã®ã¹ãããïŒã«ãããŠããšã³ã³ãŒãïŒïŒïŒã¯ãé局笊å·åã«ãããéå±€æ°ïŒ®ãèšå®ããã次ã«ããšã³ã³ãŒãïŒïŒïŒã¯ãã¹ãããïŒã«ãããŠãåéå±€ã®ãã¯ãã£ã®temporal_idãïŒãïŒïŒ®âïŒïŒãšããã
  FIG. 16 shows a processing flow of the
次ã«ããšã³ã³ãŒãïŒïŒïŒã¯ãã¹ãããïŒã«ãããŠã察象ãã³ãŒãã®ãã¡ãæå°èœåã®ãã³ãŒãããã³ãŒãã§ããéå±€ã¬ãã«ïŒ«ããïŒãâïŒã®ç¯å²å
ã«èšå®ããããããŠããšã³ã³ãŒãïŒïŒïŒã¯ãã¹ãããïŒã«ãããŠããããã¡é
延å¶åŸ¡éšïŒïŒïŒã§ãåéå±€çµã«ããããã¯ãã£ãšã³ã³ãŒãééããã³ãšã³ã³ãŒãã¿ã€ãã³ã°ãèšå®ããã
  Next, in step ST4, the
次ã«ããšã³ã³ãŒãïŒïŒïŒã¯ãã¹ãããïŒã«ãããŠãã¹ãããïŒã§æ±ããåéå±€çµã®ãã¯ãã£ãšã³ã³ãŒãééããã³ãšã³ã³ãŒãã¿ã€ãã³ã°ããcpb_removal_delayãããdpb_output_delayãã«åæ ããèšå®ããã©ã¡ãŒã¿ã»ãã/ã®ãšã³ã³ãŒããã¹ã©ã€ã¹ãšã³ã³ãŒããè¡ãããŠããããšããŠå€éåãããã¯ãžè»¢éããããã®åŸããšã³ã³ãŒãïŒïŒïŒã¯ãã¹ãããïŒã«ãããŠãåŠçãçµäºããã
  Next, in step ST6, the
å³ïŒã«æ»ã£ãŠãå§çž®ããŒã¿ãããã¡(ïœïœïœ)ïŒïŒïŒã¯ããšã³ã³ãŒãïŒïŒïŒã§çæããããåéå±€ã®ãã¯ãã£ã®ç¬Šå·åããŒã¿ãå«ããããªã¹ããªãŒã ããäžæçã«èç©ããããã«ããã¬ã¯ãµïŒïŒïŒã¯ãå§çž®ããŒã¿ãããã¡ïŒïŒïŒã«èç©ãããŠãããããªã¹ããªãŒã ãèªã¿åºãããã±ããåããããã«ãã©ã³ã¹ããŒããã±ããåããŠå€éããå€éåã¹ããªãŒã ãšããŠã®ãã©ã³ã¹ããŒãã¹ããªãŒã ãåŸãã
  Returning to FIG. 2, the compressed data buffer (cpb) 103 temporarily stores the video stream generated by the
ãã©ã³ã¹ããŒãã¹ããªãŒã ã«ã¯ãäžè¿°ããããã«ãè€æ°ã®éå±€ãåå²ãããŠåŸãããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€æå®æ°ã®ãããªã¹ããªãŒã ãå«ãŸããããã«ããã¬ã¯ãµïŒïŒïŒã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ã«ãæå®æ°ã®ãããªã¹ããªãŒã ã®ããããããããŒã¹ã¹ããªãŒã ã§ããããšã³ãã³ã¹ã¹ããªãŒã ã§ããããèå¥ããããã®èå¥æ
å ±ãæ¿å
¥ããããã®å Žåãèå¥æ
å ±ã¯ãããã°ã©ã ãããããŒãã«ã®é
äžã«æå®æ°ã®ãããªã¹ããªãŒã ã«ãããã察å¿ããŠé
眮ããããããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«ãŒãïŒVideo ES loopïŒã®äžã«ã¹ããªãŒã ã¿ã€ããšããŠæ¿å
¥ããã
  As described above, the transport stream TS includes a predetermined number of video streams having encoded image data of pictures in each layer set obtained by dividing a plurality of layers. The
ãã®å ŽåãããŒã¹ã¹ããªãŒã ã®ã¹ããªãŒã ã¿ã€ãã¯âïŒïœïŒïŒâãšãããããŸãããšã³ãã³ã¹ã¹ããªãŒã ã®ã¹ããªãŒã ã¿ã€ãã¯æ°èŠå®çŸ©ããããäŸãã°âïŒïœïŒïŒâãšãããããªãããšã³ãã³ã¹ã¹ããªãŒã ãè€æ°ååšããå Žåãå šãŠã®ãšã³ãã³ã¹ã¹ããªãŒã ã®ã¹ããªãŒã ã¿ã€ããåãããããã®ã§ã¯ãªããåãšã³ãã³ã¹ã¹ããªãŒã ã®èå¥ãå¯èœãšãªãããã«ãšã³ãã³ã¹ã¹ããªãŒã ã®ã¹ããªãŒã ã¿ã€ããšããŠè€æ°ã®ã¹ããªãŒã ã¿ã€ããæ°èŠå®çŸ©ãããŠããããäŸãã°ããšã³ãã³ã¹ã¹ããªãŒã ãïŒã€ããå Žåã第ïŒã®ãšã³ãã³ã¹ã¹ããªãŒã ã®ã¹ããªãŒã ã¿ã€ãã¯âïŒïœïŒïŒâãšããã第ïŒã®ãšã³ãã³ã¹ã¹ããªãŒã ã®ã¹ããªãŒã ã¿ã€ãã¯âïŒïœïŒïŒâãšãããã   In this case, the stream type of the base stream is â0x24â. The stream type of the enhanced stream is newly defined, for example, â0x25â. When there are multiple enhanced streams, the stream types of all enhanced streams are not the same, but multiple stream types are newly defined as stream types of enhanced streams so that each enhanced stream can be identified. Also good. For example, when there are two enhanced streams, the stream type of the first enhanced stream is â0x25â, and the stream type of the second enhanced stream is â0x26â.
ãã«ããã¬ã¯ãµïŒïŒïŒã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ã®ã¬ã€ã€ã«ãæå®æ°ã®ãããªã¹ããªãŒã ã®ããããã«å¯Ÿå¿ããŠããããªã¹ããªãŒã ã®æ§ææ
å ±ãæ¿å
¥ããããã«ããã¬ã¯ãµïŒïŒïŒã¯ããã®æ§ææ
å ±ããããã°ã©ã ãããããŒãã«ã®é
äžã«æå®æ°ã®ãããªã¹ããªãŒã ã«ãããã察å¿ããŠé
眮ããããããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«ãŒãã®äžã«ãã¹ã¯ãªãã¿ãšããŠæ¿å
¥ããã
  The
ãã«ããã¬ã¯ãµïŒïŒïŒã¯ããã¹ã¯ãªãã¿ïŒHEVC_descriptorïŒãšå
±ã«ãæ°èŠå®çŸ©ãããã«ãã¹ããªãŒã ã»ãã¹ã¯ãªãã¿ïŒmultistream_descriptorïŒãæ¿å
¥ãããå³ïŒïŒã¯ããã¹ã¯ãªãã¿ïŒHEVC_descriptorïŒã®æ§é äŸïŒSyntaxïŒã瀺ããŠããããdescriptor_tagãã®ïŒããããã£ãŒã«ãã¯ããã¹ã¯ãªãã¿ã¿ã€ãã瀺ããããã§ã¯ããã¹ã¯ãªãã¿ã§ããããšã瀺ãããdescriptor_lengthãã®ïŒããããã£ãŒã«ãã¯ããã¹ã¯ãªãã¿ã®é·ãïŒãµã€ãºïŒã瀺ãããã¹ã¯ãªãã¿ã®é·ããšããŠã以éã®ãã€ãæ°ã瀺ãã
  The
ãlevel_idcãã®ïŒããããã£ãŒã«ãã¯ããããã¬ãŒãã®ã¬ãã«æå®å€ã瀺ãããŸãããtemporal_layer_subset_flag = 1ãã§ãããšãããtemporal_id_minãã®ïŒããããã£ãŒã«ããšããtemporal_id_maxãã®ïŒããããã£ãŒã«ããååšããããtemporal_id_minãã¯ã察å¿ãããããªã¹ããªãŒã ã«å«ãŸããé局笊å·åããŒã¿ã®æãäœãéå±€ã®temporal_idã®å€ã瀺ãããtemporal_id_maxãã¯ã察å¿ãããããªã¹ããªãŒã ãæã€é局笊å·åããŒã¿ã®æãé«ãéå±€ã®temporal_idã®å€ã瀺ãã   The 8-bit field of âlevel_idcâ indicates a bit rate level designation value. When âtemporal_layer_subset_flag = 1â, a 5-bit field of âtemporal_id_minâ and a 5-bit field of âtemporal_id_maxâ exist. âTemporal_id_minâ indicates the value of temporal_id of the lowest hierarchy of the hierarchically encoded data included in the corresponding video stream. âTemporal_id_maxâ indicates the value of temporal_id of the highest hierarchy of the hierarchically encoded data included in the corresponding video stream.
å³ïŒïŒã¯ããã«ãã¹ããªãŒã ã»ãã¹ã¯ãªãã¿ïŒmultistream_descriptorïŒã®æ§é äŸïŒSyntaxïŒã瀺ããŠããããŸããå³ïŒïŒã¯ããã®æ§é äŸã«ãããäž»èŠãªæ å ±ã®å 容ïŒSemanticsïŒã瀺ããŠããã   FIG. 18 illustrates a structural example (Syntax) of a multistream descriptor (multistream_descriptor). FIG. 19 shows the contents (Semantics) of main information in the structural example.
ãmultistream_descriptor_tagãã®ïŒããããã£ãŒã«ãã¯ããã¹ã¯ãªãã¿ã¿ã€ãã瀺ããããã§ã¯ããã«ãã¹ããªãŒã ã»ãã¹ã¯ãªãã¿ã§ããããšã瀺ãããmultistream_descriptor_lengthãã®ïŒããããã£ãŒã«ãã¯ããã¹ã¯ãªãã¿ã®é·ãïŒãµã€ãºïŒã瀺ãããã¹ã¯ãªãã¿ã®é·ããšããŠã以éã®ãã€ãæ°ã瀺ããããã§ã¯ãïŒãã€ãã瀺ãããgroup_idãã®ïŒããããã£ãŒã«ãã¯ãäžé£ã®ãµãŒãã¹ã§é¢é£ä»ããããã°ã«ãŒãã®ïŒ©ïŒ€ã瀺ãããã®å ŽåãããŒã¹ã¹ããªãŒã ïŒbase streamïŒãšããããåºæ¬ãšãããã¹ãŠã®ãã³ããŒã¹ã¹ããªãŒã ïŒnon-base streamïŒenhanced streamïŒãåããæã€ã   An 8-bit field of âmultistream_descriptor_tagâ indicates a descriptor type. Here, it indicates a multi-stream descriptor. The 8-bit field of âmultistream_descriptor_lengthâ indicates the length (size) of the descriptor, and indicates the number of subsequent bytes as the descriptor length. Here, 2 bytes are shown. A 4-bit field of âgroup_idâ indicates an ID of a group associated with a series of services. In this case, the base stream and all non-base streams (non-base stream = enhanced stream) based on the base stream have the same ID.
ãstream_dependency_orderingãã®ïŒããããã£ãŒã«ãã¯ãããŒã¹ã¹ããªãŒã ïŒbase streamïŒããå§ãŸãã¹ããªãŒã éã®äŸåé¢ä¿ãæé ã§å®çŸ©ãããâïŒïŒïŒïŒâã¯ãåºæ¬ã¹ããªãŒã ã瀺ããâïŒïŒïŒïŒâã¯ãåºæ¬ã¹ããªãŒã ããïŒçªç®ã®ã¹ããªãŒã ïŒãšã³ãã³ã¹ã¹ããªãŒã ïŒã瀺ããâïŒïŒïŒïŒâã¯ãåºæ¬ã¹ããªãŒã ããïŒçªç®ã®ã¹ããªãŒã ã瀺ãããmax_layer_in_groupãã¯ãã°ã«ãŒãã®äžã§ç¬Šå·åãããéå±€ã®æ倧å€ã瀺ãã   The 4-bit field of âstream_dependency_orderingâ defines the dependency relationship between streams starting from the base stream in ascending order. â0001â indicates a basic stream. â0010â indicates the second stream (enhanced stream) from the basic stream. â0011â indicates the third stream from the basic stream. âMax_layer_in_groupâ indicates the maximum value of the layers encoded in the group.
å³ïŒïŒã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ã«ãäŸãã°ããµãŒãã¹ïŒïŒSERVICE 1ïŒãšããµãŒãã¹ïŒïŒSERVICE 2ïŒã®ãããªã¹ããªãŒã 矀ãå«ãŸããŠããå Žåã«ããããStream_typeãããGroup_idãããmax/min layerãããmax_layer_in_groupãããStream_dependenncy orderingãã®äžäŸã瀺ããŠããã   FIG. 20 shows âStream_typeâ, âGroup_idâ, âmax / min layerâ when the transport stream TS includes, for example, video streams of service 1 (SERVICE 1) and service 2 (SERVICE 2). ], âMax_layer_in_groupâ, âStream_dependenncy orderingâ.
ãã®äŸã§ããµãŒãã¹ïŒã®ãããªã¹ããªãŒã ãšããŠãããŒã¹ã¹ããªãŒã ïŒBase streamïŒããšã³ãã³ã¹ã¹ããªãŒã ïŒEnhanced stream 1ïŒããšã³ãã³ã¹ã¹ããªãŒã ïŒEnhanced stream 2
ïŒã®ïŒã€ã®ãããªã¹ããªãŒã ãå«ãŸããŠããããã®ãµãŒãã¹ïŒã¯ããGroup_idãã®å€ã¯ãïŒãã«ãªã£ãŠããããŸãããã®ãµãŒãã¹ïŒã¯ãäŸãã°ãå³ïŒã«ç€ºãé局笊å·åã®äŸãšåæ§ã«ãéå±€æ°ã¯ïŒéå±€ã§ããããmax/min layerãã®å€ã¯ããïŒããšãªã£ãŠããã
In this example, as a video stream of the
) Three video streams. In this
ãŸãããã®ãµãŒãã¹ïŒã¯ãïŒã€ã®éå±€çµã«åå²ãããŠãããããŒã¹ã¹ããªãŒã ã®ãStream_typeãã®å€ã¯ãïŒïœïŒïŒãã«èšå®ãããŠããããã¹ã¯ãªãã¿ã®ãmax/min layerãã¯éå±€ïŒããïŒã®ãã¯ãã£ãå«ãããšã瀺ãããŸãããšã³ãã³ã¹ã¹ããªãŒã ïŒEnhanced stream 1ïŒã®ãStream_typeãã®å€ã¯ãïŒïœïŒïŒãã«èšå®ãããŠããããã¹ã¯ãªãã¿ã®ãmax/min layerãã¯éå±€ïŒã®ãã¯ãã£ãå«ãããšã瀺ããããã«ããšã³ãã³ã¹ã¹ããªãŒã ïŒEnhanced stream 2ïŒã®ãStream_typeãã®å€ã¯ãïŒïœïŒïŒãã«èšå®ãããŠããããã¹ã¯ãªãã¿ã®ãmax/min layerãã¯éå±€ïŒã®ãã¯ãã£ãå«ãããšã瀺ãã
  The
ãŸãããã®äŸã§ããµãŒãã¹ïŒã®ãããªã¹ããªãŒã ãšããŠãããŒã¹ã¹ããªãŒã ïŒBase streamïŒããšã³ãã³ã¹ã¹ããªãŒã ïŒEnhanced stream 1ïŒããšã³ãã³ã¹ã¹ããªãŒã ïŒEnhanced stream 2
ïŒã®ïŒã€ã®ãããªã¹ããªãŒã ãå«ãŸããŠããããã®ãµãŒãã¹ïŒã¯ããGroup_idãã®å€ã¯ãïŒãã«ãªã£ãŠããããŸãããã®ãµãŒãã¹ïŒã¯ãäŸãã°ãéå±€æ°ã¯ïŒéå±€ã§ããããmax/min layerãã®å€ã¯ããïŒããšãªã£ãŠããã
Also, in this example, as a video stream of
) Three video streams. In this
ãŸãããã®ãµãŒãã¹ïŒã¯ãïŒã€ã®éå±€çµã«åå²ãããŠãããããŒã¹ã¹ããªãŒã ã®ãStream_typeãã®å€ã¯ãïŒïœïŒïŒãã«èšå®ãããŠããããã¹ã¯ãªãã¿ã®ãmax/min layerãã¯éå±€ïŒããïŒã®ãã¯ãã£ãå«ãããšã瀺ãããŸãããšã³ãã³ã¹ã¹ããªãŒã ïŒEnhanced stream 1ïŒã®ãStream_typeãã®å€ã¯ãïŒïœïŒïŒãã«èšå®ãããŠããããã¹ã¯ãªãã¿ã®ãmax/min layerãã¯éå±€ïŒã®ãã¯ãã£ãå«ãããšã瀺ããããã«ããšã³ãã³ã¹ã¹ããªãŒã ïŒEnhanced stream 2ïŒã®ãStream_typeãã®å€ã¯ãïŒïœïŒïŒãã«èšå®ãããŠããããã¹ã¯ãªãã¿ã®ãmax/min layerãã¯éå±€ïŒã®ãã¯ãã£ãå«ãããšã瀺ãã
  The
å³ïŒïŒã¯ããã«ããã¬ã¯ãµïŒïŒïŒã®æ§æäŸã瀺ããŠããããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã»ã¯ã·ã§ã³ã³ãŒãã£ã³ã°éšïŒïŒïŒãšããã±ããåéšïŒïŒïŒ-1ãïŒïŒïŒ-Nãšãã¹ã€ããéšïŒïŒïŒãšããã©ã³ã¹ããŒããã±ããåéšïŒïŒïŒãæããŠããã
  FIG. 21 shows a configuration example of the
ãã±ããåéšïŒïŒïŒ-1ãïŒïŒïŒ-Nã¯ããããããå§çž®ããŒã¿ãããã¡ïŒïŒïŒã«èç©ãããŠãããããªã¹ããªãŒã ïŒããèªã¿èŸŒã¿ããã±ãããçæãããããã§ããããªã¹ããªãŒã ïŒãã«ã¯ãïŒã€ã®ããŒã¹ã¹ããªãŒã ãšãïŒã€ä»¥äžã®ãšã³ãã³ã¹ã¹ããªãŒã ãå«ãŸããŠããã
  The PES packetization units 143-1 to 143-N read the video streams 1 to N stored in the
ãã®éããã±ããåéšïŒïŒïŒ-1ãïŒïŒïŒ-Nã¯ããããªã¹ããªãŒã ïŒãã®ïŒšïŒ²ïŒ€æ å ±ãå ã«ïŒ€ïŒŽïŒ³ïŒDecoding Time StampïŒãïŒPresentation Time StampïŒã®ã¿ã€ã ã¹ã¿ã³ãããããã«ä»äžããããã®å Žåãåãã¯ãã£ã®ãcpu_removal_delayãããdpb_output_delayããåç §ãããŠãïŒSystem Time ClockïŒæå»ã«åæãã粟床ã§ãåã ããçæããããããã®æå®äœçœ®ã«é 眮ãããã   At this time, the PES packetizing units 143-1 to 143-N add DTS (Decoding Time Stamp) and PTS (Presentation Time Stamp) time stamps to the PES header based on the HRD information of the video streams 1 to N. In this case, âcpu_removal_delayâ and âdpb_output_delayâ of each picture are referred to, DTS and PTS are generated with accuracy synchronized with STC (System Time Clock) time, and are arranged at predetermined positions of the PES header.
ã¹ã€ããéšïŒïŒïŒã¯ããã±ããåéšïŒïŒïŒ-1ãïŒïŒïŒ-Nã§çæããããã±ãããããã±ããèå¥åïŒïŒ°ïŒ©ïŒ€ïŒã«åºã¥ããŠéžæçã«åãåºãããã©ã³ã¹ããŒããã±ããåéšïŒïŒïŒã«éãããã©ã³ã¹ããŒããã±ããåéšïŒïŒïŒã¯ããã±ããããã€ããŒãã«å«ããã±ãããçæãããã©ã³ã¹ããŒãã¹ããªãŒã ãåŸãã
  The
ã»ã¯ã·ã§ã³ã³ãŒãã£ã³ã°éšïŒïŒïŒã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ã«æ¿å
¥ãã¹ãåçš®ã®ã»ã¯ã·ã§ã³ããŒã¿ãçæãããã»ã¯ã·ã§ã³ã³ãŒãã£ã³ã°éšïŒïŒïŒã«ã¯ãïŒïŒïŒãããéå±€æ°ïŒNumber of layersïŒãšãã¹ããªãŒã æ°ïŒNumber of streamsïŒãªã©ã®æ
å ±ãäŸçµŠããããã»ã¯ã·ã§ã³ã³ãŒãã£ã³ã°éšïŒïŒïŒã¯ããããæ
å ±ã«åºã¥ããŠãäžè¿°ãããã¹ã¯ãªãã¿ïŒHEVC_descriptorïŒããã«ãã¹ããªãŒã ã»ãã¹ã¯ãªãã¿ïŒmultistream_descriptorïŒãçæããã
  The
ã»ã¯ã·ã§ã³ã³ãŒãã£ã³ã°éšïŒïŒïŒã¯ãåçš®ã»ã¯ã·ã§ã³ããŒã¿ãããã©ã³ã¹ããŒããã±ããåéšïŒïŒïŒã«éãããã©ã³ã¹ããŒããã±ããåéšïŒïŒïŒã¯ããã®ã»ã¯ã·ã§ã³ããŒã¿ãå«ããã±ãããçæãããã©ã³ã¹ããŒãã¹ããªãŒã ã«æ¿å
¥ããããªãããã®éãåãããªã¹ããªãŒã ã«ãããã察å¿ããŠé
眮ããããããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«ãŒãïŒVideo ES loopïŒã®äžã«ãã¹ããªãŒã ã¿ã€ããæ¿å
¥ãããããã®å ŽåãããŒã¹ã¹ããªãŒã ã®ã¹ããªãŒã ã¿ã€ãã¯âïŒïœïŒïŒâãšããããšã³ãã³ã¹ã¹ããªãŒã ã®ã¹ããªãŒã ã¿ã€ãã¯ãäŸãã°æ°èŠå®çŸ©ããâïŒïœïŒïŒâãšãããã
  The
å³ïŒïŒã¯ããã«ããã¬ã¯ãµïŒïŒïŒã®åŠçãããŒã瀺ãããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¹ãããïŒïŒã«ãããŠãåŠçãéå§ãããã®åŸã«ãã¹ãããïŒïŒã®åŠçã«ç§»ãããã®ã¹ãããïŒïŒã«ãããŠããã«ããã¬ã¯ãµïŒïŒïŒã¯ãæ
å ±ïŒcpu_removal_delayãdpb_output_delayïŒãåç
§ããŠããã決ãããããã®æå®äœçœ®ã«æ¿å
¥ããã
  FIG. 22 shows a processing flow of the
次ã«ããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¹ãããïŒïŒã«ãããŠããã«ãã¹ããªãŒã ãåŠããã€ãŸããè€æ°ã§ãããåŠããå€æããããã«ãã¹ããªãŒã ã§ãããšãããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¹ãããïŒïŒã«ãããŠãè€æ°ã®ïŒ°ïŒ©ïŒ€ã§å€éååŠçãé²ããããšãšããããããŠããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¹ãããïŒïŒã«ãããŠãããŒã¹ã¹ããªãŒã ãåŠããå€æããã
  Next, in step ST13, the
ããŒã¹ã¹ããªãŒã ã§ãããšãããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¹ãããïŒïŒã«ãããŠãã¹ããªãŒã ã¿ã€ããâïŒïœïŒïŒâã«èšå®ãããã®åŸã«ã¹ãããïŒïŒã®åŠçã«é²ããäžæ¹ããšã³ãã³ã¹ã¹ããªãŒã ã§ãããšãããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¹ãããïŒïŒã«ãããŠãã¹ããªãŒã ã¿ã€ããããšã³ãã³ã¹ã¹ããªãŒã ã瀺ãå€ãäŸãã°æ°èŠå®çŸ©ããâïŒïœïŒïŒâã«èšå®ãããã®åŸã«ã¹ãããïŒïŒã®åŠçã«é²ãã
  When it is a base stream, the
ãªããã¹ãããïŒïŒã§ãã«ãã¹ããªãŒã ã§ãªããšãããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¹ãããïŒïŒã«ãããŠãïŒã€ã®ïŒ°ïŒ©ïŒ€ã§å€éååŠçãããããšãšãããã®åŸã«ã¹ãããïŒïŒã®åŠçã«é²ãã
  If the multi-stream is not determined in step ST13, the
ã¹ãããïŒïŒã«ãããŠããã«ããã¬ã¯ãµïŒïŒïŒã¯ããã¹ã¯ãªãã¿ããã«ãã¹ããªãŒã ã»ãã¹ã¯ãªãã¿ãªã©ãã»ã¯ã·ã§ã³ã³ãŒãã£ã³ã°ãããŸãã笊å·åã¹ããªãŒã ïŒãããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ïŒããã€ããŒãã«æ¿å
¥ããŠïŒ°ïŒ¥ïŒ³ãã±ããåããããããŠããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¹ãããïŒïŒã«ãããŠãã©ã³ã¹ããŒããã±ããåãããã©ã³ã¹ããŒãã¹ããªãŒã ãåŸãããã®åŸããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¹ãããïŒïŒã«ãããŠãåŠçãçµäºããã
  In step ST18, the
å³ïŒïŒã¯ããããµãŒãã¹ãïŒã¹ããªãŒã ã§é ä¿¡ããå Žåã®ãã©ã³ã¹ããŒãã¹ããªãŒã ã®æ§æäŸã瀺ããŠããããã®ãã©ã³ã¹ããŒãã¹ããªãŒã ã«ã¯ãããŒã¹ã¹ããªãŒã ãšãšã³ãã³ã¹ã¹ããªãŒã ã®ïŒã€ã®ãããªã¹ããªãŒã ãå«ãŸããŠãããããªãã¡ããã®æ§æäŸã§ã¯ãããŒã¹ã¹ããªãŒã ã®ïŒ°ïŒ¥ïŒ³ãã±ãããvideo PES1ããååšãããšå ±ã«ããšã³ãã³ã¹ã¹ããªãŒã ã®ïŒ°ïŒ¥ïŒ³ãã±ãããvideo PES2ããååšããã   FIG. 23 illustrates a configuration example of the transport stream TS when a certain service is distributed in two streams. The transport stream TS includes two video streams, a base stream and an enhanced stream. That is, in this configuration example, there is a base stream PES packet âvideo PES1â and an enhanced stream PES packet âvideo PES2â.
ãŸãããã©ã³ã¹ããŒãã¹ããªãŒã ã«ã¯ãïŒProgram Specific InformationïŒã®äžã€ãšããŠãïŒïŒŽïŒProgram Map TableïŒãå«ãŸããŠããããã®ïŒ°ïŒ³ïŒ©ã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ã«å«ãŸããåãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ãã©ã®ããã°ã©ã ã«å±ããŠããããèšããæ å ±ã§ããã   The transport stream TS includes a PMT (Program Map Table) as one of PSI (Program Specific Information). This PSI is information describing to which program each elementary stream included in the transport stream belongs.
ïŒïŒŽã«ã¯ãããã°ã©ã å šäœã«é¢é£ããæ å ±ãèšè¿°ããããã°ã©ã ã»ã«ãŒãïŒProgram loopïŒãååšããããŸããïŒïŒŽã«ã¯ãåãããªã¹ããªãŒã ã«é¢é£ããæ å ±ãæã€ãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã»ã«ãŒããååšããããã®æ§æäŸã§ã¯ãããŒã¹ã¹ããªãŒã ã«å¯Ÿå¿ãããããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«ãŒããvideo ES1 loopããååšãããšå ±ã«ããšã³ãã³ã¹ã¹ããªãŒã ã«å¯Ÿå¿ãããããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«ãŒããvideo ES2 loopããååšããã   In the PMT, there is a program loop that describes information related to the entire program. The PMT has an elementary stream loop having information related to each video stream. In this configuration example, there is a video elementary stream loop âvideo ES1 loopâ corresponding to the base stream, and a video elementary stream loop âvideo ES2 loopâ corresponding to the enhanced stream.
ãvideo ES1 loopãã«ã¯ãããŒã¹ã¹ããªãŒã ïŒvideo PES1ïŒã«å¯Ÿå¿ããŠãã¹ããªãŒã ã¿ã€ãããã±ããèå¥åïŒPIDïŒçã®æ å ±ãé 眮ããããšå ±ã«ããã®ãããªã¹ããªãŒã ã«é¢é£ããæ å ±ãèšè¿°ãããã¹ã¯ãªãã¿ãé 眮ãããããã®ã¹ããªãŒã ã¿ã€ãã¯ãããŒã¹ã¹ããªãŒã ã瀺ãâïŒïœïŒïŒâãšãããããŸãããã¹ã¯ãªãã¿ã®äžã€ãšããŠãäžè¿°ãããã¹ã¯ãªãã¿ããã«ãã¹ããªãŒã ã»ãã¹ã¯ãªãã¿ãæ¿å ¥ãããã   In the âvideo ES1 loopâ, information such as a stream type and a packet identifier (PID) is arranged corresponding to the base stream (video PES1), and a descriptor describing information related to the video stream is also arranged. The This stream type is â0x24â indicating the base stream. Further, the HEVC descriptor and the multi-stream descriptor described above are inserted as one of the descriptors.
ãŸãããvideo ES2 loopãã«ã¯ããšã³ãã³ã¹ã¹ããªãŒã ïŒvideo PES2ïŒã«å¯Ÿå¿ããŠãã¹ããªãŒã ã¿ã€ãããã±ããèå¥åïŒPIDïŒçã®æ å ±ãé 眮ããããšå ±ã«ããã®ãããªã¹ããªãŒã ã«é¢é£ããæ å ±ãèšè¿°ãããã¹ã¯ãªãã¿ãé 眮ãããããã®ã¹ããªãŒã ã¿ã€ãã¯ããšã³ãã³ã¹ã¹ããªãŒã ã瀺ããäŸãã°æ°èŠå®çŸ©ããâïŒïœïŒïŒâãšãããããŸãããã¹ã¯ãªãã¿ã®äžã€ãšããŠãäžè¿°ãããã¹ã¯ãªãã¿ããã«ãã¹ããªãŒã ã»ãã¹ã¯ãªãã¿ãæ¿å ¥ãããã   In addition, in the âvideo ES2 loopâ, information such as a stream type and a packet identifier (PID) is arranged corresponding to the enhanced stream (video PES2), and a descriptor describing information related to the video stream is also provided. Be placed. This stream type indicates an enhanced stream, for example, â0x25â that is newly defined. Further, the HEVC descriptor and the multi-stream descriptor described above are inserted as one of the descriptors.
å³ïŒïŒã¯ããããµãŒãã¹ãïŒã¹ããªãŒã ã§é ä¿¡ããå Žåã®ãã©ã³ã¹ããŒãã¹ããªãŒã ã®æ§æäŸã瀺ããŠããããã®ãã©ã³ã¹ããŒãã¹ããªãŒã ã«ã¯ãããŒã¹ã¹ããªãŒã ãšïŒã€ã®ãšã³ãã³ã¹ã¹ããªãŒã ã®ïŒã€ã®ãããªã¹ããªãŒã ãå«ãŸããŠãããããªãã¡ããã®æ§æäŸã§ã¯ãããŒã¹ã¹ããªãŒã ã®ïŒ°ïŒ¥ïŒ³ãã±ãããvideo PES1ããååšãããšå ±ã«ããšã³ãã³ã¹ã¹ããªãŒã ã®ïŒ°ïŒ¥ïŒ³ãã±ãããvideo PES2ãããvideo PES3ããååšããã   FIG. 24 illustrates a configuration example of the transport stream TS when a certain service is distributed in three streams. This transport stream TS includes three video streams of a base stream and two enhanced streams. That is, in this configuration example, the base stream PES packet âvideo PES1â exists, and the enhanced stream PES packets âvideo PES2â and âvideo PES3â exist.
ãŸããïŒïŒŽã«ã¯ãåãããªã¹ããªãŒã ã«é¢é£ããæ å ±ãæã€ãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã»ã«ãŒããååšããããã®æ§æäŸã§ã¯ãããŒã¹ã¹ããªãŒã ã«å¯Ÿå¿ãããããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«ãŒããvideo ES1 loopããååšãããšå ±ã«ãïŒã€ã®ãšã³ãã³ã¹ã¹ããªãŒã ã«å¯Ÿå¿ãããããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«ãŒããvideo ES2 loopãããã³ãããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«ãŒããvideo ES3 loopããååšããã   The PMT has an elementary stream loop having information related to each video stream. In this configuration example, there is a video elementary stream loop âvideo ES1 loopâ corresponding to the base stream, and a video elementary stream loop âvideo ES2 loopâ and a video elementary stream loop âvideoâ corresponding to two enhanced streams. ES3 loop "exists.
ãvideo ES1 loopãã«ã¯ãããŒã¹ã¹ããªãŒã ïŒvideo PES1ïŒã«å¯Ÿå¿ããŠãã¹ããªãŒã ã¿ã€ãããã±ããèå¥åïŒPIDïŒçã®æ å ±ãé 眮ããããšå ±ã«ããã®ãããªã¹ããªãŒã ã«é¢é£ããæ å ±ãèšè¿°ãããã¹ã¯ãªãã¿ãé 眮ãããããã®ã¹ããªãŒã ã¿ã€ãã¯ãããŒã¹ã¹ããªãŒã ã瀺ãâïŒïœïŒïŒâãšãããããŸãããã¹ã¯ãªãã¿ã®äžã€ãšããŠãäžè¿°ãããã¹ã¯ãªãã¿ããã«ãã¹ããªãŒã ã»ãã¹ã¯ãªãã¿ãæ¿å ¥ãããã   In the âvideo ES1 loopâ, information such as a stream type and a packet identifier (PID) is arranged corresponding to the base stream (video PES1), and a descriptor describing information related to the video stream is also arranged. The This stream type is â0x24â indicating the base stream. Further, the HEVC descriptor and the multi-stream descriptor described above are inserted as one of the descriptors.
ãŸãããvideo ES2 loopãã«ã¯ããšã³ãã³ã¹ã¹ããªãŒã ïŒvideo PES2ïŒã«å¯Ÿå¿ããŠãã¹ããªãŒã ã¿ã€ãããã±ããèå¥åïŒPIDïŒçã®æ å ±ãé 眮ããããšå ±ã«ããã®ãããªã¹ããªãŒã ã«é¢é£ããæ å ±ãèšè¿°ãããã¹ã¯ãªãã¿ãé 眮ãããããã®ã¹ããªãŒã ã¿ã€ãã¯ããšã³ãã³ã¹ã¹ããªãŒã ã瀺ããäŸãã°æ°èŠå®çŸ©ããâïŒïœïŒïŒâãšãããããŸãããã¹ã¯ãªãã¿ã®äžã€ãšããŠãäžè¿°ãããã¹ã¯ãªãã¿ããã«ãã¹ããªãŒã ã»ãã¹ã¯ãªãã¿ãæ¿å ¥ãããã   In addition, in the âvideo ES2 loopâ, information such as a stream type and a packet identifier (PID) is arranged corresponding to the enhanced stream (video PES2), and a descriptor describing information related to the video stream is also provided. Be placed. This stream type indicates an enhanced stream, for example, â0x25â that is newly defined. Further, the HEVC descriptor and the multi-stream descriptor described above are inserted as one of the descriptors.
ãŸãããvideo ES3 loopãã«ã¯ããšã³ãã³ã¹ã¹ããªãŒã ïŒvideo PES3ïŒã«å¯Ÿå¿ããŠãã¹ããªãŒã ã¿ã€ãããã±ããèå¥åïŒPIDïŒçã®æ å ±ãé 眮ããããšå ±ã«ããã®ãããªã¹ããªãŒã ã«é¢é£ããæ å ±ãèšè¿°ãããã¹ã¯ãªãã¿ãé 眮ãããããã®ã¹ããªãŒã ã¿ã€ãã¯ããšã³ãã³ã¹ã¹ããªãŒã ã瀺ããäŸãã°æ°èŠå®çŸ©ããâïŒïœïŒïŒâãããã¯âïŒïœïŒïŒâãšãããããŸãããã¹ã¯ãªãã¿ã®äžã€ãšããŠãäžè¿°ãããã¹ã¯ãªãã¿ããã«ãã¹ããªãŒã ã»ãã¹ã¯ãªãã¿ãæ¿å ¥ãããã   In addition, in the âvideo ES3 loopâ, information such as a stream type and a packet identifier (PID) is arranged corresponding to the enhanced stream (video PES3), and a descriptor describing information related to the video stream is also provided. Be placed. This stream type indicates an enhanced stream, for example, newly defined â0x25â or â0x26â. Further, the HEVC descriptor and the multi-stream descriptor described above are inserted as one of the descriptors.
å³ïŒã«æ»ã£ãŠãéä¿¡éšïŒïŒïŒã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ããäŸãã°ãïŒïŒ¯ïŒŠïŒ€ïŒçã®æŸéã«é©ããå€èª¿æ¹åŒã§å€èª¿ããå€èª¿ä¿¡å·ãéä¿¡ã¢ã³ããããéä¿¡ããã
  Returning to FIG. 2, the
å³ïŒã«ç€ºãéä¿¡è£
眮ïŒïŒïŒã®åäœãç°¡åã«èª¬æããããšã³ã³ãŒãïŒïŒïŒã«ã¯ãéå§çž®ã®åç»åããŒã¿ãå
¥åãããããšã³ã³ãŒãïŒïŒïŒã§ã¯ããã®åç»åããŒã¿ã«å¯ŸããŠãé局笊å·åãè¡ããããããªãã¡ããšã³ã³ãŒãïŒïŒïŒã§ã¯ããã®åç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãããŠç¬Šå·åãããåéå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ãããªã¹ããªãŒã ãçæãããããã®éãåç
§ãããã¯ãã£ããèªå·±éå±€ããã³ïŒãŸãã¯èªå·±éå±€ãããäžäœã®éå±€ã«æå±ããããã«ã笊å·åãããã
  The operation of the
ãšã³ã³ãŒãïŒïŒïŒã§ã¯ãè€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€æå®æ°ã®ãããªã¹ããªãŒã ãçæãããããã®å Žåãæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ããŒã¹ã¹ããªãŒã ãçæããããšå
±ã«ããã®æäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€æå®æ°ã®ãšã³ãã³ã¹ã¹ããªãŒã ãçæãããã
  In the
ãšã³ã³ãŒãïŒïŒïŒã§çæãããæå®æ°ã®ãããªã¹ããªãŒã ã¯ãå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒã«äŸçµŠãããäžæçã«èç©ãããããã«ããã¬ã¯ãµïŒïŒïŒã§ã¯ãå§çž®ããŒã¿ãããã¡ïŒïŒïŒã«èç©ãããŠããåãããªã¹ããªãŒã ãèªã¿åºããããã±ããåãããããã«ãã©ã³ã¹ããŒããã±ããåãããŠå€éãããå€éåã¹ããªãŒã ãšããŠã®ãã©ã³ã¹ããŒãã¹ããªãŒã ãåŸãããã
  The predetermined number of video streams generated by the
ãŸãããã«ããã¬ã¯ãµïŒïŒïŒã§ã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ã®ã¬ã€ã€ã«ãæå®æ°ã®ãããªã¹ããªãŒã ã®ããããããããŒã¹ã¹ããªãŒã ã§ããããšã³ãã³ã¹ã¹ããªãŒã ã§ããããèå¥ããããã®èå¥æ
å ±ãæ¿å
¥ãããããã®èå¥æ
å ±ã¯ãäŸãã°ãåãããªã¹ããªãŒã ã«ãããã察å¿ããŠé
眮ããããããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«ãŒãïŒVideo ES loopïŒã®äžã«æ¿å
¥ãããã¹ããªãŒã ã¿ã€ãã§ããããã®å ŽåãããŒã¹ã¹ããªãŒã ã®ã¹ããªãŒã ã¿ã€ãã¯âïŒïœïŒïŒâãšããããšã³ãã³ã¹ã¹ããªãŒã ã®ã¹ããªãŒã ã¿ã€ãã¯ãäŸãã°æ°èŠå®çŸ©ããâïŒïœïŒïŒâãšãããã
  Also, in the
ãŸãããã«ããã¬ã¯ãµïŒïŒïŒã§ã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ã®ã¬ã€ã€ã«ãæå®æ°ã®ãããªã¹ããªãŒã ã®ããããã«å¯Ÿå¿ããŠããããªã¹ããªãŒã ã®æ§ææ
å ±ãæ¿å
¥ããããããªãã¡ããã«ããã¬ã¯ãµïŒïŒïŒã§ã¯ãåãããªã¹ããªãŒã ã«å¯Ÿå¿ãããããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«ãŒãã«ããã¹ã¯ãªãã¿ããã«ãã¹ããªãŒã ã»ãã¹ã¯ãªãã¿ãæ¿å
¥ãããã
  Further, in the
ãã«ããã¬ã¯ãµïŒïŒïŒã§çæããããã©ã³ã¹ããŒãã¹ããªãŒã ã¯ãéä¿¡éšïŒïŒïŒã«éããããéä¿¡éšïŒïŒïŒã§ã¯ããã®ãã©ã³ã¹ããŒãã¹ããªãŒã ããäŸãã°ãïŒïŒ¯ïŒŠïŒ€ïŒçã®æŸéã«é©ããå€èª¿æ¹åŒã§å€èª¿ãããå€èª¿ä¿¡å·ãéä¿¡ã¢ã³ããããéä¿¡ãããã
  The transport stream TS generated by the
ãåä¿¡è£
眮ã®æ§æã
å³ïŒïŒã¯ãåä¿¡è£
眮ïŒïŒïŒã®æ§æäŸã瀺ããŠããããã®åä¿¡è£
眮ïŒïŒïŒã¯ãïŒCentral Processing UnitïŒïŒïŒïŒãšãåä¿¡éšïŒïŒïŒãšãããã«ããã¬ã¯ãµïŒïŒïŒãšãå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒcoded picture bufferïŒïŒïŒïŒãæããŠããããŸãããã®åä¿¡è£
眮ïŒïŒïŒã¯ããã³ãŒãïŒïŒïŒãšãéå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒdecoded picture bufferïŒïŒïŒïŒãšããã¹ãåŠçéšïŒïŒïŒãæããŠãããïŒïŒïŒã¯ãå¶åŸ¡éšãæ§æããåä¿¡è£
眮ïŒïŒïŒã®åéšã®åäœãå¶åŸ¡ããã
"Receiver configuration"
FIG. 25 illustrates a configuration example of the receiving
åä¿¡éšïŒïŒïŒã¯ãåä¿¡ã¢ã³ããã§åä¿¡ãããå€èª¿ä¿¡å·ã埩調ãããã©ã³ã¹ããŒãã¹ããªãŒã ãååŸãããããã«ããã¬ã¯ãµïŒïŒïŒã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ããããã³ãŒãèœåïŒDecoder temporal layer capabilityïŒã«å¿ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãéžæçã«åãåºããå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒcoded picture bufferïŒïŒïŒïŒã«éãã
  The receiving
å³ïŒïŒã¯ãããã«ããã¬ã¯ãµïŒïŒïŒã®æ§æäŸã瀺ããŠãããããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¢ããããŒã·ã§ã³ãã£ãŒã«ãæœåºéšïŒïŒïŒãšãã¯ããã¯æ
å ±æœåºéšïŒïŒïŒãšããã€ããŒãæœåºéšïŒïŒïŒãšãã»ã¯ã·ã§ã³æœåºéšïŒïŒïŒãšãããŒãã«/ãã¹ã¯ãªãã¿æœåºéšïŒïŒïŒãšããã±ããæœåºéšïŒïŒïŒãæããŠããããŸããããã«ããã¬ã¯ãµïŒïŒïŒã¯ããããæœåºéšïŒïŒïŒãšãã¿ã€ã ã¹ã¿ã³ãæœåºéšïŒïŒïŒãšããã€ããŒãæœåºéšïŒïŒïŒãšãã¹ããªãŒã æ§æéšïŒã¹ããªãŒã ã³ã³ããŒã¶ïŒïŒïŒïŒãæããŠããã
  FIG. 26 shows a configuration example of the
ã¢ããããŒã·ã§ã³ãã£ãŒã«ãæœåºéšïŒïŒïŒã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ã®ã¢ããããŒã·ã§ã³ãã£ãŒã«ããæã€ïŒŽïŒ³ãã±ããããåœè©²ã¢ããããŒã·ã§ã³ãã£ãŒã«ããæœåºãããã¯ããã¯æ
å ±æœåºéšïŒïŒïŒã¯ãïŒProgram Clock ReferenceïŒãå«ãŸããã¢ããããŒã·ã§ã³ãã£ãŒã«ãããåœè©²ïŒ°ïŒ£ïŒ²ãæœåºããïŒïŒïŒã«éãã
  The TS adaptation
ãã€ããŒãæœåºéšïŒïŒïŒã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ã®ïŒŽïŒ³ãã€ããŒããæã€ïŒŽïŒ³ãã±ããããåœè©²ïŒŽïŒ³ãã€ããŒããæœåºãããã»ã¯ã·ã§ã³æœåºéšïŒïŒïŒã¯ãã»ã¯ã·ã§ã³ããŒã¿ãå«ãŸãããã€ããŒãããåœè©²ã»ã¯ã·ã§ã³ããŒã¿ãæœåºãããããŒãã«/ãã¹ã¯ãªãã¿æœåºéšïŒïŒïŒã¯ãã»ã¯ã·ã§ã³æœåºéšïŒïŒïŒã§æœåºãããã»ã¯ã·ã§ã³ããŒã¿ã解æããããŒãã«ããã¹ã¯ãªãã¿ãæœåºããããããŠãããŒãã«/ãã¹ã¯ãªãã¿æœåºéšïŒïŒïŒã¯ãtemporal_idã®æå°å€ïŒminïŒãšæ倧å€ïŒmaxïŒãæ倧éå±€æ°ãã¹ããªãŒã äŸåé¢ä¿ãã°ã«ãŒããªã©ããïŒïŒïŒã«éããšå
±ã«ãã¹ããªãŒã æ§æéšïŒïŒïŒã«éãã
  The TS payload extraction unit 233 extracts the TS payload from the TS packet having the TS payload of the transport stream TS. The
ãã±ããæœåºéšïŒïŒïŒã¯ããã±ãããå«ãŸãããã€ããŒãããåœè©²ïŒ°ïŒ¥ïŒ³ãã±ãããæœåºããããããæœåºéšïŒïŒïŒã¯ããã±ããæœåºéšïŒïŒïŒã§æœåºããããã±ããããããããæœåºãããã¿ã€ã ã¹ã¿ã³ãæœåºéšïŒïŒïŒã¯ããã¯ãã£æ¯ã«ïŒ°ïŒ¥ïŒ³ãããã«æ¿å
¥ãããŠããã¿ã€ã ã¹ã¿ã³ãïŒïŒ€ïŒŽïŒ³ãïŒãæœåºããïŒïŒïŒã«éããšå
±ã«ãã¹ããªãŒã æ§æéšïŒïŒïŒã«éãã
  The PES
ãã€ããŒãæœåºéšïŒïŒïŒã¯ããã±ããæœåºéšïŒïŒïŒã§æœåºããããã±ãããããã€ããŒããã€ãŸããåéå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæœåºãããã¹ããªãŒã æ§æéšïŒïŒïŒã¯ããã€ããŒãæœåºéšïŒïŒïŒã§åãåºãããåéå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ããããã³ãŒãèœåïŒDecoder temporal layer capabilityïŒã«å¿ããŠããŒã¹ã¹ããªãŒã ã®ã¿ããããã¯ããŒã¹ã¹ããªãŒã ãšæå®æ°ã®ãšã³ãã³ã¹ã¹ããªãŒã ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãéžæçã«åãåºããå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒcoded picture bufferïŒïŒïŒïŒã«éãããã®å Žåãã¹ããªãŒã æ§æéšïŒïŒïŒã¯ãããŒãã«/ãã¹ã¯ãªãã¿æœåºéšïŒïŒïŒã§åŸãããéå±€æ
å ±ãªã©ãåç
§ããã
  The PES
äŸãã°ããã©ã³ã¹ããŒãã¹ããªãŒã ã«å«ãŸããæå®æ°ã®ãããªã¹ããªãŒã ïŒç¬Šå·åã¹ããªãŒã ïŒã®ãã¬ãŒã ã¬ãŒããïŒïŒïŒïœïœïœã§ããå ŽåãèãããäŸãã°ãè€æ°ã®éå±€ãäœéå±€åŽã®éå±€çµãšé«éå±€åŽã®éå±€çµãšã«ïŒåãããåéå±€çµã®ãã¯ãã£ã®ãã¬ãŒã ã¬ãŒããããããïŒïŒïœïœïœã§ãããšãããäŸãã°ãäžè¿°ã®å³ïŒã«ç€ºãé局笊å·åäŸã§ã¯ãéå±€ïŒããïŒã¯äœéå±€åŽã®éå±€çµãšãããïŒïŒïœïœïœã®level_idc察å¿ã®ãã³ãŒãããã³ãŒãå¯èœãšãªãããŸããéå±€ïŒã¯é«éå±€åŽã®éå±€çµãšãããïŒïŒïŒïœïœïœã®level_idc察å¿ã®ãã³ãŒãããã³ãŒãå¯èœãšãªãã
  For example, consider a case where the frame rate of a predetermined number of video streams (encoded streams) included in the transport stream TS is 120 fps. For example, it is assumed that a plurality of hierarchies are divided into a hierarchy set on the lower hierarchy side and a hierarchy set on the higher hierarchy side, and the frame rate of pictures in each hierarchy set is 60 fps. For example, in the above-described hierarchical coding example shown in FIG. 3, layers 0 to 3 are set to a lower layer set, and a decoder corresponding to level_idc of 60 fps can be decoded. Further, the
ã¹ããªãŒã æ§æéšïŒïŒïŒã¯ããã³ãŒãèœåããïŒïŒïŒïœïœïœã«å¯Ÿå¿ããŠããå Žåããã±ããïŒïŒ°ïŒ©ïŒ€ïŒã«åºã¥ããŠãããŒã¹ã¹ããªãŒã ããã³ãšã³ãã³ã¹ã¹ããªãŒã ã®åæ¹ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãåãåºããå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒã«éããäžæ¹ãã¹ããªãŒã æ§æéšïŒïŒïŒã¯ããã³ãŒãèœåããïŒïŒïŒïœïœïœã«å¯Ÿå¿ããŠããªããïŒïŒïœïœïœã«å¯Ÿå¿ããŠããå Žåããã±ããïŒïŒ°ïŒ©ïŒ€ïŒã«åºã¥ããŠãããŒã¹ã¹ããªãŒã ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ã¿ãåãåºããå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒã«éãã
  When the decoding capability corresponds to 120 fps, the
å³ïŒïŒã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ã«ããŒã¹ã¹ããªãŒã ãšãšã³ãã³ã¹ã¹ããªãŒã ã®ïŒã€ã®ãããªã¹ããªãŒã ïŒç¬Šå·åã¹ããªãŒã ïŒãå«ãŸããŠããå Žåã«ãããã¹ããªãŒã æ§æéšïŒïŒïŒã®ãã¯ãã£ïŒã¹ã©ã€ã¹ïŒéžæã®äžäŸã瀺ããŠãããããŒã¹ã¹ããªãŒã ã®ãã±ããèå¥åïŒïŒ°ïŒ©ïŒ€ïŒã¯ïŒ°ïŒ©ïŒ€ ã§ããããšã³ãã³ã¹ã¹ããªãŒã ã®ãã±ããèå¥åïŒïŒ°ïŒ©ïŒ€ïŒã¯ïŒ°ïŒ©ïŒ€ ã§ãããšãããå³ç€ºã®äŸã¯ãäžè¿°ã®å³ïŒã«ç€ºãäŸã«å¯Ÿå¿ããŠããã第ïœã®ãµãã»ãã¯ãã£ã°ã«ãŒãïŒSub group of picturesïŒã®éšåã®ã¿ã瀺ããŠãããç©åœ¢æ ã§ç€ºãããŠããåãã¯ãã£ã«ä»ãããŠããæ°åã¯ãã³ãŒãé ïŒéä¿¡åŽã§ã¯ãšã³ã³ãŒãé ïŒã瀺ããŠããã
  FIG. 27 illustrates an example of picture (slice) selection by the
ãã³ãŒãèœåããïŒïŒïŒïœïœïœã«å¯Ÿå¿ããŠããå Žåãã¹ããªãŒã æ§æéšïŒïŒïŒã¯ããã±ããèå¥åïŒïŒ°ïŒ©ïŒ€ïŒã«åºã¥ããã£ã«ã¿ãªã³ã°ãè¡ã£ãŠãã§ããããŒã¹ã¹ããªãŒã ããã³ïŒ°ïŒ©ïŒ€ïŒ¢ã§ãããšã³ãã³ã¹ã¹ããªãŒã ã®åæ¹ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãåãåºããå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒã«éãããã®å ŽåãããŒã¹ã¹ããªãŒã ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã¯é åïŒïŒcpb_1ïŒã«èç©ãããšã³ãã³ã¹ã¹ããªãŒã ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã¯é åïŒïŒcpb_2ïŒã«èç©ããã
  When the decoding capability corresponds to 120 fps, the
äžæ¹ããã³ãŒãèœåããïŒïŒïŒïœïœïœã«å¯Ÿå¿ããŠããªããïŒïŒïœïœïœã«å¯Ÿå¿ããŠããå Žåãã¹ããªãŒã æ§æéšïŒïŒïŒã¯ããã±ããèå¥åïŒïŒ°ïŒ©ïŒ€ïŒã«åºã¥ããã£ã«ã¿ãªã³ã°ãè¡ã£ãŠãã§ããããŒã¹ã¹ããªãŒã ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã ããåãåºããå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒã«éããé åïŒïŒcpb_1ïŒã«èç©ããã
  On the other hand, when the decoding capability does not correspond to 120 fps but corresponds to 60 fps, the
å³ïŒïŒã¯ãããã«ããã¬ã¯ãµïŒïŒïŒã®åŠçãããŒã®äžäŸã瀺ããŠããããã®åŠçãããŒã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ã«ãæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ããŒã¹ã¹ããªãŒã ãšããã®æäžäœã®éå±€çµã®äžäœã«äœçœ®ããæå®æ°ã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€æå®æ°ã®ãšã³ãã³ã¹ã¹ããªãŒã ãå«ãŸããŠããå Žåã瀺ããŠããã
  FIG. 28 shows an example of the processing flow of the
ããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¹ãããïŒïŒã«ãããŠãåŠçãéå§ãããã®åŸã«ãã¹ãããïŒïŒã®åŠçã«ç§»ãããã®ã¹ãããïŒïŒãããŠãïŒïŒïŒããããã³ãŒãèœåïŒDecoder temporal layer capabilityïŒãèšå®ãããã次ã«ãããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¹ãããïŒïŒãããŠãå
šéå±€ïŒã¬ã€ã€ïŒããã³ãŒãããèœåããããåŠããå€æããã
  In step ST41, the
å
šéå±€ããã³ãŒãããèœåããããšããããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¹ãããïŒïŒã«ãããŠããã£ã«ã¿ã«ããå
šéå±€ãæ§æããå
šãŠã®ã¹ããªãŒã ãéžæããã»ã¯ã·ã§ã³ããŒã·ã³ã°ïŒSection parsingïŒãè¡ãããã®åŸãããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¹ãããïŒïŒã®åŠçã«ç§»ãã
  When there is an ability to decode the entire hierarchy, the
ã¹ãããïŒïŒã§å
šéå±€ããã³ãŒãããèœåããªããšããããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¹ãããïŒïŒã«ãããŠããã³ãŒãå¯èœãªäœéå±€ãæ§æããããŒã¹ã¹ããªãŒã ãå«ãæå®æ°ã®ã¹ããªãŒã ãéžæãããŸããé¢é£ããã»ã¯ã·ã§ã³ããŒã·ã³ã°ïŒSection parsingïŒãè¡ãããã®åŸãããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¹ãããïŒïŒã®åŠçã«ç§»ãã
  When there is no capability to decode the entire hierarchy in step ST43, the
ã¹ãããïŒïŒã«ãããŠãããã«ããã¬ã¯ãµïŒïŒïŒã¯ã察象ãšãªãã®ã»ã¯ã·ã§ã³ã®äžã§ããã¹ã¯ãªãã¿ããã«ãã¹ããªãŒã ã»ãã¹ã¯ãªãã¿ãèªã¿ãã°ã«ãŒãå
ã®ã¹ããªãŒã ã®äŸåé¢ä¿ãæ倧éå±€æ°ãtemporal_idã®æ倧ãæå°å€ãåŸãã
  In step ST45, the
次ã«ãããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¹ãããïŒïŒã§ãã¹ãããïŒïŒãããã¯ã¹ãããïŒïŒã§éžæãããã¹ããªãŒã ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒãžè»¢éãããããã«ããã¬ã¯ãµïŒïŒïŒã¯ãã¹ãããïŒïŒã®åŠçã®åŸãã¹ãããïŒïŒã«ãããŠãåŠçãçµäºããã
  Next, in step ST47, the
å³ïŒïŒã«æ»ã£ãŠãå§çž®ããŒã¿ãããã¡(ïœïœïœ)ïŒïŒïŒã¯ãããã«ããã¬ã¯ãµïŒïŒïŒã§åãåºããããããªã¹ããªãŒã ïŒç¬Šå·åã¹ããªãŒã ïŒããäžæçã«èç©ããããã³ãŒãïŒïŒïŒã¯ãå§çž®ããŒã¿ãããã¡ïŒïŒïŒã«èç©ãããŠãããããªã¹ããªãŒã ããããã³ãŒããã¹ãéå±€ãšããŠæå®ãããéå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãåãåºãããããŠããã³ãŒãïŒïŒïŒã¯ãåãåºãããåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ããããããããã®ãã¯ãã£ã®ãã³ãŒãã¿ã€ãã³ã°ã§ãã³ãŒãããéå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒã«éãã
  Returning to FIG. 25, the compressed data buffer (cpb) 204 temporarily accumulates the video stream (encoded stream) extracted by the
ããã§ããã³ãŒãïŒïŒïŒã«ã¯ãïŒïŒïŒãããã³ãŒããã¹ãéå±€ãtemporal_idã§æå®ãããããã®æå®éå±€ã¯ãããã«ããã¬ã¯ãµïŒïŒïŒã§åãåºããããããªã¹ããªãŒã ïŒç¬Šå·åã¹ããªãŒã ïŒã«å«ãŸããå
šéå±€ããããã¯äœéå±€åŽã®äžéšã®éå±€ãšãããïŒïŒïŒã«ããèªåçã«ããããã¯ãŠãŒã¶æäœã«å¿ããŠèšå®ãããããŸãããã³ãŒãïŒïŒïŒã«ã¯ãïŒïŒïŒãããïŒDecoding Time stampïŒã«åºã¥ããŠããã³ãŒãã¿ã€ãã³ã°ãäžããããããªãããã³ãŒãïŒïŒïŒã¯ãåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ããã³ãŒãããéã«ãå¿
èŠã«å¿ããŠãéå§çž®ããŒã¿ãããã¡ïŒïŒïŒãã被åç
§ãã¯ãã£ã®ç»åããŒã¿ãèªã¿åºããŠå©çšããã
  Here, in the
å³ïŒïŒã¯ããã³ãŒãïŒïŒïŒã®æ§æäŸã瀺ããŠããããã®ãã³ãŒãïŒïŒïŒã¯ããã³ãã©ã«ïŒ©ïŒ€è§£æéšïŒïŒïŒãšã察象éå±€éžæéšïŒïŒïŒãšãã¹ããªãŒã çµåéšïŒïŒïŒãšããã³ãŒãéšïŒïŒïŒãæããŠããããã³ãã©ã«ïŒ©ïŒ€è§£æéšïŒïŒïŒã¯ãå§çž®ããŒã¿ãããã¡ïŒïŒïŒã«èç©ãããŠãããããªã¹ããªãŒã ïŒç¬Šå·åã¹ããªãŒã ïŒãèªã¿åºããåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ïŒ®ïŒ¡ïŒ¬ãŠããããããã«æ¿å
¥ãããŠããtemporal_idã解æããã
  FIG. 29 shows a configuration example of the
察象éå±€éžæéšïŒïŒïŒã¯ãå§çž®ããŒã¿ãããã¡ïŒïŒïŒããèªã¿åºãããåãããªã¹ããªãŒã ããããã³ãã©ã«ïŒ©ïŒ€è§£æéšïŒïŒïŒã®è§£æçµæã«åºã¥ããŠããã³ãŒããã¹ãéå±€ãšããŠæå®ãããéå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãåãåºãããã®å Žåã察象éå±€éžæéšïŒïŒïŒããã¯ãå§çž®ããŒã¿ãããã¡ïŒïŒïŒããèªã¿åºããããããªã¹ããªãŒã ã®æ°ããã³æå®éå±€ã«å¿ããŠãåäžãŸãã¯è€æ°ã®ãããªã¹ããªãŒã ïŒç¬Šå·åã¹ããªãŒã ïŒãåºåãããã
  The target
ã¹ããªãŒã çµåéšïŒïŒïŒã¯ã察象éå±€éžæéšïŒïŒïŒããåºåãããåãããªã¹ããªãŒã ïŒç¬Šå·åã¹ããªãŒã ïŒãäžã€ã«çµåããããªããå³ç€ºãšã¯ç°ãªãããã¹ããªãŒã çµåéšïŒïŒïŒã¯ãïœïœïœãããã¡ïŒïŒïŒããåºåããããåãããªã¹ããªãŒã ïŒç¬Šå·åã¹ããªãŒã ïŒãäžã€ã«çµåããŠãããããã®å Žåãã¹ããªãŒã çµåéšïŒïŒïŒã¯ã察象éå±€éžæããã³ãã©ã«ïŒ©ïŒ€è§£æãšå
±ã«å®è¡ããããšãšãªããã¹ããªãŒã çµåéšïŒïŒïŒã¯ãåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ããã³ãŒãã¿ã€ãã³ã°æ
å ±ã«åºã¥ããŠïŒã€ã®ã¹ããªãŒã ã«ãããå³ïŒïŒã¯ãã¹ããªãŒã çµåã®äžäŸã瀺ããŠããã
  The
ãã®äŸã¯ãäžè¿°ã®å³ïŒã«ç€ºãäŸã«å¯Ÿå¿ããŠãããïŒïŒïŒšïœééã®ããŒã¹ã¹ããªãŒã ã®ãã¯ãã£ãšãïŒïŒïŒšïœééã®ãšã³ãã³ã¹ã¹ããªãŒã ã®ãã¯ãã£ãšãçµåããäŸã§ããããã®å Žåãåãã¯ãã£ã¯ïŒïŒïŒïŒšïœã®ã¿ã€ã ã¹ã¿ã³ãã®ïŒã€ã®ã¹ããªãŒã ãšãããã   This example corresponds to the example shown in FIG. 9 described above, and is an example in which a picture of a base stream at 60 Hz intervals and a picture of an enhanced stream at 60 Hz intervals are combined. In this case, each picture is a stream with a time stamp of 120 Hz.
ãªãããã®ïŒã€ã®ã¹ããªãŒã ã¯ãã³ãŒãéšïŒïŒïŒã«éãããåŸè¿°ããããã«ãåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã¯ãããããã³ãŒãã¿ã€ãã³ã°ã§ãã³ãŒããããïœïœïœïŒéå§çž®ããŒã¿ãããã¡ïŒïŒïŒïŒã«èç©ãããããã®åŸãéå§çž®ããŒã¿ãããã¡ïŒïŒïŒããåãã¯ãã£ã®ç»åããŒã¿ãããã¯ãã£ã®ãªãªãŒããè¡ãããŠé 次ïŒïŒïŒïŒšïœã§èªã¿åºããããå³ç€ºã®äŸã§ã¯ããŸãããããµãã»ãã¯ãã£ã°ã«ãŒãã®ãã¯ãã£ïŒå³äžããã®ãããã³ã°ã§ç€ºãïŒãèªã¿åºããããã®æ¬¡ã«ã次ã®ãµãã»ãã¯ãã£ã°ã«ãŒãã®ãã¯ãã£ïŒå·Šäžããã®ãããã³ã°ã§ç€ºãïŒãèªã¿åºãããããã³ãŒãåŸããããµãã»ãã¯ãã£ã°ã«ãŒãã®ãã¯ãã£ã®è¡šç€ºããªãããŠããéã次ã®ãµãã»ãã¯ãã£ã°ã«ãŒãã®ãã¯ãã£ã¯éå§çž®ããŒã¿ãããã¡ïŒïŒïŒã«èç©ãããŠããŠããã®åŸã®åç
§ãã¯ãã£ãšãªãã
  This one stream is sent to the
ãªããè€æ°ã®ã¹ããªãŒã ã®ãã¯ãã£ã®ãŸãšãåŠçèªäœã¯ãäžè¿°ã®ããã«ããã¬ã¯ãµïŒïŒïŒã«ãããŠãéžæãããè€æ°ã®ã¹ããªãŒã ã«å¯ŸããŠè¡ã£ãŠãå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒã«ïŒã€ã®ã¹ããªãŒã ãšããŠè»¢éããããã«ããŠãããããã®éã®çµååŠçããåæ§ã«ããã³ãŒãã¿ã€ãã³ã°æ
å ±ã«åºã¥ããŠè¡ãããããã®å Žåããã³ãŒãã«ãããçµååŠçã¯äžèŠãšãªãã
  Note that the picture summarization process itself of the plurality of streams is performed on the plurality of selected streams in the above-described
ãã³ãŒãéšïŒïŒïŒã¯ãã¹ããªãŒã çµåéšïŒïŒïŒã§çµåããããããªã¹ããªãŒã ïŒç¬Šå·åã¹ããªãŒã ïŒãæã€åãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ããé 次ãã³ãŒãã¿ã€ãã³ã°ã§ãã³ãŒãããéå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒã«éãã
  The
ãã®å Žåããã³ãŒãéšïŒïŒïŒã¯ããã®è§£æãè¡ã£ãŠãäŸãã°ããµãã¬ã€ã€ããšã®ãããã¬ãŒãã®ã¬ãã«æå®å€ãsublayer_level_idcããææ¡ãããã³ãŒãèœåå
ã§ãã³ãŒããåŸããã®ãã©ããã確èªããããŸãããã®å Žåããã³ãŒãéšïŒïŒïŒã¯ãã®è§£æãè¡ã£ãŠãäŸãã°ããinitial_cpb_removal_timeãããcpb_removal_delayããææ¡ããïŒïŒïŒããã®ãã³ãŒãã¿ã€ãã³ã°ãé©åã確èªããã
  In this case, the
ãã³ãŒãéšïŒïŒïŒã¯ãã¹ã©ã€ã¹ïŒSliceïŒã®ãã³ãŒããè¡ãéã«ãã¹ã©ã€ã¹ãããïŒSlice headerïŒãããæéæ¹åã®äºæž¬å
ãè¡šãæ
å ±ãšããŠããref_idx_l0_active(ref_idx_l1_active)ãååŸããæéæ¹åã®äºæž¬ãè¡ãããªãããã³ãŒãåŸã®ãã¯ãã£ã¯ãã¹ã©ã€ã¹ãããïŒslice headerïŒããåŸããããshort_term_ref_pic_set_idxãããããã¯ãit_idx_spsããææšãšãããŠãä»ã®ãã¯ãã£ã«ãã被åç
§ãšããŠåŠçãããã
  When decoding the slice (Slice), the
å³ïŒïŒã«æ»ã£ãŠãéå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒã¯ããã³ãŒãïŒïŒïŒã§ãã³ãŒããããåãã¯ãã£ã®ç»åããŒã¿ããäžæçã«èç©ããããã¹ãåŠçéšïŒïŒïŒã¯ãéå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒãã衚瀺ã¿ã€ãã³ã°ã§é 次èªã¿åºãããåãã¯ãã£ã®ç»åããŒã¿ã«å¯ŸããŠããã®ãã¬ãŒã ã¬ãŒããã衚瀺èœåã«åãããåŠçãè¡ãããã®å ŽåãïŒïŒïŒãããïŒPresentation Time stampïŒã«åºã¥ããŠã衚瀺ã¿ã€ãã³ã°ãäžããããã
  Returning to FIG. 25, the uncompressed data buffer (dpb) 206 temporarily stores the image data of each picture decoded by the
äŸãã°ããã³ãŒãåŸã®åãã¯ãã£ã®ç»åããŒã¿ã®ãã¬ãŒã ã¬ãŒããïŒïŒïŒïœïœïœã§ãã£ãŠã衚瀺èœåãïŒïŒïŒïœïœïœã§ãããšãããã¹ãåŠçéšïŒïŒïŒã¯ããã³ãŒãåŸã®åãã¯ãã£ã®ç»åããŒã¿ããã®ãŸãŸãã£ã¹ãã¬ã€ã«éãããŸããäŸãã°ããã³ãŒãåŸã®åãã¯ãã£ã®ç»åããŒã¿ã®ãã¬ãŒã ã¬ãŒããïŒïŒïŒïœïœïœã§ãã£ãŠã衚瀺èœåãïŒïŒïœïœïœã§ãããšãããã¹ãåŠçéšïŒïŒïŒã¯ããã³ãŒãåŸã®åãã¯ãã£ã®ç»åããŒã¿ã«å¯ŸããŠæéæ¹å解å床ãïŒ/ïŒåãšãªãããã«ãµããµã³ãã«åŠçãæœããïŒïŒïœïœïœã®ç»åããŒã¿ãšããŠãã£ã¹ãã¬ã€ã«éãã
  For example, when the frame rate of the image data of each picture after decoding is 120 fps and the display capability is 120 fps, the
ãŸããäŸãã°ããã³ãŒãåŸã®åãã¯ãã£ã®ç»åããŒã¿ã®ãã¬ãŒã ã¬ãŒããïŒïŒïœïœïœã§ãã£ãŠã衚瀺èœåãïŒïŒïŒïœïœïœã§ãããšãããã¹ãåŠçéšïŒïŒïŒã¯ããã³ãŒãåŸã®åãã¯ãã£ã®ç»åããŒã¿ã«å¯ŸããŠæéæ¹å解å床ãïŒåãšãªãããã«è£éåŠçãæœããïŒïŒïŒïœïœïœã®ç»åããŒã¿ãšããŠãã£ã¹ãã¬ã€ã«éãããŸããäŸãã°ããã³ãŒãåŸã®åãã¯ãã£ã®ç»åããŒã¿ã®ãã¬ãŒã ã¬ãŒããïŒïŒïœïœïœã§ãã£ãŠã衚瀺èœåãïŒïŒïœïœïœã§ãããšãããã¹ãåŠçéšïŒïŒïŒã¯ããã³ãŒãåŸã®åãã¯ãã£ã®ç»åããŒã¿ããã®ãŸãŸãã£ã¹ãã¬ã€ã«éãã
  For example, when the frame rate of the image data of each picture after decoding is 60 fps and the display capability is 120 fps, the
å³ïŒïŒã¯ããã¹ãåŠçéšïŒïŒïŒã®æ§æäŸã瀺ããŠããããã®äŸã¯ãäžè¿°ããããã«ãã³ãŒãåŸã®åãã¯ãã£ã®ç»åããŒã¿ã®ãã¬ãŒã ã¬ãŒããïŒïŒïŒïœïœïœãããã¯ïŒïŒïœïœïœã§ãã£ãŠã衚瀺èœåãïŒïŒïŒïœïœïœãããã¯ïŒïŒïœïœïœã§ããå Žåã«å¯ŸåŠå¯èœãšããäŸã§ããã
  FIG. 31 shows a configuration example of the
ãã¹ãåŠçéšïŒïŒïŒã¯ãè£ééšïŒïŒïŒãšããµããµã³ãã«éšïŒïŒïŒãšãã¹ã€ããéšïŒïŒïŒãæããŠãããéå§çž®ããŒã¿ãããã¡ïŒïŒïŒããã®ãã³ãŒãåŸã®åãã¯ãã£ã®ç»åããŒã¿ã¯ãçŽæ¥ã¹ã€ããéšïŒïŒïŒã«å
¥åããããããã¯è£ééšïŒïŒïŒã§ïŒåã®ãã¬ãŒã ã¬ãŒããšãããåŸã«ã¹ã€ããéšïŒïŒïŒã«å
¥åããããããã¯ãµããµã³ãã«éšïŒïŒïŒã§ïŒ/ïŒåã®ãã¬ãŒã ã¬ãŒããšãããåŸã«ã¹ã€ããéšïŒïŒïŒã«å
¥åãããã
  The
ã¹ã€ããéšïŒïŒïŒã«ã¯ãïŒïŒïŒãããéžææ
å ±ãäŸçµŠãããããã®éžææ
å ±ã¯ãïŒïŒïŒãã衚瀺èœåãåç
§ããŠèªåçã«ããããã¯ããŠãŒã¶æäœã«å¿ããŠçºçãããã¹ã€ããéšïŒïŒïŒã¯ãéžææ
å ±ã«åºã¥ããŠãå
¥åã®ãããããéžæçã«åºåãšãããããã«ãããéå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒãã衚瀺ã¿ã€ãã³ã°ã§é 次èªã¿åºãããåãã¯ãã£ã®ç»åããŒã¿ã®ãã¬ãŒã ã¬ãŒãã¯ã衚瀺èœåã«åã£ããã®ãšãããã
  Selection information is supplied from the
å³ïŒïŒã¯ããã³ãŒãïŒïŒïŒããã¹ãåŠçéšïŒïŒïŒã®åŠçãããŒã®äžäŸã瀺ããŠããããã³ãŒãïŒïŒïŒããã¹ãåŠçéšïŒïŒïŒã¯ãã¹ãããïŒïŒã«ãããŠãåŠçãéå§ãããã®åŸã«ãã¹ãããïŒïŒã®åŠçã«ç§»ãããã®ã¹ãããïŒïŒã«ãããŠããã³ãŒãïŒïŒïŒã¯ãå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒã«èç©ãããŠãããã³ãŒã察象ã®ãããªã¹ããªãŒã ãèªã¿åºããtemporal_idã«åºã¥ããŠãïŒïŒïŒãããã³ãŒã察象ãšããŠæå®ãããéå±€ã®ãã¯ãã£ãéžæããã
  FIG. 32 shows an example of the processing flow of the
次ã«ããã³ãŒãïŒïŒïŒã¯ãã¹ãããïŒïŒã«ãããŠãéžæãããåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ããã³ãŒãã¿ã€ãã³ã°ã§é 次ãã³ãŒããããã³ãŒãåŸã®åãã¯ãã£ã®ç»åããŒã¿ãéå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒã«è»¢éããŠãäžæçã«èç©ããã次ã«ããã¹ãåŠçéšïŒïŒïŒã¯ãã¹ãããïŒïŒã«ãããŠãéå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒããã衚瀺ã¿ã€ãã³ã°ã§åãã¯ãã£ã®ç»åããŒã¿ãèªã¿åºãã
  Next, in step ST53, the
次ã«ããã¹ãåŠçéšïŒïŒïŒã¯ãèªã¿åºãããåãã¯ãã£ã®ç»åããŒã¿ã®ãã¬ãŒã ã¬ãŒãã衚瀺èœåã«ãã£ãŠãããåŠããå€æããããã¬ãŒã ã¬ãŒãã衚瀺èœåã«åã£ãŠããªããšãããã¹ãåŠçéšïŒïŒïŒã¯ãã¹ãããïŒïŒã«ãããŠããã¬ãŒã ã¬ãŒãã衚瀺èœåã«åãããŠããã£ã¹ãã¬ã€ã«éãããã®åŸãã¹ãããïŒïŒã«ãããŠãåŠçãçµäºãããäžæ¹ããã¬ãŒã ã¬ãŒãã衚瀺èœåã«åã£ãŠãããšãããã¹ãåŠçéšïŒïŒïŒã¯ãã¹ãããïŒïŒã«ãããŠããã¬ãŒã ã¬ãŒããã®ãŸãŸã§ãã£ã¹ãã¬ã€ã«éãããã®åŸãã¹ãããïŒïŒã«ãããŠãåŠçãçµäºããã
  Next, the
å³ïŒïŒã«ç€ºãåä¿¡è£
眮ïŒïŒïŒã®åäœãç°¡åã«èª¬æãããåä¿¡éšïŒïŒïŒã§ã¯ãåä¿¡ã¢ã³ããã§åä¿¡ãããå€èª¿ä¿¡å·ã埩調ããããã©ã³ã¹ããŒãã¹ããªãŒã ãååŸãããããã®ãã©ã³ã¹ããŒãã¹ããªãŒã ã¯ãããã«ããã¬ã¯ãµïŒïŒïŒã«éããããããã«ããã¬ã¯ãµïŒïŒïŒã§ã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ããããã³ãŒãèœåïŒDecoder temporal layer capabilityïŒã«å¿ããŠå
šéšãããã¯äžéšã®ãããªã¹ããªãŒã ãããã£ã«ã¿ãªã³ã°ãããã
  The operation of receiving
äŸãã°ããã³ãŒãèœåãé«ãå Žåã«ã¯ãããŒã¹ã¹ããªãŒã ããã³ãšã³ãã³ã¹ã¹ããªãŒã ã®å
šãŠã®ãããªã¹ããªãŒã ãéžæãããããŸããäŸãã°ããã³ãŒãèœåãäœãå Žåã«ã¯ããã³ãŒãå¯èœãªéå±€ãå«ããããŒã¹ã¹ããªãŒã ãå«ãæå®æ°ã®ãããªã¹ããªãŒã ãéžæãããããããŠãããã«ããã¬ã¯ãµïŒïŒïŒããã¯ãéžæããããããªã¹ããªãŒã ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒã«éãããäžæçã«èç©ãããã
  For example, when the decoding capability is high, all video streams of the base stream and the enhanced stream are selected. For example, when the decoding capability is low, a predetermined number of video streams including a base stream including a hierarchy capable of decoding are selected. Then, from the
ãã³ãŒãïŒïŒïŒã§ã¯ãå§çž®ããŒã¿ãããã¡ïŒïŒïŒã«èç©ãããŠãããããªã¹ããªãŒã ããããã³ãŒããã¹ãéå±€ãšããŠæå®ãããéå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãåãåºãããããããŠããã³ãŒãïŒïŒïŒã§ã¯ãåãåºãããåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ããããããããã®ãã¯ãã£ã®ãã³ãŒãã¿ã€ãã³ã°ã§ãã³ãŒããããéå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒã«éãããäžæçã«èç©ãããããã®å Žåãåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ããã³ãŒããããéã«ãå¿
èŠã«å¿ããŠãéå§çž®ããŒã¿ãããã¡ïŒïŒïŒãã被åç
§ãã¯ãã£ã®ç»åããŒã¿ãèªã¿åºãããŠå©çšãããã
  In the
éå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒïŒïŒïŒãã衚瀺ã¿ã€ãã³ã°ã§é 次èªã¿åºãããåãã¯ãã£ã®ç»åããŒã¿ã¯ããã¹ãåŠçéšïŒïŒïŒã«éãããããã¹ãåŠçéšïŒïŒïŒã§ã¯ãåãã¯ãã£ã®ç»åããŒã¿ã«å¯ŸããŠããã®ãã¬ãŒã ã¬ãŒããã衚瀺èœåã«åãããããã®è£éãããã¯ãµããµã³ãã«ãè¡ãããããã®ãã¹ãåŠçéšïŒïŒïŒã§åŠçãããåãã¯ãã£ã®ç»åããŒã¿ã¯ããã£ã¹ãã¬ã€ã«äŸçµŠããããã®åãã¯ãã£ã®ç»åããŒã¿ã«ããåç»åã®è¡šç€ºãè¡ãããã
  The image data of each picture sequentially read from the uncompressed data buffer (dpb) 206 at the display timing is sent to the
以äžèª¬æããããã«ãå³ïŒã«ç€ºãéåä¿¡ã·ã¹ãã ïŒïŒã«ãããŠã¯ãéä¿¡åŽã«ãããŠãå°ãªããšããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åããããã®ã§ããããã®ãããäŸãã°ãåä¿¡åŽããæäžäœã®éå±€çµã«å«ãŸãè€æ°ã®éå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãåŠçå¯èœãªãã³ãŒãèœåãããå Žåãåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãåŠçãç¡çãªãé£ç¶ããŠè¡ãããšãå¯èœãšãªãã
  As described above, in the transmission /
ãŸããå³ïŒã«ç€ºãéåä¿¡ã·ã¹ãã ïŒïŒã«ãããŠã¯ãéä¿¡åŽã«ãããŠãæäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ãããã®éå±€çµããäžäœåŽã«äœçœ®ãããã¹ãŠã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã®äžéã¿ã€ãã³ã°ãšãªãããã«ç¬Šå·åããããã®ã§ããããã®ãããäŸãã°ãåä¿¡åŽã§ã¯ãæäžäœã®éå±€çµã ãã§ãªãããããããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãŸã§ããã³ãŒãããèœåãããå Žåã«ãåãã¯ãã£ã®ãã³ãŒãåŠçãé 次ã¹ã ãŒãºã«é²ããããšãå¯èœãšãªãã
  Further, in the transmission /
ãŸããå³ïŒã«ç€ºãéåä¿¡ã·ã¹ãã ïŒïŒã«ãããŠã¯ãéä¿¡åŽã«ãããŠãè€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ããéãæäžäœã®éå±€çµã«è€æ°ã®éå±€ãå«ã¿ããã®æäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã«ã¯ïŒã€ã®éå±€ãå«ãããã«ããããã®ã§ããããã®ãããäŸãã°ãåä¿¡åŽã§ã¯ãæäžäœã®éå±€çµã«å«ãŸãè€æ°ã®éå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãåŠçå¯èœãªãã³ãŒãèœåãããå Žåããã®æäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ãããªã¹ããªãŒã ã ããéžæããŠãããã¡ã«åã蟌ãã§ãã³ãŒãåŠçãè¡ãæ§æã§æžã¿ãè€æ°ã®ãããªã¹ããªãŒã ã®çµååŠçãªã©ãè¡ããªã©ã®è€éãªæ§æãäžèŠãšãªãã
  Further, in the transmission /
ãŸããå³ïŒã«ç€ºãéåä¿¡ã·ã¹ãã ïŒïŒã«ãããŠã¯ãéä¿¡åŽã«ãããŠããã©ã³ã¹ããŒãã¹ããªãŒã ã®ã¬ã€ã€ã«ãæå®æ°ã®ãããªã¹ããªãŒã ã®ããããããããŒã¹ã¹ããªãŒã ã§ããããšã³ãã³ã¹ã¹ããªãŒã ã§ããããèå¥ããããã®èå¥æ
å ±ãæ¿å
¥ããããã®ã§ããããã®ãããåä¿¡åŽã§ã¯ããã®èå¥æ
å ±ãå©çšããããšã§ãäŸãã°ãããŒã¹ã¹ããªãŒã ã ããéžæããäœéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãéžæçã«ãã³ãŒãããããšã容æã«å¯èœãšãªãã
  In the transmission /
ãŸããå³ïŒã«ç€ºãéåä¿¡ã·ã¹ãã ïŒïŒã«ãããŠã¯ãéä¿¡åŽã«ãããŠããã©ã³ã¹ããŒãã¹ããªãŒã ã®ã¬ã€ã€ã«ããã®ãã©ã³ã¹ããŒãã¹ããªãŒã ã«å«ãŸããæå®æ°ã®ãããªã¹ããªãŒã ã®ããããã«å¯Ÿå¿ããŠããã®ãããªã¹ããªãŒã ã®æ§ææ
å ±ãæ¿å
¥ããããã®ã§ããããã®ãããäŸãã°ãåä¿¡åŽã§ã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ã«å«ãŸããåãããªã¹ããªãŒã ã«ã€ããã©ã®ã°ã«ãŒãã«å±ããã®ããã©ã®ãããªã¹ããªãŒã äŸåé¢ä¿ã«ããã®ããéå±€æ°ããããã®é局笊å·åã«ä¿ããã®ã§ãããããªã©ã容æã«ææ¡å¯èœãšãªãã
  Also, in the transmission /
ãŸããå³ïŒã«ç€ºãéåä¿¡ã·ã¹ãã ïŒïŒã«ãããŠã¯ãåä¿¡åŽã«ãããŠãåä¿¡ããããããªã¹ããªãŒã ãããã³ãŒãèœåïŒDecoder temporal layer capabilityïŒã«å¿ããæå®é局以äžã®éå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãéžæçã«å§çž®ããŒã¿ãããã¡ïŒïŒïŒã«åã蟌ãŸããŠãã³ãŒãããããã®ã§ããããã®ãããäŸãã°ããã³ãŒãèœåã«å¿ããé©åãªãã³ãŒãåŠçãå¯èœãšãªãã
  Further, in the transmission /
ãŸããå³ïŒã«ç€ºãéåä¿¡ã·ã¹ãã ïŒïŒã«ãããŠã¯ãåä¿¡åŽã«ãããŠã埩å·ååŸã®åãã¯ãã£ã®ç»åããŒã¿ã®ãã¬ãŒã ã¬ãŒãããã¹ãåŠçéšïŒïŒïŒã§è¡šç€ºèœåã«åããããã®ã§ããããã®ãããäŸãã°ããã³ãŒãèœåãäœãå Žåã§ãã£ãŠããé«è¡šç€ºèœåã«ãã£ããã¬ãŒã ã¬ãŒãã®ç»åããŒã¿ãåŸãããšãå¯èœãšãªãã
  In the transmission /
ïŒïŒïŒå€åœ¢äŸïŒ
ãªããäžè¿°å®æœã®åœ¢æ
ã«ãããŠã¯ããã«ãã¹ããªãŒã ã»ãã¹ã¯ãªãã¿ïŒå³ïŒïŒåç
§ïŒã«éå±€æ倧å€ã®æ
å ±ã§ãããMax_layer_in_groupããèšè¿°ããŠåä¿¡åŽã«éã£ãŠãããããããéå±€æ倧å€ã®æ
å ±ããã¹ã¯ãªãã¿ã«èšè¿°ããŠåä¿¡åŽã«äŸçµŠãã代ããã«ããã¡ã€ã«ãããã¯çŽã®èŠæ Œæžã®æé¢ã§âæ倧ã®ã¬ã€ã€ãæå®ããâãšããè¡šèšãè¡ãããåä¿¡è£
眮ã«äºãéå±€æ倧å€ãæå®ãããã¯èšå®ãããããšãèããããããã®å Žåã«ããåä¿¡åŽã§ã¯ãäžè¿°ããããã«éä¿¡åŽãããã¹ã¯ãªãã¿ã§äŸçµŠãããå Žåãšåæ§ã«ããã®éå±€æ倧å€ã®æ
å ±ãåç
§ããŠãèªå·±ã®ãã³ãŒãèœåã«å¿ããéå±€ãå«ãã¹ããªãŒã ããã£ã«ã¿ãªã³ã°ããŠããã³ãŒãåŠçãè¡ãããšãšãªãã
<2. Modification>
In the above-described embodiment, âMax_layer_in_groupâ, which is information on the maximum layer value, is described in the multi-stream descriptor (see FIG. 18) and is sent to the receiving side. However, instead of describing the maximum layer information in the descriptor and supplying it to the receiving side, the notation of âspecify the maximum layerâ is used in the text of the file or paper standard document, and the layer maximum is stored in advance in the receiving device. It is also possible that a value is specified or set. In this case as well, the receiving side filters the stream including the layer corresponding to its own decoding capability by referring to the information on the maximum layer value as in the case where the descriptor is supplied from the transmitting side as described above. Thus, the decoding process is performed.
ãŸããäžè¿°å®æœã®åœ¢æ ã«ãããŠã¯ãæéçã¹ã±ãŒã©ããªãã£ïŒtemporal scalabilityïŒã®äŸã瀺ãããã«ãã¹ããªãŒã ã»ãã¹ã¯ãªãã¿ãçšããŠãåã¹ããªãŒã ã®æ§ææ å ±ãåä¿¡åŽã«éä¿¡ããŠããããããã詳现説æã¯çç¥ãããã空éçã¹ã±ãŒã©ããªãã£ããããã¯ãããã¬ãŒãã¹ã±ãŒã©ããªãã£ãªã©ã®ã¹ã±ãŒã©ããªãã£ã«é¢ããŠããåºæ¬ã¹ããªãŒã ïŒããŒã¹ã¹ããªãŒã ïŒãšæ¡åŒµã¹ããªãŒã ïŒãšã³ãã³ã¹ã¹ããªãŒã ïŒãšãå ±åãããµãŒãã¹ã«ãããŠã¯ãäžè¿°ã®ãã«ãã¹ããªãŒã ã»ãã¹ã¯ãªãã¿ãå¿çšããããšãå¯èœã§ãããããªãã¡ãäžè¿°ã®ãã«ãã¹ããªãŒã ã»ãã¹ã¯ãªãã¿ã¯ããã«ãã¹ããªãŒã 笊å·åãè¡ãéã«æçšãªã·ã°ããªã³ã°æ¹æ³ã§ããã   Moreover, in the above-mentioned embodiment, the example of temporal scalability (temporal scalability) is shown and the configuration information of each stream is transmitted to the receiving side using a multi-stream descriptor. However, although the detailed description is omitted, with regard to the scalability such as spatial scalability or bit rate scalability, in the service in which the basic stream (base stream) and the extension stream (enhanced stream) coexist, the multi-stream descriptor described above is used. Can be applied. That is, the above-described multi-stream descriptor is a useful signaling method when performing multi-stream encoding.
ãŸããäžè¿°å®æœã®åœ¢æ
ã«ãããŠã¯ãéä¿¡è£
眮ïŒïŒïŒãšåä¿¡è£
眮ïŒïŒïŒãããªãéåä¿¡ã·ã¹ãã ïŒïŒã瀺ããããæ¬æè¡ãé©çšãåŸãéåä¿¡ã·ã¹ãã ã®æ§æã¯ãããã«éå®ããããã®ã§ã¯ãªããäŸãã°ãåä¿¡è£
眮ïŒïŒïŒã®éšåããäŸãã°ãïŒïŒšïŒ€ïŒïŒ©ïŒHigh-Definition Multimedia InterfaceïŒãªã©ã®ããžã¿ã«ã€ã³ã¿ãã§ãŒã¹ã§æ¥ç¶ãããã»ãããããããã¯ã¹ããã³ã¢ãã¿ã®æ§æãªã©ã§ãã£ãŠãããããªãããïŒïŒ©ãã¯ãç»é²åæšã§ããã
  In the above-described embodiment, the transmission /
ãŸããäžè¿°å®æœã®åœ¢æ ã«ãããŠã¯ãã³ã³ããããã©ã³ã¹ããŒãã¹ããªãŒã ïŒïŒïŒ°ïŒ¥ïŒ§âïŒ ïŒŽïŒ³ïŒã§ããäŸã瀺ãããããããæ¬æè¡ã¯ãã€ã³ã¿ãŒãããçã®ãããã¯ãŒã¯ãå©çšããŠå信端æ«ã«é ä¿¡ãããæ§æã®ã·ã¹ãã ã«ãåæ§ã«é©çšã§ãããã€ã³ã¿ãŒãããã®é ä¿¡ã§ã¯ãïŒïŒ°ïŒããã以å€ã®ãã©ãŒãããã®ã³ã³ããã§é ä¿¡ãããããšãå€ããã€ãŸããã³ã³ãããšããŠã¯ãããžã¿ã«æŸéèŠæ Œã§æ¡çšãããŠãããã©ã³ã¹ããŒãã¹ããªãŒã ïŒïŒïŒ°ïŒ¥ïŒ§âïŒ ïŒŽïŒ³ïŒãã€ã³ã¿ãŒãããé ä¿¡ã§äœ¿çšãããŠããïŒïŒ°ïŒãªã©ã®çš®ã ã®ãã©ãŒãããã®ã³ã³ããã該åœããã   Further, in the above-described embodiment, an example in which the container is a transport stream (MPEG-2 TS) is shown. However, the present technology can be similarly applied to a system configured to be distributed to receiving terminals using a network such as the Internet. In the Internet distribution, it is often distributed in a container of MP4 or other formats. In other words, containers of various formats such as transport stream (MPEG-2 TS) adopted in the digital broadcasting standard and MP4 used in Internet distribution correspond to the container.
ãŸããæ¬æè¡ã¯ã以äžã®ãããªæ§æãåãããšãã§ããã
ïŒïŒïŒåç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãã該åé¡ãããåéå±€ã®ãã¯ãã£ã®ç»åããŒã¿ã笊å·åãããšå
±ã«ãäžèšè€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãã該åå²ãããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€äžèšæå®æ°ã®ãããªã¹ããªãŒã ãçæããç»å笊å·åéšãåãã
äžèšç»å笊å·åéšã¯ã
å°ãªããšããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åãã
笊å·åè£
眮ã
ïŒïŒïŒäžèšç»å笊å·åéšã¯ã
äžèšæäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ãã該éå±€çµããäžäœåŽã«äœçœ®ãããã¹ãŠã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã®äžéã¿ã€ãã³ã°ãšãªãããã«ç¬Šå·åãã
åèšïŒïŒïŒã«èšèŒã®ç¬Šå·åè£
眮ã
ïŒïŒïŒäžèšç»å笊å·åéšã¯ã
äžèšæäžäœã®éå±€çµã«è€æ°ã®éå±€ãå«ã¿ã該æäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã«ã¯ïŒã€ã®éå±€ãå«ãããã«ãäžèšè€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãã
åèšïŒïŒïŒãŸãã¯ïŒïŒïŒã«èšèŒã®ç¬Šå·åè£
眮ã
ïŒïŒïŒç»å笊å·åéšã«ãããåç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãã該åé¡ãããåéå±€ã®ãã¯ãã£ã®ç»åããŒã¿ã笊å·åãããšå
±ã«ãäžèšè€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãã該åå²ãããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€äžèšæå®æ°ã®ãããªã¹ããªãŒã ãçæãã
äžèšç»å笊å·åéšã¯ã
å°ãªããšããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åãã
笊å·åæ¹æ³ã
ïŒïŒïŒåç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãã該åé¡ãããåéå±€ã®ãã¯ãã£ã®ç»åããŒã¿ã笊å·åãããšå
±ã«ãäžèšè€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãã該åå²ãããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€äžèšæå®æ°ã®ãããªã¹ããªãŒã ãçæããç»å笊å·åéšãšã
äžèšçæãããæå®æ°ã®ãããªã¹ããªãŒã ãå«ãæå®ãã©ãŒãããã®ã³ã³ãããéä¿¡ããéä¿¡éšãåãã
äžèšç»å笊å·åéšã¯ã
å°ãªããšããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åãã
éä¿¡è£
眮ã
ïŒïŒïŒåç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãã該åé¡ãããåéå±€ã®ãã¯ãã£ã®ç»åããŒã¿ã笊å·åãããšå
±ã«ãäžèšè€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãã該åå²ãããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€äžèšæå®æ°ã®ãããªã¹ããªãŒã ãçæããç»å笊å·åéšãšã
äžèšçæãããæå®æ°ã®ãããªã¹ããªãŒã ãå«ãæå®ãã©ãŒãããã®ã³ã³ãããéä¿¡ããéä¿¡éšãšã
äžèšã³ã³ããã®ã¬ã€ã€ã«ãäžèšæå®æ°ã®ãããªã¹ããªãŒã ã®ããããããäžèšæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ããŒã¹ã¹ããªãŒã ã§ãããã該æäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãå«ããšã³ãã³ã¹ã¹ããªãŒã ã§ããããèå¥ããããã®èå¥æ
å ±ãæ¿å
¥ããèå¥æ
å ±æ¿å
¥éšãåãã
éä¿¡è£
眮ã
ïŒïŒïŒäžèšã³ã³ããã¯ãã©ã³ã¹ããŒãã¹ããªãŒã ã§ããã
äžèšèå¥æ
å ±æ¿å
¥éšã¯ã
äžèšèå¥æ
å ±ããããã°ã©ã ãããããŒãã«ã®é
äžã«äžèšæå®æ°ã®ãããªã¹ããªãŒã ã«ãããã察å¿ããŠé
眮ããããããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«ãŒãã®äžã«ã¹ããªãŒã ã¿ã€ããšããŠæ¿å
¥ãã
åèšïŒïŒïŒã«èšèŒã®éä¿¡è£
眮ã
ïŒïŒïŒäžèšç»å笊å·åéšã¯ã
å°ãªããšããäžèšæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åãã
åèšïŒïŒïŒãŸãã¯ïŒïŒïŒã«èšèŒã®éä¿¡è£
眮ã
ïŒïŒïŒäžèšç»å笊å·åéšã¯ã
äžèšæäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ãã該éå±€çµããäžäœåŽã«äœçœ®ãããã¹ãŠã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã®äžéã¿ã€ãã³ã°ãšãªãããã«ç¬Šå·åãã
åèšïŒïŒïŒã«èšèŒã®éä¿¡è£
眮ã
ïŒïŒïŒïŒåç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãã該åé¡ãããåéå±€ã®ãã¯ãã£ã®ç»åããŒã¿ã笊å·åãããšå
±ã«ãäžèšè€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãã該åå²ãããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€äžèšæå®æ°ã®ãããªã¹ããªãŒã ãçæããç»å笊å·åéšãšã
äžèšçæãããæå®æ°ã®ãããªã¹ããªãŒã ãå«ãæå®ãã©ãŒãããã®ã³ã³ãããéä¿¡ããéä¿¡éšãšã
äžèšã³ã³ããã®ã¬ã€ã€ã«ã該ã³ã³ããã«å«ãŸããæå®æ°ã®ãããªã¹ããªãŒã ã®ããããã«å¯Ÿå¿ããŠã該ãããªã¹ããªãŒã ã®æ§ææ
å ±ãæ¿å
¥ããæ§ææ
å ±æ¿å
¥éšãåãã
éä¿¡è£
眮ã
ïŒïŒïŒïŒäžèšæ§ææ
å ±ã«ã¯ããããªã¹ããªãŒã ãå±ãããµãŒãã¹ã°ã«ãŒãã瀺ãæ
å ±ãå«ãŸãã
åèšïŒïŒïŒïŒã«èšèŒã®éä¿¡è£
眮ã
ïŒïŒïŒïŒäžèšæ§ææ
å ±ã«ã¯ãäžèšæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ããŒã¹ã¹ããªãŒã ããå§ãŸãã¹ããªãŒã éã®äŸåé¢ä¿ã瀺ãæ
å ±ãå«ãŸãã
åèšïŒïŒïŒïŒãŸãã¯ïŒïŒïŒïŒã«èšèŒã®éä¿¡è£
眮ã
ïŒïŒïŒïŒäžèšæ§ææ
å ±ã«ã¯ãäžèšç»å笊å·åéšã§åé¡ãããäžèšè€æ°ã®éå±€ã®éå±€æ°ã瀺ãæ
å ±ãå«ãŸãã
åèšïŒïŒïŒïŒããïŒïŒïŒïŒã®ããããã«èšèŒã®éä¿¡è£
眮ã
ïŒïŒïŒïŒäžèšã³ã³ããã¯ãã©ã³ã¹ããŒãã¹ããªãŒã ã§ããã
äžèšæ§ææ
å ±æ¿å
¥éšã¯ã
äžèšæ§ææ
å ±ããããã°ã©ã ãããããŒãã«ã®é
äžã«äžèšæå®æ°ã®ãããªã¹ããªãŒã ã«ãããã察å¿ããŠé
眮ããããããªãšã¬ã¡ã³ã¿ãªã¹ããªãŒã ã«ãŒãã®äžã«ãã¹ã¯ãªãã¿ãšããŠæ¿å
¥ãã
åèšïŒïŒïŒïŒããïŒïŒïŒïŒã®ããããã«èšèŒã®éä¿¡è£
眮ã
ïŒïŒïŒïŒåç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãããŠç¬Šå·åããããšå
±ã«ãäžèšè€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãããããšã§åŸããããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€äžèšæå®æ°ã®ãããªã¹ããªãŒã ãåä¿¡ããåä¿¡éšãšã
äžèšåä¿¡ãããæå®æ°ã®ãããªã¹ããªãŒã ãåŠçããåŠçéšãåãã
äžèšæå®æ°ã®ãããªã¹ããªãŒã ã®ãã¡ãå°ãªããšããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ãããªã¹ããªãŒã ã¯ãåãã¯ãã£ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åãããŠãã
åä¿¡è£
眮ã
ïŒïŒïŒïŒäžèšæå®æ°ã®ãããªã¹ããªãŒã ã¯ã
äžèšæäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ãã該éå±€çµããäžäœåŽã«äœçœ®ãããã¹ãŠã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã®äžéã¿ã€ãã³ã°ãšãªãããã«ç¬Šå·åãããŠãã
åèšïŒïŒïŒïŒã«èšèŒã®åä¿¡è£
眮ã
ïŒïŒïŒïŒåç»åããŒã¿ãæ§æããåãã¯ãã£ã®ç»åããŒã¿ãè€æ°ã®éå±€ã«åé¡ãããŠç¬Šå·åããããšå
±ã«ãäžèšè€æ°ã®éå±€ãæå®æ°ã®éå±€çµã«åå²ãããããšã§åŸããããåéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãããããæã€äžèšæå®æ°ã®ãããªã¹ããªãŒã ãå«ãæå®ãã©ãŒãããã®ã³ã³ãããåä¿¡ããåä¿¡éšãšã
äžèšåä¿¡ãããã³ã³ããã«å«ãŸããäžèšæå®æ°ã®ãããªã¹ããªãŒã ãããã³ãŒãèœåã«å¿ããæå®é局以äžã®éå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãéžæçã«ãããã¡ã«åã蟌ã¿ã該ãããã¡ã«åã蟌ãŸããåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ããã³ãŒãããŠãäžèšæå®é局以äžã®éå±€ã®ãã¯ãã£ã®ç»åããŒã¿ãåŸãç»å埩å·åéšãåãã
äžèšæå®æ°ã®ãããªã¹ããªãŒã ã®ãã¡ãå°ãªããšããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ãããªã¹ããªãŒã ã¯ãåãã¯ãã£ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åãããŠãã
åä¿¡è£
眮ã
ïŒïŒïŒïŒäžèšã³ã³ããã®ã¬ã€ã€ã«ãäžèšæå®æ°ã®ãããªã¹ããªãŒã ã®ããããããäžèšæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãæã€ããŒã¹ã¹ããªãŒã ã§ãããã該æäžäœã®éå±€çµããäžäœã«äœçœ®ããéå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãå«ããšã³ãã³ã¹ã¹ããªãŒã ã§ããããèå¥ããããã®èå¥æ
å ±ãæ¿å
¥ãããŠããã
äžèšç»å埩å·åéšã¯ãäžèšèå¥æ
å ±ã«åºã¥ããŠãäžèšããŒã¹ã¹ããªãŒã ãå«ãæå®æ°ã®ãããªã¹ããªãŒã ããäžèšãã³ãŒãèœåã«å¿ããæå®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãäžèšãããã¡ã«åã蟌ãã§ãã³ãŒããã
åèšïŒïŒïŒïŒã«èšèŒã®åä¿¡è£
眮ã
ïŒïŒïŒïŒäžèšç»å埩å·åéšã¯ã
äžèšæå®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãè€æ°ã®ãããªã¹ããªãŒã ã«å«ãŸããŠããå Žåãåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ããã³ãŒãã¿ã€ãã³ã°æ
å ±ã«åºã¥ããŠïŒã€ã®ã¹ããªãŒã ã«ããŠãã³ãŒããã
åèšïŒïŒïŒïŒãŸãã¯ïŒïŒïŒïŒã«èšèŒã®åä¿¡è£
眮ã
ïŒïŒïŒïŒäžèšç»å埩å·åéšã§åŸãããåãã¯ãã£ã®ç»åããŒã¿ã®ãã¬ãŒã ã¬ãŒãã衚瀺èœåã«åããããã¹ãåŠçéšãããã«åãã
åèšïŒïŒïŒïŒããïŒïŒïŒïŒã®ããããã«èšèŒã®åä¿¡è£
眮ã
Moreover, this technique can also take the following structures.
(1) The image data of each picture constituting the moving image data is classified into a plurality of hierarchies, the image data of the classified pictures of each hierarchy is encoded, and the plurality of hierarchies are grouped into a predetermined number of hierarchies. An image encoding unit that divides and generates the predetermined number of video streams each having the encoded image data of each of the divided groups of pictures,
The image encoding unit is
An encoding apparatus that performs encoding so that at least the decoding intervals of encoded image data of pictures in the lowest layer set are equal.
(2) The image encoding unit
The decoding timing of the encoded image data of the pictures of the hierarchical group positioned higher than the lowest hierarchical group is intermediate between the decoding timings of the encoded image data of the pictures of all the hierarchical groups positioned lower than the hierarchical group. The encoding device according to (1), wherein encoding is performed so that timing is reached.
(3) The image encoding unit
Dividing the plurality of hierarchies into a predetermined number of hierarchies so that the lowest hierarchic group includes a plurality of hierarchies and the hierarchic group positioned higher than the lowest hierarchic hierarchies includes one hierarchy; The encoding device according to (1) or (2).
(4) The image encoding unit classifies the image data of each picture constituting the moving image data into a plurality of layers, encodes the image data of the classified pictures in each layer, and Dividing into a predetermined number of layer sets, and generating the predetermined number of video streams respectively having encoded image data of pictures of each divided layer set,
The image encoding unit is
An encoding method that performs encoding so that at least the decoding intervals of encoded image data of pictures in the lowest layer set are equal.
(5) The image data of each picture constituting the moving image data is classified into a plurality of hierarchies, the image data of the classified pictures of each hierarchy is encoded, and the plurality of hierarchies are grouped into a predetermined number of hierarchies. An image encoding unit that divides and generates the predetermined number of video streams respectively having encoded image data of pictures of each divided hierarchical group;
A transmission unit configured to transmit a container of a predetermined format including the generated predetermined number of video streams;
The image encoding unit is
A transmission device that performs encoding so that at least the decoding intervals of encoded image data of pictures in the lowest layer set are equal.
(6) The image data of each picture constituting the moving image data is classified into a plurality of hierarchies, the image data of the classified pictures of each hierarchy is encoded, and the plurality of hierarchies are grouped into a predetermined number of hierarchies. An image encoding unit that divides and generates the predetermined number of video streams respectively having encoded image data of pictures of each divided hierarchical group;
A transmission unit for transmitting a container of a predetermined format including the generated predetermined number of video streams;
In the container layer, each of the predetermined number of video streams is a base stream having encoded image data of pictures of the lowest hierarchy set, or a hierarchy set positioned higher than the lowest hierarchy set A transmission apparatus comprising: an identification information insertion unit that inserts identification information for identifying whether or not an enhanced stream includes encoded image data of a picture.
(7) The container is a transport stream,
The identification information insertion unit
The transmission device according to (6), wherein the identification information is inserted as a stream type into a video elementary stream loop arranged corresponding to the predetermined number of video streams under a program map table.
(8) The image encoding unit
The transmission device according to (6) or (7), wherein encoding is performed so that at least the decoding intervals of the encoded image data of the pictures in the lowest hierarchy set are equal intervals.
(9) The image encoding unit
The decoding timing of the encoded image data of the pictures of the hierarchical group positioned higher than the lowest hierarchical group is intermediate between the decoding timings of the encoded image data of the pictures of all the hierarchical groups positioned lower than the hierarchical group. The transmission apparatus according to (8), wherein encoding is performed so that timing is reached.
(10) The image data of each picture constituting the moving image data is classified into a plurality of hierarchies, the image data of the classified pictures of each hierarchy is encoded, and the plurality of hierarchies are grouped into a predetermined number of hierarchies. An image encoding unit that divides and generates the predetermined number of video streams respectively having encoded image data of pictures of each divided hierarchical group;
A transmission unit for transmitting a container of a predetermined format including the generated predetermined number of video streams;
A transmission apparatus comprising: a configuration information insertion unit configured to insert configuration information of a video stream corresponding to each of a predetermined number of video streams included in the container in the container layer.
(11) The transmission device according to (10), wherein the configuration information includes information indicating a service group to which the video stream belongs.
(12) The configuration information includes information indicating a dependency relationship between streams starting from a base stream having encoded image data of a picture of the lowest hierarchical set. The information according to (10) or (11) Transmitter device.
(13) The transmission device according to any one of (10) to (12), wherein the configuration information includes information indicating a number of layers of the plurality of layers classified by the image encoding unit.
(14) The container is a transport stream,
The configuration information insertion unit
The configuration information is inserted as a descriptor in a video elementary stream loop arranged corresponding to each of the predetermined number of video streams under the program map table. Any one of (10) to (13) Transmitter.
(15) The image data of each picture constituting the moving image data is classified and encoded into a plurality of layers, and each layer obtained by dividing the plurality of layers into a predetermined number of layer sets A receiving unit for receiving the predetermined number of video streams each having encoded image data of a set of pictures;
A processing unit for processing the received predetermined number of video streams;
A receiving apparatus in which at least a video stream having encoded image data of a picture in the lowest hierarchical group among the predetermined number of video streams is encoded so that a decoding interval of each picture is equal.
(16) The predetermined number of video streams are
The decoding timing of the encoded image data of the pictures of the hierarchical group positioned higher than the lowest hierarchical group is intermediate between the decoding timings of the encoded image data of the pictures of all the hierarchical groups positioned lower than the hierarchical group. The reception apparatus according to (15), wherein the reception apparatus is encoded so as to be timing.
(17) Image data of each picture constituting the moving image data is classified and encoded into a plurality of layers, and each layer obtained by dividing the plurality of layers into a predetermined number of layer sets A receiving unit for receiving a container of a predetermined format including the predetermined number of video streams each having encoded image data of a set of pictures;
From the predetermined number of video streams included in the received container, the encoded image data of a picture of a predetermined hierarchy or lower according to the decoding capability is selectively taken into a buffer, and the code of each picture taken into the buffer is selected. An image decoding unit that decodes the converted image data and obtains image data of a picture of a layer below the predetermined layer,
A receiving apparatus in which at least a video stream having encoded image data of a picture in the lowest hierarchical group among the predetermined number of video streams is encoded so that a decoding interval of each picture is equal.
(18) In the container layer, each of the predetermined number of video streams is a base stream having encoded image data of pictures in the lowest hierarchy set, or is positioned higher than the lowest hierarchy set Identification information for identifying whether the stream is an enhanced stream including encoded image data of a hierarchical set of pictures to be inserted,
Based on the identification information, the image decoding unit fetches and decodes encoded image data of a predetermined layer set of pictures according to the decoding capability from a predetermined number of video streams including the base stream into the buffer. The receiving device according to (17).
(19) The image decoding unit
When the encoded image data of the picture of the predetermined hierarchy set is included in a plurality of video streams, the encoded image data of each picture is decoded as one stream based on the decoding timing information. (17) or ( The receiving device according to 18).
(20) The receiving device according to any one of (17) to (19), further including a post processing unit that adjusts a frame rate of image data of each picture obtained by the image decoding unit to display capability.
æ¬æè¡ã®äž»ãªç¹åŸŽã¯ãå°ãªããšããæäžäœã®éå±€çµã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãééãçééãšãªãããã«ç¬Šå·åããããšã§ãåä¿¡åŽããæäžäœã®éå±€çµã«å«ãŸãè€æ°ã®éå±€ã®ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ãåŠçå¯èœãªãã³ãŒãèœåãããå Žåãåãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãåŠçãç¡çãªãé£ç¶ããŠè¡ãããšãå¯èœã«ããããšã§ããïŒå³ïŒãå³ïŒïŒåç §ïŒã   The main feature of the present technology is that at least the reception side includes a plurality of images included in the lowest layer set by performing encoding so that the decoding intervals of the encoded image data of the pictures of the lowest layer set are equal. When there is a decoding capability capable of processing the encoded image data of the pictures in the hierarchy, the decoding processing of the encoded image data of each picture can be performed continuously without difficulty (FIGS. 8 and 11). reference).
ãŸããæ¬æè¡ã®äž»ãªç¹åŸŽã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ã®ã¬ã€ã€ã«ãæå®æ°ã®ãããªã¹ããªãŒã ã®ããããããããŒã¹ã¹ããªãŒã ã§ããããšã³ãã³ã¹ã¹ããªãŒã ã§ããããèå¥ããããã®èå¥æ å ±ãæ¿å ¥ããããšã§ãåä¿¡åŽã«ãããŠããã®èå¥æ å ±ãå©çšããããšã§ãäŸãã°ãããŒã¹ã¹ããªãŒã ã ããéžæçã«ãã³ãŒãããããšã容æã«å¯èœãšããããšã§ããïŒå³ïŒïŒãå³ïŒïŒåç §ïŒã   The main feature of the present technology is that by inserting identification information for identifying whether each of a predetermined number of video streams is a base stream or an enhanced stream into the layer of the transport stream TS, By using this identification information on the receiving side, for example, it is possible to easily selectively decode only the base stream (see FIGS. 20 and 23).
ãŸããæ¬æè¡ã®äž»ãªç¹åŸŽã¯ããã©ã³ã¹ããŒãã¹ããªãŒã ã®ã¬ã€ã€ã«ããã®ãã©ã³ã¹ããŒãã¹ããªãŒã ã«å«ãŸããæå®æ°ã®ãããªã¹ããªãŒã ã®ããããã«å¯Ÿå¿ããŠããã®ãããªã¹ããªãŒã ã®æ§ææ å ±ãæ¿å ¥ããããšã§ããã©ã³ã¹ããŒãã¹ããªãŒã ã«å«ãŸããåãããªã¹ããªãŒã ã«ã€ããã©ã®ã°ã«ãŒãã«å±ããã®ããã©ã®ãããªã¹ããªãŒã äŸåé¢ä¿ã«ããã®ããéå±€æ°ããããã®é局笊å·åã«ä¿ããã®ã§ãããããªã©ã容æã«ææ¡å¯èœãšããããšã§ããïŒå³ïŒïŒãå³ïŒïŒåç §ïŒã   The main feature of the present technology is that the configuration information of the video stream is inserted into the layer of the transport stream TS corresponding to each of a predetermined number of video streams included in the transport stream TS. For each video stream included in the transport stream TS, it is possible to easily understand which group it belongs to, what stream dependency relationship it has, and how many layer encodings the layer number relates to. (See FIGS. 20 and 23).
ïŒïŒã»ã»ã»éåä¿¡ã·ã¹ãã
ïŒïŒïŒã»ã»ã»éä¿¡è£
眮
ïŒïŒïŒã»ã»ã»ïŒ£ïŒ°ïŒµ
ïŒïŒïŒã»ã»ã»ãšã³ã³ãŒã
ïŒïŒïŒã»ã»ã»å§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒ
ïŒïŒïŒã»ã»ã»ãã«ããã¬ã¯ãµ
ïŒïŒïŒã»ã»ã»éä¿¡éš
ïŒïŒïŒã»ã»ã»ãã³ãã©ã«ïŒ©ïŒ€çºçéš
ïŒïŒïŒã»ã»ã»ãããã¡é
延å¶åŸ¡éš
ïŒïŒïŒã»ã»ã»ïŒšïŒ²ïŒ€èšå®éš
ïŒïŒïŒã»ã»ã»ãã©ã¡ãŒã¿ã»ãã/ãšã³ã³ãŒãéš
ïŒïŒïŒã»ã»ã»ã¹ã©ã€ã¹ãšã³ã³ãŒãéš
ïŒïŒïŒã»ã»ã»ïŒ®ïŒ¡ïŒ¬ãã±ããåéš
ïŒïŒïŒã»ã»ã»ã»ã¯ã·ã§ã³ã³ãŒãã£ã³ã°éš
ïŒïŒïŒ-1ãïŒïŒïŒ-Nã»ã»ã»ïŒ°ïŒ¥ïŒ³ãã±ããåéš
ïŒïŒïŒã»ã»ã»ã¹ã€ããéš
ïŒïŒïŒã»ã»ã»ãã©ã³ã¹ããŒããã±ããåéš
ïŒïŒïŒã»ã»ã»åä¿¡è£
眮
ïŒïŒïŒã»ã»ã»ïŒ£ïŒ°ïŒµ
ïŒïŒïŒã»ã»ã»åä¿¡éš
ïŒïŒïŒã»ã»ã»ããã«ããã¬ã¯ãµ
ïŒïŒïŒã»ã»ã»å§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒ
ïŒïŒïŒã»ã»ã»ãã³ãŒã
ïŒïŒïŒã»ã»ã»éå§çž®ããŒã¿ãããã¡ïŒïœïœïœïŒ
ïŒïŒïŒã»ã»ã»ãã¹ãåŠçéš
ïŒïŒïŒã»ã»ã»ïŒŽïŒ³ã¢ããããŒã·ã§ã³ãã£ãŒã«ãæœåºéš
ïŒïŒïŒã»ã»ã»ã¯ããã¯æ
å ±æœåºéš
ïŒïŒïŒã»ã»ã»ïŒŽïŒ³ãã€ããŒãæœåºéš
ïŒïŒïŒã»ã»ã»ã»ã¯ã·ã§ã³æœåºéš
ïŒïŒïŒã»ã»ã»ïŒ°ïŒ³ïŒ©ããŒãã«ïŒãã¹ã¯ãªãã¿æœåºéš
ïŒïŒïŒã»ã»ã»ïŒ°ïŒ¥ïŒ³ãã±ããæœåºéš
ïŒïŒïŒã»ã»ã»ïŒ°ïŒ¥ïŒ³ãããæœåºéš
ïŒïŒïŒã»ã»ã»ã¿ã€ã ã¹ã¿ã³ãæœåºéš
ïŒïŒïŒã»ã»ã»ïŒ°ïŒ¥ïŒ³ãã€ããŒãæœåºéš
ïŒïŒïŒã»ã»ã»ã¹ããªãŒã æ§æéš
ïŒïŒïŒã»ã»ã»ãã³ãã©ã«ïŒ©ïŒ€è§£æéš
ïŒïŒïŒã»ã»ã»å¯Ÿè±¡éå±€éžæéš
ïŒïŒïŒã»ã»ã»ã¹ããªãŒã çµåéš
ïŒïŒïŒã»ã»ã»ãã³ãŒãéš
ïŒïŒïŒã»ã»ã»è£ééš
ïŒïŒïŒã»ã»ã»ãµããµã³ãã«éš
ïŒïŒïŒã»ã»ã»ã¹ã€ããéš
DESCRIPTION OF
102: Encoder 103: Compressed data buffer (cpb)
104: Multiplexer 105:
202: receiving
205: Decoder 206: Uncompressed data buffer (dpb)
207: Post processing unit 231: TS adaptation field extracting unit 232: Clock information extracting unit 233: TS payload extracting unit 234: Section extracting unit 235: PSI table /
Claims (7)
äžèšç¬¬ïŒã®ã¹ããªãŒã ããããã¯äžèšç¬¬ïŒã®ã¹ããªãŒã ããã³äžèšç¬¬ïŒã®ã¹ããªãŒã ã®åæ¹ã«ãã³ãŒãåŠçãããåŠçéšãåãã
äžèšç¬Šå·åç»åããŒã¿ã¯ïŒ®ïŒ¡ïŒ¬ãŠãããæ§é ãæããäžèšç¬¬ïŒã®ã¹ããªãŒã ã®ïŒ³ïŒ°ïŒ³ã®ïŒ®ïŒ¡ïŒ¬ãŠãããã«ãäžèšç¬¬ïŒã®ã¹ããªãŒã ã®ã¬ãã«æå®å€ãæ¿å ¥ãããŠããã
äžèšç¬¬ïŒã®ã¹ããªãŒã ãæã€ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã¯ããã³ãŒãééãçééãšãªãããã«ç¬Šå·åãããŠããã
äžèšç¬¬ïŒã®ã¹ããªãŒã ãæã€ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã¯ãäžèšç¬¬ïŒã®ã¹ããªãŒã ãæã€ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã®äžéã¿ã€ãã³ã°ãšãªãããã«ç¬Šå·åãããŠãã
åä¿¡è£ çœ®ã A first stream and a higher hierarchy that are generated by hierarchically encoding the picture data of each picture constituting the moving picture data so that the decoding order and the display order are different, and having the encoded picture data of the pictures on the lower hierarchy side A receiving unit for receiving a second stream having encoded image data of a side picture;
A processing unit that performs decoding processing on the first stream or both the first stream and the second stream;
The encoded image data has a NAL unit structure, and the level designation value of the first stream is inserted into the SPS NAL unit of the first stream,
The encoded image data of the picture included in the first stream is encoded so that the decoding interval is equal ,
The receiving apparatus, wherein the decoding timing of the encoded image data of the picture included in the second stream is encoded so as to be an intermediate timing of the decoding timing of the encoded image data of the picture included in the first stream .
è«æ±é ïŒã«èšèŒã®åä¿¡è£ çœ®ã The reception unit includes identification information for identifying that the first stream is a base stream having encoded image data of a lower layer picture, and the second stream is a code of a higher layer picture. The receiving apparatus according to claim 1, further receiving identification information for identifying that the stream is an enhanced stream having digitized image data.
è«æ±é ïŒã«èšèŒã®åä¿¡è£ çœ®ã The processing unit performs a decoding process on the first stream or both the first stream and the second stream included in the received container based on the identification information according to a decoding capability. The receiving apparatus according to claim 2 , wherein a process of matching a frame rate of image data of each picture obtained by the decoding process with a display capability is performed.
äžèšç¬¬ïŒã®ã¹ããªãŒã ããããã¯äžèšç¬¬ïŒã®ã¹ããªãŒã ããã³äžèšç¬¬ïŒã®ã¹ããªãŒã ã®åæ¹ã«ãã³ãŒãåŠçãããæé ãæãã
äžèšç¬Šå·åç»åããŒã¿ã¯ïŒ®ïŒ¡ïŒ¬ãŠãããæ§é ãæããäžèšç¬¬ïŒã®ã¹ããªãŒã ã®ïŒ³ïŒ°ïŒ³ã®ïŒ®ïŒ¡ïŒ¬ãŠãããã«ãäžèšç¬¬ïŒã®ã¹ããªãŒã ã®ã¬ãã«æå®å€ãæ¿å ¥ãããŠããã
äžèšç¬¬ïŒã®ã¹ããªãŒã ãæã€ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã¯ããã³ãŒãééãçééãšãªãããã«ç¬Šå·åãããŠããã
äžèšç¬¬ïŒã®ã¹ããªãŒã ãæã€ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã¯ãäžèšç¬¬ïŒã®ã¹ããªãŒã ãæã€ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã®äžéã¿ã€ãã³ã°ãšãªãããã«ç¬Šå·åãããŠãã
åä¿¡æ¹æ³ã A first stream and a higher hierarchy that are generated by hierarchically encoding the picture data of each picture constituting the moving picture data so that the decoding order and the display order are different, and having the encoded picture data of the pictures on the lower hierarchy side Receiving a second stream having encoded image data of a picture on the side;
A procedure for decoding the first stream or both the first stream and the second stream;
The encoded image data has a NAL unit structure, and the level designation value of the first stream is inserted into the SPS NAL unit of the first stream,
The encoded image data of the picture included in the first stream is encoded so that the decoding interval is equal ,
A reception method in which the decoding timing of the encoded image data of the picture of the second stream is encoded so as to be an intermediate timing of the decoding timing of the encoded image data of the picture of the first stream .
äžèšç¬Šå·åç»åããŒã¿ã¯ïŒ®ïŒ¡ïŒ¬ãŠãããæ§é ãæããäžèšç¬¬ïŒã®ã¹ããªãŒã ã®ïŒ³ïŒ°ïŒ³ã®ïŒ®ïŒ¡ïŒ¬ãŠãããã«ãäžèšç¬¬ïŒã®ã¹ããªãŒã ã®ã¬ãã«æå®å€ãæ¿å ¥ãããŠããã
äžèšç¬¬ïŒã®ã¹ããªãŒã ãæã€ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã¯ããã³ãŒãééãçééãšãªãããã«ç¬Šå·åãããŠããã
äžèšç¬¬ïŒã®ã¹ããªãŒã ãæã€ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã¯ãäžèšç¬¬ïŒã®ã¹ããªãŒã ãæã€ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã®äžéã¿ã€ãã³ã°ãšãªãããã«ç¬Šå·åãããŠãã
éä¿¡è£ çœ®ã A first stream and a higher hierarchy that are generated by hierarchically encoding the picture data of each picture constituting the moving picture data so that the decoding order and the display order are different, and having the encoded picture data of the pictures on the lower hierarchy side A transmission unit for transmitting a second stream having encoded image data of a picture on the side,
The encoded image data has a NAL unit structure, and the level designation value of the first stream is inserted into the SPS NAL unit of the first stream,
The encoded image data of the picture included in the first stream is encoded so that the decoding interval is equal ,
The transmitting apparatus, wherein the decoding timing of the encoded image data of the picture included in the second stream is encoded to be an intermediate timing of the decoding timing of the encoded image data of the picture included in the first stream .
è«æ±é ïŒã«èšèŒã®éä¿¡è£ çœ®ã The transmission unit includes identification information for identifying that the first stream is a base stream having encoded image data of a picture on the lower layer side, and a code of the picture on the higher layer side of the second stream. The transmission apparatus according to claim 5 , further transmitting identification information for identifying the enhanced stream having the converted image data.
äžèšç¬Šå·åç»åããŒã¿ã¯ïŒ®ïŒ¡ïŒ¬ãŠãããæ§é ãæããäžèšç¬¬ïŒã®ã¹ããªãŒã ã®ïŒ³ïŒ°ïŒ³ã®ïŒ®ïŒ¡ïŒ¬ãŠãããã«ãäžèšç¬¬ïŒã®ã¹ããªãŒã ã®ã¬ãã«æå®å€ãæ¿å ¥ãããŠããã
äžèšç¬¬ïŒã®ã¹ããªãŒã ãæã€ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã¯ããã³ãŒãééãçééãšãªãããã«ç¬Šå·åãããŠããã
äžèšç¬¬ïŒã®ã¹ããªãŒã ãæã€ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã¯ãäžèšç¬¬ïŒã®ã¹ããªãŒã ãæã€ãã¯ãã£ã®ç¬Šå·åç»åããŒã¿ã®ãã³ãŒãã¿ã€ãã³ã°ã®äžéã¿ã€ãã³ã°ãšãªãããã«ç¬Šå·åãããŠãã
éä¿¡æ¹æ³ã A first stream and a higher hierarchy that are generated by hierarchically encoding the picture data of each picture constituting the moving picture data so that the decoding order and the display order are different, and having the encoded picture data of the pictures on the lower hierarchy side Transmitting a second stream having encoded image data of a picture on the side,
The encoded image data has a NAL unit structure, and the level designation value of the first stream is inserted into the SPS NAL unit of the first stream,
The encoded image data of the picture included in the first stream is encoded so that the decoding interval is equal ,
The transmission method, wherein the decoding timing of the encoded image data of the picture included in the second stream is encoded so as to be an intermediate timing of the decoding timing of the encoded image data of the picture included in the first stream .
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2018091095A JP6614275B2 (en) | 2018-05-10 | 2018-05-10 | Receiving device, receiving method, transmitting device, and transmitting method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2018091095A JP6614275B2 (en) | 2018-05-10 | 2018-05-10 | Receiving device, receiving method, transmitting device, and transmitting method |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2016103835A Division JP6341228B2 (en) | 2016-05-25 | 2016-05-25 | Encoding device, encoding method, transmission device, transmission method, reception device, and reception method |
Publications (2)
Publication Number | Publication Date |
---|---|
JP2018139443A JP2018139443A (en) | 2018-09-06 |
JP6614275B2 true JP6614275B2 (en) | 2019-12-04 |
Family
ID=63451074
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2018091095A Expired - Fee Related JP6614275B2 (en) | 2018-05-10 | 2018-05-10 | Receiving device, receiving method, transmitting device, and transmitting method |
Country Status (1)
Country | Link |
---|---|
JP (1) | JP6614275B2 (en) |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1917808A1 (en) * | 2005-08-26 | 2008-05-07 | Thomson Licensing | Trick play using temporal layering |
BRPI0918619A2 (en) * | 2008-09-17 | 2019-09-03 | Sharp Kk | scalable video stream decoder and scalable video stream generator |
-
2018
- 2018-05-10 JP JP2018091095A patent/JP6614275B2/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
JP2018139443A (en) | 2018-09-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7192910B2 (en) | Transmission method and transmission device | |
JP5789004B2 (en) | Transmitting apparatus, transmitting method, receiving apparatus, receiving method, encoding apparatus, and encoding method | |
JP5947269B2 (en) | Encoding apparatus, encoding method, transmitting apparatus, and receiving apparatus | |
WO2015064287A1 (en) | Transmission apparatus, transmission method, reception apparatus, and reception method | |
JP7338745B2 (en) | receiver | |
JP5954508B2 (en) | Encoding apparatus, encoding method, transmitting apparatus, and receiving apparatus | |
JP6614275B2 (en) | Receiving device, receiving method, transmitting device, and transmitting method | |
JP6341228B2 (en) | Encoding device, encoding method, transmission device, transmission method, reception device, and reception method | |
JP5954509B2 (en) | Encoding apparatus, encoding method, transmitting apparatus, and receiving apparatus | |
JP6508270B2 (en) | Transmission apparatus, transmission method, reception apparatus and reception method | |
JP2019062566A (en) | Transmission apparatus, transmission method, reception apparatus, and reception method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20180605 |
|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20180605 |
|
A977 | Report on retrieval |
Free format text: JAPANESE INTERMEDIATE CODE: A971007 Effective date: 20190320 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20190402 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20190510 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20191008 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20191021 |
|
R151 | Written notification of patent or utility model registration |
Ref document number: 6614275 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R151 |
|
LAPS | Cancellation because of no payment of annual fees |