DESCRIPTION
VIDEO STREAM ENCODING DEVICE AND METHOD, AND PICTURE CONVERSION PROCESSING UNIT
Technical Field
The present invention relates to a video stream encoding device and a picture conversion processing unit operable to convert an original video stream to a reduced video stream with smaller number of pixels and to encode the reduced video stream at a low bit rate.
Background Art
A video stream encoding device is proposed up to this time, which is operable to convert an interlaced-scan video stream, through resolution conversion, into a reduced-size video stream with a smaller number of pixels than the original interlaced-scan video stream, and subsequently to encode the reduced-size video stream. (For example, refer to Document 1 (published Japanese Patent Application Laid-Open No. H06- 189278)).
Fig. 13 is a block diagram of a conventional video stream encoding device which is constructed based on the art disclosed by Document 1 using the knowledge of a person skilled in art. The video stream encoding device shown in Fig. 13 comprises a video input unit 1, a resolution conversion unit 2, a frame memory 3, and an encoding processing unit 4. In the video stream encoding device, an interlaced-scan video stream is inputted to the video input unit 1 from an input terminal 5, and the resolution conversion unit 2 decreases the resolution of the interlaced-scan video stream, thereby converting the interlaced-scan video stream into a reduced-size video stream which possesses less resolution than the interlaced-scan video stream. The reduced-size video stream is temporarily stored in the frame memory 3. The reduced-size video stream is
read out from the frame memory 3, encoded by the encoding processing unit 4, and outputted to an output terminal 6 as encoded data.
The interlaced-scan video stream to be inputted into the input terminal 5 is composed of pictures, for example, of 704 pixels x 240 lines at 60 fields per second. The interlaced-scan video stream is horizontally reduced to half by the resolution conversion unit 2, thereby converted into the reduced-size video stream (an encoding target) composed of pictures of 352 pixels x 240 lines. The reduced-size video stream is then encoded by the encoding processing unit 4.
The easiest method of the resolution conversion that the resolution conversion unit 2 may perform is, for example, to convert a video stream composed of pictures of
704 pixels x 240 lines x 60 fields per second into a video stream composed of pictures of 352 pixels x 240 lines x 30 frames per second, by taking out only the odd fields of the pictures and reducing the horizontal size of the pictures to half.
However, when the reduced-size video stream (the encoding target) is generated by performing resolution conversion to the field pictures according to the above-mentioned conventional structure and method, serrated edges often appear at the outline of objects in the reduced-size video stream. Such serrated edges not only make the video stream unpleasant to view, but also increase high frequency components in encoding, leading to an increased amount of encoded data thereof. Especially when the reduced-size video steam is encoded at a low bit rate, the increased portion of the amount of the encoded data occupies a large part in the total amount of the encoded data, which causes a serious problem in encoding.
Disclosure of the Invention
In view of the above, an object of the present invention is to provide a video stream encoding device operable to be used when a video stream is reduced in size and encoded at a low bit rate and to reduce high frequency components included in the reduced video stream and having little influence on the picture quality of the reduced
video stream, thereby suppressing a possible degradation in picture quality of the reduced video stream when decoded.
A first aspect of the present invention provides a video stream encoding device comprising: an input unit operable to input a video stream including a time series of pictures; a memory operable to store a picture constituting a part of the video stream and currently inputted to the input unit as a current picture, and to store a picture constituting a part of the video stream and inputted to the input unit previous to the current picture as a previous picture; a picture conversion processing unit operable to generate a converted picture from the current picture inputted to the input unit and the previous picture stored in the memory; a size conversion unit operable, utilizing the converted picture, to generate a reduced video stream, each picture of the reduced video stream possessing a smaller number of pixels than each picture of the video stream inputted to the input unit; and an encoding processing unit operable to encode the reduced video stream, thereby outputting an encoded, reduced video stream. According to the structure, an inputted video stream can be efficiently converted into a reduced-size video stream, each picture of which possesses a smaller number of pixels than each picture of the inputted video stream. Furthermore, the reduced-size video stream can be encoded efficiently.
A second aspect of the present invention provides the video stream encoding device as defined in the first aspect, wherein the picture conversion processing unit comprises: a pixel data conversion unit operable to convert pixel data of pixels included in the previous picture stored in the memory into a plurality of converted pixel data; a motion judging unit operable to detect motion of the current picture of the video stream inputted to the input unit and to compare the detected motion with a predetermined threshold, thereby outputting a motion judgment result; and a pixel data selection unit operable to select one of the plurality of the converted pixel data based on the motion judgment result, thereby generating the converted picture.
According to the structure, a motion of a picture of the inputted video stream can be detected in pixel unit, and the detected motion can be used to select the most suitable pixel conversion, thereby generating the reduced-size, converted picture with a smaller number of pixels. A third aspect of the present invention provides the video stream encoding device as defined in the second aspect, wherein the pixel data conversion unit comprises: an inter field conversion unit operable to generate, as first converted pixel data, pixel data of a new pixel from pixel data of a plurality of pixels belonging to a plurality of fields of the previous picture stored in the memory; and an intra field conversion unit operable to generate, as second converted pixel data, pixel data of a new pixel from pixel data of a plurality of pixels belonging to a single field of the previous picture stored in the memory. The pixel data selection unit selects either of the first converted pixel data and the second converted pixel data based on the motion judgment result, thereby generating the converted picture. A fourth aspect of the present invention provides the video stream encoding device as defined in the third aspect, wherein the video stream is inputted in an interlaced-scan format. The inter field conversion unit generates, as the first converted pixel data, pixel data of pixels to be included in a new picture of a progressive-scan format, from pixel data of a plurality of pixels belonging to a plurality of fields of the interlaced-scan video stream. The intra field conversion unit generates, as the second converted pixel data, pixel data of pixels to be included in a new picture of a progressive- scan format, from pixel data of a plurality of pixels belonging to a single field of the interlaced-scan video stream.
According to these structures, a motion of a picture of the inputted interlaced-scan video stream can be detected and the detected motion can be used to select the more suitable pixel conversion among the inter field conversion and the intra field conversion, thereby generating the reduced-size, progressive-scan picture with a
smaller number of pixels.
A fifth aspect of the present invention provides the video stream encoding device as defined in the third aspect, wherein the pixel data conversion unit comprises: a product-sum operation unit operable to perform a product-sum operation. The product-sum operation unit multiplies predetermined coefficients to pixel data of a plurality of pixels belonging to a plurality of fields of the previous picture stored in the memory and sums up the coefficient-multiplied pixel data, thereby outputting the first converted pixel data. The product-sum operation unit multiplies predetermined coefficients to pixel data of a plurality of pixels belonging to a single field of the previous picture stored in the memory and sums up the coefficient-multiplied pixel data, thereby outputting the second converted pixel data.
According to the structure, when the inputted video stream is converted into the reduced-size video stream with a smaller number of the pixels, pixel data of a pixel to be newly generated by the conversion can be calculated by a simple product-sum calculation in which predetermined coefficients are multiplied to pixel data of a plurality of pixels selected from field pictures of the inputted video stream and the products are summed up. Furthermore, a more suitable converted, reduced-size video stream can be acquired only by changing the predetermined coefficients suitably.
A sixth aspect of the present invention provides the video stream encoding device as defined in the third aspect, wherein the motion judging unit detects motion per pixel of the current picture of the video stream inputted to the input unit, using pixel data of pixels of the current picture and pixel data of the corresponding pixels of the previous picture stored in the memory, and compares the detected motion with a predetermined threshold, thereby outputting the motion judgment result. The pixel data selection unit selects the first converted pixel data to generate the converted picture when the motion judgment result indicates that the detected motion is smaller than the predetermined threshold, and the pixel data selection unit selects the second converted
pixel data to generate the converted picture when the motion judgment result indicates that the detected motion is greater than or equal to the predetermined threshold.
According to the structure, since the motion of a picture of the inputted video steam can be detected in pixel unit and the more suitable converted pixel data can be selected from the two kinds of the converted pixel data, based on the detected motion result. Therefore, fine and minute picture conversion can be realized.
A seventh aspect of the present invention provides the video stream encoding device as defined in the first aspect, wherein the picture conversion processing unit comprises: a pixel data conversion unit including a product-sum operation unit operable to perform product-sum operation by multiplying coefficients to pixel data of a plurality of pixels belonging to a plurality of fields of the previous picture stored in the memory and summing up the coefficient-multiplied pixel data, thereby generating the converted picture; and a motion judging unit operable to detect motion per pixel of the current picture of the video stream inputted to the input unit and to compare the detected motion with a predetermined threshold, thereby outputting a motion judgment result per pixel. Based on the motion judgment result per pixel, the pixel data conversion unit determines the coefficients that the product-sum operation unit employs in the product-sum operation.
According to the structure, it is possible to detect a motion of a picture of the inputted video stream, and based on the detected motion, to determine the coefficients to be used for the product-sum operation, thereby obtaining new converted pixel data. Therefore, different sets of converted pixel data corresponding to the amounts of the motion can be obtained by the single product-sum operation unit.
An eighth aspect of the present invention provides the video stream encoding device as defined in the second aspect, wherein the video stream includes a time series of pictures composed of YCbCr components, and the motion judging unit performs motion detection using Y-component data of each pixel.
A ninth aspect of the present invention provides the video stream encoding device as defined in the second aspect, wherein the video stream includes a time series of pictures composed of RGB components, and the motion judging unit performs motion detection using G-component data of each pixel. According to these structures, irrespective of the fact that the inputted video stream is composed of YCbCr components or RGB components, a reduced-size video ' stream with a smaller number of pixels can be always suitably acquired.
A tenth aspect of the present invention provides a picture conversion processing unit comprising: a motion judging unit operable to input a current picture and a previous picture both included in a video stream, to detect motion per pixel of the current picture in comparison with the previous picture, and subsequently to compare the detected motion with a predetermined threshold, thereby outputting a motion judgment result; a pixel data conversion unit operable to convert pixel data of pixels belonging to the previous picture into a plurality of converted pixel data; and a pixel data selection unit operable to select one of the plurality of the converted pixel data based on the motion judgment result, thereby generating the converted picture.
According to the structure, it is possible to provide a picture conversion processing unit that can detect a per-pixel motion of a picture of the inputted video stream and select more suitably converted pixel data based on the detected motion. An eleventh aspect of the present invention provides the picture conversion processing unit as defined in the tenth aspect, further comprising: a memory unit operable to store the current picture and the previous picture both included in the video stream, wherein the pixel data conversion unit converts pixel data of pixels belonging to the previous picture stored in the memory unit into the plurality of the converted pixel data.
According to the structure, it is possible to provide a picture conversion processing unit which possesses a memory operable to store picture data necessary for
converting picture. The picture conversion processing unit can detect a per-pixel motion of a picture of the inputted video stream and select more suitably converted pixel data, based on the detected motion.
The above, and other objects, features and advantages of the present invention will become apparent from the following description read in conjunction with the accompanying drawings, in which like reference numerals designate the same elements.
Brief Description of the Drawings
Fig. 1 is a block diagram of a video stream encoding device in Embodiment 1 of the present invention; Fig. 2 is a block diagram of a video stream encoding device in Embodiment 2 of the present invention;
Fig. 3 is a block diagram of a picture conversion processing unit in Embodiment 3 of the present invention;
Fig. 4 is a block diagram of a chroma component conversion processing unit in Embodiment 3 of the present invention;
Fig. 5 is a block diagram of a picture conversion processing unit in Embodiment 4 of the present invention;
Fig. 6 is a block diagram of a video stream encoding device in Embodiment 5 of the present invention; Fig. 7 is an explanatory drawing illustrating line and pixel arrangement on field pictures in Embodiment 3 of the present invention;
Fig. 8 is an explanatory drawing illustrating arrangement of pixels to be used in product-sum operation in Embodiment 3 of the present invention;
Fig. 9 is an explanatory drawing illustrating product-sum operation in Embodiment 3 of the present invention;
Fig. 10 is an explanatory drawing illustrating product-sum operation for a still video mode in Embodiment 3 of the present invention;
Fig. 11 is an explanatory drawing illustrating product-sum operation for a quasi-moving video mode in Embodiment 3 of the present invention;
Fig. 12 is an explanatory drawing illustrating product-sum operation for a moving video mode in Embodiment 3 of the present invention; and Fig. 13 is a block diagram of the conventional video stream encoding device.
Best Mode for Carrying out the Invention
Next, Embodiments of the present invention are explained referring to drawings.
In the following explanation of the present invention, it is defined that a "video stream" is composed of a time- wisely arranged, series of pictures and that a "picture" is a collective term for a field or a frame. It is further defined that a "size of a picture" means a "number of pixels of a picture" and is given by a product of the number of pixels per line in the horizontal direction of the picture and the number of lines in the vertical direction of the picture. In the present invention, the "size of a video stream" does not mean the physical size of a display screen, the occupied storage capacity in storing the video stream in a recording medium, nor the total transmission quantity in transmitting the video stream. Therefore, for example, a "reduced-size video stream" means a "video stream in which the number of pixels of a picture is reduced" in comparison with the original video stream. Embodiment 1
Fig. 1 is a block diagram of a video stream encoding device in Embodiment 1 of the present invention.
The video stream encoding device of the present embodiment comprises a video input unit 10, a memory 20, a picture conversion processing unit 100, a size conversion unit 30, and an encoding processing unit 40. The picture conversion processing unit 100 includes a pixel data conversion unit 110, a pixel data selection unit
120, and a motion judging unit 130.
In the following, operation of the video stream encoding device of the present embodiment is explained, illustrating an example that a video stream inputted in an interlaced-scan format is converted into a progressive-scan video stream, and the converted progressive-scan video stream is reduced in size and then encoded. The video stream is inputted from an input terminal 80 to the video input unit 10. The inputted video stream is sent to the motion judging unit 130 of the picture conversion processing unit 100, and at the same time, stored in the memory 20.
The motion judging unit 130 detects a motion of a picture of the inputted video stream in pixel unit by comparing the picture of the current field inputted from the video input unit 10 with the picture of the previous-but-one field stored in the memory 20. The motion judging unit 130 then compares the detected motion with the predetermined threshold, and judges which mode the comparison result should belong to: a "moving video mode" where the motion is intense, a "still video mode" where almost no motion exists, and a "quasi-moving video mode" where the motion is in between of the moving video mode and the still video mode, and outputs the judgment result to the pixel data selection unit 120.
The pixel data conversion unit 110 generates, from the video stream stored in the memory 20, a plurality of pixel data necessary for obtaining the converted picture, which is a reduced-size picture of the inputted video stream, and outputs the plurality of pixel data generated to the pixel data selection unit 120. At this time, the plurality of pixel data includes pixel data generated from only the pixel data of pixels of a single field of the video stream stored in the memory 20, and pixel data generated from the pixel data of pixels of a field and pixels of a previous field to the field.
The pixel data selection unit 120 selects the most suitable pixel data from the plurality of pixel data which the pixel data conversion unit 110 outputted, based on the judgment result of the motion judging unit 130. Then, the pixel data selection unit 120 generates the converted picture in a progressive- scan format, and outputs the converted
picture to the size conversion unit 30. For example, when the judgment result of motion detection is for the still video mode, the pixel data selection unit 120 selects the pixel data that is generated using the pixel data of a plurality of pixels of the previous field and the pixel data of a plurality of pixels of the previous-but-one field. When the judgment result of motion detection is for the moving video mode, the pixel data selection unit 120 selects the pixel data that is generated using the pixel data of a plurality of pixels of only the previous field.
The size conversion unit 30 converts the size of the progressive-scan format converted picture, which the picture conversion processing unit 100 has converted from a picture of the video stream in the interlaced-scan format. For example, a progressive- scan video stream of 704 pixels x 240 lines is reduced to half in the horizontal direction, thereby generating a reduced, progressive- scan video stream of 352 pixels x 240 lines.
The encoding processing unit 40 encodes the reduced, progressive-scan video stream at a low bit rate, and outputs the encoded result to an output terminal 90 as encoded data.
As explained above, according to the video stream encoding device of the present embodiment, in consideration of the motion of a picture of the video stream, the interlaced-scan video stream can be converted into the progressive-scan video stream, and furthermore, the progressive-scan video stream can be reduced in size and encoded at the low bit rate. Thereby, the amount of encoded data for the encoded video stream of reduced size can be satisfactorily suppressed, and at the same time, when the encoded video stream is decoded, the appearance of unpleasantly serrated edges at the outline of objects within a picture can be prevented. Embodiment 2
Fig. 2 is a block diagram of a video stream encoding device in Embodiment 2 of the present invention. The same symbols are given to elements each having the same
function as elements of Fig. 1 in order to omit explanation.
The video stream encoding device of the present embodiment shown in Fig. 2 comprises the video input unit 10, a picture conversion processing unit 200, the size conversion unit 30, the encoding processing unit 40, and a second frame memory 50. The picture conversion processing unit 200 includes a pixel data conversion unit 210, a selector 220, the motion judging unit 130, and a first frame memory 230. The selector 220 corresponds to the pixel data selection unit 120 shown in Fig. 1.
The features of the video stream encoding device of the present embodiment lie in a fact that the memory 20 in Embodiment 1 of the present invention shown in Fig. 1 is installed in the interior of the picture conversion processing unit 200 as the first frame memory 230 in the present embodiment, and also in a fact that the second frame memory 50 is installed as a memory that temporarily stores the picture data of the reduced video stream, which is outputted by the size conversion unit 30 after size conversion. Furthermore, in the video stream encoding device of the present embodiment, the pixel data conversion unit 210 includes a first inter field conversion unit 211 operable to perform pixel data conversion for a case where the judgment result of the motion detection of a picture of the inputted video stream is for the still video mode, a second inter field conversion unit 212 operable to perform pixel data conversion for a case where the judgment result is for the quasi-moving video mode, and an intra field conversion unit 213 operable to perform pixel data conversion for a case where the judgment result is for the moving video mode.
Next, the outline of operation of the video stream encoding device of the present embodiment is explained. The interlaced-scan video stream inputted into the video input unit 10 is fed to the picture conversion processing unit 200, more specifically fed to the motion judging unit 130 and stored in the first frame memory 230 at the same time.
The motion judging unit 130 detects the motion of a picture of the inputted video stream in pixel unit, comparing a picture of the current field inputted from the video input unit 10 and a picture of the previous-but-one field stored in the memory 230. The motion judging unit 130 then judges the detected motion by comparing with the predetermined threshold, and classifies the detected motion into the three modes; the moving video mode where the motion is intense, the still video mode where almost no motion exists, and the quasi-moving video mode where the motion is in between of the moving video mode and the still video mode. The judgment result is outputted to the selector 220. In the pixel data conversion unit 210, the first inter field conversion unit 211 and the second inter field conversion unit 212 perform inter field conversion using the pixel data of pixels of the previous field and the pixel data of pixels of the previous-but-one field, both of which are stored in the first frame memory 230, and then output the first converted pixel data and the second converted pixel data, respectively. The intra field conversion unit 213 performs intra field conversion using the pixel data of pixels of the previous field stored in the first frame memory 230, and then outputs the third converted pixel data.
Based on the judgment result of the motion detection of the motion judging unit 130, the selector 220 selects one of the three kinds of the converted pixel data which the pixel data conversion unit 210 has outputted. Thus, when the judgment result of the motion detection is for the still video mode, the selector 220 selects the first converted pixel data which the first inter field conversion unit 211 has outputted. When the judgment result of the motion detection is for the quasi-moving video, the selector 220 selects the second converted pixel data which the second inter field conversion unit 212 has outputted. When the judgment result of the motion detection is for the moving video mode, the selector 220 selects the third converted pixel data which the intra field conversion unit 213 has outputted. The selector 220 thereby generates a converted
picture using the selected, converted pixel data.
The size conversion unit 30 reduces the size of the converted picture of the progressive-scan video stream which the picture conversion processing unit 200 has outputted, thereby generating a reduced video stream. The size conversion unit 30 temporarily stores the reduced video stream in the second frame memory 50.
The encoding processing unit 40 encodes the reduced video stream stored in the second frame memory 50 at a low bit rate, and outputs the encoded result to the output terminal 90 as encoded data.
An application example of the picture conversion processing unit 200 of the present embodiment is explained in details in Embodiment 3 of the present invention in the following.
As explained above, according to the video stream encoding device of the present embodiment, similar to the video stream encoding device of Embodiment 1 of the present invention, the interlaced-scan video stream can be converted into the progressive- scan video stream, in consideration of the motion of a picture of the video stream, and furthermore, the progressive- scan video stream can be reduced in size and encoded at the low bit rate. Thereby, the amount of encoded data for the encoded video stream of reduced size can be satisfactorily suppressed, and at the same time, when the encoded video stream is decoded, the appearance of unpleasantly serrated edges at the outline of objects within a picture can be prevented.
Embodiment 3
Fig. 3 is a block diagram of a picture conversion processing unit in Embodiment 3 of the present invention. The picture conversion processing unit of the present embodiment is a more detailed application example than the picture conversion processing unit 200 of Embodiment 2 of the present invention.
The picture conversion processing unit of the present embodiment comprises a frame memory 410, a line memory group 420, a motion judging unit 430, a product-sum
operation unit 440, and a selector 450. The product-sum operation unit 440 includes a first-mode product-sum operation unit 441, a second-mode product-sum operation unit 442, and a third-mode product-sum operation unit 443. The first-mode product-sum operation unit 441 corresponds to the first inter field conversion unit 211 of Fig. 2. The second-mode product-sum operation unit 442 corresponds to the second inter field conversion unit 212 of Fig. 2. The third-mode product-sum operation unit 443 corresponds to the intra field conversion unit 213 of Fig. 2.
In the following, an example is explained where an interlaced-scan video stream composed of YCbCr components is inputted into the picture conversion processing unit of the present embodiment and converted into a progressive-scan video stream.
Y-component data of a current field (the (2n+l)th field) of the inputted video stream is supplied to an input terminal 480. Moreover, the frame memory 410 stores Y-component data of a previous field (the (2n)th field) and Y-component data of a previous-but-one field (the (2n-l)th field), as well as the Y-component data of the current field.
The line memory group 420 includes line memories a-h. The line memory h stores the Y-component data of one line of the (2n+l)th field that is supplied to the input terminal 480. The line memory g stores the Y-component data of one line of the (2n)th field that is read from the frame memory 410. Every time when processing for one line is completed, the Y-component data of one line of the (2n)th field stored in the line memory g is transferred to the line memory f, the line memory e, and the line memory d one after another. The line memory c stores the Y-component data of one line of the (2n-l)th field that is read out from the frame memory 410. Every time when processing for one line is completed, the Y-component data of one line of the (2n+l)th field stored in the line
memory c is transferred to the line memory b and the line memory a one after another.
The Y-component data stored in line memories a-h of the line memory group 420 are taken out from the top of each line one by one, and are sent to the motion judging unit 430 and the product-sum operation unit 440. The motion judging unit 430 includes a difference calculation unit 433 that detects a motion per pixel, a first register 431 that stores a threshold A for classifying motions, a second register 432 that stores a threshold B for the same, and a comparator 434 that classifies the motion.
The first-mode product-sum operation unit 441 generates the first converted pixel data corresponding to the still video mode. The second-mode product-sum operation unit 442 generates the second converted pixel data corresponding to the quasi-moving video mode. The third-mode product-sum operation unit 443 generates the third converted pixel data corresponding to the moving video mode.
Based on the judgment result of the motion detection by the motion judging unit 430, the selector 450 selects either one of the first converted pixel data, the second converted pixel data, and the third converted pixel data, and outputs the selected result to an output terminal 490 as converted pixel data Yt of a target pixel.
In the following, operation of the picture conversion processing unit of the present embodiment is explained, referring to Figs. 7-12. Fig. 7 is an explanatory drawing illustrating line and pixel arrangement on field pictures in Embodiment 3 of the present invention. Fig. 7 shows the line arrangements and pixel arrangements of each of the (2n-l)th field (a previous-but-one field), the (2n)th field (a previous field), and the (2n+l)th field (a current field). It is assumed that the (2n-l)th field and the (2n+l)th field consist of lines of odd numbers and the (2n)th field consists of lines of even numbers.
Each line memory of the line memory group 420 of Fig. 3 stores Y-component data of pixels as follows:
the line memory a stores Y-component data of pixels of the (2m-l)th line of the (2n-l)th field; the line memory b stores Y-component data of pixels of the (2m+l)th line of the
(2n-l)th field; the line memory c stores Y-component data of pixels of the (2m+3)th line of the
(2n-l)th field; the line memory d stores Y-component data of pixels of the (2m-2)th line of the (2n)th field; the line memory e stores Y-component data of pixels of the (2m)th line of the (2n)th field; the line memory f stores Y-component data of pixels of the (2m+2)th line of the (2n)th field; the line memory g stores Y-component data of pixels of the (2m+4)th line of the (2n)th field; and the line memory h stores Y-component data of pixels of the (2m+l)th line of the
(2n+l)th field.
Now, it is assumed that the second pixel of each line is due to be processed.
Accordingly, the difference calculation unit 433 takes in the following pixel data: the pixel data Yb of the second pixel of the (2m+l)th line of the (2n-l)th field from the line memory b, and the pixel data Yh of the second pixel of the (2m+l)th line of the (2n+l)th field from the line memory h.
The first-mode product-sum operation unit 441 and the second-mode product-sum operation unit 442 take in the following pixel data: pixel data Ya of the second pixel of the (2m-l)th line of the (2n-l)th field from the line memory a; pixel data Yb of the second pixel of the (2m+l)th line of the (2n-l)th field from the line
memory b; pixel data Yc of the second pixel of the (2m+3)th line of the (2n-l)th field from the line memory c; pixel data Yd of the second pixel of the (2m-2)th line of the (2n)th field from the line memory d; pixel data Ye of the second pixel of the (2m)th line of the (2n)th field from the line memory e; pixel data Yf of the second pixel of the (2m+2)th line of the (2n)th field from the line memory f; and pixel data Yg of the second pixel of the (2m+4)th line of the (2n)th field from the line memory g.
The third-mode product-sum operation unit 443 takes in the following pixel data: the pixel data Yd of the second pixel of the (2m-2)th line of the (2n)th field from the line memory d; the pixel data Ye of the second pixel of the (2m)th line of the (2n)th field from the line memory e; the pixel data Yf of the second pixel of the (2m+2)th line of the (2n)th field from the line memory f; and the pixel data Yg of the second pixel of the (2m+4)th line of the (2n)th field from the line memory g.
Fig. 8 is an explanatory drawing illustrating arrangement of pixels to be used in product-sum operation in Embodiment 3 of the present invention. That is to say, Fig. 8 shows the arrangement of pixels of each line in each field that are to be processed by the difference calculation unit 433 and the product-sum operation unit 440, at the time when picture conversion processing is performed for the second pixel of each line. Pixels on a segment pθ-qθ, a segment pl-ql, and a segment p2-q2 shown in Fig. 8 correspond
respectively to pixels on the segment pθ-qθ of the (2n-l)th field, the segment pl-ql of the (2n)th field, and the segment p2-q2 of the (2n+l)th field shown in Fig. 7.
It is assumed that the image data of the target pixel t on the (2m+l)th line of a picture of the progressive-scan video stream is now calculated referring to Fig. 8. At this time, pixels a-h of a picture of the interlaced-scan video stream in a reference area 600 are referred to.
The difference calculation unit 433 determines an absolute difference value |Yb-Yh| of the pixel data Yb of the pixel b and the pixel data Yh of the pixel h, and feeds the value to the comparator 434. The comparator 434 compares the inputted absolute difference value |Yb-Yh| with the threshold A stored in the first register 431 and the threshold B stored in the second register 432, and judges a motion of the target pixel t.
In other words, assuming that the threshold A< threshold B, the comparator 434 judges as follows: when the absolute difference value |Yb-Yh| satisfies Equation (1), the motion of the target pixel t is judged to be in the still video mode; when the absolute difference value |Yb-Yh| satisfies Equation (2), the motion of the target pixel t is judged to be in the quasi-moving video mode; and when the absolute difference value |Yb-Yh| satisfies the equation (3), the motion of the target pixel t is judged to be in the moving video mode. Then, the judgment result of the comparator 434 is sent to the selector 450. |Yb-Yh|<A (1)
A=<|Yb-Yh|<B (2)
B=<|Yb-Yh| (3)
Here, a symbol "=<" denotes that the value of the left side of the equation is equal to or less than the value of the right side of the equation. The selector 450 selects the first converted pixel data (an operation result of the first-mode product-sum operation unit 441) corresponding to the still video mode when the judgment result is for the still video mode. The selector 450 selects the second
converted pixel data (an operation result of the second-mode product-sum operation unit 442) corresponding to the quasi-moving video mode when the judgment result is for the quasi-moving video mode. The selector 450 selects the third converted pixel data (an operation result of the third-mode product-sum operation unit 443) corresponding to the moving video mode when the judgment result is for the moving video mode.
The threshold A and the threshold B are set up in order that the converted progressive- scan video stream may become the most natural regardless of the magnitude of the motion of a picture.
Next, an operation of the product-sum operation unit 440 is explained. Fig. 9 is an explanatory drawing illustrating product-sum operation in
Embodiment 3 of the present invention. The pixel data Yt of the target pixel t is calculated by multiplying each pixel data Ya, Yb, Yc, Yd, Ye, Yf and Yg of the pixels a, b, c, d, e, f and g in the reference area 600 by the weight coefficients Wa, Wb, Wc, Wd, We, Wf and Wg which are predetermined for the respective pixel, and adding each result of the multiplication. Accordingly, the pixel data Yt is generally calculated by Equation (4) shown below.
Yt=Wa*Ya+Wb*Yb+Wc*Yc+Wd*Yd+We*Ye+Wf*Yf+Wg*Yg (4) The first-mode product-sum operation unit 441 which generates the first converted pixel data for the still video mode sets up the weight coefficients as shown in Equations (5-1) through (5-7).
Wa = -2/16 (5-1)
Wb = 4/16 (5-2)
Wc = -2/16 (5-3)
Wd = 1/16 (5-4) We = 7/16 (5-5)
Wf= 7/16 (5-6)
Wg = 1/16 (5-7)
Fig. 10 is an explanatory drawing illustrating product-sum operation for a still video mode in Embodiment 3 of the present invention. Fig. 10 illustrates that the pixel data Yt of the target pixel t is calculated by multiplying the pixel data of pixels a through g except for pixel h in the reference area 600 by the respective weight coefficients shown in Equations (5-1) through (5-7) and adding each result of the multiplication. As shown in Fig. 10, in conversion of a video stream with little motion, the pixel data Yt of the target pixel t is calculated with the product-sum of the pixel data of a plurality of pixels located near the target pixel t in the current field and the previous field (the (2n)th field and the (2n-l)th field in the example of Fig. 10). Thus, the high frequency components originally possessed by the pixel data Yt of the target pixel t can be reduced by generating the pixel data Yt through interpolation from the pixel data of a plurality of the pixels near the target pixel t. Furthermore, the high frequency components of the pixel data Yt can be preferably reduced in a range which does not cause degradation of picture quality, by optimizing the weight coefficients shown in Equations (5-1) through (5-7). As a result, the amount of the encoded data for the reduced, encoded video stream can be suppressed, furthermore, when the reduced, encoded video stream obtained by converting a video stream with little motion, conspicuously serrated edges in the outline of a body in a picture can be suppressed. The second-mode product-sum operation unit 442, which generates the second converted pixel data of the quasi-moving video mode, sets up the weight coefficients as shown Equations (6-1) through (6-7).
Wa=-2/16 (6-1)
Wb = 4/16 (6-2) Wc=-2/16 (6-3)
Wd = 2/16 (6-4)
We = 8/16 (6-5)
Wf= 8/16 (6-6)
Wg = 2/16 (6-7)
Fig. 11 is an explanatory drawing illustrating product-sum operation for a quasi-moving video mode in Embodiment 3 of the present invention. Fig. 11 shows that the pixel data Yt of the target pixel t is calculated by multiplying the pixel data of pixels a through g except for pixel h in the reference area 600 by the respective weight coefficients shown by Equations (6-1) through (6-7) and adding each result of the multiplication. Although the reference area 600 in the quasi-moving video mode is taken same as in the still video mode, the respective weight coefficients differ, weighting more the pixels d through g in the product-sum operation in the quasi-moving video mode.
The third-mode product-sum operation unit 443 which generates the third converted pixel data of the moving video mode sets up the respective weight coefficients as shown in Equations (7-1) through (7-7). Wa = O (7-1)
Wb = 0 (7-2)
Wc = 0 (7-3)
Wd = 2/16 (7-4)
We = 8/16 (7-5) Wf= 8/16 (7-6)
Wg = 2/16 (7-7)
Fig. 12 is an explanatory drawing illustrating product-sum operation for a moving video mode in Embodiment 3 of the present invention. It should be noted that when a motion of the target pixel is in the moving video mode, the converted pixel data is generated through interpolation of pixel data of only pixels d, e, f and g near the target pixel t of the same field (the (2n)th field in the example of Fig. 12); thereby, degradation of picture quality can be prevented even in conversion of a video stream
with large motion.
The above is description regarding conversion of the Y-component data of the pixel. Next, conversion of the chroma components of the pixel is explained.
Fig. 4 is a block diagram of the chroma component conversion processing unit according to Embodiment 3 of the present invention. The chroma component conversion processing unit of the present embodiment comprises a frame memory 410, a color line memory group 510, and a chroma data operation unit 520. The color line memory group 510 includes a color line memory e 512 and a color line memory f 511.
With reference to Fig. 8, operation of the chroma component conversion processing unit of the present embodiment as shown in Fig. 4 is explained. In the following description, chroma components Cb and Cr are collectively expressed as C-component data.
In generating the converted pixel data of the chroma components for the target pixel t of Fig. 8, the following assumption is first made: the C-component data of the (2n)th field are stored in the frame memory 410. The C-component data of the (2m+2)th line is read from the frame memory 410 and stored in the color line memory f 511, and the C-component data of the (2m)th line is transferred from the color line memory f 511 and stored in the color line memory e 512. In generating a chroma component Ct of the target pixel t of Fig. 8, a chroma component Ce of the pixel e is read from color line memory e 512, a chroma components Cf of the pixel f is read from the color line memory f 511. Each chroma component is sent to the chroma data operation unit 520, respectively. The chroma data operation unit 520 operates Equation (8) shown below, calculates the chroma components Ct of the target pixel t and outputs the result to an output terminal 590. Ct= (Ce+Cf) / 2 (8)
The conversion processing of the chroma components described above is performed in the conversion from an interlaced-scan video stream to a progressive-scan
video stream. In the conversion processing of the chroma components, it is not necessary to take into consideration the motion of a picture of the video stream, which has been taken into consideration in the conversion of the Y-component data as described above. As described above, the interlaced-scan video stream comprising YCbCr components can be thoroughly converted to the progressive-scan video stream by using the picture conversion processing unit of Fig. 3 and the chroma component conversion processing unit of Fig. 4.
In addition, in performing conversion to the pixel data Yt using the equation (4), it is preferable to set up the weight coefficients to satisfy the following Equation (9) as in the case of Equations (5-1) through (5-7). However, it is not necessary for the weight coefficients to be set up so as to satisfy Equation (9), as in the case of Equations (6-1) through (6-7) or Equations (7-1) through (7-7).
Wa+Wb+Wc+Wd+We+Wf+Wg=l (9) Embodiment 4
Fig. 5 is a block diagram of a picture conversion processing unit according to Embodiment 4 of the present invention. In Fig. 5, description is omitted by giving the same symbol regarding the same component as in Fig. 3.
The picture conversion processing unit of the present embodiment comprises the frame memory 410, the line memory group 420, the motion judging unit 430, the product-sum operation unit 440, a coefficient register group 460, and a selector 470.
The coefficient register group 460 includes a first-mode register 461, a second-mode register 462, and a third-mode register 463.
The picture conversion processing unit of the present embodiment is realized with a simpler structure than the picture conversion processing unit of Embodiment 3 of the present invention. That is, in the picture conversion processing unit of the present embodiment, sets of coefficients to be used in the product-sum operation, performed by
the product-sum operation unit 440, are beforehand stored, set by set, in three registers of the coefficient register group 460, corresponding to classification of the motion of the target pixel. The selector 470 selects one of the sets of coefficients from the coefficient register group 460, according to the judgment result of the motion of the target pixel outputted from the motion judging unit 430, and sends the set of coefficients to the product-sum operation unit 440. The product-sum operation unit 440 performs a product-sum operation using the set of coefficients sent from the selector 470, generates and outputs converted pixel data.
In the following, operation of the picture conversion processing unit of the present embodiment is explained in more detail, regarding a case where picture conversion is performed to the target pixel t that is shown in Fig. 8 using the pixel data of a plurality of pixels in the reference area 600 that is close to the target pixel t.
The motion judging unit 430 detects the motion of the target pixel t using the pixel data Yb of the pixel b, and the pixel data Yh of the pixel h, and outputs the judgment result classified into either one of the still video mode, the quasi-moving video mode, and the moving video mode according to the magnitude of the motion. This operation is the same as the operation in Embodiment 3 of the present invention.
Like the first-mode product-sum operation unit 441 of Embodiment 3 of the present invention, the following pixel data are fed to the product-sum operation unit 440 of the present embodiment: the pixel data Ya of the pixel a from the line memory a, the pixel data Yb of the pixel b from the line memory b, the pixel data Yc of the pixel c from the line memory c, the pixel data Yd of the pixel d from the line memory d, the pixel data Ye of the pixel e from the line memory e, the pixel data Yf of the pixel f from the line memory f, and the pixel data Yg of the pixel g from the line memory g.
The product-sum operation unit 440 performs an operation of the equation (4), using the set of coefficients selected by the selector 470.
The sets of coefficients Wa through Wg described in Equation (4), of which the product-sum operation unit 440 performs the operation, are stored in the coefficient register group 460. That is to say, a first set of coefficients Wa through Wg shown in Equations (5-1) through (5-7) for performing an operation in the still video mode is stored in the first-mode register 461. A second set of coefficient Wa through Wg shown in Equations (6-1) through (6-7) for performing an operation in the quasi-moving video mode is stored in the second-mode register 462. A third set of coefficients Wa through Wg shown in Equations (7-1) through (7-7) for performing an operation in the moving video mode is stored in the third-mode register 463.
The selector 470 selects the first set of coefficients stored in the first-mode register 461 when the judgment result of the motion of the target pixel outputted from the motion judging unit 430 is for the still video mode, selects the second set of coefficients stored in the second-mode register 462 when the judging result of the motion is for the quasi-moving video mode, selects the third set of coefficients stored in the third-mode register 463 when the judging result of the motion is for the moving video mode, and sends the selected set of coefficients to the product-sum operation unit 440. The product-sum operation unit 440 performs the operation of Equation (4) using the set of coefficients sent from the selector 470, and outputs the result to the output terminal 490 as the converted pixel data Yt of the target pixel t.
As explained above, the picture conversion processing unit of the present embodiment can perform conversion of video stream in the still video mode, the quasi-moving video mode, and the moving video mode, using only one product-sum operation unit 440. Therefore, the picture conversion processing unit of the present embodiment has the same function and the same features as the picture conversion
processing unit of Embodiment 3 of the present invention. Embodiment 5
Fig. 6 is a block diagram of a video stream encoding device according to Embodiment 5 of the present invention. In Fig. 6, description is omitted by giving the same symbol regarding the same component as in Fig. 2.
The video stream encoding device of the present embodiment shown in Fig. 6 comprises the video input unit 10, a picture conversion processing unit 300, the encoding processing unit 40, and the second frame memory 50. The picture conversion processing unit 300 includes the pixel data conversion unit 210, the selector 220, the motion judging unit 130, the first frame memory 230, and a size conversion control unit 240.
The feature of the video stream encoding device of the present embodiment is that the picture conversion processing unit 300 possesses in itself the size conversion control unit 240 which controls the pixel data conversion unit 210. In Embodiment 1 of the present invention as mentioned above, after converting an interlaced-scan video stream into a progressive-scan video stream, collectively as a gang, in the picture conversion processing unit 100, the picture size is reduced in the size conversion unit 30. Also in Embodiment 2, after converting an interlaced-scan video stream into a progressive-scan video stream collectively in the picture conversion processing unit 200, the picture size is reduced in the size conversion unit 30.
Consequently, among the pixels of a picture of the converted progressive-scan video stream, there are pixels that are not used for a picture of reduced-size, progressive-scan video stream, unwillingly performing unnecessary processing in the picture conversion.
The video stream encoding device of the present embodiment is devised in order to remove the unnecessary processing described above. In the picture conversion processing unit 300 of the present embodiment, the size conversion control unit 240 has information regarding the picture size of the reduced video stream that the encoding
processing unit 40 encodes at a low bit rate. Based on the information regarding the picture size, the size conversion control unit 240 gives the pixel data conversion unit 210 an instruction to select only the pixels that compose a picture of the reduced video stream as target pixels. Accordingly, the pixel data conversion unit 210 generates the converted pixel data for each of the target pixels, referring to the pixels in the reference area near each of the target pixels, in consideration of the motion of each of the target pixels as well. Thus, the pixel data conversion unit 210 performs direct conversion of the interlaced-scan video stream to the progressive-scan video stream in reduced size, and stores the conversion result in the second frame memory 50. The more concrete practical application of the conversion processing of the video stream by the present embodiment, in which the motion of the target pixels in a picture of the video stream is considered, is similar to a case where the size conversion control unit 240 is added to the picture conversion processing unit of Embodiment 3 of the present invention. Therefore further description is omitted. In Embodiment 3 and Embodiment 4 of the present invention described above, the explanation is made, as example, for the interlaced-scan video stream which is composed of the YCbCr components. The present invention can be equally applied to an interlaced-scan video stream which is composed of the RGB components. In the case of the interlaced-scan video stream composed of the RGB components, motion detection of the target pixel is performed using the G-component data, with the subsequent classification judgment for the still video mode, the quasi-moving video mode and the moving video mode. Based on the classification judgment result, converted pixel data for each of the RGB components of the target pixel can be similarly generated. According to Embodiments 1 through 5 of the present invention described above, the motion of a picture of the inputted video stream is classified into three steps of the still video mode, the quasi-moving video mode and the moving video mode.
Classification may be made in two modes of the still video mode and the moving video mode. In this case, for example, the second inter field conversion unit 212 of the picture conversion processing unit 200 shown in Fig. 2 can be omitted, and a simpler video stream encoding device can be provided. Such a simpler video stream encoding device can also restore a reduced video steam with a picture quality sufficiently high enough for some purposes.
As described above, the object of the present invention is to realize a video stream encoding device that can perform size reduction of an interlaced-scan video stream and subsequent encoding at a low bit rate, by detecting a per-pixel motion of a picture of the video stream, reflecting the detected motion to conversion of the video stream, thereby suppressing a possible degradation of the picture quality when the encoded, reduced video stream is decoded. Various applications may be available unless deviating from the object of the present invention.
According to the present invention, it is possible to provide a video stream encoding device that is operable, when performing size reduction of an interlaced-scan video stream and encoding at a low bit rate, to reduce high frequency components not influencing the picture quality of encoding target video and to suppress a possible degradation of picture quality when decoding the encoded, reduced video stream.
Having described preferred embodiments of the invention with reference to the accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments, and that various changes and modifications may be effected therein by one skilled in the art without departing from the scope or spirit of the invention as defined in the appended claims. Industrial Applicability The video stream encoding device related to the present invention can be applied in, for example, distributing video streams via the Internet, generating video stream in a reduced size to be encoded at a low bit rate, and the related wide range of
applications.