EP1762092A1

EP1762092A1 - Image processor and image processing method using scan rate conversion

Info

Publication number: EP1762092A1
Application number: EP05750971A
Authority: EP
Inventors: Shaori Guo; Abraham K. Riemens; Chris Lee; Robert J. Schutten; Selliah Rathnam
Original assignee: Koninklijke Philips Electronics NV
Current assignee: Koninklijke Philips NV
Priority date: 2004-06-21
Filing date: 2005-06-20
Publication date: 2007-03-14
Also published as: JP2008509576A; CN1973541A; WO2006000977A1

Abstract

Image processor (100) connectable to a system bus (112) for exchanging data with an external device (110). The image processor (100) comprises a memory unit (102, 103), a motion estimator (104), and a motion compensation unit (107) for outputting a scan rate converted output image frame. The image processor (100) is arranged to perform motion-compensated scan rate conversion when the video data signal is a standard-definition TV signal and to perform non-motion-compensated scan rate conversion when the video data signal is a high-definition TV signal.

Description

Image processor and image processing method using scan rate conversion

FIELD OF THE INVENTION The present invention relates to an image processor for performing scan rate conversion on a video data signal. In further aspects, the present invention relates to an image processing method and an image receiving apparatus.

PRIOR ART Such an image processing method and image processor or co-processor are known from PCT publication WO-A-02102058. The publication discloses a motion- compensated up-conversion of the frame rate (scan rate conversion) of video sequences. The disclosed method selects one of a plurality of scan rate conversion algorithms to obtain a best possible visual quality or to use the available resources as good as possible. The known image processing method and image processor are arranged to provide a scan rate conversion that is optimized for a specific type of video source data.

SUMMARY OF THE INVENTION The present invention seeks to provide an image processing method and image processor which allow to perform high quality scan rate conversion for different types of digital video data. The invention is defined by the independent claims. The dependent claims define advantageous embodiments. The present invention provides an image processor connectable to a system bus for exchanging data with an external device. The image processor comprises a memory unit, a motion estimator, and a motion compensation unit for outputting a scan rate converted output image frame. The image processor is arranged to perform motion-compensated scan rate conversion when the video data signal is a standard-definition TV signal and to perform non-motion-compensated scan rate conversion when the video data signal is a high-definition TV signal. The standard-definition TV signal has a lower resolution than the high- definition TV signal, e.g. 1024 x 576 pixels against 1920 x 1020 pixels. The non-motion- compensated scan rate conversion requires less resources (memory access, processing), thus allowing adding functionality for HDTV signals to a high quality image processor for SDTV signals. It is noted that the mentioned input images related to the video data signal may cover direct scan rate conversion operation, i.e. storing actual input video images only in the memory unit, or recursive operation, in which also images are stored in the memory unit which are calculated from previous images. In a further embodiment, the motion compensation unit comprises a de- interlacing sub-unit. This is a particularly suitable implementation for scan rate conversion. In an even further embodiment, the image processor is arranged to perform the motion-compensated scan rate conversion in a multi-pass processing mode, and to perform the non-motion-compensated scan rate conversion in a single pass processing mode while disabling the motion estimator and using a zero motion vector field as input to the motion compensation unit. For HDTV signals, thus only a single pass de-interlacing is performed. Although more pixels have to be processed, the number of processing steps is lower, thus making the requirements for processing power substantially the same for SDTV and HDTV signal processing. To obtain a high quality image processing for SDTV signals, including motion-compensated scan rate conversion, the multi-pass processing mode comprises a motion estimation pass and a motion compensation pass in a further embodiment. The motion compensation pass includes de- interlacing and up-conversion. By using separate steps for motion estimation and motion compensation, it is even possible to improve the output of the motion estimation vectors by an additional external (host) CPU before use in the motion compensation step, resulting in an output of even higher quality. The motion compensation unit may in a further embodiment comprise an up- conversion sub-unit for performing temporal interpolation between successive video images in order to increase the video field or frame rate. This up-conversion sub-unit may be used for both the SDTV and HDTV signal processing. In a particularly advantageous embodiment, the memory unit comprises a first local memory unit and a second local memory unit each being arranged to store a 256 x 48 pixel image area. This allows SDTV signal processing with a sufficiently large search area for the motion estimation step, e.g. when processing stripes of 128 pixels wide. Also, it may accommodate the same number of pixels for HDTV processing. In a further aspect, an image processing method for performing scan rate conversion on a video data signal is provided. As an already present processing step for processing SDTV signals is used for processing HDTV signals, the present method is very effective in providing additional HDTV signal processing functionality to an existing SDTV signal processing function, at only marginal (software and/or hardware) cost. The motion-compensated scan rate conversion mechanism may be executed in a multi-pass processing mode, and the non-motion-compensated scan rate conversion mechanism may be executed in a single-pass processing mode using a zero motion vector field. As the non-motion-compensated scan rate conversion requires less memory access and processing resources, the same hardware platform can be used for both the SDTV and HDTV processing. In a further embodiment, the multi-pass processing mode comprises a motion estimation pass and a motion compensation pass. The motion compensation pass includes de- interlacing and up-conversion. Furthermore, it is possible to improve the output of the motion estimation vectors by an external (host) CPU before use in the motion compensation pass. The image processor and/or image processing method may be advantageously used in all kinds of image receiving apparatus, such as a television set or a video recorder (using tape, optical disc or hard disk media), comprising a receiver for receiving a video data signal, and an image processor according to the present invention. This allows a more smooth transition from SDTV to HDTV broadcast reception, as the image receiving apparatus is able to process both types of signals, without any substantial extra cost.

SHORT DESCRIPTION OF DRAWINGS The present invention will be discussed in more detail below, using a number of exemplary embodiments, with reference to the attached drawings, in which Fig. 1 shows a block diagram of an image processor architecture for scan rate conversion; Fig. 2 shows a graphic representation of the local memories used in the image processor architecture of Fig. 1; Fig. 3 shows a conceptual data flow of the image processor architecture of Fig. 1 in motion-compensated scan rate conversion; and Fig. 4 shows a conceptual data flow of the image processor architecture of Fig. 1 in non-motion-compensated scan rate conversion. DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS Scan-rate-conversion is necessary when pictures are received and displayed at different frequencies, and conversion between different scan rates is important since there exist a wide variety of scan rates for recording and displaying pictures. For example, NTSC and PAL television display at 60 Hz and 50 Hz, respectively. The frequencies for recording films are 24, 25 or 30 Hz. Furthermore, for high quality display, it is desirable to display pictures at a higher frequency than the one at which they are received or recorded. For instance, the 50-Hz PAL video often produces annoying field flickers. The field flicker can be avoided, for example, by displaying the PAL video at 100 Hz. Scan-rate-conversion refers to one or more of the following processes: de- interlacing, up-conversion, horizontal and vertical scaling. While de- interlacing is the process of converting an image field into an image frame by increasing the vertical sampling density, up-conversion is to increase the number of image pictures in a video sequence, normally by means of interpolation. And horizontal and vertical scaling is the process of increasing or decreasing the number of pixels in either horizontal or vertical direction. Depending on whether motion information is used, scan-rate-conversion techniques are largely divided into two categories: motion-compensated and non-motion- compensated. Motion-compensated scan-rate-conversion is a technique in which motion information is first extracted, often in the form of motion vectors, from the input video sequence, and the motion information is then applied to the creation of output video pictures with the desired scan rate. Non-motion-compensated scan-rate-conversion generates output video pictures with the desired scan rate without making use of the motion information embedded in the input video sequence. Fig. 1 shows a block diagram and architecture of an image processor 100 (or co-processor) for scan-rate-conversion according to an embodiment of the present invention. The image processor 100 is connectable to a data bus 112 that is designed to exchange e.g. data of input and output images and motion vectors. An external memory device 110 is connected to the data bus 112 and is arranged to store e.g. data of input and output images and motion vectors. The data bus 112 (or system bus) is used to connect the image processor 100 according to this embodiment with external memory device 110. It is however imaginable that other communication means are used to connect the image processor 100 and its components to further external devices. In the embodiment of Fig. 1, the image processor 100 comprises a number of sub-units: Two local memories 102, 103. As shown in Fig. 1, the co-processor 100 contains two local memories 102, 103 for temporarily storing pixels that are loaded from system memory 110 and to be used for motion estimation, de-interlacing, and up-conversion. Local memory 102 is typically used to buffer image data of the previous image, whereas local memory 103 typically stores data from the current field (and optionally also data from the next field). This way, the memories contain all required data for de-interlacing the current field and for temporal interpolation between the time instances of the previous and the current images. The image data stored in local memory 102 is typically calculated by image processor 100 based on a previous input image from the video signal. Hence, this architecture supports temporal recursive video algorithms. In the embodiment shown, each local memory 102, 103 comprises 256 x 48 pixels = 12 k bytes, arranged in 16 horizontally adjacent 8 x 8 pixel blocks. Four pixels at all borders are used as filter run-in and to allow 16 x 16 block matches for the SAD calculations (Sum of Absolute Differences) in a motion estimation pass. This results in a vertical range of the motion vector of +/-16 pixels, and a horizontal range of the motion vector of +/- 60 pixels, as shown in Fig. 2. A motion estimation unit 104. Motion estimation is the first of the two-stage- process of motion-compensated scan-rate-conversion. It computes the motion vector of each 8 x 8 block of a video picture. The motion vectors are subsequently used for the motion- compensated de-interlacing and up-conversion sub-units (106, and 108 respectively, see below). A motion compensation unit 107, comprising two sub-units 106, 108: A de-interlacing sub-unit 106. De-interlacing sub-unit 106 converts a video field (consists of either even or odd lines of an image) into a video frame by increasing vertical sampling density. In motion-compensated scan-rate conversion, de-interlacing is achieved by means of interpolation of the missing field lines based on motion-compensated image data. An up-conversion sub-unit 108. The up-conversion sub-unit 108 of the co- processor 100 performs temporal interpolation between successive video pictures of a video sequence so as to increase the video field or frame rate. This is necessary when a video sequence needs to be displayed at a higher field or frame rate to avoid flickering or to enhance video display quality. A temporal noise reduction sub-unit 116. The temporal noise reduction sub- unit 116 performs noise reduction on de-interlaced image data. This is achieved iteratively by motion-detected information from temporal image data, making use of the recursive nature of the architecture. - A spatial noise reduction sub-unit 114. The spatial noise reduction sub-unit performs noise reduction on field-based sequences. Because of the recursive nature of our architecture, spatial noise reduction is only used to filter the input images of the current and next fields, i.e. video data that is stored in the second local memory 103. A vertical processing sub-unit 118. The vertical processing sub-unit 118 comprises two operations on video image data in vertical direction. These operations are vertical peaking and scaling. The vertical peaking is to compensate the information loss after up conversion processing and increase the gain for high frequencies of signals, controlled by the programmable peaking coefficients to present different filter characteristics resulting in peaking or averaging signal. The vertical scaling operation performs the expansion or compression of a video image in vertical direction. The vertical scaling function can also optionally generate interlaced output, taking care of proper low pass filtering to avoid excessive aliasing. According to the required operating mode, one or more of the above- mentioned units may be disabled or enabled. For example, in the embodiment of Fig. 4 only a few of the above-mentioned units are used. If in an embodiment no noise reduction and/or vertical processing is required, units 114, 116, and/or 118 may be left out completely. Depending on the required input / output signals, the motion-compensation unit 107 does not need to comprise both the de- interlacing unit 106 and the up-conversion unit 108. The motion estimator 104 and the motion compensation unit 107 advantageously operate according to algorithms as described in the article "IC for motion- compensated de-interlacing, noise reduction, and picture rate conversion", by G. de Haan, in IEEE Transactions on Consumer Electronics, Vol. 45, No. 3, August 1999, which is incorporated herein by reference. Optionally the de-interlacing is performed in accordance with another method as described in "De-interlacing - An Overview" by G. de Haan, in proceedings of the IEEE, Vol. 86, No. 9, September 1998, which is also incorporated herein by reference. It is advantageous when the image processor 100 is implemented on one IC. Alternatively, the image processor 100 is implemented with multiple ICs that are interconnected with connections that have a relatively large bandwidth. Fig. 3 shows a logical diagram that represents the data-flow of the motion- compensated scan-rate-conversion process, which according to the present invention is executed for standard-definition TV (SDTV) signals. The system bus 112 provides pixel inputs to local memory 102 and spatial noise reduction unit 114. The system bus provides motion vectors input MVI to motion estimation unit 104 and three de-interlacing units 106. The system bus 112 receives recursive outputs RO from one of three temporal noise reduction units 116. The two other temporal noise reduction units 116 are coupled to respective vertical scaling units 118 that provide progressive outputs PO to the system bus 112. The whole process consists of two passes: motion estimation pass and motion compensation pass. In the first pass, only the motion estimation unit 104 is enabled, all other sub- units are disabled. The motion estimation runs on every luminance input field, always using the previous frame and the current field of video image data. No motion estimation should be done on chrominance data, although the motion vectors obtained from the motion estimation of luminance data are used for the de-interlacing and up-conversion of the chrominance data. The output of the motion estimation pass is a field of motion vectors and SAD (sum of absolute difference) values. These may e.g. be temporarily stored in the external memory 110 via the system bus 112. In the second pass, motion estimation unit 104 is disabled, while other units are enabled or disabled as required, although de- interlacing sub-unit 106 is normally enabled for this pass of processing. Per execution, one or two output pictures are generated. Since the de-interlacing process is recursive, three pictures are generated per execution in the worst case: the de-interlaced and noise reduced current picture is required for the recursive de- interlacing and noise reduction sub-units; two up-converted images are required as output pictures, which are vertically scaled and peaked at the proper temporal position. At every execution, the co-processor 100 processes one vertical "stripe" of the picture. The width of this stripe is 16 blocks, i.e., 128 pixels. The height is equal to the picture height. The bandwidth overhead of reading motion-compensated data is two: in order to process 128 bytes horizontally, 256 bytes are read into the local memory. Given that the input video sequences targeted by the co-processor 100 are dual- input-stream with a total size of 1024 x 576 pixels per picture and a frequency of 50 Hz, this results in a maximum local memory bandwidth requirement to be about 118 Mbytes per second. This is for motion compensation only, however, also motion estimation is required. If it is assumed that this takes 50% of the required time, then the total input bandwidth during motion compensation is 236 Mbyte/s. With a processing speed of two input samples per clock cycle, this would require the co-processor 100 to operate at a clock speed of 118 MHz. In a realistic implementation, some time will be required for pipeline latency and host-CPU interaction, so a design target of 140 MHz is appropriate. The two input samples per clock cycle is a compromise between clock speed and silicon area, and the indicated speed of 140 MHz is a good choice using current IC technology using a standard cell design technology. The purpose of incorporating non-motion-compensated scan-rate conversion in the co-processor 100 is to process high-definition TV (HDTV) signals. According to the present invention, HDTV signals are subjected to scan-rate-conversion with adding only marginal additional hardware cost and little design complexity.

Fig. 4 shows a dataflow of the co-processor 100 when it operates in non- motion-compensated scan-rate-conversion mode for HDTV signal processing. The. system bus 112 provides pixel inputs PI to the local memories 102 and 103, a zero motion vector MVI to the de-interlacing unit 106. The system bus 112 receives a recursive output RO from the de-interlacing unit 106. Note that the HDTV scan-rate-conversion uses exactly the same hardware (local memories 102, 103, de-interlacing unit 106, system bus 112) as the motion- compensated SDTV scan-rate conversion. With the present architecture arrangement, no additional hardware is needed for HDTV non-motion-compensated scan-rate-conversion. In this mode, the motion estimation unit 104 is disabled and the whole processing has only one pass, i.e. de-interlacing (in de-interlacing sub-unit 106). For the input motion vectors of the de-interlacing sub-unit 106, a zero motion vector is used. Furthermore, the scan-rate-conversion has only one output, i.e. the recursive output. The recursive output is reused for the display outputs, by properly timing the output signals to the external display via the system bus 112. So there is no added local memory access for generating display outputs. Similar to the motion-compensated scan-rate-conversion described above, at every execution, the co-processor 100 processes one vertical "stripe" of the picture. The width of this stripe is 16 blocks, i.e., 128 pixels. The height is equal to the picture height. Unlike the motion-compensated scan-rate-conversion, however, the bandwidth overhead of reading motion-compensated data is one: in order to process 128 bytes horizontally, exactly 128 bytes are read into the local memory. Assuming that HDTV has a size of 1920 x 1020 pixels per picture, with a frequency of 60 Hz, the maximum local memory bandwidth is 117.5 Mbytes per second. Again a factor of two is required as both previous frame and current & next field are required, however, no memory overhead factor of two is present in this case. All (100%) of the processing time is now available as motion estimation is not required, thus the total input bandwidth during motion compensation becomes 235 Mbyte/s. This shows that the lower bound running frequency of the co-processor 100 of 140 MHz, processing two data samples per clock cycle, is also suitable for HDTV scan-rate-conversion. SDTV and HDTV scan-rate- conversion can be done with the same co-processor 100, with only marginal additional hardware cost. The image processor 100, or the image processing method as described in relation to the image processor 100, can be used in high-end media processors, multimedia processors, and digital display processor etc. Examples are television sets, set-top-boxes, and video recorders (tape, disc or hard disk recorders). The individual components of the co-processor 100 may be enabled or disabled, depending on operating mode. This flexibility makes it possible to perform scan- rate-conversion for high-definition TV video signals with exactly the same hardware as used for standard-definition TV video signals. In other words, the HDTV scan-rate-conversion function is incorporated with only marginal additional hardware cost. For SDTV signals, the video processing is separated into the motion estimation pass and the motion compensation. Therefore, the host-CPU can further process the generated motion vectors and possibly make the motion vectors more accurate. The above embodiment has been described in relation to a specific example. The skilled person will understand that modifications and alternatives are possible for components of the embodiment shown. These modifications and alternatives are within the scope of the present invention, which is defined by the claims as appended. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps other than those listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the device claim enumerating several means, several of these means may be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

Claims

CLAIMS:

1. An image processor for performing scan rate conversion on a video data signal, the image processor (100) being connectable to communication means (112) for exchanging data with an external device (110), the image processor (100) comprising: a memory unit (102, 103) for storing input images related to the video data signal; a motion estimator (104) for estimating a motion vector field based on stored images; a motion compensation unit (107) for outputting a scan rate converted output image frame; wherein the image processor (100) is arranged to perform motion- compensated scan rate conversion when the video data signal is a standard-definition TV signal and to perform non-motion-compensated scan rate conversion when the video data signal is a high-definition TV signal.

2. The image processor according to claim 1, in which the motion compensation unit comprises a de-interlacing sub-unit (106).

3. The image processor according to claim 1, in which the image processor (100) is arranged to perform the motion-compensated scan rate conversion in a multi-pass processing mode, and to perform the non-motion-compensated scan rate conversion in a single pass processing mode while disabling the motion estimator (104) and using a zero motion vector field as input to the motion compensation unit (107).

4. The image processor according to claim 3, in which the multi-pass processing mode comprises a motion estimation pass and a motion compensation pass.

5. The image processor according to claim 1, in which the motion compensation unit (107) further comprises an up-conversion sub-unit (108) for performing temporal interpolation between successive video images in order to increase the video field or frame rate.

6. The image processor according to claim 1, in which the memory unit comprises a first local memory unit (102) and a second local memory unit (103) each being arranged to store a 256 x 48 pixel image area.

7. An image processing method for performing scan rate conversion on a video data signal, the method comprising: determining whether the video data signal is a standard-definition TV signal (SDTV) or a high-definition TV (HDTV) signal; in case the video data signal is a HDTV signal, performing scan rate conversion by a non-motion-compensated scan rate conversion mechanism; in case the video data signal is a SDTV signal, performing scan rate conversion by a motion-compensated scan rate conversion mechanism; wherein both the non-motion-compensated scan rate conversion mechanism and the motion-compensated scan rate conversion mechanism are executed using substantially the same hardware components.

8. The method according to claim 7, in which the motion-compensated scan rate conversion mechanism is executed in a multi-pass processing mode, and the non-motion- compensated scan rate conversion mechanism is executed in a single pass processing mode using a zero motion vector field.

9. The method according to claim 8, in which the multi-pass processing mode comprises a motion estimation pass and a motion compensation pass.

10. An image receiving apparatus, such as a television set or a video recorder, comprising a receiver for receiving a video data signal, and an image processor as claimed in claim 1.