WO2015049392A1

WO2015049392A1 - A method and system for improving the quality of colour images

Info

Publication number: WO2015049392A1
Application number: PCT/EP2014/071367
Authority: WO
Inventors: Michel Dauw; Pierre De Muelenaere; Olivier Dupont; Jianglin MA
Original assignee: I.R.I.S.
Priority date: 2013-10-04
Filing date: 2014-10-06
Publication date: 2015-04-09
Also published as: BE1021013B1

Abstract

A method for improving the quality of a colour image prior to performing subsequent processing on said colour image is proposed. The method involves the steps of converting the colour image into multiple image layers distinguishable from one another and simultaneously correcting and up-scaling each of the multiple image layers based on the values of a distortion correction and up-scaling transformation matrix.

Description

A method and system for improving the quality of colour images Technical field

The present invention relates to a method for improving the quality of colour images, and more specifically to a method for simultaneously performing the steps of distortion correction and up-scaling of the image.

Background art

A digital image is formed by an array of rows and columns of pixels. For a greyscale image, each pixel has one value representing the average Iuminance of the corresponding area, while for a colour image each pixel has a red, green and blue value representing the average colour of the corresponding area.

It is known that a pixel colour can also be represented by the YUV (or YCrCb) representation, where Y is the luminance, and, U and V are the red and blue chrominance channels. Therefore, it is possible to convert a colour from an RGB representation into the YUV representation and vice- versa.

A digital image is characterized by its resolution, which is determined by the number of pixels in each direction per inch. For example, an image with a resolution of 300 dpi (dot per inch) is an image which has 300 rows and 300 columns per inch.

A document is a set of pages that contains text but can also contain graphics, pictures, logos, drawings, etc. For example, a document can be a letter, a business card, an invoice, a form, and a magazine or newspaper article. To convert the documents into digital images electronic devices, such as scanners or cameras, can be used. The conversion of a document to a digital image enables the document to be stored electronically and further processed by a computer. To this effect, the digital image can be further processed by a processing application, such as text recognition or OCR (Optical Character Recognition), so that its content can be used in a different context. For example, by performing OCR on document images it enables for the content of the document image to be recognised and hence become electronically searchable. By making the electronic documents searchable it facilitates the compact storage of the document image and makes applications such as machine translation, text-to-speech and text mining possible.

The accuracy of OCR plays a critical role in these applications, which in turns relies heavily on the quality of digitization process. Among all the elements that affect document digitization, the digitalization device's resolution and geometric distortion are considered the most important. The quality of the document digitization process may be improved by capturing the document images at a higher resolution. However, such an approach may lead to longer scanning times, depending on the processing power of the capturing device, thereby making the overall document digitization process more time consuming. This is because, at higher resolutions, a higher number of pixels need to be determined by the scanner and subsequently transferred to the computer for further processing the digital image. In addition, capturing images at higher resolution requires a higher resolution device, which is generally more expensive than lower resolution devices. Therefore, directly capturing high resolution images is not only time consuming but can also be costly. To overcome the disadvantages of directly capturing a high resolution image using a high resolution device, the method of up-scaling is widely used for enhancing the quality of the low resolution images. Up-scaling a digital image can be performed using a variety of known techniques, such as bilinear and bi-cubic interpolations. These techniques work by mapping the grid of the destination image into the grid of the source image. The destination pixel values are estimated by interpolating the source pixel values using one of the up-scaling techniques. The accuracy of the up- scaled image greatly depends on the number of source pixels taken into account. For example, in the case of the bilinear method the 4 nearest neighbouring pixels in the source image are taken into account, while bicubic interpolation takes into account the 16 nearest neighbours. Depending on the interpolation method used, up-scaling of an image may introduce the undesirable effects of blurring and ringing to the destination image, which may affect the accuracy of the OCR. Therefore, careful consideration should be taken when selecting a suitable interpolation method.

As previously discussed OCR accuracy is also very sensitive to the geometric distortion. Geometric distortion is difficult, if not impossible, to avoid because images are very difficult to be placed in the right position when being captured. Moreover, the scanning process itself can often introduce some kind of geometric distortion. Therefore, it is necessary to remove these distortions before performing OCR. Many distortion correction algorithms, such as deskew algorithms have been proposed and may include methods based on Hough transformation, cross correlation, projection profile, Fourier transformation and k nearest neighbors(k-NN) clustering.

Up-scaling and deskew are considered to be two different document image pre-processing techniques having the same aim of improving the quality of the captured images. Current OCR solutions perform these techniques as separate steps, for example by first up- scaling the image using one of the interpolation methods and then applying a deskew algorithm to correct the geometric distortion. As a result, improving the quality of the image using the solutions found in the prior art can be time consuming and computationally expensive. Therefore, there is a need for providing a method for improving the quality of the document image in a fast and accurate manner.

Disclosure of the invention

it is an aim of the present invention to provide a method for improving the quality of the colour image in a fast manner, while maintaining sufficient accuracy for a subsequent processing of the image.

This aim is achieved according to the invention with the method comprising the steps of the first independent claim.

According to a first aspect of the invention, a method is presented for improving the quality of a colour image prior to performing subsequent processing on the colour image. The method for improving the quality of the colour image comprises the steps of: converting the colour image into multiple image layers distinguishable from one another, formulating a distortion correction and up-scaling transformation matrix based on at least an image distortion value estimated from at least a first image layer of the multiple image layers and a target resolution, simultaneously correcting and up-scaling each of the multiple image layers based on the values of the transformation matrix, and recombining the corrected and up-scaled multiple image layers into a destination colour image.

The method of the present invention has the advantage that the steps of correcting and up-scaling the image can be performed simultaneously on each of the image layers. As a result, the number of interpolation operations required for correcting and up-scaling the colour image may be reduced from two operations to one, thereby significantly reducing the computational time taken for improving the quality of the image.

According to the first aspect of the present invention it may be possible to create a substantially distortion-free high resolution document image. For example, a document image of 200 dpi may be scaled up to 300 dpi, and a document image of 300 dpi may be scaled up to 400 dpi. By doing so, the spatial resolution requirement for OCR can be met, and decent OCR results can be expected in the subsequent operations. Furthermore, allowing document images to be captured in a low resolution can also reduce the cost of the image capturing devices required and speed up the scanning process. Moreover, the method is faster than separately performing the operations of distortion correction and up- scaling.

In one embodiment the simultaneously correcting and up-scaling is executed on each of the multiple image layers in parallel. In this embodiment a further reduction in computation time may be achieved.

In another embodiment the step of correcting and up-scaling is performed by applying a first interpolation method to the at least first image layer and at least a second interpolation method to a second image layer of the remaining multiple image layers.

In particular, the at least second interpolation method may be of a lesser quality than the first interpolation method.

By converting the captured image into multiple layers and apply on the image layer different interpolation methods, it may be possible to reduce the computational time taken for correcting and up-scaling the image. This is because each of the image layers may contain different types of information that may affect the accuracy of the captured image during further processing, for example during OCR. Therefore, it may be beneficial to apply computational expensive but highly accurate interpolation methods to the image layer or layers containing the most relevant information for further processing. At the same time, it may be beneficial to apply interpolation methods of lesser quality to image layers that are less relevant to further processing.

In one embodiment the first interpolation method may be a bi-cubic interpolation method using splines, which may comprise at least a first and a second parameter. For example the first parameter may be a 'B' parameter, while the second parameter may be a 'C parameter. The B and C parameters may be arranged to control the quality of the colour image, and more in particular the effects of blurring and ringing. Therefore, the values of the B and C parameters are preferably chosen such that the effects of blurring and ringing are minimised. In this way the method of the present invention can provide an image that has less blurring and ringing construction errors and is more suitable for subsequent further processing such as OCR.

In one embodiment at least the first parameter may be estimated based on the sharpness of the colour image generated by a capturing device.

Alternatively, at least the first parameter may be based on the luminance values obtained from the colour image pixels.

In one embodiment at least the second interpolation method may be a bi-linear interpolation method. The use of the bi-linear provides a Sow quality method having a reduced computational time for processing image layer containing less relevant information for further processing. It is considered that other methods available to the skilled person providing the same effect may also be used.

In embodiments of the present invention the distortion correction of the colour image involves any one of de-skew, projective correction and de-warping, which are three different types of distortion that may be introduced during capturing of the colour image. Once these distortion types are identified with corresponding algorithms, distortion correction and up-scaling of the colour image can be simultaneously performed.

in particular, in the case of de-skew, the image distortion value used in the transformation matrix may be estimated from the skew angle of at least the first image layer. In order to find the skew angle a multi-scale projection profile based method may be selected, which can be performed in a fast manner with accurate results. However, other methods known to the skilled person can also be used for finding the skew angle of the colour image. In a further embodiment the formulation of the distortion correction and up-scaiing matrix is based on the resolution of the colour image. The resolution of the image may be obtained by directly reading the specification of the imaging device and filing the required values in the matrix.

Alternatively, the image resolution may be estimated from the average size of the connected components obtained from an image binarization step, which binarization step is performed prior to the step of formulating the transformation matrix.

It has been found that the rotation and up-scaling matrix may be determined by the identified image resolution, skew angle and the application's requirement for the target resolution. Furthermore, an input scale parameter can also be provided by the user and included in the transformation matrix. For example, the input scale parameter may be the desired resolution of the output image, the original resolution of the colour image, a parameter relating to the effective distortion of the image, and in general any parameter related to the quality of the colour image.

In one embodiment, the first image layer comprises luminance values of the colour image pixels. It has been found that the further processing accuracy is mainly determined by the luminance value of each pixel in the colour image. Therefore, the image layer or layers containing luminance information can be considered the most relevant for further processing applications, such as OCR. As a result, it may be sufficient to perform accurate but time-consuming interpolation methods on this luminance layer only.

In a further embodiment, the remaining multiple image layers other than the first image layer comprise chrominance values of the colour image pixels. It has been found that the chrominance values of the colour image affect to a lesser degree the accuracy of further processing application. As a result, it may be sufficient to apply a less accurate but fast interpolation method on chrominance layers without affecting the accuracy of the OCR.

In one embodiment of the present invention, a method is provided for performing Optical Character Recognition (OCR) on a colour image, the method comprises the steps of providing a colour image having a first resolution, applying the method according to the present invention; and subsequently performing Optical Character Recognition on the destination colour image.

By performing an OCR on a colour image that has been processed with the method of the present invention, the accuracy of the OCR processing can be significantly improved, while at the same time the computational time can be dramatically reduced.

In another embodiment of the present invention the colour image may be in RGB format. However, other image formats known to the person skilled in the art may be equally used.

In an aspect of the present invention, a computer program product comprises software code portions stored in a non-transitory computer medium for performing the steps of the embodiments of the methods described above when the program is run on a computer device. To this end the non-transitory computer medium used for storing the software code portions may be a storage medium suitable for use in a computer device or the like, such as a peripheral memory device, a CD, a DVD or any other, or a non-transitory computer medium to which a remote connection can be established for downloading the computer program product.

The use of the software program run on a computer can allow the user to apply the method of improving the quality of the colour image in an easy and fast manner. A further advantage of using a non transitory medium for storing the software portions is that the above described methods can be transferred from one computer device to another, thereby enabling the user to perform the method irrespective of the location of the computer device.

According to another aspect of the invention, a system is provided for improving the quality of a colour image prior to performing subsequent processing on the colour image. The system comprises: first means for converting the colour image into multiple image layers distinguishable from one another; second means for formulating a distortion correction and up- scaling transformation matrix based on at least an image distortion value estimated from at least a first image layer of the multiple image layers and a target resolution; third means for simultaneously correcting and up- scaling each of the multiple image layers based on the values of the transformation matrix; and fourth means for recombining the corrected and up-scaied multiple image layers into a destination colour image.

To this end the system according to the second aspect of the present invention may comprise an electronic device comprising at least one Integrated Circuit (IC), such as tablet, a phone or a computer device. Furthermore, the system may be an Application Specific IC (ASIC) or a Field Programmable Gate Array (FPGA). The first, second, third and fourth means may be formed by one or more processors of such an electronic device along with appropriate software code provided in a memory on such an electronic device. In embodiments, the first, second, third or fourth means may be provided remotely from such an electronic device and communication means may be provided on the electronic device and the remote device for establishing a communication link and communicating results from the various steps performed back and forth.

In embodiments, the system may be provided with other means for further performing any other steps of the embodiments of methods described above. Brief description of the drawings

The invention will be further elucidated by means of the following description and the appended figures.

Figure 1 shows a method for improving the quality of a colour image according to solutions found in the prior art.

Figure 2 present a processing flow algorithm for applying simultaneously the image deskew and up-scaling method according to a preferred embodiment of the present invention.

Figure 3 shows a processing flow of an OCR system based on preferred embodiments of the present invention.

Modes for carrying out the invention

The present invention will be described with respect to particular embodiments and with reference to certain drawings but the invention is not limited thereto but only by the claims. The drawings described are only schematic and are non-limiting.

Furthermore, the terms first, second, third and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. The terms are interchangeable under appropriate circumstances and the embodiments of the invention can operate in other sequences than described or illustrated herein.

The term "comprising", used in the claims, should not be interpreted as being restricted to the means listed thereafter; it does not exclude other elements or steps. It needs to be interpreted as specifying the presence of the stated features, integers, steps or components as referred to, but does not preclude the presence or addition of one or more other features, integers, steps or components, or groups thereof. Thus, the scope of the expression "a device comprising means A and B" should not be limited to devices consisting only of components A and B. It means that with respect to the present invention, the only relevant components of the device are A and B.

The present invention may now be described with reference to the accompanying drawings, which provide exemplary embodiments of the present invention.

Figure 1 presents a method for improving the quality of an image according to solutions of the prior art. The method starts with step 10 of identifying the spatial resolution of the input image. In most cases the spatial resolution can be found inside the head file of the input image written by the scanner user for capturing the colour image. In case this information is unavailable, it may also be possible to estimate the image's spatial resolution with known image processing methods. The next step of step according to the method presented in Figure 1 is to estimate the skew angle, as shown in step 11. This is performed by using a known solution such as the multi scale projection profile method. Once the skew angle is estimated the image is rotated at step 12 to compensate for the skew angle using a selected interpolation method applied on the three colour images and based on the estimated angle. After the step of image rotation the colour image resolution is up-scaled to the desired resolution in step 13, which is performed separately from step 12. As a result, generating the corrected and up-scaled image using the method presented in Figure 1 may take a considerable amount of time. This is because, both distortion correction and up-scaling are performed by separately applying to the captured interpolation methods. Therefore, by performing distortion correction and up-scaling independently it involves applying the interpolation methods twice on the same image layer, thereby considerably increasing the time required for improving the image quality of the image.

Figure 2 shows an exemplary embodiment of a processing flow algorithm employed by the method of the present invention. An advantage of the method of the present invention is that distortion correction, such as deskew, and up-scaling techniques are combined together to improve the quality of the image. Therefore, compared to the solutions of the prior art, whereby the interpolation is performed twice as shown in Figure 1 , embodiments of the present invention can be less time-consuming. As a result, by using the methods of embodiments of the present invention the quality of a colour image can be improved in a fast and accurate manner.

An advantage of embodiments of the present invention is that the colour image layers can be processed differently during correction distortion and up-scaling. For example in the case of deskew, the skew angle may be estimated based on the luminance layer (Y). This is because the luminance layer of the colour image contains the most distinguishing information and can enable an accurate estimation of the skew angle. By processing a single image layer or a reduced number of image layers for detecting the deskew angle, compared to processing all the image layers, the workload can be dramatically reduced. The same also applies to the rotation and up-scaiing stage of the processing flow, as shown in Figure 2. It is known that OCR can rely more on the luminance information of the colour image rather than the chrominance information. As a result, it can be sufficient to perform accurate but time-consuming interpolation methods on the luminance layer while applying a less accurate but fast interpolation method on the chrominance layers without affecting the accuracy of the OCR.

An advantage of the method of the present invention is that the image can be directly up-scaled to the resolution of the targeted application. For example, when converting a document containing normal text (10 pt and above) a minimum of 300 dpi is preferred for obtaining good OCR accuracy. However, when converting a business card to a colour image a minimum resolution of 400 dpi is preferred for OCR as the text there is often written with a small point size (e.g. 8 pt). Based on the reso!ution ratio before and after the operation as well as the identified skew angle, a rotation and up-scaling transformation matrix can be formulated as:

0) where S_x and S_y represent the resolution ratio before and after the operation in the horizontal direction and vertical direction respectively, Θ stands for the identified skew angle, [x, y] is the output image pixei coordinate and its corresponding location in the input image is [χ', y']. S_x and S_y are determined by the resolution of the input image, which can be either extracted from the specification of the imaging devices or be estimated by analyzing the image contents, and the application's requirement for target resolution (for example, 400 dpi for scanned business card).

As the output image pixel's position in the input image grid [χ', y'] is not always an integer, image interpolation must be involved. An analysis of the prior interpolation art has shown that bilinear interpolation is fast but gives an image that is too blurred for OCR. When the image is blurred, the OCR accuracy will drop because it would be difficult in determining the character edges. Bi-cubic interpolation tends to preserve the edge information and result in less loss of valid pixel information but it is time- consuming as it has to process 16 different source pixels to estimate one destination pixel.

Bi-cubic interpolation employs a spline which is a piecewise polynomial. Mitchell and Netravali has introduced an interesting family of cubic splines, the BC-splines:

/« =

12 * B + 6 * C) * \x\ ² + (6 - 2 * 5)

(-12 * B - 48 * C) * \x \ + (8 * B + 24 * C) ¹≤

W < i

M < 2 (2)

others where B and C are parameters that control the shape of the cubic curves and thus the appearance of the output image. The parameters are chosen such that the effects of blurring and ringing on the output image are balanced. Mitchell et al. have shown that for a good image reconstruction, 2^*C+B should be equal to 1 , see Mitchell, Don P. and Netravali, Arun N., "Reconstruction Filters in Computer Graphics", Computer Graphics, vol. 22, no.4, August 1988, pp. 221-228, which is incorporated herein by reference in its entirety.

According to embodiments of the invention, it has been found that blurring is sometimes acceptable and even desirable for OCR. It allows to get rid of noise or imperfections in the colours of the text or the colours of the background of the text in order to suppress "false" edges. As a result, the bi-cubic BC-splines are very interesting for up-scaling the colour image. This is because, if the acceptable blurring effect is known, the ringing error can be minimized, thereby the effect of blurring and ringing can be minimised.

If, for example, it is estimated that the source image is already too blurred, B can be set to 0 and C to 0.5.

If, for example, it is estimated that the source image is sharp enough, B can be set to 1 and C to 0.

The estimation of the B parameter can be performed by determining the scanner type and analyzing the sharpness of the images produced by the scanner.

The B parameter could also be estimated directly from the colour image by analyzing the pixel's luminance value in particular by analyzing the character edges of each pixel of the colour image.

Therefore, by choosing B and C carefully, the tradeoff between blurring and ringing reconstruction errors can be easily found, which is another advantage of the presented invention.

As previously discussed, OCR mainly relies on luminance information rather than chrominance information. Therefore, for the luminance image layer, the time-consuming but accurate bi-cubic interpolation method may be adopted, while for the remaining chrominance image layers the less accurate but fast bilinear method may be employed.

The processing flow algorithm shown in Figure 2, which is a preferred deskew and up-scaling method, may comprise the following steps:

a) Providing an input image, which will mostly be a RGB color image but may also be any other format known the person skilled in the art, which is converted into 3 layers: Y (luminance), U and V (two chrominance).

b) Identify the resolution of the input image, as shown in step 20. c) Converting the input image into Y, U, and V image layers, as shown in step 21

d) Search for the skew angle using the multi-scale profile based method, as shown in step 22

e) Based on the user input, estimate the rotation and up-scaling transformation matrix (1 ), as shown in step 23.

f) The Y layer is interpolated with the bi-cubic method (BC-splines) with carefully selected B and C in (2), creating a new layer, as shown in step 24.

g) The U and V layers are interpolated into new layers with the bilinear method, as shown in step 25.

h) The Y', IT and V layers are converted into the RGB destination color image, as shown in step 26.

By applying the proposed method, the geometric distortion is corrected and the resolution of the image is up-scaled to a higher resolution image. The simultaneous deskew and up-scaling method is considered to be two times faster than the separated deskew and up- scaling method when the same image interpolation method is used for both deskew and up-scaling. The iayer-dependent interpolation scheme further reduces the interpolation time. Compared to the direct bi-cubic interpolation on RGB colour images, the layer-dependent method is 2.5 times faster than the direct bi-cubic interpolation of the RGB colour image.

Figure 3 shows how embodiments of the present invention can be incorporated in an OCR processing chain. It is also possible that methods of the present invention can be optimized for other purposes other than OCR. For example, methods of the present invention may be used to improve the quality of a colour image, which is obtained and displayed in front of the user for visual interpretation.

Claims

1. A method for improving the quality of a colour image prior to performing subsequent processing on the colour image, the method comprising the steps of :

converting the colour image into multiple image layers distinguishable from one another (21);

formulating a distortion correction and up-scaling transformation matrix (23) based on at least an image distortion value estimated from at least a first image layer of the multiple image layers and a target resolution;

simultaneously correcting and up-scaiing each of the multiple image layers (24,25)based on the values of the transformation matrix; and

recombining the corrected and up-scaied multiple image layers into a destination colour image (26).

2. The method according to claim 1 , wherein the simultaneously correcting and up-scaling (24, 25) is executed on each of the multiple image layers in parallel.

3. The method according to claim 1 or claim 2, wherein correction and up-scaling is performed by applying a first interpolation method (24) to the at least first image layer and a second interpolation method (25) to a second image layer of the remaining multiple image layers.

4. The method according to claim 3, wherein the second interpolation method (25) has a lesser quality than the first interpolation method (24).

5. The method according to any one of claims 3 to 4, wherein the first interpolation method (24) is a bi-cubic interpolation method using splines.

6. The method according to claim 5, wherein the bi-cubic method comprises at least a first and a second parameter, which parameters control the quality of the colour image.

7. The method according to claim 6, wherein the first parameter is estimated based on the sharpness of the colour image generated by a capturing device.

8. The method according to claim 6, wherein the first parameter is estimated based on the luminance values obtained from the colour image pixels.

9. The method according to any one of claims 3 to 8, wherein the second interpolation method (25) is a bi-linear interpolation method.

10. The method according to any one of claims 1 to 9, wherein the distortion correction of the colour image involves any one of de-skew, rotate or projection correction methods.

1 1. The method according to any one of claims 1 to 10, wherein the image distortion value is estimated from a skew angle (22) of the first image layer.

12. The method according to any one of claims 1 to 1 , wherein the formulation of the distortion correction and up-scaling matrix (23) is further based on the resolution of the colour image.

13. The method according to claim 12, wherein the colour image resolution is estimated from the average size of the connected components obtained from an image binarization step, which binarization step is performed prior to the step of formulating the transformation matrix.

14. The method according to any one of the preceding claims, wherein the first image layer comprises luminance values of the colour image pixels.

15. The method according to any one of the preceding claims, wherein the multiple image layers other than the first image layer comprise chrominance values of the colour image pixels.

16. A method for performing Optical Character Recognition on a colour image, the method comprising the steps of :

providing a colour image having a first resolution (30);

applying the method according to any one of claims 1 to 15 for improving the quality of the colour image (31); and

performing Optical Character Recognition on the destination colour image (32).

17. The method according to any of the preceding claims, wherein the colour image is in RGB format.

18. A computer program product comprising software code portions stored in a non-transitory computer medium for performing the steps of the method according to any one of claims 1 to 17 when the program is run on a computer.

19. A system configured for improving the quality of a colour image prior to performing subsequent processing on the colour image, the system comprising : first means for converting the colour image into multiple image layers distinguishable from one another;

second means for formulating a distortion correction and up-scaling transformation matrix based on at least an image distortion value estimated from at least a first image layer of the multiple image layers and a target resolution;

third means for simultaneously correcting and up-scaling each of the multiple image layers based on the values of the transformation matrix; and,

fourth means for recombining the corrected and up-scaled multiple image layers into a destination colour image.