WO2007052957A1 - Device and method of classifying an image - Google Patents
Device and method of classifying an image Download PDFInfo
- Publication number
- WO2007052957A1 WO2007052957A1 PCT/KR2006/004517 KR2006004517W WO2007052957A1 WO 2007052957 A1 WO2007052957 A1 WO 2007052957A1 KR 2006004517 W KR2006004517 W KR 2006004517W WO 2007052957 A1 WO2007052957 A1 WO 2007052957A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- target image
- characteristic
- unit
- type
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 57
- 238000000605 extraction Methods 0.000 claims description 24
- 238000010606 normalization Methods 0.000 claims description 24
- 238000004364 calculation method Methods 0.000 claims description 19
- 238000009499 grossing Methods 0.000 claims description 9
- 230000005540 biological transmission Effects 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 3
- 238000007639 printing Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 10
- 238000009826 distribution Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 8
- 239000003086 colorant Substances 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000012545 processing Methods 0.000 description 5
- 239000011159 matrix material Substances 0.000 description 4
- 238000012937 correction Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 239000002699 waste material Substances 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 244000309464 bull Species 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013481 data capture Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 239000012925 reference material Substances 0.000 description 1
- 210000003462 vein Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06K—GRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K7/00—Methods or arrangements for sensing record carriers, e.g. for reading patterns
- G06K7/10—Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation
- G06K7/14—Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation using light without selection of wavelength, e.g. sensing reflected white light
Definitions
- the present invention relates to a device and method of classifying an image, and more particularly, to a device and method in which an image type is classified into an image code, a logo, a picture, or the like and decoding according to the classified type of the image is performed internally or by means of an external server.
- Interface technology which is used for connecting to a computer or a server on a network using information extracted by recognizing and analyzing an image input by a camera and thus receiving a predetermined service has drawn much attention as the mobile computing environment becomes widely used.
- a technology related to image recognition there is a biometrics technology by which a face, finger prints, a hand vein, an iris, a gesture, or the like can be recognized, a watermark technology by which a predetermined pattern can be hidden in an image and extracted, and a recognition technology by which logos, letters on objects, or the like can be recognized.
- the present invention provides a device and method of classifying an image in which an image type input through a predetermined image input device is detected so as to determine whether the image is decodable internally and only characteristic information on the image is transmitted externally when decoding cannot be performed internally.
- the present invention also provides a computer-readable medium having embodied thereon a computer program for executing the method of classifying an image.
- a device for classifying an image comprising: an image input unit receiving a target image as an input; an external characteristic acquisition unit primarily determining an image type of the target image based on a similarity between external characteristics of the target image and various types of reference images; an internal characteristic acquisition unit secondarily determining an image type of the target image based on a similarity between internal characteristics of the target image and internal characteristics of the reference images; and an image classification unit determining whether the target image can be internally decoded based on the image type of the target image determined based on results of the primary and secondary determinations.
- [10] method of classifying an image comprising: receiving a target image as an input; primarily determining an image type of the target image based on a similarity between external characteristics of the target image and various types of reference images; secondarily determining an image type of the target image based on a similarity between internal characteristics of the target image and internal characteristics of the reference images; and determining whether the target image can be internally decoded based on the image type of the target image determined based on results of the primary and secondary determinations.
- a type of input image can be classified into an image code, a logo, a drawing, etc. by image processing of the input image, and appropriate decoding for each type can be performed. Accordingly, it can be quickly determined whether the image can be internally decoded in order to prevent unnecessary decoding, and image recognition can be performed more rapidly in decoding in a terminal or a server by extracting basic characteristic information in advance.
- the terminal can be determined in advance whether decoding is performed in a terminal or a server by recognizing a complicated image or an image requiring many calculations so as to improve speed and precision.
- the terminal when decoding is performed in a server, the terminal generally transmits an image without determining whether decoding is performed in the terminal so as not to waste any time.
- only images which are decodable in the server are transmitted in order to efficiently process information.
- FlGs. IA to IF are diagrams illustrating examples of image codes
- FlG. 2 is a block diagram illustrating an apparatus for classifying an image code according to an embodiment of the present invention
- FlG. 3 is a block diagram illustrating a detailed structure of an image input unit of the device for classifying an image, as illustrated in FlG. 2;
- FlG. 4 A is a block diagram illustrating a detailed structure of an external characteristic acquisition unit of a device for classifying an image according to an embodiment of the present invention
- FlG. 4B is a block diagram illustrating a detailed structure of an internal characteristic acquisition unit of a device for classifying an image according to an embodiment of the present invention
- FlG. 5 is a flowchart illustrating a method of classifying an image according to an embodiment of the present invention.
- FlG. 6 is a flowchart illustrating a method of classifying an image according to an embodiment of the present invention.
- a device for classifying an image comprising: an image input unit receiving a target image as an input; an external characteristic acquisition unit primarily determining an image type of the target image based on a similarity between external characteristics of the target image and various types of reference images; an internal characteristic acquisition unit secondarily determining an image type of the target image based on a similarity between internal characteristics of the target image and internal characteristics of the reference images; and an image classification unit determining whether the target image can be internally decoded based on the image type of the target image determined based on results of the primary and secondary determinations.
- method of classifying an image comprising: receiving a target image as an input; primarily determining an image type of the target image based on a similarity between external characteristics of the target image and various types of reference images; secondarily determining an image type of the target image based on a similarity between internal characteristics of the target image and internal characteristics of the reference images; and determining whether the target image can be internally decoded based on the image type of the target image determined based on results of the primary and secondary determinations.
- FIGS. IA to IF are diagrams illustrating examples of image codes.
- FIGS. IA, IB, and 1C are examples of a data matrix, a quick response (QR) code, and a PDF-417 code, respectively.
- FIGS. ID, IE, and IF are examples of color and grey codes, a mixed code generated by overlaying a color code and a QR code, and a mixed code generated by overlaying a color code and a predetermined image, respectively.
- An embodiment of the present invention discloses a method of classifying image codes illustrated in FIGS. IA to IF.
- the present invention is not limited thereto and a different image code, logo or photograph can be defined and classified.
- a method of classifying image codes will be mainly described.
- FlG. 2 is a block diagram illustrating an apparatus for classifying an image code according to an embodiment of the present invention.
- the apparatus for classifying an image code in the current embodiment includes an image input unit 200, an external characteristic acquisition unit 210, an internal characteristic acquisition unit 220, an image classification unit 230, a decoding unit 240, and a characteristic information transmission unit 250.
- the image input unit 200 receives a target image as an input from a predetermined image input device.
- the image input device may be a camera which is built into or externally attached to a personal portable terminal or a device which receives a target image as an input through wire/wireless networks in the form of an electronic document. Detailed structures of the image input unit 200 will be described in more detail with reference to FlG. 3.
- the external characteristic acquisition unit 210 performs a first determination on a type of target image based on the shape of an outline and a pattern of the image.
- the external characteristic acquisition unit 210 performs the first determination on a type of image that is to be analyzed by comparing an external characteristic of the target image with information on external characteristics and patterns of various types of images which are analyzed and stored in advance.
- the various images which are analyzed and stored in advance are collectively referred to as a reference image.
- the external characteristic acquisition unit 210 firstly determines whether the image is a two-dimensional image when a shape of an outline of the image to be analyzed is a predefined figure. Detailed structures of the external characteristic acquisition unit 210 will be described in more detail with reference to FIG. 4A.
- the internal characteristic acquisition unit 220 performs a second determination on a type of target image based on an inside pattern and color of the target image.
- the types of target image are confined within a range according to the first determination performed by the external characteristic acquisition unit 210. Accordingly, the inside pattern and color of the reference images within the range are compared with those of the target image in order to determine the type of target image.
- the image classification unit 230 determines the type of target image based on results of the first and second determination and determines whether the target image can be internally decoded based on the image type of the target image.
- the type of the target image is determined to be an image code based on the of the first determination, and the type of the target image is determined to be a color code among two-dimensional image codes based on the result of the second determination, when the number of colors used in the color code is equal to or smaller than a predetermined number, the target image is determined to be not a normal color code. Accordingly, the image classification unit 230 determines whether the target image can be decoded based on the results of the first and second determination.
- a color code includes cells using four colors, eight colors, or the like or three or more pieces of brightness information.
- the image classification unit 230 classifies the target image as 'determination impossible' in spite of the results of the first and second determination.
- the target image is classified as 'determination impossible'. In this case, the image is severely damaged, and so binarization should be performed again, or the image should be input again.
- the target image is shapeless, the type of target image can be determined based on the similarity.
- the decoding unit 240 decodes the target image using a decoding method corresponding to the type of target image when the target image is determined to be internally decodable based on the image type of the target image by the image classification unit 230.
- the decoding unit 240 determines letter and number information by a pattern matching technique using characteristic information for a letter or a number. For a 2D image code or a color code, the decoding unit 240 can perform decoding by determining positions of cells using the characteristic information and normalization.
- the target image is shapeless, the decoding unit 240 extracts the most similar image from a database, and a corresponding service is provided by using the extracted image. The extraction of the characteristic information will be described in detail with reference to FIGS. 4A and 4b.
- the decoding unit 240 decodes the target image internally.
- the internal decoding may not be performed due to a limitation in the information processing capability of a mobile terminal.
- the characteristic information transmission unit 250 transmits the target image to an external server having high performance in order for the target image to be decoded and receives the result of the decoding, when the target image cannot be processed internally.
- images which cannot be processed internally are logos and images that include many letters or numbers. However, these images may be internally decodable as the performance of a terminal having a device for classifying an image according to an embodiment of the present invention is improved.
- the image types of which images cannot be internally decoded are set so as to be flexible based on the performance of the terminal.
- the characteristic information transmission unit 250 may analyze the target image and transmit only the analyzed characteristic information on the target image to the external server instead of transmitting the target image to the external server, depending on the communication speed between the device and the external server. The extraction of the characteristic information for the transmission will be described in detail later with reference to FIGS. 4 A and 4B.
- FlG. 3 is a block diagram illustrating a detailed structure of an image input unit of the device for classifying an image, as illustrated in FlG. 2.
- the image input unit 200 includes an image normalization unit
- the image input unit 200 separates a portion representing predetermined information, that is, an image code, a logo, a letter, or the like, from an image input through a camera or the like in order to generate a target image.
- the image normalization unit 300 at first converts the image input through a camera into an image format which can be easily decoded. For example, the image normalization unit 300 converts an input image having a format such as YUV24 and RGB24 into an RGB24 format. Thereafter, the image normalization unit 300 corrects color, brightness, and the like of the image and arranges the direction and size of the image so as to generate a normalized image.
- the color and brightness of the image input by means of a camera or the like are greatly influenced by the type of light, the time when the image is inputted (day or night), a printing medium on which the image is printed, performance of a camera sensor, etc.
- the image normalization unit 300 measures a color variance, a brightness variance of the input image, a distribution characteristic of the variances, and the like and corrects the color, and the brightness of the image.
- the image normalization unit 300 at first divides the image into a predetermined number of blocks and acquires a brightness level and a characteristic value of light for the blocks.
- the image normalization unit 300 compares the characteristic values for the blocks with each other and performs interpolation in order to acquire brightness information of the whole image.
- the method described above is generally used for processing gradation brightness information, and correction can be performed using a variance ratio of the brightness.
- the brightness information can be precisely acquired when the same colors are compared with each other. Accordingly, the image normalization unit 300 captures a predetermined image under a predetermined standard light environment in order to analyze distributions of the brightness and the like in advance. The image normalization unit 300 can correct the brightness of the image more easily and precisely by comparing the distributions of the brightness which are analyzed in advance and the distribution of the brightness of the input image with each other. [45] In other words, the image normalization unit 300 analyzes a distribution ratio, maximal and minimal values of R, G, and B channels for each color and maximal, minimal, and median values and a distribution ratio of the brightness under a standard light in advance.
- the image normalization unit 300 analyzes the distribution ratio of the R, G, B channels and the like of the image input under a general light environment and compares the results of the analysis with the values under the standard light which is analyzed in advance in order to correct the color and the brightness of the input image.
- the image normalization unit 300 may use a color correction algorithm such as a
- Gray World Assumption method a Retinex algorithm, a Gamut mapping algorithm, etc., for generating a normalized image in which the brightness and color are corrected.
- the binarization unit 310 converts the normalized image in which the color and brightness are corrected into a binary image in which an image is represented by two colors or two brightness levels. In the current embodiment, the conversion into a binary image including black and white colors will now be described.
- the binarization unit 310 divides the normalized image by the color and the brightness levels. As an example, the binarization unit 310 converts the normalized image into a plurality of regions, each of which has the same color and the same brightness level by determining a color and a brightness level of a pixel to be the same as a color and a brightness level of an adjacent pixel when a difference of the color and the brightness level between the pixel and the adjacent pixel is within a predetermined error.
- the binarization unit 310 calculates a binary value for each region using a brightness level and a histogram for each R, G, and B channels.
- the binarization unit 310 may divide adjacent pixels into groups in units of several pixels and calculate a weighted average of the adjacent pixels in the group, and the average may be set as a frequency of a corresponding brightness level.
- the binarization unit 310 selects a value of the brightness or color of a group which has the least occurr ing frequency of the two groups as a binary value.
- the binarization unit 310 stores the brightness values and color values in each group having the least frequency as a binary environment variable in advance and selects and sets one of the values as a binary value.
- the binarization unit 310 converts the image into binary values of black and white and removes noise images. As an example, when the brightness of an image pixel is equal to or greater than the binary value, the binarization unit 310 converts the image pixel into a white color. On the other hand, when the brightness of an image pixel is smaller than the binary value, the binarization unit 310 converts the image pixel into a black color.
- the binarization unit 310 determines the black object to be a noise image and removes the black object.
- a 2D black- and-white image code includes a set of small black objects, a small black object cannot be removed routinely. Accordingly, the binarization unit 310 removes the outermost black pixels, that is, noise images connected to an outer frame, so as to protect a code image that is expected to be in the center of the image or in the proximity thereof. Then, the binarization unit 310 restores disconnected points of the image by performing Gaussian filtering, when the image is not even.
- the image extraction unit 320 extracts a target image region from the binary image. At first, the image extraction unit 320 detects candidate regions in which objects exist by determining an outline of regions represented in black in the binary image. When the input image is a letter or a number, a shapeless figure which includes a line or is relatively thin is detected as a candidate region. On the other hand, when the input image is a color code or a logo, a shaped figure is detected as a candidate region. In addition, when the input image is a QR code, a data matrix, or the like, a candidate including a set of pixels and lines is detected.
- the candidate region detected by the image extraction unit 320 may include one or a plurality of regions.
- the image extraction unit 320 processes candidate regions which are separated from each other as one target image region. For example, since an image code such as a QR code is a set of pixels, the pixels within a predetermined distance may be regarded as one set after distances among the plurality of pixels are calculated. It is more useful to extend the region from an object located at the center of the image by calculating distances from adjacent pixels. Since, in most cases, the object of interest is located in the center of an image, this method can be easily used.
- FIG. 4A is a block diagram illustrating a detailed structure of an external characteristic acquisition unit 210 of a device for classifying an image according to an embodiment of the present invention.
- the external characteristic acquisition unit 210 includes an external characteristic extraction unit 410 and a first pattern similarity calculation unit 420.
- the external characteristic extraction unit 410 extracts characteristic information of the target image by detecting an outline, a change in a direction, maximal and minimal coordinate points, a distance of a unidirectional segment of the target image in order to generate a standardized pattern information.
- Characteristic information is, for example, coordinates of a characteristic point or a length of a chain code between the characteristic points. More specifically, information on a change in an angle of an outline of an image or segments of a characteristic pattern, coordinates of characteristic points, a pattern based on a relative position of the characteristic points, maximal and minimal points among external characteristic points of the image, and a length of the pattern may be included in the characteristic information.
- the characteristic points mean pixels indicating characteristics of an object which may be a segment or a figure of an image.
- various algorithms such as an edge detection algorithm and a method of detecting a termination or superposition of a character in character recognition are used together.
- Major information on each pixel structuring a segment of an object such as a termination, or a corner, information on a pixel from which a direction of the segment changes over a threshold angle, and information on a center point of 2D code cells are commonly used characteristic points. References concerning methods of detecting the characteristic points are as follows.
- the external characteristic extraction unit 410 performs skeletonization, hook handling, smoothing, and normalization in order to generate standardized pattern information, as necessary. Each procedure is described as follows.
- Skeletonization means converting a segment or a letter into a segment which has a thickness of one pixel.
- the skeletonization is performed for an object having a thickness smaller than a predetermined thickness.
- the skeletonization can be performed for a thick object, as necessary.
- a segment resulting from the skeletonization clearly represents a shape or characteristics of the object, since the segment includes information on a central axis of the object. For example, when representing a square, a skeletonized segment in the shape of a letter 'X' in which diagonal vertices of the square are connected with each other can be obtained through skeletonization.
- the skele- tonization procedure is performed by using a thinning algorithm.
- Hook handling is for removing a hook portion of a characteristic point.
- a hook which can be regarded as an image type noise, exists in a start portion of the letter, and the hook portion may be removed.
- the hook portion has a short length between the characteristic points and an abrupt change in direction, and accordingly, the hook portion is removed based on the characteristics of the hook portion.
- x denotes a vector representing coordinates.
- Normalization is for readjusting the size or direction of an image according to a predetermined reference, since the size and direction of the image which is input by means of a camera can vary based on characteristics of the camera. In normalization, there is size normalization and direction normalization.
- a length of the shortest segment from among structuring elements of an object or a length of an n-divided segment resulting from dividing the shortest segment by n is set as a unit distance, and the lengths of other segments are represented in multiples of this unit distance. Accordingly, a length ratio of the segments can be acquired.
- the size of the image can be standardized by enlarging or contracting using the same ratio lengths of the segments structuring an outline of the object. In other words, since the number of unit distances can be acquired by dividing a sum of lengths of segments constructing the outline with the unit distance, the sizes of images can be standardized so as to make the images have a length the same as the number of unit distances.
- direction information between predetermined characteristic points that is, chain code information
- the direction information from a reference characteristic point to a characteristic point A which is connected to the reference characteristic point is acquired, and the direction information from the characteristic point A to a characteristic point B which is connected to the characteristic point A is acquired.
- the direction information on all the segments constructing an outline is acquired using the method described above in order to generate a continuous chain code.
- the start characteristic point becomes a final characteristic point, and otherwise the start characteristic point is different to the final characteristic point.
- a shape number is acquired using the difference between the direction numbers constructing the chain code.
- the first pattern similarity calculation unit 420 searches for a most similar pattern by comparing a pattern of the characteristic information, which is extracted from the target image, and predetermined reference patterns.
- the reference patterns are patterns (or pattern shape numbers) determined by analyzing in advance reference images such as various types of image codes and logos which can be internally processed by a personal mobile terminal or the like having a device for classifying an image according to an embodiment of the present invention. Accordingly, the first pattern similarity calculation unit 420 can perform a first determination on the type of target image by comparing a pattern generated from the characteristic information on the target image and a reference pattern (or a pattern shape number).
- the first pattern similarity calculation unit 420 forms a set of reference patterns having the same number of elements as the unit distance number, that is, the number of elements of a set of shape numbers, obtained from the normalization procedure described above and compares orders of elements of shape numbers of the image pattern and a reference pattern in order to find a similar reference pattern.
- the comparison should be performed while considering that there may be a different shape number of the image pattern starting with a same minimal difference number '0' although the image is normalized to a minimal difference number 1 O'.
- a shape number of the image pattern is '03313303' and a shape number of a reference pattern is '03033133', both are the same image patterns.
- the first pattern similarity calculation unit 420 calculates a pattern similarity between a target image pattern and a reference image pattern only once.
- the target image is a set of small objects having a plurality of outlines
- the first pattern similarity calculation unit 420 should calculate a set of pattern similarities corresponding to each object.
- finder patterns exist in a remaining portion except in a right lower portion of the code, and in a maxicode a finder pattern having a shape of concentric circles called a 'bull's eye' exists in the center.
- characteristic patterns called guide bars exist on left and right sides of the code.
- other patterns which can characterize a corresponding code other than the finder patterns and the guide bars such as a timing pattern and an alignment pattern exist in a predetermined position.
- one code since one code includes a set of distributed patterns, that is, a set of outlines and points, the similarity between each of the distributed patterns should be calculated so as to measure the overall similarity of the corresponding code.
- a logo since it may include distributed patterns, similarities between the distributed patterns are calculated, and it is required to calculate the overall similarity as one logo using a set of distributed patterns.
- a method in which relative distances between the distributed patterns are considered is the most commonly used method and is called a spiral search.
- FIG. 4A a method in which normalized shape numbers of characteristic patterns are generated and compared with shape numbers of a reference pattern is described.
- An aspect of the present invention is to firstly determine the type of target image by comparing external characteristics of the target image and the reference image. Accordingly, in order to achieve this aspect of the present invention, various methods of comparing the external characteristics may be used.
- the shape of an outline of an object is a square or a rectangle, and accordingly it is determined whether the color code can be classified based on the characteristics of the shape of the color code.
- a bar code includes a predetermined guide pattern and a set of white/black bars, and accordingly, it is determined whether the bar code can be classified based on the set of the patterns.
- a 2D image code can be divided into a matrix type and a layered type
- most of the 2D image codes include finder patterns and pixels in a data region.
- the finder patterns exist on a vertex or in an outline portion of a code, but some special codes may exist in the center. Examples of the finder pattern are a square, a predetermined white/black bar pattern, concentric circles, concentric squares, and an outline form. Accordingly, when a finder pattern exists in a predetermined position of the target image, it is determined that the 2D image code can be classified.
- the remaining portions, apart from the finder pattern include shapeless patterns of pixels. It is not necessary to check each remaining portion, and instead, it is checked whether the complete shape of the object includes a square, a rectangle, a predetermined figure, or the like.
- the shape of an image of the object forms the shape of one of the predetermined figures as a whole such as a square, a rectangle, a hexagon, or an octagon, although the shape of the image does not correspond to a shape of the code described above, or when the shape of the image is similar to reference patterns already registered, it is determined that the figure or the segment can be classified. That is because, as a whole, the shape of the pattern is an image code or a company logo, which have a particular type of shape.
- the shape of a pattern is a set of relatively short straight lines and curved segments
- a similarity is measured by comparing the shape of the image with a set of reference patterns corresponding to letters and numbers.
- the similarity is measured by comparing the image with a set of reference patterns corresponding to shapeless patterns.
- the similarity is greater than a predetermined similarity, it is determined that the figure or the segment can be classified.
- FlG. 4B is a block diagram illustrating a detailed structure of an internal characteristic acquisition unit of a device for classifying an image according to an embodiment of the present invention.
- the internal characteristic acquisition unit 220 includes an internal characteristic extraction unit 430 and a second pattern similarity calculation unit 440.
- the internal characteristic extraction unit 430 detects characteristic points and color information inside a target image which is determined at first to be classifiable. For the extraction of internal characteristics, various methods used by the external characteristic extraction unit 410 can be used. Since an outline of a bar code is an outlines of patterns constructing an object, it is not necessary to extract the internal characteristics. However, in a two-dimensional code, another pattern exists in a finer pattern.
- the internal characteristic extraction unit 430 detects characteristic points of the concentric circles. Also, in a QR code, since a black square exists in a finder pattern, the internal characteristic extraction unit 430 detects the internal characteristic of the black square. In a color code, since a figure in various forms exists in a corresponding outline, the internal characteristic extraction unit 430 detects color information in the color code along with characteristic points of the figure in various forms as internal characteristics. In a logo image, since there are some cases where a letter and a number are included in a figure, the internal characteristic extraction unit 430 detects characteristic points for the letter or number and determines a color for each region.
- the second pattern similarity calculation unit 440 calculates a distribution of internal characteristic points and similarities to the reference patterns. In other words, the second pattern similarity calculation unit 440 measures similarities between characteristic points and characteristic information detected by the internal characteristic extraction unit 430 and the reference patterns. Since the target image is classified into an image type, that is, which group of candidates the target image belongs, based on the external characteristics by the first pattern similarity calculation unit 420, the second pattern similarity calculation unit 440 performs a function of re-determining whether the first classification is correct. Accordingly, the second pattern similarity calculation unit 440 determines the similarity to be very high when the internal characteristic points and characteristic information of the target image are reference characteristic information of the corresponding candidate group. Otherwise, the second pattern similarity calculation unit 440 determines the similarity to be low.
- FlG. 5 is a flowchart illustrating a method of classifying an image according to an embodiment of the present invention.
- the target image is decoded using a method corresponding to the image code (Operation S525).
- the target image is determined to be an image type that cannot be processed internally, that is, a logo, a letter, or the like, the target image is transmitted to an external server, and the recoding result is received (Operation S530). A corresponding service based on the decoding result is provided (Operation S535).
- FlG. 6 is a flowchart illustrating a method of classifying an image according to an embodiment of the present invention.
- an image printed on a predetermined printing medium is received by means of an image input device (Operation S600).
- a normalized image in which color and brightness level distortions according to neighboring environments at the time the image is input are corrected, is generated (Operation S605).
- the normalized image is converted into a binary image including two colors of black and white (Operation S610).
- a target image is extracted from the binary image (Operation S615).
- Characteristic patterns are extracted by analyzing patterns of an outline and characteristic points of the target image (Operation S620).
- a pattern similarity between the characteristic patterns of the target image and characteristic patterns of reference images (that is, reference patterns) is calculated (Operation S625).
- the target image is determined at first to be the same type of predetermined reference image (Operation S630), and internal characteristic points of the target image are detected (Operation S635). Then it is determined whether internal characteristics of the target image and the image type detected according to the result of the first determination are similar (Operation S640).
- the embodiments of the present invention can be written as computer programs and can be implemented in general-use digital computers that execute the programs using a computer readable recording medium.
- Examples of the computer readable recording medium include magnetic storage media (e.g., ROM, floppy disks, hard disks, etc.), optical recording media (e.g., CD-ROMs, or DVDs), and storage media such as carrier waves (e.g., transmission through the Internet).
- a type of input image can be classified into an image code, a logo, a drawing, etc. by image processing of the input image, and appropriate decoding for each type can be performed. Accordingly, it can be quickly determined whether the image can be internally decoded in order to prevent unnecessary decoding, and image recognition can be performed more rapidly in decoding in a terminal or a server by extracting basic characteristic information in advance.
- the terminal can be determined in advance whether decoding is performed in a terminal or a server by recognizing a complicated image or an image requiring many calculations so as to improve speed and precision.
- the terminal when decoding is performed in a server, the terminal generally transmits an image without determining whether decoding is performed in the terminal so as not to waste any time.
- only images which are decodable in the server are transmitted in order to efficiently process information.
- the present invention relates to a device and method of classifying an image .
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Electromagnetism (AREA)
- General Health & Medical Sciences (AREA)
- Toxicology (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
A device and method of classifying an image are provided. In the device and method, an image type of the target image is determined based on a similarity between external characteristics of the target image and various types of reference images and whether internal characteristics of the target image are similar to internalcharacteristics of the image type resulting from the determination are determined for a second time in order to determine whether the image type of the input image can be internally encoded. Accordingly, it can be determined whether the input image can be internally decoded based on the image type of the target image so as to prevent unnecessary decoding.
Description
Description
DEVICE AND METHOD OF CLASSIFYING AN IMAGE
Technical Field
[1] The present invention relates to a device and method of classifying an image, and more particularly, to a device and method in which an image type is classified into an image code, a logo, a picture, or the like and decoding according to the classified type of the image is performed internally or by means of an external server.
Background Art
[2] Interface technology which is used for connecting to a computer or a server on a network using information extracted by recognizing and analyzing an image input by a camera and thus receiving a predetermined service has drawn much attention as the mobile computing environment becomes widely used. As a technology related to image recognition, there is a biometrics technology by which a face, finger prints, a hand vein, an iris, a gesture, or the like can be recognized, a watermark technology by which a predetermined pattern can be hidden in an image and extracted, and a recognition technology by which logos, letters on objects, or the like can be recognized.
[3] As the performance of a personal mobile terminal such as a mobile computer, a cellular phone, and a PDA is gradually improved, a large amount of information can be processed in the personal mobile terminal. Since image recognition is a very complicated procedure and requires many calculations, all the procedures for the image recognition cannot be processed in the personal mobile terminal. Therefore, several image codes such as a color code and a bar code and simple face recognition can be processed in the mobile terminal. However, a complicated image code, biometric information, a logo, and the like are analyzed in an external server, instead of in the terminal, by transmitting the image to the external server.
[4] When the image is transmitted to the external server, a relatively slow network speed of the terminal should be considered, and after a certain period of time a decoding result should be received from the external server. Accordingly, it takes several seconds to several minutes to transmit the image to the external server and receive a processing result from the external server. The image should be inputted again and transmitted when the image quality is not good or when the decoding fails. A communication fee for transmitting data between the mobile terminal and the external server is charged to the mobile terminal.
[5] Generally, in order to provide a service of an image recognition area, only a predetermined type of image is recognized. For example, in a biometrics service, a corresponding biometric image can be recognized, and in an image code service, only an
image code image can be recognized. Of course, in the image code service, a one- dimensional bar code, a two-dimensional code, or the like can be recognized. However, there is no case where biometric recognition, code recognition, and log recognition are performed in one terminal. This is because it cannot be determined whether an input image is an image code, a biometric image, a logo, or a letter in one terminal. Disclosure of Invention
Technical Problem
[6] The present invention provides a device and method of classifying an image in which an image type input through a predetermined image input device is detected so as to determine whether the image is decodable internally and only characteristic information on the image is transmitted externally when decoding cannot be performed internally.
[7] The present invention also provides a computer-readable medium having embodied thereon a computer program for executing the method of classifying an image.
Technical Solution
[8] According to an aspect of the present invention, there is provided a device for classifying an image comprising: an image input unit receiving a target image as an input; an external characteristic acquisition unit primarily determining an image type of the target image based on a similarity between external characteristics of the target image and various types of reference images; an internal characteristic acquisition unit secondarily determining an image type of the target image based on a similarity between internal characteristics of the target image and internal characteristics of the reference images; and an image classification unit determining whether the target image can be internally decoded based on the image type of the target image determined based on results of the primary and secondary determinations.
[9] According to another aspect of the present invention, there is provided a
[10] method of classifying an image comprising: receiving a target image as an input; primarily determining an image type of the target image based on a similarity between external characteristics of the target image and various types of reference images; secondarily determining an image type of the target image based on a similarity between internal characteristics of the target image and internal characteristics of the reference images; and determining whether the target image can be internally decoded based on the image type of the target image determined based on results of the primary and secondary determinations.
Advantageous Effects
[11] According to the present invention, a type of input image can be classified into an
image code, a logo, a drawing, etc. by image processing of the input image, and appropriate decoding for each type can be performed. Accordingly, it can be quickly determined whether the image can be internally decoded in order to prevent unnecessary decoding, and image recognition can be performed more rapidly in decoding in a terminal or a server by extracting basic characteristic information in advance.
[12] In addition, it can be determined in advance whether decoding is performed in a terminal or a server by recognizing a complicated image or an image requiring many calculations so as to improve speed and precision. In addition, when decoding is performed in a server, the terminal generally transmits an image without determining whether decoding is performed in the terminal so as not to waste any time. However, according to the present invention, only images which are decodable in the server are transmitted in order to efficiently process information.
Description of Drawings
[13] FlGs. IA to IF are diagrams illustrating examples of image codes;
[14] FlG. 2 is a block diagram illustrating an apparatus for classifying an image code according to an embodiment of the present invention;
[15] FlG. 3 is a block diagram illustrating a detailed structure of an image input unit of the device for classifying an image, as illustrated in FlG. 2;
[16] FlG. 4 A is a block diagram illustrating a detailed structure of an external characteristic acquisition unit of a device for classifying an image according to an embodiment of the present invention;
[17] FlG. 4B is a block diagram illustrating a detailed structure of an internal characteristic acquisition unit of a device for classifying an image according to an embodiment of the present invention;
[18] FlG. 5 is a flowchart illustrating a method of classifying an image according to an embodiment of the present invention; and
[19] FlG. 6 is a flowchart illustrating a method of classifying an image according to an embodiment of the present invention.
Best Mode
[20] According to an aspect of the present invention, there is provided a device for classifying an image comprising: an image input unit receiving a target image as an input; an external characteristic acquisition unit primarily determining an image type of the target image based on a similarity between external characteristics of the target image and various types of reference images; an internal characteristic acquisition unit secondarily determining an image type of the target image based on a similarity between internal characteristics of the target image and internal characteristics of the reference images; and an image classification unit determining whether the target
image can be internally decoded based on the image type of the target image determined based on results of the primary and secondary determinations.
[21] According to another aspect of the present invention, there is provided a
[22] method of classifying an image comprising: receiving a target image as an input; primarily determining an image type of the target image based on a similarity between external characteristics of the target image and various types of reference images; secondarily determining an image type of the target image based on a similarity between internal characteristics of the target image and internal characteristics of the reference images; and determining whether the target image can be internally decoded based on the image type of the target image determined based on results of the primary and secondary determinations.
Mode for Invention
[23] Now, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.
[24] FIGS. IA to IF are diagrams illustrating examples of image codes.
[25] FIGS. IA, IB, and 1C are examples of a data matrix, a quick response (QR) code, and a PDF-417 code, respectively. FIGS. ID, IE, and IF are examples of color and grey codes, a mixed code generated by overlaying a color code and a QR code, and a mixed code generated by overlaying a color code and a predetermined image, respectively.
[26] An embodiment of the present invention discloses a method of classifying image codes illustrated in FIGS. IA to IF. However, the present invention is not limited thereto and a different image code, logo or photograph can be defined and classified. For convenience of description, however, a method of classifying image codes will be mainly described.
[27] FlG. 2 is a block diagram illustrating an apparatus for classifying an image code according to an embodiment of the present invention.
[28] Referring to FlG. 2, the apparatus for classifying an image code in the current embodiment includes an image input unit 200, an external characteristic acquisition unit 210, an internal characteristic acquisition unit 220, an image classification unit 230, a decoding unit 240, and a characteristic information transmission unit 250.
[29] The image input unit 200 receives a target image as an input from a predetermined image input device. The image input device may be a camera which is built into or externally attached to a personal portable terminal or a device which receives a target image as an input through wire/wireless networks in the form of an electronic document. Detailed structures of the image input unit 200 will be described in more detail with reference to FlG. 3.
[30] The external characteristic acquisition unit 210 performs a first determination on a
type of target image based on the shape of an outline and a pattern of the image. The external characteristic acquisition unit 210 performs the first determination on a type of image that is to be analyzed by comparing an external characteristic of the target image with information on external characteristics and patterns of various types of images which are analyzed and stored in advance. Hereinafter, the various images which are analyzed and stored in advance are collectively referred to as a reference image. For example, since a bar code includes a plurality of black bars, and a two-dimensional image includes a predefined figure such as a square, a rectangle, or the like, the external characteristic acquisition unit 210 firstly determines whether the image is a two-dimensional image when a shape of an outline of the image to be analyzed is a predefined figure. Detailed structures of the external characteristic acquisition unit 210 will be described in more detail with reference to FIG. 4A.
[31] The internal characteristic acquisition unit 220 performs a second determination on a type of target image based on an inside pattern and color of the target image. The types of target image are confined within a range according to the first determination performed by the external characteristic acquisition unit 210. Accordingly, the inside pattern and color of the reference images within the range are compared with those of the target image in order to determine the type of target image.
[32] For example, for a bar code, there is no other pattern or color inside a black bar, and so additional detection of the internal characteristic is not required. However, for a two-dimensional image code, it cannot be determined whether the target image is a two-dimensional image code only from a shape of an outline of the target image. Accordingly, it is necessary to determine whether the inside pattern includes characteristics of a two-dimensional image code, even though the shape or pattern of the outline is determined to be a shape of an image code according to the first determination. Detailed structures of the internal characteristic acquisition unit 220 will be described in more detail with reference to FIG. 4B.
[33] The image classification unit 230 determines the type of target image based on results of the first and second determination and determines whether the target image can be internally decoded based on the image type of the target image. However, although the type of the target image is determined to be an image code based on the of the first determination, and the type of the target image is determined to be a color code among two-dimensional image codes based on the result of the second determination, when the number of colors used in the color code is equal to or smaller than a predetermined number, the target image is determined to be not a normal color code. Accordingly, the image classification unit 230 determines whether the target image can be decoded based on the results of the first and second determination.
[34] For example, a color code includes cells using four colors, eight colors, or the like
or three or more pieces of brightness information. When a color code includes only one color, so that the target image is determined to be not a normal color code, and accordingly the image classification unit 230 classifies the target image as 'determination impossible' in spite of the results of the first and second determination. For a 2D code, when a considerable portion of the code image is damaged due to the reflection of light or ink, the similarity is determined to be low, and therefore the target image is classified as 'determination impossible'. In this case, the image is severely damaged, and so binarization should be performed again, or the image should be input again. Although the target image is shapeless, the type of target image can be determined based on the similarity.
[35] The decoding unit 240 decodes the target image using a decoding method corresponding to the type of target image when the target image is determined to be internally decodable based on the image type of the target image by the image classification unit 230. The decoding unit 240 determines letter and number information by a pattern matching technique using characteristic information for a letter or a number. For a 2D image code or a color code, the decoding unit 240 can perform decoding by determining positions of cells using the characteristic information and normalization. Although the target image is shapeless, the decoding unit 240 extracts the most similar image from a database, and a corresponding service is provided by using the extracted image. The extraction of the characteristic information will be described in detail with reference to FIGS. 4A and 4b.
[36] When the type of the target image is determined to be a color code having a small data size, a two-dimensional code, a bar code, or the like, decoding can generally be performed, considering the performance of a terminal. Then, the decoding unit 240 decodes the target image internally. However, for image codes having large data sizes, the internal decoding may not be performed due to a limitation in the information processing capability of a mobile terminal.
[37] The characteristic information transmission unit 250 transmits the target image to an external server having high performance in order for the target image to be decoded and receives the result of the decoding, when the target image cannot be processed internally. Examples of images which cannot be processed internally are logos and images that include many letters or numbers. However, these images may be internally decodable as the performance of a terminal having a device for classifying an image according to an embodiment of the present invention is improved. The image types of which images cannot be internally decoded are set so as to be flexible based on the performance of the terminal.
[38] The characteristic information transmission unit 250 may analyze the target image and transmit only the analyzed characteristic information on the target image to the
external server instead of transmitting the target image to the external server, depending on the communication speed between the device and the external server. The extraction of the characteristic information for the transmission will be described in detail later with reference to FIGS. 4 A and 4B.
[39] FlG. 3 is a block diagram illustrating a detailed structure of an image input unit of the device for classifying an image, as illustrated in FlG. 2.
[40] Referring to FlG. 3, the image input unit 200 includes an image normalization unit
300, a binarization unit 310, and an image extraction unit 320. The image input unit 200 separates a portion representing predetermined information, that is, an image code, a logo, a letter, or the like, from an image input through a camera or the like in order to generate a target image.
[41] The image normalization unit 300 at first converts the image input through a camera into an image format which can be easily decoded. For example, the image normalization unit 300 converts an input image having a format such as YUV24 and RGB24 into an RGB24 format. Thereafter, the image normalization unit 300 corrects color, brightness, and the like of the image and arranges the direction and size of the image so as to generate a normalized image.
[42] The color and brightness of the image input by means of a camera or the like are greatly influenced by the type of light, the time when the image is inputted (day or night), a printing medium on which the image is printed, performance of a camera sensor, etc. The image normalization unit 300 measures a color variance, a brightness variance of the input image, a distribution characteristic of the variances, and the like and corrects the color, and the brightness of the image.
[43] In order to explain the correction procedure in more detail, the image normalization unit 300 at first divides the image into a predetermined number of blocks and acquires a brightness level and a characteristic value of light for the blocks. The image normalization unit 300 compares the characteristic values for the blocks with each other and performs interpolation in order to acquire brightness information of the whole image. The method described above is generally used for processing gradation brightness information, and correction can be performed using a variance ratio of the brightness.
[44] The brightness information can be precisely acquired when the same colors are compared with each other. Accordingly, the image normalization unit 300 captures a predetermined image under a predetermined standard light environment in order to analyze distributions of the brightness and the like in advance. The image normalization unit 300 can correct the brightness of the image more easily and precisely by comparing the distributions of the brightness which are analyzed in advance and the distribution of the brightness of the input image with each other.
[45] In other words, the image normalization unit 300 analyzes a distribution ratio, maximal and minimal values of R, G, and B channels for each color and maximal, minimal, and median values and a distribution ratio of the brightness under a standard light in advance. Thereafter, the image normalization unit 300 analyzes the distribution ratio of the R, G, B channels and the like of the image input under a general light environment and compares the results of the analysis with the values under the standard light which is analyzed in advance in order to correct the color and the brightness of the input image.
[46] The image normalization unit 300 may use a color correction algorithm such as a
Gray World Assumption method, a Retinex algorithm, a Gamut mapping algorithm, etc., for generating a normalized image in which the brightness and color are corrected.
[47] The binarization unit 310 converts the normalized image in which the color and brightness are corrected into a binary image in which an image is represented by two colors or two brightness levels. In the current embodiment, the conversion into a binary image including black and white colors will now be described.
[48] The binarization unit 310 divides the normalized image by the color and the brightness levels. As an example, the binarization unit 310 converts the normalized image into a plurality of regions, each of which has the same color and the same brightness level by determining a color and a brightness level of a pixel to be the same as a color and a brightness level of an adjacent pixel when a difference of the color and the brightness level between the pixel and the adjacent pixel is within a predetermined error.
[49] When the division of the normalized image according to the color and brightness level is completed, the binarization unit 310 calculates a binary value for each region using a brightness level and a histogram for each R, G, and B channels. The binarization unit 310 may divide adjacent pixels into groups in units of several pixels and calculate a weighted average of the adjacent pixels in the group, and the average may be set as a frequency of a corresponding brightness level. When values of the brightness level or the color can be clearly divided into two groups, the binarization unit 310 selects a value of the brightness or color of a group which has the least occurr ing frequency of the two groups as a binary value. When there are three or more groups, the binarization unit 310 stores the brightness values and color values in each group having the least frequency as a binary environment variable in advance and selects and sets one of the values as a binary value.
[50] The binarization unit 310 converts the image into binary values of black and white and removes noise images. As an example, when the brightness of an image pixel is equal to or greater than the binary value, the binarization unit 310 converts the image pixel into a white color. On the other hand, when the brightness of an image pixel is
smaller than the binary value, the binarization unit 310 converts the image pixel into a black color.
[51] When the size of a black object is smaller than a predetermined value in the converted black-and-white image, the binarization unit 310 determines the black object to be a noise image and removes the black object. However, since a 2D black- and-white image code includes a set of small black objects, a small black object cannot be removed routinely. Accordingly, the binarization unit 310 removes the outermost black pixels, that is, noise images connected to an outer frame, so as to protect a code image that is expected to be in the center of the image or in the proximity thereof. Then, the binarization unit 310 restores disconnected points of the image by performing Gaussian filtering, when the image is not even.
[52] The image extraction unit 320 extracts a target image region from the binary image. At first, the image extraction unit 320 detects candidate regions in which objects exist by determining an outline of regions represented in black in the binary image. When the input image is a letter or a number, a shapeless figure which includes a line or is relatively thin is detected as a candidate region. On the other hand, when the input image is a color code or a logo, a shaped figure is detected as a candidate region. In addition, when the input image is a QR code, a data matrix, or the like, a candidate including a set of pixels and lines is detected.
[53] The candidate region detected by the image extraction unit 320 may include one or a plurality of regions. The image extraction unit 320 processes candidate regions which are separated from each other as one target image region. For example, since an image code such as a QR code is a set of pixels, the pixels within a predetermined distance may be regarded as one set after distances among the plurality of pixels are calculated. It is more useful to extend the region from an object located at the center of the image by calculating distances from adjacent pixels. Since, in most cases, the object of interest is located in the center of an image, this method can be easily used.
[54] FIG. 4A is a block diagram illustrating a detailed structure of an external characteristic acquisition unit 210 of a device for classifying an image according to an embodiment of the present invention. Referring to FIG. 4A, the external characteristic acquisition unit 210 includes an external characteristic extraction unit 410 and a first pattern similarity calculation unit 420.
[55] The external characteristic extraction unit 410 extracts characteristic information of the target image by detecting an outline, a change in a direction, maximal and minimal coordinate points, a distance of a unidirectional segment of the target image in order to generate a standardized pattern information. Characteristic information is, for example, coordinates of a characteristic point or a length of a chain code between the characteristic points. More specifically, information on a change in an angle of an outline
of an image or segments of a characteristic pattern, coordinates of characteristic points, a pattern based on a relative position of the characteristic points, maximal and minimal points among external characteristic points of the image, and a length of the pattern may be included in the characteristic information.
[56] The characteristic points mean pixels indicating characteristics of an object which may be a segment or a figure of an image. In order to detect the characteristic points, various algorithms such as an edge detection algorithm and a method of detecting a termination or superposition of a character in character recognition are used together. Major information on each pixel structuring a segment of an object such as a termination, or a corner, information on a pixel from which a direction of the segment changes over a threshold angle, and information on a center point of 2D code cells are commonly used characteristic points. References concerning methods of detecting the characteristic points are as follows.
[57] Bar Code Technology and Application, written by Oh, Ho-Keun (Seung-An Dang,
1997)
[58] Character Recognition - Theory and Practices, written by Lee, Seung-Hwan
(Hong-Neung Science Publications, 1994)
[59] Pattern Recognition and Image Analysis, written by Earl Gose; Richard
Johnsonbaugh; Steve Jost (Prentice Hall, 1996)
[60] ISO/IEC 18004:2000 Information techonolgy - Automatic identification and data capture technique - Bar code symbology specifications - PDF417
[61] ISO/ISE 16022:2000 Information technology - International symbology specification - Data Matrix
[62] ISO/IEC 16032:2000 Information technology - International symbology specification - MaxiCode
[63] The external characteristic extraction unit 410 performs skeletonization, hook handling, smoothing, and normalization in order to generate standardized pattern information, as necessary. Each procedure is described as follows.
[64] (1) Skeletonization
[65] Skeletonization means converting a segment or a letter into a segment which has a thickness of one pixel. Generally, the skeletonization is performed for an object having a thickness smaller than a predetermined thickness. However, the skeletonization can be performed for a thick object, as necessary. When the skeletonization is performed for the thick object, a segment resulting from the skeletonization clearly represents a shape or characteristics of the object, since the segment includes information on a central axis of the object. For example, when representing a square, a skeletonized segment in the shape of a letter 'X' in which diagonal vertices of the square are connected with each other can be obtained through skeletonization. The skele-
tonization procedure is performed by using a thinning algorithm.
[66] (2) Hook Handling
[67] Hook handling is for removing a hook portion of a characteristic point. For example, in a letter, a hook, which can be regarded as an image type noise, exists in a start portion of the letter, and the hook portion may be removed. The hook portion has a short length between the characteristic points and an abrupt change in direction, and accordingly, the hook portion is removed based on the characteristics of the hook portion.
[68] (3) Smoothing
[69] Smoothing is for removing small perturbations of a segment due to distortion or the like. In other words, for a set of curves, which have a change in direction to some degree, and which construct a segment as a whole, since the characteristic points are densely located and shows a change in direction, smoothing is performed in order to process the characteristic points so as to converge into a single straight line or a curved line.
[70] For example, there is a method of smoothing proposed by Ellozy which smoothes segments when the segments have a small perturbations in writing or drawing a picture (Equation 1) and also, there is a method of smoothing proposed by Plamo which straightens points of partly straight line portions on the whole (Equation 2).
[71] [Equation 1] x: C - 3λ\ . . - 1 2-v .. , ' 1 7λV 1 2.v , . , - 2.v: ( Λ 35 [72] [Equation 2] x , ~ {x. j • 3.v.._, • 6x; , • 7.v . • 6.V τ . j-3.v . , , t .v . . ,)'27
[73] Here, x denotes a vector representing coordinates.
[74] (4) Normalization
[75] Normalization is for readjusting the size or direction of an image according to a predetermined reference, since the size and direction of the image which is input by means of a camera can vary based on characteristics of the camera. In normalization, there is size normalization and direction normalization.
[76] At first, in regard to the size normalization, a length of the shortest segment from among structuring elements of an object or a length of an n-divided segment resulting from dividing the shortest segment by n is set as a unit distance, and the lengths of other segments are represented in multiples of this unit distance. Accordingly, a length ratio of the segments can be acquired. The size of the image can be standardized by enlarging or contracting using the same ratio lengths of the segments structuring an outline of the object. In other words, since the number of unit distances can be
acquired by dividing a sum of lengths of segments constructing the outline with the unit distance, the sizes of images can be standardized so as to make the images have a length the same as the number of unit distances.
[77] Next, direction information between predetermined characteristic points, that is, chain code information, is acquired. For example, the direction information from a reference characteristic point to a characteristic point A which is connected to the reference characteristic point is acquired, and the direction information from the characteristic point A to a characteristic point B which is connected to the characteristic point A is acquired. The direction information on all the segments constructing an outline is acquired using the method described above in order to generate a continuous chain code. When the object has the shape of a figure other than a segment, the start characteristic point becomes a final characteristic point, and otherwise the start characteristic point is different to the final characteristic point. After the chain code is formed, a shape number is acquired using the difference between the direction numbers constructing the chain code. (For detailed information, please refer to 'Digital Image Processing, by Rafael C. Gonzalez and Richard E. Woods, Addison- Wesley, 2002').
[78] In order to acquire a shape number, an object which is not rotated may be used.
When an object is rotated, a portion having the least difference number is set as a reference, and difference numbers are acquired in a clockwise direction using an eight directional encoding method.
[79] The first pattern similarity calculation unit 420 searches for a most similar pattern by comparing a pattern of the characteristic information, which is extracted from the target image, and predetermined reference patterns. Here, the reference patterns are patterns (or pattern shape numbers) determined by analyzing in advance reference images such as various types of image codes and logos which can be internally processed by a personal mobile terminal or the like having a device for classifying an image according to an embodiment of the present invention. Accordingly, the first pattern similarity calculation unit 420 can perform a first determination on the type of target image by comparing a pattern generated from the characteristic information on the target image and a reference pattern (or a pattern shape number).
[80] More specifically, the first pattern similarity calculation unit 420 forms a set of reference patterns having the same number of elements as the unit distance number, that is, the number of elements of a set of shape numbers, obtained from the normalization procedure described above and compares orders of elements of shape numbers of the image pattern and a reference pattern in order to find a similar reference pattern. At this time, the comparison should be performed while considering that there may be a different shape number of the image pattern starting with a same
minimal difference number '0' although the image is normalized to a minimal difference number 1O'. As an example, when a shape number of the image pattern is '03313303' and a shape number of a reference pattern is '03033133', both are the same image patterns.
[81] When the target image is a figure, the inside of which is filled-in, a characteristic of one outer shape is extracted, and accordingly the first pattern similarity calculation unit 420 calculates a pattern similarity between a target image pattern and a reference image pattern only once. On the other hand, when the target image is a set of small objects having a plurality of outlines, the first pattern similarity calculation unit 420 should calculate a set of pattern similarities corresponding to each object.
[82] For a 2D image code, there is a similarity between patterns constructing data, and similarities between finder patterns which are used for searching for a code or outlines. For a QR code, there are three finder patterns including three vertices of the code, and a set of patterns according to various forms of pixels is formed in other portions.
[83] For example, in a QR code among 2D image codes, finder patterns exist in a remaining portion except in a right lower portion of the code, and in a maxicode a finder pattern having a shape of concentric circles called a 'bull's eye' exists in the center. In a bar code, characteristic patterns called guide bars exist on left and right sides of the code. In addition, other patterns which can characterize a corresponding code other than the finder patterns and the guide bars such as a timing pattern and an alignment pattern exist in a predetermined position. In this case, since one code includes a set of distributed patterns, that is, a set of outlines and points, the similarity between each of the distributed patterns should be calculated so as to measure the overall similarity of the corresponding code. In a logo, since it may include distributed patterns, similarities between the distributed patterns are calculated, and it is required to calculate the overall similarity as one logo using a set of distributed patterns. In order to determine whether the distributed patterns belong to an image, a method in which relative distances between the distributed patterns are considered is the most commonly used method and is called a spiral search.
[84] In FIG. 4A, a method in which normalized shape numbers of characteristic patterns are generated and compared with shape numbers of a reference pattern is described. However, the present invention is not limited thereto. An aspect of the present invention is to firstly determine the type of target image by comparing external characteristics of the target image and the reference image. Accordingly, in order to achieve this aspect of the present invention, various methods of comparing the external characteristics may be used.
[85] Hereinafter, detailed examples of firstly determining an image type by comparing the similarity between characteristics of the target image extracted by the external
characteristic extraction unit 410 and a reference image such as a color code and a bar code will be described.
[86] (1) Classification of Color Codes
[87] For a color code, the inside area of which is filled-in, the shape of an outline of an object is a square or a rectangle, and accordingly it is determined whether the color code can be classified based on the characteristics of the shape of the color code.
[88] (2) Classification of a Bar Code Image Code
[89] A bar code includes a predetermined guide pattern and a set of white/black bars, and accordingly, it is determined whether the bar code can be classified based on the set of the patterns.
[90] (3) Classification of a 2D Image Code
[91] Although a 2D image code can be divided into a matrix type and a layered type, most of the 2D image codes include finder patterns and pixels in a data region. Generally, the finder patterns exist on a vertex or in an outline portion of a code, but some special codes may exist in the center. Examples of the finder pattern are a square, a predetermined white/black bar pattern, concentric circles, concentric squares, and an outline form. Accordingly, when a finder pattern exists in a predetermined position of the target image, it is determined that the 2D image code can be classified. The remaining portions, apart from the finder pattern, include shapeless patterns of pixels. It is not necessary to check each remaining portion, and instead, it is checked whether the complete shape of the object includes a square, a rectangle, a predetermined figure, or the like.
[92] (4) Other Figures and Segments
[93] When the shape of an image of the object forms the shape of one of the predetermined figures as a whole such as a square, a rectangle, a hexagon, or an octagon, although the shape of the image does not correspond to a shape of the code described above, or when the shape of the image is similar to reference patterns already registered, it is determined that the figure or the segment can be classified. That is because, as a whole, the shape of the pattern is an image code or a company logo, which have a particular type of shape. When the shape of a pattern is a set of relatively short straight lines and curved segments, a similarity is measured by comparing the shape of the image with a set of reference patterns corresponding to letters and numbers. When the image is shapeless, the similarity is measured by comparing the image with a set of reference patterns corresponding to shapeless patterns. When the similarity is greater than a predetermined similarity, it is determined that the figure or the segment can be classified.
[94] (5) Others
[95] When the similarity between the characteristic pattern of the target image and a set
of reference patterns is smaller than a predetermined similarity, it is determined that the target image cannot be classified, and binary values which are previously calculated and the characteristic information on the characteristic points are analyzed so as to be used as reference material for binary calculation. In other words, when an area size of an object is small, and there are many disconnected segments after a bi- narization process has been performed, it can be determined that a portion that is to be processed into black is processed into white due to a binary value that is too low and that major information on the object is removed. In an opposite case, since the binary value needs to be increased, it is required to select an appropriate binary value among binary value candidates which have been determined in advance in order to match the condition.
[96] FlG. 4B is a block diagram illustrating a detailed structure of an internal characteristic acquisition unit of a device for classifying an image according to an embodiment of the present invention. Referring to FlG. 4B, the internal characteristic acquisition unit 220 includes an internal characteristic extraction unit 430 and a second pattern similarity calculation unit 440.
[97] The internal characteristic extraction unit 430 detects characteristic points and color information inside a target image which is determined at first to be classifiable. For the extraction of internal characteristics, various methods used by the external characteristic extraction unit 410 can be used. Since an outline of a bar code is an outlines of patterns constructing an object, it is not necessary to extract the internal characteristics. However, in a two-dimensional code, another pattern exists in a finer pattern.
[98] For example, for a finder pattern of a maxicode, in a found outline, other concentric circles exist, and accordingly, the internal characteristic extraction unit 430 detects characteristic points of the concentric circles. Also, in a QR code, since a black square exists in a finder pattern, the internal characteristic extraction unit 430 detects the internal characteristic of the black square. In a color code, since a figure in various forms exists in a corresponding outline, the internal characteristic extraction unit 430 detects color information in the color code along with characteristic points of the figure in various forms as internal characteristics. In a logo image, since there are some cases where a letter and a number are included in a figure, the internal characteristic extraction unit 430 detects characteristic points for the letter or number and determines a color for each region.
[99] The second pattern similarity calculation unit 440 calculates a distribution of internal characteristic points and similarities to the reference patterns. In other words, the second pattern similarity calculation unit 440 measures similarities between characteristic points and characteristic information detected by the internal characteristic
extraction unit 430 and the reference patterns. Since the target image is classified into an image type, that is, which group of candidates the target image belongs, based on the external characteristics by the first pattern similarity calculation unit 420, the second pattern similarity calculation unit 440 performs a function of re-determining whether the first classification is correct. Accordingly, the second pattern similarity calculation unit 440 determines the similarity to be very high when the internal characteristic points and characteristic information of the target image are reference characteristic information of the corresponding candidate group. Otherwise, the second pattern similarity calculation unit 440 determines the similarity to be low.
[100] FlG. 5 is a flowchart illustrating a method of classifying an image according to an embodiment of the present invention.
[101] Referring to FlG. 5, at first, a target image is received as an input (Operation
S500). It is firstly determined whether external characteristics of the target image are similar to those of an image code (Operation S505). When it is determined at first that the external characteristics of the target image are similar to those of the image code (Operation S510), it is then determined whether internal characteristics of the target image are similar to those of the image code (Operation S515). For comparing the outside and internal characteristics of the target image and the image code, the same method illustrated in FlGs. 4 A and 4B is used.
[102] When it is determined that the internal characteristics of the target image are similar to those of the image code (Operation S520), the target image is decoded using a method corresponding to the image code (Operation S525). When the target image is determined to be an image type that cannot be processed internally, that is, a logo, a letter, or the like, the target image is transmitted to an external server, and the recoding result is received (Operation S530). A corresponding service based on the decoding result is provided (Operation S535).
[103] FlG. 6 is a flowchart illustrating a method of classifying an image according to an embodiment of the present invention.
[104] Referring to FlG. 6, an image printed on a predetermined printing medium is received by means of an image input device (Operation S600). A normalized image, in which color and brightness level distortions according to neighboring environments at the time the image is input are corrected, is generated (Operation S605). The normalized image is converted into a binary image including two colors of black and white (Operation S610). A target image is extracted from the binary image (Operation S615).
[105] Characteristic patterns are extracted by analyzing patterns of an outline and characteristic points of the target image (Operation S620). A pattern similarity between the characteristic patterns of the target image and characteristic patterns of reference
images (that is, reference patterns) is calculated (Operation S625). When the pattern similarity is equal to or greater than a predetermined threshold, the target image is determined at first to be the same type of predetermined reference image (Operation S630), and internal characteristic points of the target image are detected (Operation S635). Then it is determined whether internal characteristics of the target image and the image type detected according to the result of the first determination are similar (Operation S640). When the target image is determined to be internally decodable based on the type of target image which is determined according to results of the first and second determination (Operation S650), a method of decoding corresponding to the type of image is selected and decoding is performed using the method (Operation S655) On the other hand, when the target image is determined to be internally decodable base on the image type of the target image which is determined according to results of the first and second determination, the target image is transmitted to an external server and the decoding result is received from the external server (Operation S660). After decoding, a corresponding service is provided (Operation S665).
[106] The embodiments of the present invention can be written as computer programs and can be implemented in general-use digital computers that execute the programs using a computer readable recording medium. Examples of the computer readable recording medium include magnetic storage media (e.g., ROM, floppy disks, hard disks, etc.), optical recording media (e.g., CD-ROMs, or DVDs), and storage media such as carrier waves (e.g., transmission through the Internet).
[107] According to the present invention, a type of input image can be classified into an image code, a logo, a drawing, etc. by image processing of the input image, and appropriate decoding for each type can be performed. Accordingly, it can be quickly determined whether the image can be internally decoded in order to prevent unnecessary decoding, and image recognition can be performed more rapidly in decoding in a terminal or a server by extracting basic characteristic information in advance.
[108] In addition, it can be determined in advance whether decoding is performed in a terminal or a server by recognizing a complicated image or an image requiring many calculations so as to improve speed and precision. In addition, when decoding is performed in a server, the terminal generally transmits an image without determining whether decoding is performed in the terminal so as not to waste any time. However, according to the present invention, only images which are decodable in the server are transmitted in order to efficiently process information.
[109] While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. The
exemplary embodiments should be considered in descriptive sense only and not for purposes of limitation. Therefore, the scope of the invention is defined not by the detailed description of the invention but by the appended claims, and all differences within the scope will be construed as being included in the present invention.
Industrial Applicability The present invention relates to a device and method of classifying an image .
Claims
[1] A device for classifying an image comprising: an image input unit receiving a target image as an input; an external characteristic acquisition unit primarily determining an image type of the target image based on a similarity between external characteristics of the target image and various types of reference images; an internal characteristic acquisition unit secondarily determining an image type of the target image based on a similarity between internal characteristics of the target image and internal characteristics of the reference images; and an image classification unit determining whether the target image can be internally decoded based on the image type of the target image determined based on results of the primary and secondary determinations.
[2] The device of claim 1, further comprising: a decoding unit decoding the target image using a decoding method corresponding to the image type when the target image is determined to be internally decodable by the image classification unit; and a characteristic information transmission unit transmitting the target image to an external server in order to be decoded when the target image is determined not to be internally decodable by the image classification unit.
[3] The device of claim 2, wherein the external characteristic acquisition unit performs skeletonization, hook handling, smoothing, and normalization on the target image in order to generate external characteristic information including information on characteristic points constructing an outline of the target image, wherein the internal characteristic acquisition unit generates internal characteristic information including information on internal characteristic points and color information of the target image, and wherein the characteristic information transmission unit transmits the external characteristic information and the internal characteristic information to an external server.
[4] The device of claim 1, wherein the image input unit comprises: an image normalization unit generating a normalized image by correcting variances of color, brightness according to image input environment at a time when the image is received including light and the type of printing medium; a binarization unit setting a predetermined brightness level as a threshold level and generating a binary image by converting a pixel of the normalized image into a black color when the pixel is brighter than the threshold level and converting a
pixel of the normalized image into a white color when the pixel is darker than the threshold level; and an image extraction unit generating the target image by extracting black color regions.
[5] The device of claim 1, wherein the external characteristic acquisition unit comprises: an external characteristic extraction unit normalizing the size and direction of the target image, forming a chain code including direction information on segments connecting characteristic points to each other which constructs an outline of the target image, and acquiring a pattern shape number using differences of direction numbers constructing the chain code; and a first pattern similarity calculation unit primarily determining an image type of the target image by comparing similarities between a pattern shape number of the target image and pattern shape numbers of the reference images.
[6] The device of claim 5, wherein the external characteristic extraction unit performes at least one of skeletonization, hook handling, and smoothing.
[7] The device of claim 1, wherein the internal characteristic acquisition unit comprises: an internal characteristic extraction unit extracting characteristic points and color information for an inside pattern of the target image; and a second pattern similarity calculation unit secondarily determining whether an inside pattern which the characteristic points represents is similar to an inside pattern of the type resulting from the first determination.
[8] A method of classifying an image comprising: receiving a target image as an input; primarily determining an image type of the target image based on a similarity between external characteristics of the target image and various types of reference images; secondarily determining an image type of the target image based on a similarity between internal characteristics of the target image and internal characteristics of the reference images; and determining whether the target image can be internally decoded based on the image type of the target image determined based on results of the primary and secondary determinations.
[9] The method of claim 8, further comprising: decoding the target image using a decoding method corresponding to the image type when the target image is determined to be internally decodable; and transmitting the target image to an external server in order to be decoded when
the target image is determined not to be internally decodable.
[10] The method of claim 9, wherein the primarily determining of an image type of the target image comprises generating external characteristic information including information on characteristic points constructing an outline of the target image by performing skeletonization, hook handling, smoothing, and normalization on the target image, and wherein the secondarily determining of an image type of the target image comprises generating internal characteristic information including information on internal characteristic points and color information of the target image, and wherein the transmitting of the target image to an external server comprises transmitting the external characteristic information and the internal characteristic information to an external server, when target image cannot be internally decoded based on the image type of the target image.
[11] The method of claim 8, wherein the primarily determining of an image type comprises: normalizing the size and direction of the target image, forming a chain code including direction information on segments connecting characteristic points to each other which constructs an outline of the target image, and acquiring a pattern shape number using differences of direction numbers constructing the chain code; and primarily determining an image type of the target image by comparing similarities between a pattern shape number of the target image and pattern shape numbers of the reference images.
[12] The method of claim 8, wherein the second determination of an image type comprises: extracting characteristic points and color information for an inside pattern of the target image; and secondarily determining whether an inside pattern which the characteristic points represents is similar to an inside pattern of the type resulting from the first determination.
[13] A computer-readable medium having embodied thereon a computer program for executing the method of claim 8.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR20050105746 | 2005-11-05 | ||
KR10-2005-0105746 | 2005-11-05 | ||
KR1020060024755A KR100726473B1 (en) | 2005-11-05 | 2006-03-17 | Apparatus for classifying an image and method therefor |
KR10-2006-0024755 | 2006-03-17 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2007052957A1 true WO2007052957A1 (en) | 2007-05-10 |
Family
ID=38006062
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2006/004517 WO2007052957A1 (en) | 2005-11-05 | 2006-11-01 | Device and method of classifying an image |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2007052957A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105426356A (en) * | 2015-10-29 | 2016-03-23 | 杭州九言科技股份有限公司 | Target information identification method and apparatus |
CN107392068A (en) * | 2017-07-25 | 2017-11-24 | 朱宸 | A kind of method and apparatus of continuous data identification |
US12002186B1 (en) | 2021-06-11 | 2024-06-04 | Dolby Laboratories Licensing Corporation | Surround area detection and blending for image filtering |
JP7568212B2 (en) | 2020-08-27 | 2024-10-16 | リプロバイオ株式会社 | Sperm motility evaluation system and sperm motility evaluation method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0448945A2 (en) * | 1990-03-29 | 1991-10-02 | International Business Machines Corporation | A method and apparatus for barcode recognition in a digital image |
JPH08315144A (en) * | 1995-05-16 | 1996-11-29 | Hitachi Ltd | Device and method for pattern classification |
JP2000270334A (en) * | 1999-03-15 | 2000-09-29 | Fuji Xerox Co Ltd | Coder, decoder, image processor, coding method, decoding method and image processing method |
WO2004040506A1 (en) * | 2002-10-31 | 2004-05-13 | Iconlab, Inc. | Two-dimensional code having superior decoding property which is possible to control the level of error correcting codes, and method for encoding and decoding the same |
-
2006
- 2006-11-01 WO PCT/KR2006/004517 patent/WO2007052957A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0448945A2 (en) * | 1990-03-29 | 1991-10-02 | International Business Machines Corporation | A method and apparatus for barcode recognition in a digital image |
JPH08315144A (en) * | 1995-05-16 | 1996-11-29 | Hitachi Ltd | Device and method for pattern classification |
JP2000270334A (en) * | 1999-03-15 | 2000-09-29 | Fuji Xerox Co Ltd | Coder, decoder, image processor, coding method, decoding method and image processing method |
WO2004040506A1 (en) * | 2002-10-31 | 2004-05-13 | Iconlab, Inc. | Two-dimensional code having superior decoding property which is possible to control the level of error correcting codes, and method for encoding and decoding the same |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105426356A (en) * | 2015-10-29 | 2016-03-23 | 杭州九言科技股份有限公司 | Target information identification method and apparatus |
CN107392068A (en) * | 2017-07-25 | 2017-11-24 | 朱宸 | A kind of method and apparatus of continuous data identification |
CN107392068B (en) * | 2017-07-25 | 2021-01-15 | 朱宸 | Method and device for identifying metering data |
JP7568212B2 (en) | 2020-08-27 | 2024-10-16 | リプロバイオ株式会社 | Sperm motility evaluation system and sperm motility evaluation method |
US12002186B1 (en) | 2021-06-11 | 2024-06-04 | Dolby Laboratories Licensing Corporation | Surround area detection and blending for image filtering |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108596197B (en) | Seal matching method and device | |
US8587685B2 (en) | Method and apparatus for retrieving label | |
CN108388822B (en) | Method and device for detecting two-dimensional code image | |
JP4232800B2 (en) | Line noise elimination device, line noise elimination method, line noise elimination program | |
CN104951940B (en) | A kind of mobile payment verification method based on personal recognition | |
CN102332084B (en) | Identity identification method based on palm print and human face feature extraction | |
KR101907414B1 (en) | Apparus and method for character recognition based on photograph image | |
EP1870858A2 (en) | Method of classifying colors of color based image code | |
US7512257B2 (en) | Coding system and method of a fingerprint image | |
US8744189B2 (en) | Character region extracting apparatus and method using character stroke width calculation | |
WO2017141802A1 (en) | Image processing device, character recognition device, image processing method, and program recording medium | |
CN108875623A (en) | A kind of face identification method based on multi-features correlation technique | |
US20040218790A1 (en) | Print segmentation system and method | |
KR100726473B1 (en) | Apparatus for classifying an image and method therefor | |
WO2007052957A1 (en) | Device and method of classifying an image | |
JP2006285956A (en) | Red eye detecting method and device, and program | |
CN110210467B (en) | Formula positioning method of text image, image processing device and storage medium | |
Yang et al. | Towards robust color recovery for high-capacity color QR codes | |
CN112818983B (en) | Method for judging character inversion by using picture acquaintance | |
CN104346596A (en) | Identification method and identification device for QR (Quick Response) code | |
US6694059B1 (en) | Robustness enhancement and evaluation of image information extraction | |
JP2001243465A (en) | Method and device for matching fingerprint image | |
JP5625196B2 (en) | Feature point detection device, feature point detection method, feature point detection program, and recording medium | |
CN116824647A (en) | Image forgery identification method, network training method, device, equipment and medium | |
CN113269136B (en) | Off-line signature verification method based on triplet loss |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 06812356 Country of ref document: EP Kind code of ref document: A1 |