[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN115100226B - A Contour Extraction Method Based on Monocular Digital Image - Google Patents

A Contour Extraction Method Based on Monocular Digital Image Download PDF

Info

Publication number
CN115100226B
CN115100226B CN202210674544.0A CN202210674544A CN115100226B CN 115100226 B CN115100226 B CN 115100226B CN 202210674544 A CN202210674544 A CN 202210674544A CN 115100226 B CN115100226 B CN 115100226B
Authority
CN
China
Prior art keywords
image
gaussian
gradient
pyramid
edge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210674544.0A
Other languages
Chinese (zh)
Other versions
CN115100226A (en
Inventor
吴新丽
赵云
罗佳丽
杨文珍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Sci Tech University ZSTU
Original Assignee
Zhejiang Sci Tech University ZSTU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Sci Tech University ZSTU filed Critical Zhejiang Sci Tech University ZSTU
Priority to CN202210674544.0A priority Critical patent/CN115100226B/en
Publication of CN115100226A publication Critical patent/CN115100226A/en
Application granted granted Critical
Publication of CN115100226B publication Critical patent/CN115100226B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/174Segmentation; Edge detection involving the use of two or more images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • G06T2207/20028Bilateral filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20172Image enhancement details
    • G06T2207/20192Edge enhancement; Edge preservation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20224Image subtraction

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a contour extraction method based on a monocular digital image. In the scale space of the image, the Gaussian difference pyramid is utilized to combine the luminance and chrominance information of the digital image, the chrominance information gradient image and the luminance information gradient image of the image are fused together, the calculated edge direction of each point forms an edge tangent line flow, the Gaussian difference algorithm based on the flow is adopted to detect and extract the contour edge line of the digital image, the problem of poor continuity caused by isotropy of the traditional Gaussian difference kernel is overcome, and a digital image contour map which keeps significant edge characteristics and removes noise and interference information is generated. The beneficial effects of the invention are as follows: and organically combining the brightness information and the color information, fully considering the edge tangential flow information of the image, reducing the interference of noise, and extracting the edge contour of the image which is clear and accurate.

Description

一种基于单目数字图像的轮廓提取方法A Contour Extraction Method Based on Monocular Digital Image

技术领域Technical Field

本发明涉及图像处理相关技术领域,尤其是指一种基于单目数字图像的轮廓提取方法。The invention relates to the technical field related to image processing, and in particular to a contour extraction method based on a monocular digital image.

背景技术Background Art

图像轮廓是指数字图像中感兴趣区域的线条和边界,包括物体边界和由亮度、颜色或纹理突然变化定义的区域边界。人类视觉通常是通过目标物体的边缘和轮廓来区分目标物体,在数字图像中,边缘轮廓是区分不同区域的重要特征。边缘检测是数字图像处理中的一项基本任务,对于分析图像内容至关重要,因此在许多计算机视觉,目标识别和自动检测中应用广泛。Image contours refer to the lines and boundaries of areas of interest in digital images, including object boundaries and area boundaries defined by sudden changes in brightness, color, or texture. Human vision usually distinguishes target objects by their edges and contours. In digital images, edge contours are important features for distinguishing different areas. Edge detection is a basic task in digital image processing and is crucial for analyzing image content. Therefore, it is widely used in many computer vision, target recognition, and automatic detection.

一般来说,边缘检测技术主要有两种类型,第一种基于区域的技术通过识别分析数字图像中的颜色、亮度以及纹理属性,将其中有着较高相似度的像素点归为一个区域,以区域之间的边界检测提取图像的边缘信息。第二种基于线条的提取技术使用亮度、颜色或纹理的高对比度来寻找线条或边界。基于区域的轮廓提取技术有区域增长、区域分割和区域合并三种。基于线的技术包括主动轮廓技术、基于边缘检测的技术和基于边缘分组的技术。Generally speaking, there are two main types of edge detection technology. The first region-based technology classifies pixels with high similarity into a region by identifying and analyzing the color, brightness and texture attributes in digital images, and extracts edge information of the image by detecting the boundaries between regions. The second line-based extraction technology uses high contrast in brightness, color or texture to find lines or boundaries. There are three types of region-based contour extraction technologies: region growing, region segmentation and region merging. Line-based technologies include active contour technology, edge detection-based technology and edge grouping-based technology.

在数字图像处理中,有几种边缘检测方法可用,可分为一阶和二阶差分检测。在一阶差分检测中,输入的数字图像被一个合适的掩码卷积,生成一个梯度图像,图像中的边缘是通过阈值检测的,将边缘信息转化为白色,非边缘信息转化为黑色,以此获取边缘信息。常见的一阶微分算子包括Roberts 算子(交叉微分算法)、Prewitt算子、Sobel算子(索贝尔算子)和Canny 算子(坎尼算子)等。二阶差分检测中,数字图像首先被一个自适应滤波器平滑,由于二阶导数对噪声非常敏感,因此过滤功能非常重要。二阶微分算子包括Laplace算子(拉普拉斯算子)、Zuniga-Haralick定位算子、LOG算子(拉普拉斯高斯算子)和DOG算子(高斯差分算子)等。In digital image processing, there are several edge detection methods available, which can be divided into first-order and second-order difference detection. In the first-order difference detection, the input digital image is convolved with a suitable mask to generate a gradient image. The edges in the image are detected by thresholding, and the edge information is converted to white and the non-edge information is converted to black to obtain the edge information. Common first-order differential operators include Roberts operator (cross differential algorithm), Prewitt operator, Sobel operator, and Canny operator. In the second-order difference detection, the digital image is first smoothed by an adaptive filter. Since the second-order derivative is very sensitive to noise, the filtering function is very important. Second-order differential operators include Laplace operator, Zuniga-Haralick localization operator, LOG operator (Laplacian of Gaussian operator) and DOG operator (Gaussian difference operator).

在数字图像轮廓提取中存在着线条和区域边界的不连续性,复杂的形状、纹理及大量噪声等问题。In the process of digital image contour extraction, there are problems such as discontinuity of lines and region boundaries, complex shapes, textures and a lot of noise.

发明内容Summary of the invention

本发明是为了克服现有技术中存在上述的不足,提供了一种降低噪声干扰且提高边缘轮廓清晰度的基于单目数字图像的轮廓提取方法。The present invention aims to overcome the above-mentioned deficiencies in the prior art and provides a contour extraction method based on a monocular digital image, which reduces noise interference and improves edge contour clarity.

为了实现上述目的,本发明采用以下技术方案:In order to achieve the above object, the present invention adopts the following technical solutions:

一种基于单目数字图像的轮廓提取方法,具体包括如下步骤:A contour extraction method based on a monocular digital image specifically comprises the following steps:

(1)图像尺度空间:在图像大小不变的前提保证下,通过平滑和滤波器迭代,得到数字图像在各个尺度下的结构信息,并在此基础上,用预设的实验窗口对各个尺度下的图像进行研究处理,达到图像预处理的效果;(1) Image scale space: Under the premise of keeping the image size unchanged, the structural information of the digital image at each scale is obtained through smoothing and filter iteration. On this basis, the image at each scale is studied and processed using a preset experimental window to achieve the effect of image preprocessing.

(2)高斯差分算子边缘检测:把观察图像的窗口设置为固定值,在这个窗口下观察图像的像素尺寸发生变化,这就产生了图像金字塔;在图像金字塔的基础上以高斯平滑和下采样对图像进行操作,将获得的一组处理完成的图像进行集合排列,就得到了高斯金字塔;将高斯金字塔每一组上层图像与下层图像做预测残差计算,形成高斯差分金字塔,获得高斯差分算子DOG;(2) Gaussian difference operator edge detection: The window for observing the image is set to a fixed value. The pixel size of the image observed under this window changes, which generates an image pyramid. Based on the image pyramid, the image is operated by Gaussian smoothing and downsampling. The obtained set of processed images are arranged in a set to obtain a Gaussian pyramid. The prediction residual is calculated between each group of upper and lower images of the Gaussian pyramid to form a Gaussian difference pyramid and obtain the Gaussian difference operator DOG.

(3)基于流的高斯差分边缘检测:以高斯平滑和图像灰度化对图像进行预处理获取梯度图像,再使用双边滤波操作对提取的轮廓线进行处理,构造边缘切线流的同时调和图像,之后基于边缘切线流的高斯差分算法对数字图像进行轮廓边缘线的提取。(3) Flow-based Gaussian difference edge detection: The image is preprocessed by Gaussian smoothing and image grayscale to obtain a gradient image, and then the extracted contour lines are processed using a bilateral filter operation to construct an edge tangent flow and harmonize the image. The contour edge lines of the digital image are then extracted using the Gaussian difference algorithm based on the edge tangent flow.

在图像的尺度空间中,利用高斯差分金字塔结合数字图像的亮度色度信息,将图像的色度信息梯度图像和亮度信息梯度图像融合在一起,计算的每个点边缘方向形成边缘切线流,采用基于流的高斯差分算法对数字图像进行轮廓边缘线的检测提取,克服传统高斯差分内核的各向同性所带来的连贯性较差的问题,生成了保留显著边缘特征,去除噪声和干扰信息的数字图像轮廓图,并与几种使用较多的轮廓提取方法进行对比实验,可以看出本方法所得到的轮廓提取结果更加的连贯,清晰。即本方法提出了一种基于计算几何技术的简单边缘检测方法,将亮度信息和颜色信息进行有机的结合,并充分考虑图像的边缘切线流信息,降低噪声的干扰,提取出图像较为清晰准确的边缘轮廓。In the scale space of the image, the Gaussian difference pyramid is used to combine the brightness and color information of the digital image, and the image's color information gradient image and brightness information gradient image are fused together. The edge direction of each point is calculated to form an edge tangent flow. The flow-based Gaussian difference algorithm is used to detect and extract the contour edge line of the digital image, overcoming the poor coherence problem caused by the isotropy of the traditional Gaussian difference kernel, and generating a digital image contour map that retains significant edge features and removes noise and interference information. The method is compared with several commonly used contour extraction methods, and it can be seen that the contour extraction results obtained by this method are more coherent and clear. That is, this method proposes a simple edge detection method based on computational geometry technology, which organically combines brightness information and color information, fully considers the edge tangent flow information of the image, reduces noise interference, and extracts a clearer and more accurate edge contour of the image.

作为优选,在步骤(1)中,以图像I(x,y)为例,尺度空间Tt是采用尺度参数为t的图像平滑算子生成,尺度参数t反应的是数字图像被光滑的程度,将{Tt}t∈R称为二维图像I(x,y)在尺度参数t下的映像,R指的是实数集合,参数 t越大,就表示图像的内容越简单。Preferably, in step (1), taking image I(x, y) as an example, the scale space T t is generated by an image smoothing operator with a scale parameter t. The scale parameter t reflects the degree of smoothing of the digital image. {T t } t∈R is called the image of the two-dimensional image I(x, y) under the scale parameter t. R refers to a set of real numbers. The larger the parameter t is, the simpler the content of the image is.

作为优选,在步骤(1)中,尺度空间算子Tt需满足视觉不变性,包括以下几种:Preferably, in step (1), the scale space operator T t needs to satisfy visual invariance, including the following:

灰度不变性:有Tt(I+h)=Tt(I)+h,其中h为任意常数;Grayscale invariance: T t (I+h) = T t (I) + h, where h is an arbitrary constant;

对比度不变性:有Tt(f(I))=f(Tt(I)),其中f为任意的非降性实函数;Contrast invariance: T t (f(I)) = f(T t (I)), where f is an arbitrary non-decreasing real function;

平移不变性:有Tta(I))=τa(Tt(I)),其中τa(I)=I(x+a),a为任意常数;Translation invariance: T ta (I)) = τ a (T t (I)), where τ a (I) = I (x + a), a is an arbitrary constant;

伸缩不变性:满足t'(t,δ)>0,能够使HδTt=TtHδ成立,且HδI=I(δx),其中δ为任意的正实数与t为尺度参数;Scalability invariance: satisfies t'(t,δ)>0, and can make H δ T t =T t H δ hold, and H δ I=I(δx), where δ is an arbitrary positive real number and t is a scale parameter;

欧式不变性:对于任意的正交矩阵R,都存在Tt(R·I)=R·Tt(I),其中(R·I)(x)=I(R·x);Euclidean invariance: For any orthogonal matrix R, there exists T t (R·I) = R·T t (I), where (R·I)(x) = I(R·x);

仿射不变性:对于任意的仿射变换A和任意的尺度参数t,满足t'(δ,A)>0,且 ATt=TtA成立;Affine invariance: For any affine transformation A and any scale parameter t, t'(δ,A)>0 and AT t =T t A holds;

其中,Tt(I)指的是尺度空间算子函数,I指的是输入的图像I(x,y)。Here, T t (I) refers to the scale space operator function, and I refers to the input image I(x,y).

作为优选,在步骤(2)中,图像金字塔是用于图像压缩和计算机视觉,将来自同一张原始图像的多个分辨率图像按照金字塔形状从大到小排列,其通过梯次向下采样获得,直到达到某个终止条件才停止采样;图像金字塔中,所在金字塔的层级越高,图像的尺寸和分辨率就越小,基础级J的像素尺寸为 2J×2J或N×N,其中J=log2N,第0级顶点的大小为1×1,也就是一个像素点,第j级的像素尺寸大小为2j×2j,其中0≤j≤J;考虑到图像的失真问题,图像金字塔会被缩短到P+1级,其中1≤P≤J,且j=J-P,…,J-2,J-1,J;将级别限制到P来降低原数字图像的分辨率近似,其中P+1级金字塔(P>0)中像素的总数是:Preferably, in step (2), the image pyramid is used for image compression and computer vision, and multiple resolution images from the same original image are arranged in a pyramid shape from large to small, which are obtained by stepwise down sampling until a certain termination condition is reached. In the image pyramid, the higher the level of the pyramid, the smaller the size and resolution of the image. The pixel size of the base level J is 2 J ×2 J or N×N, where J=log 2 N, the size of the 0th level vertex is 1×1, that is, a pixel point, and the pixel size of the jth level is 2 j ×2 j , where 0≤j≤J. Considering the image distortion problem, the image pyramid will be shortened to P+1 level, where 1≤P≤J, and j=JP, ..., J-2, J-1, J. The level is limited to P to reduce the resolution of the original digital image approximately, where the total number of pixels in the P+1 level pyramid (P>0) is:

作为优选,在步骤(2)中,每个图像分辨率大小为一组,在高斯金字塔中都有若干层,每一层都是在上一层图像的基础上以高斯平滑和下采样的手段构建的,具体的构建过程如下:Preferably, in step (2), each image resolution size is a group, and there are several layers in the Gaussian pyramid. Each layer is constructed by Gaussian smoothing and downsampling on the basis of the previous layer of image. The specific construction process is as follows:

(21)图像金字塔的第一组第一层为原始数字图像,对其进行高斯内核卷积,以σ为标准差,将计算完成的新图像放在第一组的第二层,其中高斯卷积公式为:(21) The first layer of the first group of the image pyramid is the original digital image, which is convolved with a Gaussian kernel, with σ as the standard deviation, and the new image is placed in the second layer of the first group, where the Gaussian convolution formula is:

式中,D(x,y)是频率域中点(x,y)与频率矩形中点的距离;Where D(x,y) is the distance between the midpoint (x,y) in the frequency domain and the midpoint of the frequency rectangle;

(22)使用新的标准差σ'对第一组第二层的图像做高斯内核卷积,将得到的结果图像放在图像金字塔的第一组第三层;(22) Use the new standard deviation σ' to perform Gaussian kernel convolution on the image of the second layer of the first group, and place the resulting image in the third layer of the first group of the image pyramid;

(23)反复执行步骤(22),最终就能得到第一组的N层图像;在高斯金字塔的每组结果中,每一层的图像都是同一图像经过不同的标准差卷积所获得的;(23) Repeat step (22) to finally obtain the first group of N layers of images. In each group of results of the Gaussian pyramid, each layer of images is obtained by convolving the same image with different standard deviations.

(24)将原始图像进行下采样,所得到的图像放在图像金字塔的第二组第一层,同样对其进行前三步操作,得到N层图像;(24) Down-sampling the original image, placing the resulting image in the first layer of the second group of the image pyramid, and performing the same three steps on it to obtain an N-layer image;

(25)重复执行步骤(24),就可以得到M组N层图像,最终构建高斯金字塔。(25) Repeat step (24) to obtain M groups of N-layer images and finally construct a Gaussian pyramid.

作为优选,在步骤(2)中,高斯差分算子DOG获取方式如下:Preferably, in step (2), the Gaussian difference operator DOG is obtained as follows:

在高斯金字塔所构建的尺度空间中,σ为尺度空间的坐标,M为高斯金字塔结构中的组数,N为高斯金字塔中每组的层数;In the scale space constructed by the Gaussian pyramid, σ is the coordinate of the scale space, M is the number of groups in the Gaussian pyramid structure, and N is the number of layers of each group in the Gaussian pyramid;

其中,σ0是基准尺度,m为高斯金字塔中组的标号,n为组内层的标号;在高斯金字塔中,一共有M组N层图像,高斯金字塔中的图像使用符号(m,n)来进行定位和一一对应;Among them, σ 0 is the reference scale, m is the group number in the Gaussian pyramid, and n is the number of the layer within the group; in the Gaussian pyramid, there are M groups of N layers of images, and the images in the Gaussian pyramid use the symbol (m, n) for positioning and one-to-one correspondence;

高斯金字塔中,图像的高斯滤波表示为:In the Gaussian pyramid, the Gaussian filter of the image is expressed as:

代入σ不同的参数值,得到另一个高斯滤波图像:Substituting different parameter values of σ, we get another Gaussian filtered image:

其中高斯滤波函数为:The Gaussian filter function is:

然后,将两个高斯滤波图像做差值计算,得到:Then, the difference between the two Gaussian filtered images is calculated to obtain:

其中记Gσ1-Gσ2为高斯差分算子DOG,得到:Where G σ1 -G σ2 is the Gaussian difference operator DOG, and we get:

作为优选,在步骤(3)中,梯度图像的获取具体如下:Preferably, in step (3), the gradient image is obtained as follows:

在图像f的(x,y)像素点确定周围像素点集合的强度和方向,所用到的工具就是梯度,梯度用来表示,并且用向量的方式来定义:At the (x,y) pixel of image f, the intensity and direction of the surrounding pixel set are determined. The tool used is the gradient. To represent it, and define it in vector form:

梯度向量表示了图像f在像素点(x,y)处沿着梯度的方向变化最快,变化率最大,这是梯度向量的在几何上的一个显著特点,梯度向量的长度用M(x,y) 表示:The gradient vector indicates that the image f changes fastest and at the largest rate along the gradient direction at the pixel point (x, y). This is a significant geometric feature of the gradient vector. The length of is represented by M(x,y):

M(x,y)所代表的含义是梯度向量方向变化率的值,这其中gx,gy和M(x,y)在像素尺寸大小上都和原图像相同,是x和y在图像f中所有像素位置上变化求导时得到的,M(x,y)就被称为梯度图像;M(x,y) represents the value of the rate of change of the gradient vector direction. Among them, gx , gy and M(x,y) are the same as the original image in pixel size. They are obtained by taking the derivative of x and y at all pixel positions in the image f. M(x,y) is called the gradient image.

梯度向量相对于x轴的角度为:The angle of the gradient vector relative to the x-axis is:

像素点的边缘方向与该点的梯度向量垂直,因此用梯度来确定某一点的边缘强度和方向;The edge direction of a pixel is perpendicular to the gradient vector of that point, so the gradient is used to determine the edge strength and direction of a point;

在灰度图像中只能得到亮度信息,因此将图像从RGB空间转到YUV彩色空间中进行处理,在YUV空间中获取到图像的亮度和色度信息,其转换矩阵为:Only brightness information can be obtained in grayscale images, so the image is converted from RGB space to YUV color space for processing, and the brightness and chromaticity information of the image is obtained in YUV space. The conversion matrix is:

其中,Y为图像中的像素亮度值,R、G、B三个字母分别代表了红、绿、蓝三个基底颜色,因此图像的亮度信息为:Among them, Y is the pixel brightness value in the image, and the three letters R, G, and B represent the three base colors of red, green, and blue respectively. Therefore, the brightness information of the image is:

Y=0.299R+0.587G+0.144BY=0.299R+0.587G+0.144B

在对图像像素点进行一阶求导或二阶求导的过程中,噪声对于求导结果的影响是不容忽视的,而索贝尔算子作为一个离散微分算子在抑制噪声上效果明显,因此利用索贝尔算子来计算得到图像亮度信息的梯度:In the process of taking the first-order or second-order derivative of the image pixels, the influence of noise on the derivative result cannot be ignored. The Sobel operator, as a discrete differential operator, is effective in suppressing noise. Therefore, the Sobel operator is used to calculate the gradient of the image brightness information:

图像的亮度的梯度长度为:The gradient length of the brightness of the image is:

因为在计算过程中,平方和平方根的求解需要耗费大量的时间,为了提高计算效率,取其不开平方的近似值:Because solving the square and square root takes a lot of time in the calculation process, in order to improve the calculation efficiency, the approximate value without square root is taken:

|ΔY|=|ΔYx|+|ΔYy||ΔY|=|ΔY x |+|ΔY y |

计算色度信息的梯度值时,应用国际通用的测色标准CIE-L*a*b*,这个标准可以应用于图像的光源色和物体色的计算,为将RGB图像转化到CIE-L*a*b*色彩空间,先把RGB转换为XYZ色彩空间:When calculating the gradient value of chromaticity information, the internationally accepted color measurement standard CIE-L * a * b * is applied. This standard can be applied to the calculation of the light source color and object color of the image. To convert the RGB image to the CIE-L * a * b * color space, first convert RGB to the XYZ color space:

其中XYZ为物体的三刺激值,CIE-L*a*b*空间的转换公式为:Where XYZ is the tristimulus value of the object, and the conversion formula of CIE-L * a * b * space is:

X0Y0Z0表示CIE标准照明体的三刺激值,L*为心理明度,a*、b*为心理色度;X 0 Y 0 Z 0 represents the tristimulus values of the CIE standard illuminant, L * is the psychological lightness, a * and b * are the psychological chromaticity;

由此,包含图像色度信息的梯度表示为:Therefore, the gradient containing the image chromaticity information is expressed as:

则图像色度信息的梯度长度为:Then the gradient length of the image chromaticity information is:

其中,为CIE-L*a*b*色度空间里的距离:in, is the distance in CIE-L * a * b * color space:

为方便计算取两者的近似值为:For the convenience of calculation, the approximate values of the two are:

同样的,取图像色度信息的梯度长度的近似值:Similarly, take the approximate value of the gradient length of the image chromaticity information:

对求得的亮度梯度近似值和求得的色度梯度近似值都统一到一个大致相同的数值区间内,做归一化处理:The obtained brightness gradient approximation and the obtained chromaticity gradient approximation are unified into a roughly the same numerical range and normalized:

ΔY'=(ΔY-min)(max-min)ΔY'=(ΔY-min)(max-min)

ΔC'=(ΔC-min)(max-min)ΔC'=(ΔC-min)(max-min)

由此得出亮度梯度和色度梯度,且两者均满足线性关系,计算得融合梯度为:From this, the brightness gradient and chromaticity gradient are obtained, and both satisfy a linear relationship. The fusion gradient is calculated as:

作为优选,在步骤(3)中,构造边缘切线流的具体方法如下:Preferably, in step (3), the specific method of constructing the edge tangent flow is as follows:

为了能够在图像的轮廓提取中保留足够的细节信息,采用了边缘切线流ETF 来进行平滑处理;In order to retain enough detail information in the contour extraction of the image, the edge tangent flow ETF is used for smoothing;

ETF构造滤波器定义为:The ETF construction filter is defined as:

式中,Ωμ表示为x的邻域,其半径为μ,k是向量的归一化因子,tcur(y)表示y 的归一化切线向量,Φ(x,y)表示的是向量tcur(y)的方向;Where Ω μ represents the neighborhood of x, whose radius is μ, k is the normalization factor of the vector, t cur (y) represents the normalized tangent vector of y, and Φ(x,y) represents the direction of the vector t cur (y);

详细说明式中的ωs(x,y)为空间权值函数,其本质是一个半径为μ的矩形滤波器,其表达式为:ω s (x, y) in the detailed description is a spatial weight function, which is essentially a rectangular filter with a radius of μ, and its expression is:

ωm(x,y)为幅度权值函数,其表达式为:ω m (x, y) is the amplitude weight function, and its expression is:

其中e(x)表示的是在x点处梯度值的归一化,η表示的是下降率;Where e(x) represents the normalized gradient value at point x, and η represents the rate of descent;

ωd(x,y)为方向权值函数,其表达式为:ω d (x, y) is the direction weight function, and its expression is:

ωd(x,y)=|tcur(x)·tcur(y)|ω d (x,y)=|t cur (x)·t cur (y)|

幅度权值函数ωm(x,y)和方向权值函数ωd(x,y)在保持图像特征上有着重要作用,ωd(x,y)的值会随着两个归一化切线向量的垂直而减小,随着两个向量的平行而增大;此时用符号函数Φ(x,y)∈{1,-1}来表示tcur(y)的方向:The amplitude weight function ω m (x, y) and the direction weight function ω d (x, y) play an important role in maintaining image features. The value of ω d (x, y) decreases as the two normalized tangent vectors become perpendicular and increases as the two vectors become parallel. At this time, the sign function Φ(x, y)∈{1, -1} is used to represent the direction of t cur (y):

通过所得的初始的融合梯度M0(x),接着根据梯度场的垂直关系得到边缘切线流t0(x),通过对公式ti(x)→ti+1(x)的迭代2-3次,得到效果良好的边缘切线流 ETF。The initial fused gradient M 0 (x) is obtained, and then the edge tangent flow t 0 (x) is obtained according to the vertical relationship of the gradient field. By iterating the formula ti (x)→ti +1 (x) 2-3 times, a good edge tangent flow ETF is obtained.

作为优选,在步骤(3)中,通过DOG计算沿正交于边缘切线的一维高斯模糊,通过沿着边缘切线流的线积分卷积来进行边缘对齐平滑,两通道的流动引导方法被称为基于边缘切线流的高斯差分算法FDOG, FDOG滤波器定义为:Preferably, in step (3), a one-dimensional Gaussian blur is calculated along the edge tangent orthogonal to the edge tangent by DOG, and edge alignment smoothing is performed by line integral convolution along the edge tangent flow. The two-channel flow guidance method is called the Gaussian difference algorithm based on the edge tangent flow FDOG. The FDOG filter is defined as:

其中ls(t)表示的是参数t处线lS上的点,lS表示为垂直于曲线Cx,且与Cx相较于点x,用来表示曲线的宽度;Cx表示在点x的流动曲线,s表示弧长参数,其取值范围为[-S,S];因此I(ls(t))表示为所输入图像I在ls(t)点所代表的值,对于函数f,采用的是DOG边缘模型:Where l s (t) represents the point on the line l S at parameter t, l S is perpendicular to the curve C x and is used to represent the width of the curve compared to C x at point x; C x represents the flow curve at point x, s represents the arc length parameter, and its value range is [-S, S]; therefore, I(l s (t)) represents the value represented by the input image I at point l s (t). For the function f, the DOG edge model is used:

其中,Gσ(x)是一个单变量的高斯函数,其方差为1;σc和σs分别控制的是滤波结果的中心边缘间隔和与周边边缘的间隔大小,两者的取值为σs=1.6σc,这样能使函数f比较接近高斯拉普拉斯算子;ρ控制这噪声的级别和灵敏程度,取值在[0.79,1.0]之间;Where G σ (x) is a single variable Gaussian function with a variance of 1; σ c and σ s respectively control the center edge interval and the interval size with the peripheral edge of the filtering result, and the values of both are σ s = 1.6σ c , which can make the function f closer to the Gaussian Laplace operator; ρ controls the level and sensitivity of the noise, and its value is between [0.79, 1.0];

FDOG滤波器沿着Cx的累积响应公式为:The cumulative response formula of the FDOG filter along Cx is:

得到基于边缘切线流的高斯差分滤波结果之后,FDOG通过阈值化,得到边缘的二值图像:After obtaining the Gaussian difference filtering result based on the edge tangent flow, FDOG obtains the binary image of the edge by thresholding:

其中,τ控制中心环绕差异的大小,取值在[0,1]。Among them, τ controls the size of the center-surround difference and takes values in [0, 1].

本发明的有益效果是:将亮度信息和颜色信息进行有机的结合,并充分考虑图像的边缘切线流信息,降低噪声的干扰,提取出图像较为清晰准确的边缘轮廓。The beneficial effects of the present invention are: organically combining brightness information and color information, fully considering the edge tangent flow information of the image, reducing the interference of noise, and extracting a clearer and more accurate edge contour of the image.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是图像多尺度的表达示意图;FIG1 is a schematic diagram of multi-scale expression of an image;

图2是图像金字塔示意图;Fig. 2 is a schematic diagram of an image pyramid;

图3是图像金字塔的演示图;Figure 3 is a demonstration diagram of an image pyramid;

图4是近似和残差金字塔的系统模型图;FIG4 is a system model diagram of the approximation and residual pyramids;

图5是高斯金字塔结构示意图;Fig. 5 is a schematic diagram of the Gaussian pyramid structure;

图6是高斯差分金字塔结构示意图;FIG6 is a schematic diagram of a Gaussian difference pyramid structure;

图7是轮廓提取方法流程图;Fig. 7 is a flow chart of the contour extraction method;

图8是梯度确认边缘方向原理图;FIG8 is a schematic diagram showing the principle of gradient confirmation of edge direction;

图9是边缘切线流构建效果图;FIG9 is a diagram showing the effect of edge tangent flow construction;

图10是FDOG滤波结构图;FIG10 is a diagram of the FDOG filter structure;

图11是图像多尺度的表达示意图。FIG11 is a schematic diagram of multi-scale expression of an image.

图12是图像多尺度的表达示意图。FIG12 is a schematic diagram of multi-scale expression of an image.

具体实施方式DETAILED DESCRIPTION

下面结合附图和具体实施方式对本发明做进一步的描述。The present invention is further described below in conjunction with the accompanying drawings and specific embodiments.

一种基于单目数字图像的轮廓提取方法,具体包括如下步骤:A contour extraction method based on a monocular digital image specifically comprises the following steps:

(1)图像尺度空间:在图像大小不变的前提保证下,通过平滑和滤波器迭代,得到数字图像在各个尺度下的结构信息,并在此基础上,用预设的实验窗口对各个尺度下的图像进行研究处理,达到图像预处理的效果;(1) Image scale space: Under the premise of keeping the image size unchanged, the structural information of the digital image at each scale is obtained through smoothing and filter iteration. On this basis, the image at each scale is studied and processed using a preset experimental window to achieve the effect of image preprocessing.

在计算机视觉中,数字图像在不同的应用背景中所呈现出的光滑程度是不同的,其灰度值的变化与物体空间现象相对应,可以在不同大小的区域内识别,范围从几个像素到大区域,识别灰度值的变化对于解码图像中固有的信息至关重要。图像的尺度空间就是把同一张图像经过各种平滑处理和滤波器迭代,得到多个模糊程度不同的图像,并将所得到的图像组合起来观察研究,在此过程中所形成的抽象性框架。In computer vision, digital images show different degrees of smoothness in different application contexts. The change in grayscale value corresponds to the spatial phenomenon of the object and can be identified in areas of different sizes, ranging from a few pixels to large areas. Identifying the change in grayscale value is essential for decoding the inherent information in the image. The scale space of an image is an abstract framework formed in the process of combining the same image for observation and study by obtaining multiple images with different blur levels after various smoothing processes and filter iterations.

在不同的尺度参数下,数字图像中的尺度结构是不相同的,随着尺度参数的增加,可以使细微的尺度结构尽量少的产生,从而达到分离数字图像结构的目的。因此,以足够粗糙的尺度对数字图像进行处理,就可以尽量去除掉不相关或者不在研究范围内的细微尺度结构。这样的操作会大大减少数字图像的处理难度,并且实验表明,这样的思想指导下的图像处理是有其意义所在的。Under different scale parameters, the scale structure in the digital image is different. As the scale parameter increases, the subtle scale structure can be minimized, thereby achieving the purpose of separating the digital image structure. Therefore, by processing the digital image at a sufficiently coarse scale, the subtle scale structure that is irrelevant or not within the scope of the study can be removed as much as possible. Such an operation will greatly reduce the difficulty of processing digital images, and experiments have shown that image processing guided by such an idea is meaningful.

以图像I(x,y)为例,尺度空间Tt是采用尺度参数为t的图像平滑算子生成,尺度参数t反应的是数字图像被光滑的程度,将{Tt}t∈R称为二维图像I(x,y)在尺度参数t下的映像,R指的是实数集合,参数t越大,就表示图像的内容越简单。图像多尺度的表达示意如图1所示。Taking image I(x,y) as an example, the scale space Tt is generated by an image smoothing operator with scale parameter t. The scale parameter t reflects the degree of smoothing of the digital image. { Tt } t∈R is called the image of the two-dimensional image I(x,y) under scale parameter t. R refers to the set of real numbers. The larger the parameter t is, the simpler the image content is. The schematic diagram of the multi-scale expression of an image is shown in Figure 1.

尺度空间算子Tt需满足视觉不变性,包括以下几种:The scale space operator T t needs to satisfy visual invariance, including the following:

满足灰度不变性和对比度不变性:三维物体在真实场景中接收到的光线亮度和角度发生变化时,人体眼睛对于物体所感知到的图像灰度和对比度也会随之发生改变,因此尺度空间算子Tt就需要在进行分析时保持灰度和对比度不变。Satisfy grayscale invariance and contrast invariance: When the brightness and angle of light received by a three-dimensional object in a real scene change, the grayscale and contrast of the image perceived by the human eye will also change accordingly. Therefore, the scale space operator T t needs to keep the grayscale and contrast unchanged during analysis.

灰度不变性:有Tt(I+h)=Tt(I)+h,其中h为任意常数;Grayscale invariance: T t (I+h) = T t (I) + h, where h is an arbitrary constant;

对比度不变性:有Tt(f(I))=f(Tt(I)),其中f为任意的非降性实函数。Contrast invariance: T t (f(I)) = f(T t (I)), where f is an arbitrary non-decreasing real function.

满足平移不变性、伸缩不变性、欧式不变性以及仿射不变性:现实三维物体和观测者的相对方向和角度发生变化时,人体眼睛对于图像所感知到的位置、角度、大小和形状也会随之发生改变,因此尺度空间算子Tt就需要在进行分析时与相对位置、相对角度、大小和仿射变换无关。Satisfy translation invariance, scaling invariance, Euclidean invariance and affine invariance: When the relative direction and angle between the real three-dimensional object and the observer change, the position, angle, size and shape of the image perceived by the human eye will also change accordingly. Therefore, the scale space operator T t needs to be independent of the relative position, relative angle, size and affine transformation when performing analysis.

平移不变性:有Tta(I))=τa(Tt(I)),其中τa(I)=I(x+a),a为任意常数;Translation invariance: T ta (I)) = τ a (T t (I)), where τ a (I) = I (x + a), a is an arbitrary constant;

伸缩不变性:满足t'(t,δ)>0,能够使HδTt=TtHδ成立,且HδI=I(δx),其中δ为任意的正实数与t为尺度参数;Scalability invariance: satisfies t'(t,δ)>0, and can make H δ T t =T t H δ hold, and H δ I=I(δx), where δ is an arbitrary positive real number and t is a scale parameter;

欧式不变性:对于任意的正交矩阵R,都存在Tt(R·I)=R·Tt(I),其中 (R·I)(x)=I(R·x);Euclidean invariance: For any orthogonal matrix R, there exists T t (R·I) = R·T t (I), where (R·I)(x) = I(R·x);

仿射不变性:对于任意的仿射变换A和任意的尺度参数t,满足t'(δ,A)>0,且ATt=TtA成立;Affine invariance: For any affine transformation A and any scale parameter t, t'(δ,A)>0 and AT t =T t A holds;

其中,Tt(I)指的是尺度空间算子函数,I指的是输入的图像I(x,y)。Here, T t (I) refers to the scale space operator function, and I refers to the input image I(x,y).

(2)高斯差分算子边缘检测:把观察图像的窗口设置为固定值,在这个窗口下观察图像的像素尺寸发生变化,这就产生了图像金字塔;在图像金字塔的基础上以高斯平滑和下采样对图像进行操作,将获得的一组处理完成的图像进行集合排列,就得到了高斯金字塔;将高斯金字塔每一组上层图像与下层图像做预测残差计算,形成高斯差分金字塔,获得高斯差分算子DOG;(2) Gaussian difference operator edge detection: The window for observing the image is set to a fixed value. The pixel size of the image observed under this window changes, which generates an image pyramid. Based on the image pyramid, the image is operated by Gaussian smoothing and downsampling. The obtained set of processed images are arranged in a set to obtain a Gaussian pyramid. The prediction residual is calculated between each group of upper and lower images of the Gaussian pyramid to form a Gaussian difference pyramid and obtain the Gaussian difference operator DOG.

尺度空间中,在不同尺度下观察到的数字图像像素多少是不同的,从另一个角度来说,把观察图像的窗口设置为固定值,在这个窗口下观察图像的像素尺寸发生变化,这就产生了图像金字塔。构建图像金字塔,是为了获得图像在不同尺度下的表达,而这个尺度通常为图像的分辨率,数字图像通过不断的平滑和采样,获得各种像素尺寸的数字图像。In scale space, the number of pixels of digital images observed at different scales is different. From another perspective, the window for observing the image is set to a fixed value, and the pixel size of the image observed under this window changes, which produces an image pyramid. The purpose of constructing an image pyramid is to obtain the expression of the image at different scales, and this scale is usually the resolution of the image. Digital images are obtained through continuous smoothing and sampling to obtain digital images of various pixel sizes.

本方法所采用的高斯差分算子DOG(Different of Gaussian)是基于图像金字塔建立起来的,其边缘提取算法卷积核小,算法速度快,并且可以很好的保留原图像的细节信息,因此高斯差分算子在图像的边缘检测中有着显著的效果。The Gaussian difference operator DOG (Different of Gaussian) used in this method is established based on the image pyramid. Its edge extraction algorithm has a small convolution kernel, a fast algorithm speed, and can well retain the detail information of the original image. Therefore, the Gaussian difference operator has a significant effect in image edge detection.

在图像多尺度表达中,从分辨率大小角度看待图像信息的结构之一就是图像金字塔,这种结构高效且清晰易于理解。图像金字塔是用于图像压缩和计算机视觉,将来自同一张原始图像的多个分辨率图像按照金字塔形状从大到小排列,其通过梯次向下采样获得,直到达到某个终止条件才停止采样。如图2所示,金字塔的底部是原始图像的多像素尺寸表示,多为原始图像本身,而顶部则仅有少量像素点,甚至只有一个像素点。In the multi-scale representation of images, one of the structures that views image information from the perspective of resolution is the image pyramid, which is efficient, clear and easy to understand. Image pyramid is used for image compression and computer vision. Multiple resolution images from the same original image are arranged in a pyramid shape from large to small. It is obtained by downsampling in stages until a certain termination condition is reached. As shown in Figure 2, the bottom of the pyramid is a multi-pixel representation of the original image, which is mostly the original image itself, while the top has only a small number of pixels or even only one pixel.

图像金字塔中,所在金字塔的层级越高,图像的尺寸和分辨率就越小,基础级J的像素尺寸为2J×2J或N×N,其中J=log2N,第0级顶点的大小为1×1,也就是一个像素点,第j级的像素尺寸大小为2j×2j,其中0≤j≤J;考虑到图像的失真问题,图像金字塔会被缩短到P+1级,其中1≤P≤J,且j=J-P,…,J-2,J-1,J;将级别限制到P来降低原数字图像的分辨率近似,其中P+1级金字塔(P>0)中像素的总数是:In the image pyramid, the higher the level of the pyramid, the smaller the image size and resolution. The pixel size of the base level J is 2 J ×2 J or N×N, where J=log 2 N. The size of the vertex at level 0 is 1×1, that is, a pixel point. The pixel size of the jth level is 2 j ×2 j , where 0≤j≤J. Considering the image distortion problem, the image pyramid will be shortened to level P+1, where 1≤P≤J, and j=JP, ..., J-2, J-1, J. Limiting the level to P is to reduce the resolution of the original digital image approximately, where the total number of pixels in the P+1 level pyramid (P>0) is:

把一张尺寸大小为1024×1024的原始数字图像作为实验对象,图像金字塔各级的图像大小和级数之间的关系如下表所示:Taking an original digital image of size 1024×1024 as the experimental object, the relationship between the image size and the number of levels at each level of the image pyramid is shown in the following table:

图像金字塔中,实验所使用数字图像所处的层级越高,包含的信息也会越来越少,当然其像素尺寸也就越小,如图3所示为图像金字塔的演示图。In the image pyramid, the higher the level of the digital image used in the experiment, the less information it contains, and of course the smaller its pixel size is. Figure 3 is a demonstration diagram of the image pyramid.

在图像金字塔的基础上构建一个近似和预测残差金字塔的系统模型。以第J级图像为输入,通过近似滤波器和下采样器输出第J-1级近似金字塔所需要的图像,通过上采样器和插值滤波器输出一个第J级补充的预测残差金字塔所需要的图像,如图4所示。A system model of an approximate and prediction residual pyramid is constructed based on the image pyramid. Taking the J-th level image as input, the image required by the J-1-th level approximate pyramid is output through the approximate filter and the downsampler, and the image required by the J-th level supplementary prediction residual pyramid is output through the upsampler and the interpolation filter, as shown in Figure 4.

近似金字塔和预测残差金字塔都以反复迭代的方式实现,在第一次迭代时,将原始数字图像放在金字塔的第J级中,然后按下面三个步骤执行P次,其中j=J-P+1,…,J-2,J-1,J:Both the approximate pyramid and the prediction residual pyramid are implemented in an iterative manner. In the first iteration, the original digital image is placed in the Jth level of the pyramid, and then the following three steps are performed P times, where j = J-P+1,…,J-2,J-1,J:

(1)通过近似滤波器和下采样器,计算出第j级所输入图像的分辨率降低的近似图像,并将其放在近似金字塔的j-1级;(1) Calculate an approximate image with reduced resolution of the input image at level j through an approximate filter and a downsampler, and place it at level j-1 of the approximate pyramid;

(2)将步骤(1)所产生的近似图像通过上采样器和插值滤波器生成第j 级图像的一个估计,该预测图像的维度和第j级图像相同;(2) the approximate image generated in step (1) is passed through an upsampler and an interpolation filter to generate an estimate of the j-th level image, where the dimension of the predicted image is the same as that of the j-th level image;

(3)将步骤一的第j级输入图像和步骤二的预测图像做差计算,得到第 j级预测残差,并将其放在预测残差金字塔的第j级。(3) Calculate the difference between the j-th level input image in step 1 and the predicted image in step 2 to obtain the j-th level prediction residual, and place it at the j-th level of the prediction residual pyramid.

构建近似和预测残差金字塔的系统模型中的插值滤波器和近似滤波器可以采用多种滤波技术。插值滤波器包括最近邻内插、双线性内插、样条和小波等技术。在近似滤波中:使用邻域平均就可得到平均金字塔;不过滤就可以得到子取样金字塔;使用低通高斯滤波就可以得到高斯金字塔,这也是本方法所用的金字塔结构。The interpolation filter and approximate filter in the system model for constructing the approximate and prediction residual pyramid can adopt a variety of filtering techniques. Interpolation filters include nearest neighbor interpolation, bilinear interpolation, spline and wavelet techniques. In approximate filtering: using neighborhood averaging can obtain an average pyramid; without filtering, a sub-sampling pyramid can be obtained; using low-pass Gaussian filtering can obtain a Gaussian pyramid, which is also the pyramid structure used in this method.

以高斯平滑和亚采样对图像进行操作,将获得的一组处理完成的图像进行集合排列,就得到了高斯金字塔。每个图像分辨率大小为一组,在高斯金字塔中都有若干层,每一层都是在上一层图像的基础上以高斯平滑和下采样的手段构建的,具体的构建过程如下:The image is operated by Gaussian smoothing and subsampling, and a group of processed images are arranged in a set to obtain a Gaussian pyramid. Each image resolution size is a group, and there are several layers in the Gaussian pyramid. Each layer is constructed on the basis of the previous layer of image by means of Gaussian smoothing and downsampling. The specific construction process is as follows:

(21)图像金字塔的第一组第一层为原始数字图像,对其进行高斯内核卷积,以σ为标准差,将计算完成的新图像放在第一组的第二层,其中高斯卷积公式为:(21) The first layer of the first group of the image pyramid is the original digital image, which is convolved with a Gaussian kernel, with σ as the standard deviation, and the new image is placed in the second layer of the first group, where the Gaussian convolution formula is:

式中,D(x,y)是频率域中点(x,y)与频率矩形中点的距离;Where D(x,y) is the distance between the midpoint (x,y) in the frequency domain and the midpoint of the frequency rectangle;

(22)使用新的标准差σ'对第一组第二层的图像做高斯内核卷积,将得到的结果图像放在图像金字塔的第一组第三层;(22) Use the new standard deviation σ' to perform Gaussian kernel convolution on the image of the second layer of the first group, and place the resulting image in the third layer of the first group of the image pyramid;

(23)反复执行步骤(22),最终就能得到第一组的N层图像;在高斯金字塔的每组结果中,每一层的图像都是同一图像经过不同的标准差卷积所获得的;(23) Repeat step (22) to finally obtain the first group of N layers of images. In each group of results of the Gaussian pyramid, each layer of images is obtained by convolving the same image with different standard deviations.

(24)将原始图像进行下采样,所得到的图像放在图像金字塔的第二组第一层,同样对其进行前三步操作,得到N层图像;(24) Down-sampling the original image, placing the resulting image in the first layer of the second group of the image pyramid, and performing the same three steps on it to obtain an N-layer image;

(25)重复执行步骤(24),就可以得到M组N层图像,最终构建高斯金字塔,其结构如图5所示。(25) Repeat step (24) to obtain M groups of N-layer images, and finally construct a Gaussian pyramid, whose structure is shown in FIG5 .

高斯差分算子DOG就是在高斯金字塔的基础上进一步深化。主要的金字塔结构就是将高斯金字塔每一组上层图像与下层图像做预测残差计算,形成新的高斯差分金字塔,其结构如图6所示。高斯差分算子DOG获取方式如下:The Gaussian difference operator DOG is a further development of the Gaussian pyramid. The main pyramid structure is to calculate the prediction residual of each group of upper and lower images of the Gaussian pyramid to form a new Gaussian difference pyramid, and its structure is shown in Figure 6. The Gaussian difference operator DOG is obtained as follows:

在高斯金字塔所构建的尺度空间中,σ为尺度空间的坐标,M为高斯金字塔结构中的组数,N为高斯金字塔中每组的层数;In the scale space constructed by the Gaussian pyramid, σ is the coordinate of the scale space, M is the number of groups in the Gaussian pyramid structure, and N is the number of layers of each group in the Gaussian pyramid;

其中,σ0是基准尺度,m为高斯金字塔中组的标号,n为组内层的标号;在高斯金字塔中,一共有M组N层图像,高斯金字塔中的图像使用符号(m,n)来进行定位和一一对应;Among them, σ 0 is the reference scale, m is the group number in the Gaussian pyramid, and n is the number of the layer within the group; in the Gaussian pyramid, there are M groups of N layers of images, and the images in the Gaussian pyramid use the symbol (m, n) for positioning and one-to-one correspondence;

高斯金字塔中,图像的高斯滤波表示为:In the Gaussian pyramid, the Gaussian filter of the image is expressed as:

代入σ不同的参数值,得到另一个高斯滤波图像:Substituting different parameter values of σ, we get another Gaussian filtered image:

其中高斯滤波函数为:The Gaussian filter function is:

然后,将式(5)和式(6)两个高斯滤波图像做差值计算,得到:Then, the difference between the two Gaussian filter images in equation (5) and equation (6) is calculated to obtain:

其中记Gσ1-Gσ2为高斯差分算子DOG,得到:Where G σ1 -G σ2 is the Gaussian difference operator DOG, and we get:

(3)基于流的高斯差分边缘检测:以高斯平滑和图像灰度化对图像进行预处理获取梯度图像,再使用双边滤波操作对提取的轮廓线进行处理,构造边缘切线流的同时调和图像,之后基于边缘切线流的高斯差分算法对数字图像进行轮廓边缘线的提取。(3) Flow-based Gaussian difference edge detection: The image is preprocessed by Gaussian smoothing and image grayscale to obtain a gradient image, and then the extracted contour lines are processed using a bilateral filter operation to construct an edge tangent flow and harmonize the image. The contour edge lines of the digital image are then extracted using the Gaussian difference algorithm based on the edge tangent flow.

数字图像的灰度值中,图像轮廓的边缘像素和背景像素在轮廓的大部分区域有着较大的差异,但不排除存在某些灰度值相近的区域。因此为了能够使边缘检测提取出的数字图像轮廓有较好的边缘信息特征,本方法采用基于流的高斯差分算法对数字图像进行轮廓边缘线的检测提取,以高斯平滑和图像灰度化对图像进行预处理,再使用滤波操作对提取的轮廓线进行处理,本方法所采用的轮廓提取算法如图7所示。In the grayscale value of a digital image, the edge pixels and background pixels of the image contour have a large difference in most areas of the contour, but it does not rule out the existence of some areas with similar grayscale values. Therefore, in order to make the digital image contour extracted by edge detection have better edge information characteristics, this method uses a flow-based Gaussian difference algorithm to detect and extract the contour edge line of the digital image, pre-processes the image with Gaussian smoothing and image graying, and then uses filtering operations to process the extracted contour line. The contour extraction algorithm used in this method is shown in Figure 7.

为了达到边缘检测的目的,很多时候会使用图像像素点的一阶导数或二阶导数,找到边缘像素集合的强度和方向,以此种方法来将图像像素点灰度值迅速变化的部分识别出来,并对其进行图像分割的处理。但是传统的利用灰度变化的方法只考虑到了灰度图像中的亮度信息,边缘检测得到的结果可能会与人眼感知的结果有一定的误差。因此,在亮度信息的基础上,本方法加入了图像的色度信息,以提高边缘检测的准确率。梯度图像的获取具体如下:In order to achieve the purpose of edge detection, the first-order derivative or second-order derivative of the image pixels is often used to find the intensity and direction of the edge pixel set. In this way, the part of the image pixel grayscale value that changes rapidly is identified and image segmentation is performed on it. However, the traditional method of using grayscale changes only considers the brightness information in the grayscale image, and the results of edge detection may have a certain error with the results perceived by the human eye. Therefore, on the basis of brightness information, this method adds the chromaticity information of the image to improve the accuracy of edge detection. The acquisition of gradient image is as follows:

在图像f的(x,y)像素点确定周围像素点集合的强度和方向,所用到的工具就是梯度,梯度用来表示,并且用向量的方式来定义:At the (x,y) pixel of image f, the intensity and direction of the surrounding pixel set are determined. The tool used is the gradient. To represent it, and define it in vector form:

梯度向量表示了图像f在像素点(x,y)处沿着梯度的方向变化最快,变化率最大,这是梯度向量的在几何上的一个显著特点,梯度向量的长度用 M(x,y)表示:The gradient vector indicates that the image f changes fastest and at the largest rate along the gradient direction at the pixel point (x, y). This is a significant geometric feature of the gradient vector. The length of is represented by M(x,y):

M(x,y)所代表的含义是梯度向量方向变化率的值,这其中gx,gy和 M(x,y)在像素尺寸大小上都和原图像相同,是x和y在图像f中所有像素位置上变化求导时得到的,M(x,y)就被称为梯度图像;M(x,y) represents the value of the rate of change of the gradient vector direction. Among them, gx , gy and M(x,y) are the same as the original image in pixel size. They are obtained by taking the derivative of x and y at all pixel positions in the image f. M(x,y) is called the gradient image.

梯度向量相对于x轴的角度为:The angle of the gradient vector relative to the x-axis is:

像素点的边缘方向与该点的梯度向量垂直,因此用梯度来确定某一点的边缘强度和方向,如图8所表示的梯度确定边缘方向的原理图,图中一个方块表示一个像素;The edge direction of a pixel is perpendicular to the gradient vector of the point, so the gradient is used to determine the edge strength and direction of a point, as shown in FIG8 , which shows the principle diagram of the gradient determining the edge direction. In the figure, one square represents one pixel.

在灰度图像中只能得到亮度信息,因此将图像从RGB空间转到YUV彩色空间中进行处理,在YUV空间中获取到图像的亮度和色度信息,其转换矩阵为:Only brightness information can be obtained in grayscale images, so the image is converted from RGB space to YUV color space for processing, and the brightness and chromaticity information of the image is obtained in YUV space. The conversion matrix is:

其中,Y为图像中的像素亮度值,R、G、B三个字母分别代表了红、绿、蓝三个基底颜色,因此图像的亮度信息为:Among them, Y is the pixel brightness value in the image, and the three letters R, G, and B represent the three base colors of red, green, and blue respectively. Therefore, the brightness information of the image is:

Y=0.299R+0.587G+0.144B (14)Y=0.299R+0.587G+0.144B (14)

在对图像像素点进行一阶求导或二阶求导的过程中,噪声对于求导结果的影响是不容忽视的,而索贝尔算子作为一个离散微分算子在抑制噪声上效果明显,因此利用索贝尔算子来计算得到图像亮度信息的梯度:In the process of taking the first-order or second-order derivative of the image pixels, the influence of noise on the derivative result cannot be ignored. The Sobel operator, as a discrete differential operator, is effective in suppressing noise. Therefore, the Sobel operator is used to calculate the gradient of the image brightness information:

根据式(11)图像的亮度的梯度长度为:According to formula (11), the gradient length of the brightness of the image is:

因为在计算过程中,平方和平方根的求解需要耗费大量的时间,为了提高计算效率,取其不开平方的近似值:Because solving the square and square root takes a lot of time in the calculation process, in order to improve the calculation efficiency, the approximate value without square root is taken:

|ΔY|=|ΔYx|+|ΔYy| (17)|ΔY|=|ΔY x |+|ΔY y | (17)

计算色度信息的梯度值时,应用国际通用的测色标准CIE-L*a*b*,这个标准可以应用于图像的光源色和物体色的计算,为将RGB图像转化到 CIE-L*a*b*色彩空间,先把RGB转换为XYZ色彩空间:When calculating the gradient value of chromaticity information, the internationally accepted color measurement standard CIE-L * a * b * is applied. This standard can be applied to the calculation of the light source color and object color of the image. To convert the RGB image to the CIE-L * a * b * color space, first convert RGB to the XYZ color space:

其中XYZ为物体的三刺激值,CIE-L*a*b*空间的转换公式为:Where XYZ is the tristimulus value of the object, and the conversion formula of CIE-L * a * b * space is:

X0Y0Z0表示CIE标准照明体的三刺激值,L*为心理明度,a*、b*为心理色度;X 0 Y 0 Z 0 represents the tristimulus values of the CIE standard illuminant, L * is the psychological lightness, a * and b * are the psychological chromaticity;

由此,包含图像色度信息的梯度表示为:Therefore, the gradient containing the image chromaticity information is expressed as:

则图像色度信息的梯度长度为:Then the gradient length of the image chromaticity information is:

其中,为CIE-L*a*b*色度空间里的距离:in, is the distance in CIE-L * a * b * color space:

为方便计算取两者的近似值为:For the convenience of calculation, the approximate values of the two are:

同样的,取式(21)图像色度信息的梯度长度的近似值:Similarly, take the approximate value of the gradient length of the image chromaticity information in equation (21):

对式(17)求得的亮度梯度近似值和式(24)求得的色度梯度近似值都统一到一个大致相同的数值区间内,做归一化处理:The brightness gradient approximation obtained by equation (17) and the chromaticity gradient approximation obtained by equation (24) are unified into a roughly the same numerical range and normalized:

ΔY'=(ΔY-min)(max-min) (25)ΔY'=(ΔY-min)(max-min) (25)

ΔC'=(ΔC-min)(max-min) (26)ΔC'=(ΔC-min)(max-min) (26)

由此得出亮度梯度和色度梯度,且两者均满足线性关系,计算得融合梯度为:From this, the brightness gradient and chromaticity gradient are obtained, and both satisfy a linear relationship. The fusion gradient is calculated as:

带有随机噪声或纹理的图像可能会产生比较多不相连、小的边缘,一些研究采用了一种比较好的解决方案,即根据近似的边缘方向来调整滤波器。这些方法背后的想法是,首先响应发生在边缘的亮度变化,然后使用边缘对齐的模糊来平滑这些响应。在每个点计算的边缘方向形成边缘切线流(edge tangent flow,ETF)。通过DOG计算沿正交于边缘切线的一维高斯模糊,通过沿着边缘切线流的线积分卷积来进行边缘对齐平滑,两通道的流动引导方法被称为基于流的高斯差分算法(flow-based difference of Gaussians,FDOG)。构造边缘切线流的具体方法如下:Images with random noise or textures may produce many disconnected and small edges. Some studies have adopted a better solution, which is to adjust the filter according to the approximate edge direction. The idea behind these methods is to first respond to brightness changes that occur at the edges, and then use edge-aligned blur to smooth these responses. The edge direction calculated at each point forms an edge tangent flow (ETF). The one-dimensional Gaussian blur along the edge tangent orthogonal to the edge tangent is calculated by DOG, and edge-aligned smoothing is performed by line integral convolution along the edge tangent flow. The two-channel flow guidance method is called the flow-based difference of Gaussians (FDOG) algorithm. The specific method of constructing the edge tangent flow is as follows:

为了能够在图像的轮廓提取中保留足够的细节信息,采用了边缘切线流 ETF来进行平滑处理;In order to retain enough detail information in the image contour extraction, the edge tangent flow ETF is used for smoothing;

ETF构造滤波器定义为:The ETF construction filter is defined as:

式中,Ωμ表示为x的邻域,其半径为μ,k是向量的归一化因子,tcur(y)表示y的归一化切线向量,Φ(x,y)表示的是向量tcur(y)的方向;Where Ω μ represents the neighborhood of x, whose radius is μ, k is the normalization factor of the vector, t cur (y) represents the normalized tangent vector of y, and Φ(x,y) represents the direction of the vector t cur (y);

详细说明式(28)中的ωs(x,y)为空间权值函数,其本质是一个半径为μ的矩形滤波器,其表达式为:Detailed description: ω s (x, y) in equation (28) is a spatial weight function, which is essentially a rectangular filter with a radius of μ, and its expression is:

ωm(x,y)为幅度权值函数,其表达式为:ω m (x, y) is the amplitude weight function, and its expression is:

其中e(x)表示的是在x点处梯度值的归一化,η表示的是下降率;Where e(x) represents the normalized gradient value at point x, and η represents the rate of descent;

ωd(x,y)为方向权值函数,其表达式为:ω d (x, y) is the direction weight function, and its expression is:

ωd(x,y)=|tcur(x)·tcur(y)| (31)ω d (x,y)=|t cur (x)·t cur (y)| (31)

幅度权值函数ωm(x,y)和方向权值函数ωd(x,y)在保持图像特征上有着重要作用,ωd(x,y)的值会随着两个归一化切线向量的垂直而减小,随着两个向量的平行而增大;此时用符号函数Φ(x,y)∈{1,-1}来表示tcur(y)的方向:The amplitude weight function ω m (x, y) and the direction weight function ω d (x, y) play an important role in maintaining image features. The value of ω d (x, y) decreases as the two normalized tangent vectors become perpendicular and increases as the two vectors become parallel. At this time, the sign function Φ(x, y)∈{1, -1} is used to represent the direction of t cur (y):

通过式(27)所得的初始的融合梯度M0(x),接着根据梯度场的垂直关系得到边缘切线流t0(x),通过对公式ti(x)→ti+1(x)的迭代2-3次,得到效果良好的边缘切线流ETF。如图9为ETF内核大小为5的边缘切线流构建效果图。The initial fusion gradient M 0 (x) is obtained by formula (27), and then the edge tangent flow t 0 (x) is obtained according to the vertical relationship of the gradient field. By iterating the formula ti (x)→ ti+1 (x) 2-3 times, a good edge tangent flow ETF is obtained. Figure 9 shows the effect of edge tangent flow construction with an ETF kernel size of 5.

在轮廓提取中,传统的高斯差分滤波器有连贯性较差,方向性不明确的问题,这是其内核的各向同性的结构所造成的,本方法采用了基于流的高斯差分算法,以克服之前存在的一些问题,图10说明了基于边缘切线流的高斯差分方法,图(a)截取自图9,图(b)中Cx表示在点x的流动曲线,s表示弧长参数,其取值范围为[-S,S],lS表示为垂直于曲线Cx,且与Cx相较于点x,用来表示曲线的宽度。In contour extraction, the traditional Gaussian difference filter has the problems of poor coherence and unclear directionality, which is caused by the isotropic structure of its kernel. This method adopts a flow-based Gaussian difference algorithm to overcome some of the previous problems. Figure 10 illustrates the Gaussian difference method based on edge tangent flow. Figure (a) is intercepted from Figure 9. In Figure (b), Cx represents the flow curve at point x, s represents the arc length parameter, and its value range is [-S,S]. lS represents being perpendicular to the curve Cx and compared with Cx at point x, which is used to represent the width of the curve.

FDOG滤波器定义为:The FDOG filter is defined as:

其中ls(t)表示的是参数t处线lS上的点,lS表示为垂直于曲线Cx,且与Cx相较于点x,用来表示曲线的宽度;Cx表示在点x的流动曲线,s表示弧长参数,其取值范围为[-S,S];因此I(ls(t))表示为所输入图像I在ls(t)点所代表的值,对于函数f,采用的是DOG边缘模型:Where l s (t) represents the point on the line l S at parameter t, l S is perpendicular to the curve C x and is used to represent the width of the curve compared to C x at point x; C x represents the flow curve at point x, s represents the arc length parameter, and its value range is [-S, S]; therefore, I(l s (t)) represents the value represented by the input image I at point l s (t). For the function f, the DOG edge model is used:

其中,Gσ(x)是一个单变量的高斯函数,其方差为1;σc和σs分别控制的是滤波结果的中心边缘间隔和与周边边缘的间隔大小,两者的取值为σs=1.6σc,这样能使函数f比较接近高斯拉普拉斯算子;ρ控制这噪声的级别和灵敏程度,取值在[0.79,1.0]之间;Where G σ (x) is a single variable Gaussian function with a variance of 1; σ c and σ s respectively control the center edge interval and the interval size with the peripheral edge of the filtering result, and the values of both are σ s = 1.6σ c , which can make the function f closer to the Gaussian Laplace operator; ρ controls the level and sensitivity of the noise, and its value is between [0.79, 1.0];

FDOG滤波器沿着Cx的累积响应公式为:The cumulative response formula of the FDOG filter along Cx is:

得到基于边缘切线流的高斯差分滤波结果之后,FDOG通过阈值化,得到边缘的二值图像:After obtaining the Gaussian difference filtering result based on the edge tangent flow, FDOG obtains the binary image of the edge by thresholding:

其中,τ控制中心环绕差异的大小,取值在[0,1],大部分时候取0.5。Among them, τ controls the size of the center-surround difference and takes a value in [0, 1], and most of the time it is 0.5.

如图11,本方法选取了两张单目数字图像对轮廓提取算法进行测试,其中(a)是一张含有明显边缘特征,背景清晰的简单书法图像,(b)是一张轮廓较多,且有背景颜色干扰的复杂卡通图像。As shown in Figure 11, this method selects two monocular digital images to test the contour extraction algorithm, where (a) is a simple calligraphy image with obvious edge features and a clear background, and (b) is a complex cartoon image with many contours and background color interference.

以应用了几种不同算子与滤波器的边缘检测算法对这两张差异较大的图像做轮廓提取处理,以此对比轮廓边缘检测的结果。图12为结果对比图。The edge detection algorithms using several different operators and filters are used to extract the contours of these two images with large differences, and the results of the contour edge detection are compared. Figure 12 is a comparison of the results.

在各个轮廓提取程序中,Canny算法所采用的两个滞后性阈值为150和 100,算子的高斯半径为3;Sobel算法中所使用的高斯半径为3,x和y方向上的差分阶数均为1;Scharr滤波器中x和y方向上的差分阶数均为1,计算导数时的缩放因子为1;ETF算法中内核大小为5,阈值为0.8;本方法中参数的选取为σc=0.3,σm=3,τ=0.8,ρ=0.998。In each contour extraction program, the two hysteresis thresholds used by the Canny algorithm are 150 and 100, and the Gaussian radius of the operator is 3; the Gaussian radius used in the Sobel algorithm is 3, and the differential orders in the x and y directions are both 1; the differential orders in the x and y directions in the Scharr filter are both 1, and the scaling factor when calculating the derivative is 1; the kernel size in the ETF algorithm is 5, and the threshold is 0.8; the parameters selected in this method are σ c =0.3, σ m =3, τ =0.8, and ρ =0.998.

通过对图11的对比,可以看出几种算法之间的差异。其中,Canny算法所提取的轮廓噪声和干扰比较少,轮廓较细会有断裂感,容易将图像噪声识别为边缘;Sobel算法和Scharr滤波器存在同样的问题,在没有背景的简单书法轮廓中边缘不够清晰,Scharr滤波器尤为明显,在复杂背景的卡通图像轮廓中对背景的抗干扰能力较差,会产生很多的噪声;ETF算法提取出的图像轮廓信息在有干扰时会断裂的情况,轮廓有些失真;从结果上看本算法在提取文字轮廓和图形图像轮廓时,可以比较好的保留图像轮廓,且抗干扰能力强,生成的图片比较清晰。By comparing Figure 11, we can see the differences between several algorithms. Among them, the contour noise and interference extracted by the Canny algorithm are relatively small, and the contour is thin and has a sense of breakage, and it is easy to identify image noise as an edge; the Sobel algorithm and the Scharr filter have the same problem, the edge is not clear enough in the simple calligraphy contour without background, and the Scharr filter is particularly obvious. The anti-interference ability of the background in the cartoon image contour with a complex background is poor, and a lot of noise will be generated; the image contour information extracted by the ETF algorithm will be broken when there is interference, and the contour is somewhat distorted; from the results, this algorithm can better retain the image contour when extracting the text contour and graphic image contour, and has strong anti-interference ability, and the generated picture is clearer.

综上所述,在图像的尺度空间中,利用高斯差分金字塔结合数字图像的亮度色度信息,将图像的色度信息梯度图像和亮度信息梯度图像融合在一起,计算的每个点边缘方向形成边缘切线流,采用基于流的高斯差分算法对数字图像进行轮廓边缘线的检测提取,克服传统高斯差分内核的各向同性所带来的连贯性较差的问题。生成了保留显著边缘特征,去除噪声和干扰信息的数字图像轮廓图,并与几种使用较多的轮廓提取方法进行对比实验,可以看出本文方法所得到的轮廓提取结果更加的连贯,清晰。In summary, in the scale space of the image, the Gaussian difference pyramid is used to combine the brightness and color information of the digital image, and the image's color information gradient image and brightness information gradient image are fused together. The edge direction of each point is calculated to form an edge tangent flow, and the flow-based Gaussian difference algorithm is used to detect and extract the contour edge line of the digital image, overcoming the poor coherence problem caused by the isotropy of the traditional Gaussian difference kernel. A digital image contour map is generated that retains significant edge features and removes noise and interference information, and is compared with several commonly used contour extraction methods. It can be seen that the contour extraction results obtained by this method are more coherent and clear.

Claims (8)

1.一种基于单目数字图像的轮廓提取方法,其特征是,具体包括如下步骤:1. A contour extraction method based on a monocular digital image, characterized in that it specifically comprises the following steps: (1)图像尺度空间:在图像大小不变的前提保证下,通过平滑和滤波器迭代,得到数字图像在各个尺度下的结构信息,并在此基础上,用预设的实验窗口对各个尺度下的图像进行研究处理,达到图像预处理的效果;(1) Image scale space: Under the premise of keeping the image size unchanged, the structural information of the digital image at each scale is obtained through smoothing and filter iteration. On this basis, the image at each scale is studied and processed using a preset experimental window to achieve the effect of image preprocessing. (2)高斯差分算子边缘检测:把观察图像的窗口设置为固定值,在这个窗口下观察图像的像素尺寸发生变化,这就产生了图像金字塔;在图像金字塔的基础上以高斯平滑和下采样对图像进行操作,将获得的一组处理完成的图像进行集合排列,就得到了高斯金字塔;将高斯金字塔每一组上层图像与下层图像做预测残差计算,形成高斯差分金字塔,获得高斯差分算子DOG;(2) Gaussian difference operator edge detection: The window for observing the image is set to a fixed value. The pixel size of the image observed under this window changes, which generates an image pyramid. Based on the image pyramid, the image is operated by Gaussian smoothing and downsampling. The obtained set of processed images are arranged in a set to obtain a Gaussian pyramid. The prediction residual is calculated between each group of upper and lower images of the Gaussian pyramid to form a Gaussian difference pyramid and obtain the Gaussian difference operator DOG. (3)基于流的高斯差分边缘检测:以高斯平滑和图像灰度化对图像进行预处理获取梯度图像,再使用双边滤波操作对提取的轮廓线进行处理,构造边缘切线流的同时调和图像,之后基于边缘切线流的高斯差分算法对数字图像进行轮廓边缘线的提取;(3) Flow-based Gaussian difference edge detection: The image is preprocessed by Gaussian smoothing and image graying to obtain a gradient image, and then the extracted contour lines are processed using a bilateral filter operation to construct an edge tangent flow and harmonize the image. After that, the Gaussian difference algorithm based on the edge tangent flow is used to extract the contour edge lines of the digital image. 在步骤(3)中,梯度图像的获取具体如下:In step (3), the gradient image is obtained as follows: 在图像f的(x,y)像素点确定周围像素点集合的强度和方向,所用到的工具就是梯度,梯度用来表示,并且用向量的方式来定义:At the (x,y) pixel of image f, the intensity and direction of the surrounding pixel set are determined. The tool used is the gradient. To represent it, and define it in vector form: 梯度向量表示了图像f在像素点(x,y)处沿着梯度的方向变化最快,变化率最大,这是梯度向量的在几何上的一个显著特点,梯度向量的长度用M(x,y)表示:The gradient vector indicates that the image f changes fastest and at the largest rate along the gradient direction at the pixel point (x, y). This is a significant geometric feature of the gradient vector. The length of is represented by M(x,y): M(x,y)所代表的含义是梯度向量方向变化率的值,这其中gx,gy和M(x,y)在像素尺寸大小上都和原图像相同,是x和y在图像f中所有像素位置上变化求导时得到的,M(x,y)就被称为梯度图像;M(x,y) represents the value of the rate of change of the gradient vector direction. Among them, gx , gy and M(x,y) are the same as the original image in pixel size. They are obtained by taking the derivative of x and y at all pixel positions in the image f. M(x,y) is called the gradient image. 梯度向量相对于x轴的角度为:The angle of the gradient vector relative to the x-axis is: 像素点的边缘方向与该点的梯度向量垂直,因此用梯度来确定某一点的边缘强度和方向;The edge direction of a pixel is perpendicular to the gradient vector of that point, so the gradient is used to determine the edge strength and direction of a point; 在灰度图像中只能得到亮度信息,因此将图像从RGB空间转到YUV彩色空间中进行处理,在YUV空间中获取到图像的亮度和色度信息,其转换矩阵为:Only brightness information can be obtained in grayscale images, so the image is converted from RGB space to YUV color space for processing, and the brightness and chromaticity information of the image is obtained in YUV space. The conversion matrix is: 其中,Y为图像中的像素亮度值,R、G、B三个字母分别代表了红、绿、蓝三个基底颜色,因此图像的亮度信息为:Among them, Y is the pixel brightness value in the image, and the three letters R, G, and B represent the three base colors of red, green, and blue respectively. Therefore, the brightness information of the image is: Y=0.299R+0.587G+0.144BY=0.299R+0.587G+0.144B 在对图像像素点进行一阶求导或二阶求导的过程中,噪声对于求导结果的影响是不容忽视的,而索贝尔算子作为一个离散微分算子在抑制噪声上效果明显,因此利用索贝尔算子来计算得到图像亮度信息的梯度:In the process of taking the first-order or second-order derivative of the image pixels, the influence of noise on the derivative result cannot be ignored. The Sobel operator, as a discrete differential operator, is effective in suppressing noise. Therefore, the Sobel operator is used to calculate the gradient of the image brightness information: 图像的亮度的梯度长度为:The gradient length of the brightness of the image is: 因为在计算过程中,平方和平方根的求解需要耗费大量的时间,为了提高计算效率,取其不开平方的近似值:Because solving the square and square root takes a lot of time in the calculation process, in order to improve the calculation efficiency, the approximate value without square root is taken: |ΔY|=|ΔYx|+|ΔY||ΔY|=|ΔY x |+|Δ Y | 计算色度信息的梯度值时,应用国际通用的测色标准CIE-L*a*b*,这个标准可以应用于图像的光源色和物体色的计算,为将RGB图像转化到CIE-L*a*b*色彩空间,先把RGB转换为XYZ色彩空间:When calculating the gradient value of chromaticity information, the internationally accepted color measurement standard CIE-L * a * b * is applied. This standard can be applied to the calculation of the light source color and object color of the image. To convert the RGB image to the CIE-L * a * b * color space, first convert RGB to the XYZ color space: 其中XYZ为物体的三刺激值,CIE-L*a*b*空间的转换公式为:Where XYZ is the tristimulus value of the object, and the conversion formula of CIE-L * a * b * space is: 其中Y/Y0>0.01 Where Y/Y 0 >0.01 X0Y0Z0表示CIE标准照明体的三刺激值,L*为心理明度,a*、b*为心理色度;由此,包含图像色度信息的梯度表示为:X 0 Y 0 Z 0 represents the tristimulus values of the CIE standard illuminant, L * is the psychological brightness, a * and b * are the psychological chromaticity; therefore, the gradient containing the image chromaticity information is expressed as: 则图像色度信息的梯度长度为:Then the gradient length of the image chromaticity information is: 其中,为CIE-L*a*b*色度空间里的距离:in, is the distance in CIE-L * a * b * color space: 为方便计算取两者的近似值为:For the convenience of calculation, the approximate values of the two are: 同样的,取图像色度信息的梯度长度的近似值:Similarly, take the approximate value of the gradient length of the image chromaticity information: 对求得的亮度梯度近似值和求得的色度梯度近似值都统一到一个相同的数值区间内,做归一化处理:The obtained brightness gradient approximation and the obtained chromaticity gradient approximation are unified into the same numerical range and normalized: ΔY'=(ΔY-min)(max-min)ΔY'=(ΔY-min)(max-min) ΔC'=(ΔC-min)(max-min)ΔC'=(ΔC-min)(max-min) 由此得出亮度梯度和色度梯度,且两者均满足线性关系,计算得融合梯度为:From this, the brightness gradient and chromaticity gradient are obtained, and both satisfy a linear relationship. The fusion gradient is calculated as: 2.根据权利要求1所述的一种基于单目数字图像的轮廓提取方法,其特征是,在步骤(1)中,以图像I(x,y)为例,尺度空间Tt是采用尺度参数为t的图像平滑算子生成,尺度参数t反应的是数字图像被光滑的程度,将{Tt}t∈R称为二维图像I(x,y)在尺度参数t下的映像,R指的是实数集合,参数t越大,就表示图像的内容越简单。2. According to claim 1, a contour extraction method based on a monocular digital image is characterized in that, in step (1), taking image I(x, y) as an example, the scale space T t is generated by an image smoothing operator with a scale parameter t, and the scale parameter t reflects the degree of smoothing of the digital image. {T t } t∈R is called the image of the two-dimensional image I(x, y) under the scale parameter t, and R refers to a set of real numbers. The larger the parameter t, the simpler the content of the image. 3.根据权利要求2所述的一种基于单目数字图像的轮廓提取方法,其特征是,在步骤(1)中,尺度空间算子Tt需满足视觉不变性,包括以下几种:3. The contour extraction method based on a monocular digital image according to claim 2 is characterized in that, in step (1), the scale space operator T t needs to satisfy visual invariance, including the following: 灰度不变性:有Tt(I+h)=Tt(I)+h,其中h为任意常数;Grayscale invariance: T t (I+h) = T t (I) + h, where h is an arbitrary constant; 对比度不变性:有Tt(f(I))=f(Tt(I)),其中f为任意的非降性实函数;Contrast invariance: T t (f(I)) = f(T t (I)), where f is an arbitrary non-decreasing real function; 平移不变性:有Tta(I))=τa(Tt(I)),其中τa(I)=I(x+a),a为任意常数;Translation invariance: T ta (I)) = τ a (T t (I)), where τ a (I) = I (x + a), a is an arbitrary constant; 伸缩不变性:满足t'(t,δ)>0,能够使HδTt=TtHδ成立,且HδI=I(δx),其中δ为任意的正实数与t为尺度参数;Scalability invariance: satisfies t'(t,δ)>0, and can make H δ T t =T t H δ hold, and H δ I=I(δx), where δ is an arbitrary positive real number and t is a scale parameter; 欧式不变性:对于任意的正交矩阵R,都存在Tt(R·I)=R·Tt(I),其中(R·I)(x)=I(R·x);Euclidean invariance: For any orthogonal matrix R, there exists T t (R·I) = R·T t (I), where (R·I)(x) = I(R·x); 仿射不变性:对于任意的仿射变换A和任意的尺度参数t,满足t'(δ,A)>0,且ATt=TtA成立;Affine invariance: For any affine transformation A and any scale parameter t, t'(δ,A)>0 and AT t =T t A holds; 其中,Tt(I)指的是尺度空间算子函数,I指的是输入的图像I(x,y)。Here, T t (I) refers to the scale space operator function, and I refers to the input image I(x,y). 4.根据权利要求1所述的一种基于单目数字图像的轮廓提取方法,其特征是,在步骤(2)中,图像金字塔是用于图像压缩和计算机视觉,将来自同一张原始图像的多个分辨率图像按照金字塔形状从大到小排列,其通过梯次向下采样获得,直到达到某个终止条件才停止采样;图像金字塔中,所在金字塔的层级越高,图像的尺寸和分辨率就越小,基础级J的像素尺寸为2J×2J或N×N,其中J=log2N,第0级顶点的大小为1×1,也就是一个像素点,第j级的像素尺寸大小为2j×2j,其中0≤j≤J;考虑到图像的失真问题,图像金字塔会被缩短到P+1级,其中1≤P≤J,且j=J-P,···,J-2,J-1,J;将级别限制到P来降低原数字图像的分辨率近似,其中P+1级金字塔(P>0)中像素的总数是:4. A contour extraction method based on a monocular digital image according to claim 1, characterized in that, in step (2), an image pyramid is used for image compression and computer vision, and multiple resolution images from the same original image are arranged in a pyramid shape from large to small, which are obtained by stepwise down sampling until a certain termination condition is reached. In the image pyramid, the higher the level of the pyramid, the smaller the size and resolution of the image, the pixel size of the base level J is 2 J ×2 J or N×N, where J=log 2 N, the size of the 0th level vertex is 1×1, that is, a pixel point, and the pixel size of the jth level is 2 j ×2 j , where 0≤j≤J; considering the image distortion problem, the image pyramid will be shortened to P+1 level, where 1≤P≤J, and j=JP,...,J-2,J-1,J; limiting the level to P to reduce the resolution of the original digital image approximately, where the total number of pixels in the P+1 level pyramid (P>0) is: 5.根据权利要求4所述的一种基于单目数字图像的轮廓提取方法,其特征是,在步骤(2)中,每个图像分辨率大小为一组,在高斯金字塔中都有若干层,每一层都是在上一层图像的基础上以高斯平滑和下采样的手段构建的,具体的构建过程如下:5. The method for extracting contours based on a monocular digital image according to claim 4, wherein in step (2), each image resolution size is a group, and there are several layers in the Gaussian pyramid, and each layer is constructed by Gaussian smoothing and downsampling on the basis of the image of the previous layer, and the specific construction process is as follows: (21)图像金字塔的第一组第一层为原始数字图像,对其进行高斯内核卷积,以σ为标准差,将计算完成的新图像放在第一组的第二层,其中高斯卷积公式为:(21) The first layer of the first group of the image pyramid is the original digital image, which is convolved with a Gaussian kernel, with σ as the standard deviation, and the new image is placed in the second layer of the first group, where the Gaussian convolution formula is: D(x,y)=[(x-x0)2+(y-y0)2]1/2 D(x,y)=[(xx 0 ) 2 +(yy 0 ) 2 ] 1/2 式中,D(x,y)是频率域中点(x,y)与频率矩形中点的距离;Where D(x,y) is the distance between the midpoint (x,y) in the frequency domain and the midpoint of the frequency rectangle; (22)使用新的标准差σ'对第一组第二层的图像做高斯内核卷积,将得到的结果图像放在图像金字塔的第一组第三层;(22) Use the new standard deviation σ' to perform Gaussian kernel convolution on the image of the second layer of the first group, and place the resulting image in the third layer of the first group of the image pyramid; (23)反复执行步骤(22),最终就能得到第一组的N层图像;在高斯金字塔的每组结果中,每一层的图像都是同一图像经过不同的标准差卷积所获得的;(23) Repeat step (22) to finally obtain the first group of N layers of images. In each group of results of the Gaussian pyramid, each layer of images is obtained by convolving the same image with different standard deviations. (24)将原始图像进行下采样,所得到的图像放在图像金字塔的第二组第一层,同样对其进行前三步操作,得到N层图像;(24) Down-sampling the original image, placing the resulting image in the first layer of the second group of the image pyramid, and performing the same three steps on it to obtain an N-layer image; (25)重复执行步骤(24),就可以得到M组N层图像,最终构建高斯金字塔。(25) Repeat step (24) to obtain M groups of N-layer images and finally construct a Gaussian pyramid. 6.根据权利要求5所述的一种基于单目数字图像的轮廓提取方法,其特征是,6. The method for extracting contours based on a monocular digital image according to claim 5, characterized in that: 在步骤(2)中,高斯差分算子DOG获取方式如下:In step (2), the Gaussian difference operator DOG is obtained as follows: 在高斯金字塔所构建的尺度空间中,σ为尺度空间的坐标,M为高斯金字塔结构中的组数,N为高斯金字塔中每组的层数;In the scale space constructed by the Gaussian pyramid, σ is the coordinate of the scale space, M is the number of groups in the Gaussian pyramid structure, and N is the number of layers of each group in the Gaussian pyramid; 其中,σ0是基准尺度,m为高斯金字塔中组的标号,n为组内层的标号;在高斯金字塔中,一共有M组N层图像,高斯金字塔中的图像使用符号(m,n)来进行定位和一一对应;Among them, σ 0 is the reference scale, m is the group number in the Gaussian pyramid, and n is the number of the layer within the group; in the Gaussian pyramid, there are M groups of N layers of images, and the images in the Gaussian pyramid use the symbol (m, n) for positioning and one-to-one correspondence; 高斯金字塔中,图像的高斯滤波表示为:In the Gaussian pyramid, the Gaussian filter of the image is expressed as: 代入σ不同的参数值,得到另一个高斯滤波图像:Substituting different parameter values of σ, we get another Gaussian filtered image: 其中高斯滤波函数为:The Gaussian filter function is: 然后,将两个高斯滤波图像做差值计算,得到:Then, the difference between the two Gaussian filtered images is calculated to obtain: 其中记Gσ1-Gσ2为高斯差分算子DOG,得到:Where G σ1 -G σ2 is the Gaussian difference operator DOG, and we get: 7.根据权利要求1所述的一种基于单目数字图像的轮廓提取方法,其特征是,在步骤(3)中,构造边缘切线流的具体方法如下:7. The method for contour extraction based on a monocular digital image according to claim 1, wherein in step (3), the specific method for constructing the edge tangent flow is as follows: 为了能够在图像的轮廓提取中保留足够的细节信息,采用了边缘切线流ETF来进行平滑处理;In order to retain enough detail information in the contour extraction of the image, the edge tangent flow ETF is used for smoothing; ETF构造滤波器定义为:The ETF construction filter is defined as: 式中,Ωμ表示为x的邻域,其半径为μ,k是向量的归一化因子,tcur(y)表示y的归一化切线向量,Φ(x,y)表示的是向量tcur(y)的方向;Where Ω μ represents the neighborhood of x, whose radius is μ, k is the normalization factor of the vector, t cur (y) represents the normalized tangent vector of y, and Φ(x,y) represents the direction of the vector t cur (y); 详细说明式中的ωs(x,y)为空间权值函数,其本质是一个半径为μ的矩形滤波器,其表达式为:ω s (x, y) in the detailed description is a spatial weight function, which is essentially a rectangular filter with a radius of μ, and its expression is: ωm(x,y)为幅度权值函数,其表达式为:ω m (x, y) is the amplitude weight function, and its expression is: 其中e(x)表示的是在x点处梯度值的归一化,η表示的是下降率;Where e(x) represents the normalized gradient value at point x, and η represents the rate of descent; ωd(x,y)为方向权值函数,其表达式为:ω d (x, y) is the direction weight function, and its expression is: ωd(x,y)=|tcur(x)·tcur(y)|ω d (x,y)=|t cur (x)·t cur (y)| 幅度权值函数ωm(x,y)和方向权值函数ωd(x,y)在保持图像特征上有着重要作用,ωd(x,y)的值会随着两个归一化切线向量的垂直而减小,随着两个向量的平行而增大;此时用符号函数Φ(x,y)∈{1,-1}来表示tcur(y)的方向:The amplitude weight function ω m (x, y) and the direction weight function ω d (x, y) play an important role in maintaining image features. The value of ω d (x, y) decreases as the two normalized tangent vectors become perpendicular and increases as the two vectors become parallel. At this time, the sign function Φ(x, y)∈{1, -1} is used to represent the direction of t cur (y): 通过所得的初始的融合梯度M0(x),接着根据梯度场的垂直关系得到边缘切线流t0(x),通过对公式ti(x)→ti+1(x)的迭代2-3次,得到效果良好的边缘切线流ETF。The initial fused gradient M 0 (x) is obtained, and then the edge tangent flow t 0 (x) is obtained according to the vertical relationship of the gradient field. By iterating the formula ti (x)→ti +1 (x) 2-3 times, a good edge tangent flow ETF is obtained. 8.根据权利要求7所述的一种基于单目数字图像的轮廓提取方法,其特征是,在步骤(3)中,通过DOG计算沿正交于边缘切线的一维高斯模糊,通过沿着边缘切线流的线积分卷积来进行边缘对齐平滑,两通道的流动引导方法被称为基于边缘切线流的高斯差分算法FDOG,8. A contour extraction method based on a monocular digital image according to claim 7, characterized in that, in step (3), a one-dimensional Gaussian blur along a line orthogonal to the edge tangent is calculated by DOG, and edge alignment and smoothing are performed by line integral convolution along the edge tangent flow. The two-channel flow guidance method is called a Gaussian difference algorithm based on edge tangent flow FDOG, FDOG滤波器定义为:The FDOG filter is defined as: 其中ls(t)表示的是参数t处线lS上的点,lS表示为垂直于曲线Cx,且与Cx相较于点x,用来表示曲线的宽度;Cx表示在点x的流动曲线,s表示弧长参数,其取值范围为[-S,S];因此I(ls(t))表示为所输入图像I在ls(t)点所代表的值,对于函数f,采用的是DOG边缘模型:Where l s (t) represents the point on the line l S at parameter t, l S is perpendicular to the curve C x and is used to represent the width of the curve compared to C x at point x; C x represents the flow curve at point x, s represents the arc length parameter, and its value range is [-S, S]; therefore, I(l s (t)) represents the value represented by the input image I at point l s (t). For the function f, the DOG edge model is used: 其中,Gσ(x)是一个单变量的高斯函数,其方差为1;σc和σs分别控制的是滤波结果的中心边缘间隔和与周边边缘的间隔大小,两者的取值为σs=1.6σc,这样能使函数f比较接近高斯拉普拉斯算子;ρ控制这噪声的级别和灵敏程度,取值在[0.79,1.0]之间;Where G σ (x) is a single variable Gaussian function with a variance of 1; σ c and σ s respectively control the center edge interval and the interval size with the peripheral edge of the filtering result, and the values of both are σ s = 1.6σ c , which can make the function f closer to the Gaussian Laplace operator; ρ controls the level and sensitivity of the noise, and its value is between [0.79, 1.0]; FDOG滤波器沿着Cx的累积响应公式为:The cumulative response formula of the FDOG filter along Cx is: 得到基于边缘切线流的高斯差分滤波结果之后,FDOG通过阈值化,得到边缘的二值图像:After obtaining the Gaussian difference filtering result based on the edge tangent flow, FDOG obtains the binary image of the edge by thresholding: 其中,τ控制中心环绕差异的大小,取值在[0,1]。Among them, τ controls the size of the center-surround difference and takes values in [0, 1].
CN202210674544.0A 2022-06-15 2022-06-15 A Contour Extraction Method Based on Monocular Digital Image Active CN115100226B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210674544.0A CN115100226B (en) 2022-06-15 2022-06-15 A Contour Extraction Method Based on Monocular Digital Image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210674544.0A CN115100226B (en) 2022-06-15 2022-06-15 A Contour Extraction Method Based on Monocular Digital Image

Publications (2)

Publication Number Publication Date
CN115100226A CN115100226A (en) 2022-09-23
CN115100226B true CN115100226B (en) 2024-10-01

Family

ID=83290444

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210674544.0A Active CN115100226B (en) 2022-06-15 2022-06-15 A Contour Extraction Method Based on Monocular Digital Image

Country Status (1)

Country Link
CN (1) CN115100226B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116834023B (en) * 2023-08-28 2023-11-14 山东嘉达装配式建筑科技有限责任公司 Nailing robot control system
CN117952857B (en) * 2024-03-22 2024-06-14 汉中神灯生物科技有限公司 Spectral image intelligent analysis method of natural food additive
CN118537819B (en) * 2024-07-25 2024-10-11 中国海洋大学 Low-calculation-force frame difference method road vehicle visual identification method, medium and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0692772A1 (en) * 1994-07-12 1996-01-17 Laboratoires D'electronique Philips S.A.S. Method and apparatus for detecting characteristic points on the contour of an object
CN110097626A (en) * 2019-05-06 2019-08-06 浙江理工大学 A kind of basse-taille object identification processing method based on RGB monocular image

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于单目图像的浅浮雕成型技术研究;赵云;中国优秀硕士论文全文数据库;20230215(第第2期期);全文 *

Also Published As

Publication number Publication date
CN115100226A (en) 2022-09-23

Similar Documents

Publication Publication Date Title
CN115100226B (en) A Contour Extraction Method Based on Monocular Digital Image
CN108376391B (en) Intelligent infrared image scene enhancement method
Park et al. Single image dehazing with image entropy and information fidelity
Fan et al. Homomorphic filtering based illumination normalization method for face recognition
CN104463814B (en) Image enhancement method based on local texture directionality
CN105139391B (en) A kind of haze weather traffic image edge detection method
Krishnan et al. A survey on different edge detection techniques for image segmentation
CN109190617B (en) Image rectangle detection method and device and storage medium
CN108389215A (en) A kind of edge detection method, device, computer storage media and terminal
CN111353955A (en) Image processing method, device, equipment and storage medium
CN114612359A (en) Visible light and infrared image fusion method based on feature extraction
WO2017128646A1 (en) Image processing method and device
CN115689960A (en) A fusion method of infrared and visible light images based on adaptive illumination in nighttime scenes
CN108038458B (en) Automatic acquisition method of outdoor scene text in video based on feature summary map
CN110599553B (en) A kind of skin color extraction and detection method based on YCbCr
Kumar et al. Enhancing scene perception using a multispectral fusion of visible–near‐infrared image pair
Han et al. Automatic illumination and color compensation using mean shift and sigma filter
Jose et al. Bilateral edge detectors
Chen et al. Image segmentation in thermal images
CN118379636A (en) Remote sensing image shadow automatic detection algorithm based on orthogonal decomposition
CN109410227B (en) GVF model-based land utilization pattern spot contour extraction algorithm
CN102855025A (en) Optical multi-touch contact detection method based on visual attention model
CN110930358A (en) Solar panel image processing method based on self-adaptive algorithm
Xu et al. Quaternion quasi-Chebyshev non-local means for color image denoising
CN115035350A (en) Method for detecting small targets against air ground and ground background based on edge detection enhancement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant