CN115311553A - Target detection method and device, electronic equipment and storage medium - Google Patents
Target detection method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN115311553A CN115311553A CN202210819568.0A CN202210819568A CN115311553A CN 115311553 A CN115311553 A CN 115311553A CN 202210819568 A CN202210819568 A CN 202210819568A CN 115311553 A CN115311553 A CN 115311553A
- Authority
- CN
- China
- Prior art keywords
- feature
- target
- detected
- sample
- target detection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 295
- 238000000605 extraction Methods 0.000 claims abstract description 58
- 238000000034 method Methods 0.000 claims abstract description 35
- 238000012545 processing Methods 0.000 claims abstract description 25
- 230000006870 function Effects 0.000 claims description 64
- 230000004927 fusion Effects 0.000 claims description 46
- 238000012549 training Methods 0.000 claims description 35
- 238000004590 computer program Methods 0.000 claims description 13
- 238000002372 labelling Methods 0.000 claims description 10
- 238000006243 chemical reaction Methods 0.000 claims description 8
- 239000011159 matrix material Substances 0.000 claims description 8
- 230000001629 suppression Effects 0.000 abstract description 7
- 238000004364 calculation method Methods 0.000 description 6
- 238000013461 design Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 239000000284 extract Substances 0.000 description 5
- 238000005259 measurement Methods 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000011176 pooling Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000005315 distribution function Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/42—Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/7715—Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
本发明实施例提供一种目标检测方法,方法包括:获取待检测图像,所述待检测图像包括待检测旋转目标;通过训练好的目标检测模型对所述待检测图像进行特征提取,得到所述待检测旋转目标的分类特征和辅助特征;基于所述辅助特征对所述分类特征进行辅助处理,得到所述待检测旋转目标的检测结果。通过提取待检测旋转目标的分类特征和辅助特征,利用辅助特征来对分类特征进行辅助处理,从而得到待检测旋转目标的检测结果,不需要对anchor进行设计,因此也不需要进行非极大值抑制,提高了旋转目标检测的检测准确率,进而提高了旋转目标检测的检测性能。
An embodiment of the present invention provides a target detection method. The method includes: acquiring an image to be detected, where the image to be detected includes a rotating target to be detected; and performing feature extraction on the image to be detected by using a trained target detection model to obtain the Classification feature and auxiliary feature of the rotating object to be detected; perform auxiliary processing on the classification feature based on the auxiliary feature to obtain the detection result of the rotating object to be detected. By extracting the classification features and auxiliary features of the rotating target to be detected, and using the auxiliary features to perform auxiliary processing on the classification features, the detection results of the rotating target to be detected are obtained, and the anchor does not need to be designed, so no non-maximum value is required. Suppression improves the detection accuracy of rotating object detection, thereby improving the detection performance of rotating object detection.
Description
技术领域technical field
本发明涉及人工智能领域,尤其涉及一种目标检测方法、装置、电子设备及存储介质。The invention relates to the field of artificial intelligence, in particular to a target detection method, device, electronic equipment and storage medium.
背景技术Background technique
旋转目标检测指的是将具有旋转方向的目标检测出来,也就是需要检测目标的中心点、宽高、角度。在俯视图的目标检测中比较常见,如遥感图像目标检测、航拍图像目标检测等。当前的旋转目标检测往往是基于anchor锚点和非极大值抑制的目标检测算法,由于图像中旋转目标的大小和位置不一样,基于anchor的目标检测算法有一些固有缺点,比如想要检测所有大小和位置的旋转目标,anchor的设计会非常复杂,要去设计不同的比例,不同的尺寸,而且anchor的比例和尺寸设计不适当,还会影响后续的非极大值抑制,造成误差累积,从而影响目标检测模型的检测准确率。因此,现有的旋转目标检测算法具有检测准确率不高的问题。Rotating target detection refers to detecting a target with a rotating direction, that is, the center point, width, height, and angle of the target need to be detected. It is more common in target detection of top view, such as remote sensing image target detection, aerial image target detection, etc. The current rotating target detection is often based on the anchor point and non-maximum value suppression target detection algorithm, because the size and position of the rotating target in the image are not the same, the target detection algorithm based on the anchor has some inherent shortcomings, such as wanting to detect all The size and position of the rotating target, the design of the anchor will be very complicated. It is necessary to design different proportions and different sizes, and the proportion and size of the anchor are not designed properly, which will also affect the subsequent non-maximum value suppression, resulting in error accumulation. Thus affecting the detection accuracy of the target detection model. Therefore, the existing rotating target detection algorithm has the problem of low detection accuracy.
发明内容Contents of the invention
本发明实施例提供一种目标检测方法,旨在解决现有目标检测过程中,旋转目标检测算法具有检测准确率不高的问题。通过提取待检测旋转目标的分类特征和辅助特征,利用辅助特征来对分类特征进行辅助处理,从而得到待检测旋转目标的检测结果,不需要对anchor进行设计,因此也不需要进行非极大值抑制,提高了旋转目标检测的检测准确率,进而提高了旋转目标检测的检测性能。An embodiment of the present invention provides a target detection method, which aims to solve the problem of low detection accuracy of the rotating target detection algorithm in the existing target detection process. By extracting the classification features and auxiliary features of the rotating target to be detected, the auxiliary features are used to assist the classification features, so as to obtain the detection result of the rotating target to be detected, and there is no need to design the anchor, so there is no need for non-maximum value Suppression, improves the detection accuracy of rotating object detection, and then improves the detection performance of rotating object detection.
第一方面,本发明实施例提供一种目标检测方法,所述目标检测方法用于旋转目标的检测,所述方法包括:In the first aspect, an embodiment of the present invention provides a target detection method, the target detection method is used for detecting a rotating target, and the method includes:
获取待检测图像,所述待检测图像包括待检测旋转目标;Acquiring an image to be detected, the image to be detected includes a rotating target to be detected;
通过训练好的目标检测模型对所述待检测图像进行特征提取,得到所述待检测旋转目标的分类特征和辅助特征;performing feature extraction on the image to be detected through the trained target detection model to obtain classification features and auxiliary features of the rotating target to be detected;
基于所述辅助特征对所述分类特征进行辅助处理,得到所述待检测旋转目标的检测结果。An auxiliary process is performed on the classification feature based on the auxiliary feature to obtain a detection result of the rotating target to be detected.
可选的,所述训练好的目标检测模型包括特征提取网络、特征融合网络、分类特征输出网络、辅助特征输出网络,所述通过训练好的目标检测模型对所述待检测图像进行特征提取,得到所述待检测旋转目标的分类特征和辅助特征,包括:Optionally, the trained target detection model includes a feature extraction network, a feature fusion network, a classification feature output network, and an auxiliary feature output network, and the trained target detection model performs feature extraction on the image to be detected, The classification features and auxiliary features of the rotating target to be detected are obtained, including:
通过所述特征提取网络对所述待检测图像进行特征提取,得到所述待检测图像的多尺度特征;performing feature extraction on the image to be detected through the feature extraction network to obtain multi-scale features of the image to be detected;
通过所述特征融合网络对所述多尺度特征进行特征融合,得到所述待检测图像的融合特征;performing feature fusion on the multi-scale features through the feature fusion network to obtain the fusion features of the image to be detected;
通过所述分类特征输出网络对所述融合特征进行预测,得到所述待检测旋转目标的分类特征,所述分类特征包括特征通道,不同类别的待检测旋转目标对应于不同的特征通道;The fusion feature is predicted by the classification feature output network to obtain the classification feature of the rotating target to be detected, the classification feature includes a feature channel, and different types of rotating targets to be detected correspond to different feature channels;
通过所述辅助特征输出网络对所述融合特征进行预测,得到所述待检测旋转目标的辅助特征。The fusion feature is predicted by the auxiliary feature output network to obtain the auxiliary feature of the rotating target to be detected.
可选的,所述辅助特征包括高宽特征以及旋转角度特征,所述基于所述辅助特征对所述分类特征进行辅助处理,得到所述待检测旋转目标的检测结果,包括:Optionally, the auxiliary feature includes a height-width feature and a rotation angle feature, and performing auxiliary processing on the classification feature based on the auxiliary feature to obtain a detection result of the rotating target to be detected includes:
对所述分类特征进行关键点提取,得到所述待检测旋转目标的目标关键点;Carrying out key point extraction on the classification features to obtain target key points of the rotating target to be detected;
基于所述目标关键点,在所述高宽特征中索引对应的目标高宽属性,以及在所述旋转角度特征中索引对应的目标旋转角度属性;Based on the target key point, indexing a corresponding target height and width attribute in the height and width feature, and indexing a corresponding target rotation angle attribute in the rotation angle feature;
基于所述目标关键点、所述目标高宽属性以及所述目标旋转角度属性,得到所述待检测旋转目标的检测结果。A detection result of the rotating object to be detected is obtained based on the key point of the object, the attribute of the height and width of the object, and the attribute of the rotation angle of the object.
可选的,在所述通过训练好的目标检测模型对所述待检测图像进行特征提取,得到所述待检测旋转目标的分类特征和辅助特征之前,所述方法还包括:Optionally, before performing feature extraction on the image to be detected through the trained target detection model to obtain classification features and auxiliary features of the rotating target to be detected, the method further includes:
获取训练数据集,所述训练数据集中包括样本图像以及标注框,所述样本图像中包括样本旋转目标,所述标注框为所述样本旋转目标的标注框;Obtain a training data set, the training data set includes a sample image and a label frame, the sample image includes a sample rotation target, and the label frame is a label frame of the sample rotation target;
获取目标检测模型,并通过所述训练数据集对的目标检测模型进行训练,得到训练好的目标检测模型,所述目标检测模型包括特征提取网络、特征融合网络、分类特征输出网络、高宽特征输出网络以及旋转角度特征输出网络。Obtain the target detection model, and train the target detection model through the training data set to obtain the trained target detection model. The target detection model includes a feature extraction network, a feature fusion network, a classification feature output network, and a height and width feature. Output network and rotation angle feature output network.
可选的,所述获取目标检测模型,并通过所述训练数据集对的目标检测模型进行训练,得到训练好的目标检测模型,包括:Optionally, the acquisition of the target detection model, and training the target detection model through the training data set to obtain a trained target detection model, including:
将所述样本图像输入到所述目标检测模型,得到所述样本旋转目标对应的样本检测框;Inputting the sample image into the target detection model to obtain a sample detection frame corresponding to the sample rotation target;
将所述样本检测框与所述标注框分别通过预设的编码函数进行编码,分别得到所述样本检测框对应的样本函数分布,以及所述标注框对应的标注函数分布;Encoding the sample detection frame and the label frame respectively by a preset encoding function, respectively obtaining a sample function distribution corresponding to the sample detection frame and a label function distribution corresponding to the label frame;
根据所述样本函数分布与所述标注函数分布之间的度量距离,对所述目标检测模型进行网络参数调整;adjusting the network parameters of the target detection model according to the metric distance between the sample function distribution and the label function distribution;
对所述目标检测模型的网络参数调整过程进行迭代,直到所述目标检测模型收敛或达到预设的迭代次数,得到训练好的目标检测模型。The network parameter adjustment process of the target detection model is iterated until the target detection model converges or reaches a preset number of iterations to obtain a trained target detection model.
可选的,所述将所述样本图像输入到所述目标检测模型,得到所述样本旋转目标对应的样本检测框,包括:Optionally, the inputting the sample image into the target detection model to obtain the sample detection frame corresponding to the sample rotation target includes:
通过所述目标检测模型对所述样本图像进行处理,得到所述样本图像对应的样本特征图,并根据所述样本特征图的高宽,对所述样本特征图构建矩阵网格,所述样本特征图像包括所述样本图像对应分类特征图、高宽特征图与旋转角度特征图;Process the sample image through the target detection model to obtain a sample feature map corresponding to the sample image, and construct a matrix grid for the sample feature map according to the height and width of the sample feature map, the sample The feature image includes a classification feature map, a height-width feature map, and a rotation angle feature map corresponding to the sample image;
在所述高宽特征图中每个网格点建立一个高宽属性的索引,以及在所述旋转角度特征图中每个网格点建立一个旋转角度属性的索引;Establishing an index of a height and width attribute for each grid point in the height-width feature map, and establishing an index of a rotation angle attribute for each grid point in the rotation angle feature map;
根据所述样本特征图中每个网格点及其对应的索引属性,得到所述样本旋转目标对应的样本检测框。According to each grid point in the sample feature map and its corresponding index attribute, a sample detection frame corresponding to the sample rotation target is obtained.
可选的,所述标注框内包括标注关键点,所述根据所述样本函数分布与所述标注函数分布之间的度量距离,对所述目标检测模型进行网络参数调整,包括:Optionally, the marked key points are included in the marked frame, and the network parameter adjustment of the target detection model is performed according to the measurement distance between the sample function distribution and the marked function distribution, including:
根据分类特征图的样本关键点,计算所述样本关键点与所述标注关键点之间的第一损失;Calculate the first loss between the sample key point and the label key point according to the sample key point of the classification feature map;
通过预设的转换函数将所述度量距离转换为第二损失;converting the metric distance into a second loss through a preset conversion function;
基于所述第一损失与第二损失,对所述目标检测模型进行网络参数调整。Based on the first loss and the second loss, network parameters are adjusted for the target detection model.
第二方面,本发明实施例提供一种目标检测装置,所述装置包括:In a second aspect, an embodiment of the present invention provides a target detection device, the device comprising:
第一获取模块,用于获取待检测图像,所述待检测图像包括待检测旋转目标;A first acquiring module, configured to acquire an image to be detected, where the image to be detected includes a rotating target to be detected;
提取模块,用于通过训练好的目标检测模型对所述待检测图像进行特征提取,得到所述待检测旋转目标的分类特征和辅助特征;The extraction module is used to perform feature extraction on the image to be detected through the trained target detection model, and obtain classification features and auxiliary features of the rotating target to be detected;
处理模块,用于基于所述辅助特征对所述分类特征进行辅助处理,得到所述待检测旋转目标的检测结果。A processing module, configured to perform auxiliary processing on the classification feature based on the auxiliary feature to obtain a detection result of the rotating target to be detected.
第三方面,本发明实施例提供一种电子设备,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现本发明实施例提供的目标检测方法中的步骤。In a third aspect, an embodiment of the present invention provides an electronic device, including: a memory, a processor, and a computer program stored on the memory and operable on the processor. When the processor executes the computer program, The steps in the target detection method provided by the embodiment of the present invention are realized.
第四方面,本发明实施例提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现发明实施例提供的目标检测方法中的步骤。In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the target detection method provided by the embodiment of the present invention is implemented. step.
本发明实施例中,获取待检测图像,所述待检测图像包括待检测旋转目标;通过训练好的目标检测模型对所述待检测图像进行特征提取,得到所述待检测旋转目标的分类特征和辅助特征;基于所述辅助特征对所述分类特征进行辅助处理,得到所述待检测旋转目标的检测结果。通过提取待检测旋转目标的分类特征和辅助特征,利用辅助特征来对分类特征进行辅助处理,从而得到待检测旋转目标的检测结果,不需要对anchor进行设计,因此也不需要进行非极大值抑制,提高了旋转目标检测的检测准确率,进而提高了旋转目标检测的检测性能。In the embodiment of the present invention, the image to be detected is obtained, and the image to be detected includes the rotating target to be detected; the feature extraction is performed on the image to be detected through the trained target detection model, and the classification features and the classification features of the rotating target to be detected are obtained. Auxiliary features: performing auxiliary processing on the classification features based on the auxiliary features to obtain a detection result of the rotating target to be detected. By extracting the classification features and auxiliary features of the rotating target to be detected, the auxiliary features are used to assist the classification features, so as to obtain the detection result of the rotating target to be detected, and there is no need to design the anchor, so there is no need for non-maximum value Suppression, improves the detection accuracy of rotating object detection, and then improves the detection performance of rotating object detection.
附图说明Description of drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without creative work.
图1是本发明实施例提供的一种目标检测方法的流程图;FIG. 1 is a flow chart of a target detection method provided by an embodiment of the present invention;
图2是本发明实施例提供的一种目标检测模型的结构示意图;Fig. 2 is a schematic structural diagram of a target detection model provided by an embodiment of the present invention;
图3是本发明实施例提供的一种目标检测装置的结构示意图;Fig. 3 is a schematic structural diagram of a target detection device provided by an embodiment of the present invention;
图4是本发明实施例提供的一种电子设备的结构示意图。Fig. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.
请参见图1,图1是本发明实施例提供的一种目标检测方法的流程图,如图1所示,该目标检测方法用于旋转目标的检测,该目标检测方法包括以下步骤:Please refer to FIG. 1. FIG. 1 is a flowchart of a target detection method provided by an embodiment of the present invention. As shown in FIG. 1, the target detection method is used for detecting a rotating target, and the target detection method includes the following steps:
101、获取待检测图像。101. Acquire an image to be detected.
在本发明实施例中,待检测图像包括待检测旋转目标。上述待检测图像可以是侧面图像、俯视图像、仰视图像等,侧面图像可以是从目标侧面拍摄到的图像,俯视图像可以是从目标上方拍摄到的图像,仰视图像可以是从目标下方拍摄到的图像。In the embodiment of the present invention, the image to be detected includes a rotating target to be detected. The image to be detected above can be a side image, a top view image, an up view image, etc., the side image can be an image taken from the side of the target, the top view image can be an image taken from above the target, and the look up image can be taken from the bottom of the target image.
上述待检测旋转目标可以是人员、车辆、飞机、建筑、物品等具有实体的目标。The above-mentioned rotating target to be detected may be a solid target such as a person, a vehicle, an airplane, a building, or an object.
102、通过训练好的目标检测模型对待检测图像进行特征提取,得到待检测旋转目标的分类特征和辅助特征。102. Perform feature extraction on the image to be detected through the trained target detection model, and obtain classification features and auxiliary features of the rotating target to be detected.
在本发明实施例中,可以将待检测图像输入到训练好的目标检测模型中,通过目标检测模型对待检测图像进行特征提取,得到待检测旋转目标的分类特征和辅助特征。In the embodiment of the present invention, the image to be detected can be input into the trained target detection model, and feature extraction is performed on the image to be detected through the target detection model to obtain classification features and auxiliary features of the rotating target to be detected.
在一种可能的实施例中,在将待检测图像输入到训练好的目标检测模型之前,可以对待检测图像进行预处理,上述预处理可以包括图像像素归一化和宽高缩放到H0×W0大小,其中,H0和W0的大小为32的整数倍。In a possible embodiment, before the image to be detected is input into the trained target detection model, the image to be detected can be preprocessed, and the above preprocessing can include normalization of image pixels and scaling of width and height to H 0 × W 0 size, where the sizes of H 0 and W 0 are integer multiples of 32.
上述待检测旋转目标的分类特征中包含了待检测旋转目标的类别信息,比如待检测旋转目标的类别为人员、车辆、飞机、建筑、物品等。上述待检测旋转目标的辅助特征可以包含待检测旋转目标高宽以及旋转角度等属性信息。The classification features of the rotating objects to be detected include category information of the rotating objects to be detected, for example, the categories of the rotating objects to be detected are people, vehicles, airplanes, buildings, objects, and the like. The above-mentioned auxiliary features of the rotating object to be detected may include attribute information such as height, width and rotation angle of the rotating object to be detected.
具体的,训练好的目标检测模型包括分类特征分支结构以及辅助特征分支结构,训练好的目标检测模型可以先从待检测图像中提取共用特征,通过分类特征分支结构输出对应的分类特征,通过辅助特征分支结构输出对应的辅助特征。分类特征分支结构与辅助特征分支结构具有不同的结构参数。Specifically, the trained target detection model includes a classification feature branch structure and an auxiliary feature branch structure. The trained target detection model can first extract common features from the image to be detected, and output the corresponding classification features through the classification feature branch structure. The feature branch structure outputs the corresponding auxiliary features. The classification feature branch structure has different structural parameters from the auxiliary feature branch structure.
进一步的,上述目标检测模型可以是基于深度卷积神经网络进行构建的,对深度卷积神经网络进行训练后,得到训练好的目标检测模型。具体可以收集样本图像,样本图像中包括样本旋转目标,样本旋转目标可以是人员、车辆、飞机、建筑、物品等目标,对样本图像中的样本旋转目标进行标注,得到对应的标签数据,标注包括对应于分类特征的类别标注,以及对应于辅助特征的属性标注,属性标注可以包括高宽标注和旋转角度标注。通过样本图像和对应的标签数据对深度卷积神经网络进行训练,使得深度卷积神经网络学习到旋转目标的分类特征和旋转目标的辅助特征进行输出,训练完成得到训练好的目标检测模型。Further, the above target detection model may be constructed based on a deep convolutional neural network, and after training the deep convolutional neural network, a trained target detection model is obtained. Specifically, sample images can be collected. The sample images include sample rotation targets. The sample rotation targets can be people, vehicles, aircraft, buildings, objects, etc., and the sample rotation targets in the sample images are marked to obtain corresponding label data. The labels include The category label corresponding to the classification feature, and the attribute label corresponding to the auxiliary feature, the attribute label can include height and width labels and rotation angle labels. The deep convolutional neural network is trained through sample images and corresponding label data, so that the deep convolutional neural network learns the classification features of the rotating target and the auxiliary features of the rotating target for output, and the trained target detection model is obtained after training.
103、基于辅助特征对分类特征进行辅助处理,得到待检测旋转目标的检测结果。103. Perform auxiliary processing on the classification feature based on the auxiliary feature to obtain a detection result of the rotating target to be detected.
在本发明实施例中,待检测旋转目标的辅助特征可以包含待检测旋转目标高宽以及旋转角度等属性信息,可以在辅助特征索引对应的属性信息添加到分类特征中,从而得到待检测旋转目标的检测结果。In the embodiment of the present invention, the auxiliary features of the rotating object to be detected may include attribute information such as the height, width, and rotation angle of the rotating object to be detected, and the attribute information corresponding to the auxiliary feature index may be added to the classification feature, thereby obtaining the rotating object to be detected test results.
上述检测结果可以包括待检测旋转目标的位置、类别、高宽以及旋转角度等。The above detection results may include the position, category, height, width, and rotation angle of the rotating object to be detected.
本发明实施例中,获取待检测图像,所述待检测图像包括待检测旋转目标;通过训练好的目标检测模型对所述待检测图像进行特征提取,得到所述待检测旋转目标的分类特征和辅助特征;基于所述辅助特征对所述分类特征进行辅助处理,得到所述待检测旋转目标的检测结果。通过提取待检测旋转目标的分类特征和辅助特征,利用辅助特征来对分类特征进行辅助处理,从而得到待检测旋转目标的检测结果,不需要对anchor进行设计,因此也不需要进行非极大值抑制,提高了旋转目标检测的检测准确率,进而提高了旋转目标检测的检测性能。In the embodiment of the present invention, the image to be detected is obtained, and the image to be detected includes the rotating target to be detected; the feature extraction is performed on the image to be detected through the trained target detection model, and the classification features and the classification features of the rotating target to be detected are obtained. Auxiliary features: performing auxiliary processing on the classification features based on the auxiliary features to obtain a detection result of the rotating target to be detected. By extracting the classification features and auxiliary features of the rotating target to be detected, the auxiliary features are used to assist the classification features, so as to obtain the detection result of the rotating target to be detected, and there is no need to design the anchor, so there is no need for non-maximum value Suppression, improves the detection accuracy of rotating object detection, and then improves the detection performance of rotating object detection.
可选的,训练好的目标检测模型包括特征提取网络、特征融合网络、分类特征输出网络、辅助特征输出网络,在通过训练好的目标检测模型对待检测图像进行特征提取,得到待检测旋转目标的分类特征和辅助特征的步骤中,可以通过特征提取网络对待检测图像进行特征提取,得到待检测图像的多尺度特征;通过特征融合网络对多尺度特征进行特征融合,得到待检测图像的融合特征;通过分类特征输出网络对融合特征进行预测,得到待检测旋转目标的分类特征,分类特征包括特征通道,不同类别的待检测旋转目标对应于不同的特征通道;通过辅助特征输出网络对融合特征进行预测,得到待检测旋转目标的辅助特征。Optionally, the trained target detection model includes a feature extraction network, a feature fusion network, a classification feature output network, and an auxiliary feature output network. After the trained target detection model is used to perform feature extraction on the image to be detected, the image of the rotating target to be detected is obtained. In the steps of classifying features and auxiliary features, the feature extraction network can be used to extract the features of the image to be detected to obtain the multi-scale features of the image to be detected; the feature fusion network is used to perform feature fusion on the multi-scale features to obtain the fusion features of the image to be detected; Predict the fusion features through the classification feature output network to obtain the classification features of the rotating target to be detected. The classification features include feature channels, and different types of rotating targets to be detected correspond to different feature channels; the fusion features are predicted through the auxiliary feature output network. , to obtain the auxiliary features of the rotating target to be detected.
在本发明实施例中,上述特征提取网络可以是backbone网络构成,比如VGG19,ResNet、MobileNet等,本发明实施例不对特征提取网络做任何限制。上述特征提取网络可以提高待检测图像在不同尺度下的特征,得到待检测图像的多尺度特征。需要说明的是,在特征提取网络中,由于下采样层的存在,随着特征提取网络的计算深度越深,提取到的特征尺度越小。In the embodiment of the present invention, the above-mentioned feature extraction network may be composed of a backbone network, such as VGG19, ResNet, MobileNet, etc., and the embodiment of the present invention does not impose any restrictions on the feature extraction network. The above feature extraction network can improve the features of the image to be detected at different scales, and obtain the multi-scale features of the image to be detected. It should be noted that in the feature extraction network, due to the existence of the downsampling layer, as the calculation depth of the feature extraction network becomes deeper, the extracted feature scale becomes smaller.
上述特征融合网络可以包括上采样层和融合层,通过上采样层对尺寸较小的特征进行上采样,使得尺寸较小的特征被上采样为尺寸较大的特征,进而通过融合层将上采样后的特征与相同尺寸的特征进行融合。具体的,特征融合网络从特征提取网络的不同阶段特征中提取多尺度特征,并逐一将小尺度的特征进行2倍上采样并与从特征提取网络的相同尺度的特征融合,最后将高尺度的融合特征输出到预测网络。The above-mentioned feature fusion network may include an upsampling layer and a fusion layer, through which the features of smaller size are upsampled through the upsampling layer, so that the features of smaller size are upsampled into features of larger size, and then the upsampling layer is further sampled through the fusion layer The final features are fused with features of the same size. Specifically, the feature fusion network extracts multi-scale features from the features of different stages of the feature extraction network, and performs 2 times upsampling on the small-scale features one by one and fuses them with the features of the same scale from the feature extraction network. Finally, the high-scale features The fused features are output to the prediction network.
上述分类特征输出网络与辅助特征输出网络也可以称为预测网络,通过分类特征输出网络输出对应的分类特征,输出的分类特征中可以包括多个特征通道,每个特征通道对应一个待检测旋转目标的一个类别。The above-mentioned classification feature output network and auxiliary feature output network can also be referred to as prediction networks. The corresponding classification features are output through the classification feature output network. The output classification features can include multiple feature channels, and each feature channel corresponds to a rotating target to be detected. a category of .
具体的,上述分类特征输出网络可以是基于CenterNetR的分类特征输出网络,可以将融合特征输入到基于CenterNetR的分类特征输出网络,通过CenterNetR预测待检测旋转目标的中心点热力图作为分类特征,不同类别的中心点热力图被分布在不同的特征通道中,因此,可以由特征通道确定待检测旋转目标的类别。Specifically, the above-mentioned classification feature output network can be a classification feature output network based on CenterNetR, and the fusion feature can be input to the classification feature output network based on CenterNetR, and the central point heat map of the rotating target to be detected can be predicted by CenterNetR as a classification feature. The center point heat map of is distributed in different feature channels, therefore, the category of the rotating target to be detected can be determined by the feature channel.
上述辅助特征也可以是基于CenterNetR的属性特征输出网络,可以将融合特征输入到基于CenterNetR的属性特征输出网络,通过CenterNetR预测每个中心点对应的属性信息作为辅助信息,其中上述辅助特征的尺度分辨率与分类特征的尺度分辨率相同。上述辅助特征可以包含待检测旋转目标高宽以及旋转角度等属性信息。在辅助特征中,每个位置点均对应一组属性信息,通过分类特征的中心点位置,可以索引到对应位置的属性信息。The above-mentioned auxiliary features can also be an attribute feature output network based on CenterNetR, and the fusion feature can be input to the attribute feature output network based on CenterNetR, and the attribute information corresponding to each center point can be predicted by CenterNetR as auxiliary information, wherein the scale resolution of the above-mentioned auxiliary features The rate is the same as the scale resolution of the categorical features. The aforementioned auxiliary features may include attribute information such as the height, width and rotation angle of the rotating object to be detected. In the auxiliary feature, each position point corresponds to a set of attribute information, and the attribute information of the corresponding position can be indexed through the center point position of the classification feature.
可选的,辅助特征包括高宽特征以及旋转角度特征,基于辅助特征对分类特征进行辅助处理,得到待检测旋转目标的检测结果的步骤中,可以对分类特征进行关键点提取,得到待检测旋转目标的目标关键点;基于目标关键点,在高宽特征中索引对应的目标高宽属性,以及在旋转角度特征中索引对应的目标旋转角度属性;基于目标关键点、目标高宽属性以及目标旋转角度属性,得到待检测旋转目标的检测结果。Optionally, the auxiliary features include height-width features and rotation angle features. In the step of obtaining the detection result of the rotating target to be detected by performing auxiliary processing on the classification features based on the auxiliary features, key points can be extracted from the classification features to obtain the rotation to be detected. The target key point of the target; based on the target key point, index the corresponding target height and width attributes in the height and width feature, and index the corresponding target rotation angle attribute in the rotation angle feature; based on the target key point, target height and width attributes, and target rotation Angle attribute, get the detection result of the rotating target to be detected.
在本发明实施例中,上述特征包括高宽特征以及旋转角度特征,上述高宽特征对应于待检测旋转目标的高宽属性,上述旋转角度特征对应于待检测旋转目标的旋转角度属性。分类特征中每个位置点对应于一个高宽特征中的位置点以及一个转角度特征中的位置点,因此,可以根据分类特征中的目标关键点,在高宽特征中对应位置点索引到高宽属性作为目标,在旋转角度特征中对应位置点索引到旋转角度属性。In the embodiment of the present invention, the features include a height-width feature and a rotation angle feature, the height-width feature corresponds to the height-width attribute of the rotating object to be detected, and the rotation angle feature corresponds to the rotation angle attribute of the rotating object to be detected. Each position point in the classification feature corresponds to a position point in a height-width feature and a position point in a rotation angle feature. Therefore, according to the target key point in the classification feature, the corresponding position point in the height-width feature can be indexed to the height The wide attribute is used as the target, and the corresponding position point in the rotation angle feature is indexed to the rotation angle attribute.
具体的,高宽特征以及旋转角度特征均与分类特征具有相同的尺度分辨率,上述分类特征的高为H,宽为W,同样的,高宽特征的高为H,宽为W,旋转角度特征的高为H,宽为W。分类特征可以是中心点热力图,热力图的中心点为热力值最高的位置点,该中心点也可以作为目标关键点。具体的,在得到分类特征后,可以通过一个n*n的最大池化核对分类特征进行采样,并根据预设的置信度阈值得到高置信度的关键点作为目标关键点。其中,n小于H,且n小于W。最大池化核的作用为从在热力图n*n区域中采样最大值。以n=3进行举例,可以在热力图3*3区域中采样出热力值最高的值作为最大池化核的采样值。Specifically, the height-width feature and the rotation angle feature have the same scale resolution as the classification feature. The height of the above-mentioned classification feature is H, and the width is W. Similarly, the height and width feature is H, the width is W, and the rotation angle The height of the feature is H and the width is W. The classification feature can be a center point heat map. The center point of the heat map is the point with the highest heat value, and the center point can also be used as the target key point. Specifically, after the classification features are obtained, the classification features can be sampled through an n*n maximum pooling kernel, and the key points with high confidence are obtained as the target key points according to the preset confidence threshold. Wherein, n is smaller than H, and n is smaller than W. The role of the maximum pooling kernel is to sample the maximum value from the n*n area of the heat map. Taking n=3 as an example, the value with the highest thermal value can be sampled in the 3*3 area of the thermal map as the sampling value of the maximum pooling kernel.
在得到目标关键点(i,j)后,其中,(i,j)表示目标关键点在分类特征中的位置坐标。可以根据目标关键点(i,j)在高宽特征中索引到对应的目标高宽属性(w,h),其中w表示待检测旋转目标在分类特征中的宽度,h表示待检测旋转目标在分类特征中的高度。同时,可根据目标关键点(i,j)在高宽特征中索引到对应的目标旋转角度属性θ,其中θ表示待检测旋转目标在分类特征中的旋转角度。After obtaining the target key point (i, j), where (i, j) represents the position coordinates of the target key point in the classification feature. The corresponding target height and width attribute (w, h) can be indexed in the height and width feature according to the target key point (i, j), where w represents the width of the rotation target to be detected in the classification feature, and h represents the rotation target to be detected in Height in categorical features. At the same time, the corresponding target rotation angle attribute θ can be indexed in the height-width feature according to the target key point (i, j), where θ represents the rotation angle of the rotation target to be detected in the classification feature.
根据目标关键点(i,j)、目标高宽属性(w,h)以及目标旋转角度属性θ,可以得到待检测旋转目标的检测结果(i,j,w,h,θ)。According to the target key point (i, j), the target height and width attribute (w, h) and the target rotation angle attribute θ, the detection result (i, j, w, h, θ) of the rotating target to be detected can be obtained.
在一种可能的实施例中,上述辅助特征还可以包含偏置特征,上述偏置特征用于描述目标关键点的偏移量。具体的,上述偏置特征可以在目标检测模型中增加偏置特征输出网络,通过偏置特征输出网络对融合特征进行预测得到。偏置特征与分类特征也具有相同的尺度分辨率,偏置特征的高为H,宽为W。分类特征中每个位置点对应于一个偏置特征中的位置点。可以根据目标关键点(i,j)在偏置特征索引到对应的目标偏置属性(dx,dy),则待检测旋转目标的最终位置为(x,y),其中,x=i+dx,y=j+dy。结合目标高宽属性(w,h)以及目标旋转角度属性θ,可以得到待检测旋转目标的检测结果(x,y,w,h,θ)。通过目标关键点在偏置特征索引得到对应目标偏置属性,可以更准确的确定待检测旋转目标的位置。In a possible embodiment, the above-mentioned auxiliary feature may further include an offset feature, and the above-mentioned offset feature is used to describe the offset of the target key point. Specifically, the above bias feature can be obtained by adding a bias feature output network to the target detection model, and predicting the fusion feature through the bias feature output network. The bias feature and classification feature also have the same scale resolution, the height of the bias feature is H, and the width is W. Each location point in the classification feature corresponds to a location point in the bias feature. The target key point (i, j) can be indexed to the corresponding target bias attribute (dx, dy) in the bias feature, then the final position of the rotating target to be detected is (x, y), where x=i+dx , y=j+dy. Combined with the target height and width attributes (w, h) and the target rotation angle attribute θ, the detection result (x, y, w, h, θ) of the rotating target to be detected can be obtained. By obtaining the corresponding target offset attribute of the target key point in the offset feature index, the position of the rotating target to be detected can be determined more accurately.
可选的,在通过训练好的目标检测模型对待检测图像进行特征提取,得到待检测旋转目标的分类特征和辅助特征之前,还可以获取训练数据集,训练数据集中包括样本图像以及标注框,样本图像中包括样本旋转目标,标注框为样本旋转目标的标注框;获取目标检测模型,并通过训练数据集对的目标检测模型进行训练,得到训练好的目标检测模型,目标检测模型包括特征提取网络、特征融合网络、分类特征输出网络、高宽特征输出网络以及旋转角度特征输出网络。Optionally, before performing feature extraction on the image to be detected through the trained target detection model to obtain the classification features and auxiliary features of the rotating target to be detected, a training data set can also be obtained. The training data set includes sample images and label boxes, sample The image includes the sample rotation target, and the label frame is the label frame of the sample rotation target; obtain the target detection model, and train the target detection model through the training data set to obtain the trained target detection model, the target detection model includes the feature extraction network , feature fusion network, classification feature output network, height and width feature output network, and rotation angle feature output network.
在本发明实施例中,在通过训练好的目标检测模型对待检测图像进行特征提取之前,可以对目标检测模型进行训练。可以收集包含样本旋转目标的样本图像进行标注,得到从而训练数据集,样本旋转目标与待检测旋转目标具有相同的类别。可以通过专家人员在样本图像中对旋转目标进行标注,得到标注框,标框包括样本旋转目标的类别、标注框位置、标注框高宽以及标注框旋转角度,在一种可能的实施例中,辅助特征包括偏置特征,则标注框目标偏置,此时,目标检测模型还包括偏置特征输出网络。需要说明的是,分类特征输出网络、高宽特征输出网络、旋转角度特征输出网络以及偏置特征输出网络均为独立且并行的分支网络。In the embodiment of the present invention, the target detection model may be trained before feature extraction is performed on the image to be detected through the trained target detection model. A sample image containing a sample rotating object can be collected for labeling to obtain a training data set, and the sample rotating object has the same category as the rotating object to be detected. Experts can mark the rotating target in the sample image to obtain a labeling frame, which includes the category of the sample rotating target, the position of the labeling frame, the height and width of the labeling frame, and the rotation angle of the labeling frame. In a possible embodiment, If the auxiliary features include bias features, then the target of the annotation box is biased. At this time, the target detection model also includes a bias feature output network. It should be noted that the classification feature output network, the height-width feature output network, the rotation angle feature output network, and the bias feature output network are all independent and parallel branch networks.
具体的,请参考图2,图2是本发明实施例提供的一种目标检测模型的结构示意图,如图2所示,在目标检测模型中,特征提取网络的输出与特征融合网络的输入连接,特征融合网络的输出分别与分类特征输出网络、高宽特征输出网络、旋转角度特征输出网络以及偏置特征输出网络的输入连接。Specifically, please refer to FIG. 2. FIG. 2 is a schematic structural diagram of a target detection model provided by an embodiment of the present invention. As shown in FIG. 2, in the target detection model, the output of the feature extraction network is connected to the input of the feature fusion network , the output of the feature fusion network is respectively connected to the input of the classification feature output network, the height-width feature output network, the rotation angle feature output network and the bias feature output network.
通过训练数据集对目标检测模型进行训练,在训练过程中,迭代调整特征提取网络、特征融合网络、分类特征输出网络、高宽特征输出网络、旋转角度特征输出网络以及偏置特征输出网络中的网络参数,直到目标检测模型收敛或达到预设的迭代次数,得到训练好的目标检测模型。The target detection model is trained through the training data set. During the training process, iteratively adjust the feature extraction network, feature fusion network, classification feature output network, height and width feature output network, rotation angle feature output network and bias feature output network. Network parameters until the target detection model converges or reaches the preset number of iterations to obtain a trained target detection model.
可选的,在获取目标检测模型,并通过训练数据集对的目标检测模型进行训练,得到训练好的目标检测模型的步骤中,可以将样本图像输入到目标检测模型,得到样本旋转目标对应的样本检测框;将样本检测框与标注框分别通过预设的编码函数进行编码,分别得到样本检测框对应的样本函数分布,以及标注框对应的标注函数分布;根据样本函数分布与标注函数分布之间的度量距离,对目标检测模型进行网络参数调整;对目标检测模型的网络参数调整过程进行迭代,直到目标检测模型收敛或达到预设的迭代次数,得到训练好的目标检测模型。Optionally, in the step of obtaining the target detection model and training the target detection model through the training data set to obtain the trained target detection model, the sample image can be input into the target detection model to obtain the corresponding The sample detection frame; the sample detection frame and the label frame are respectively encoded by the preset encoding function, and the sample function distribution corresponding to the sample detection frame and the label function distribution corresponding to the label frame are respectively obtained; according to the difference between the sample function distribution and the label function distribution Measure the distance between them, adjust the network parameters of the target detection model; iterate the network parameter adjustment process of the target detection model until the target detection model converges or reaches the preset number of iterations, and a trained target detection model is obtained.
在本发明实施例中,可以在训练过程中,将样本图像输入到目标检测模型中,得到目标检测模型输出的样本检测框。样本检测框可以通过目标检测模型中分类特征输出网络、高宽特征输出网络、旋转角度特征输出网络以及偏置特征输出网络输出的预测结果得到。在得到样本检测框后,可以计算样本检测框与标注检测框之间的损失,通过样本检测框与标注检测框之间的损失进行反向传播,调整目标检测模型中各个网络的网络参数,迭代上述过程,完成对目标检测模型的训练。In the embodiment of the present invention, during the training process, the sample image may be input into the target detection model to obtain the sample detection frame output by the target detection model. The sample detection frame can be obtained through the prediction results output by the classification feature output network, height and width feature output network, rotation angle feature output network and bias feature output network in the target detection model. After the sample detection frame is obtained, the loss between the sample detection frame and the marked detection frame can be calculated, and the loss between the sample detection frame and the marked detection frame can be backpropagated to adjust the network parameters of each network in the target detection model, iteratively The above process completes the training of the target detection model.
进一步的,为了提高辅助特征的辅助效果,进而提高目标检测模型的检测准确性,本发明实施例将样本检测框和标注框分别通过编码函数进行编码,通过编码函数将样本检测框中分类特征与辅助特征进行编码耦合,具体的,通过编码函数将标注框位置与宽高特征、旋转角度特征进行编码耦合,从而使目标检测模型学习到这种耦合联系,使得训练好的目标检测模型输出更准确的辅助特征。Furthermore, in order to improve the auxiliary effect of the auxiliary features and further improve the detection accuracy of the target detection model, the embodiment of the present invention encodes the sample detection frame and the label frame respectively through the encoding function, and uses the encoding function to combine the classification features and the The auxiliary features are encoded and coupled. Specifically, the position of the label box is encoded and coupled with the width and height features, and the rotation angle features through the encoding function, so that the target detection model can learn this coupling connection, making the output of the trained target detection model more accurate. auxiliary features.
更进一步的,上述编码函数可以是非线性分布函数,比如可以是二维高斯分布函数,具体的,可以如下述式子所示:Furthermore, the above encoding function may be a nonlinear distribution function, such as a two-dimensional Gaussian distribution function. Specifically, it may be shown in the following formula:
μ=(x,y)T μ=(x,y) T
其中,上述(x,y,w,h,θ)为检测框的表达形式,(x,y)为检测框的中心点坐标,(w,h)为检测框的宽和高,θ为检测框中旋转目标的旋转角度。通过上述式子,将检测框(x,y,w,h,θ)编码为二维高斯分布形式(μ,Σ),具体的,μ表示转换二维高斯分布的均值,Σ表示转换二维高斯分布的协方差。样本检测框和标注框的编码都可以通过上述式子进行。Among them, the above (x, y, w, h, θ) is the expression form of the detection frame, (x, y) is the coordinates of the center point of the detection frame, (w, h) is the width and height of the detection frame, and θ is the detection frame The rotation angle of the rotation target in the box. Through the above formula, the detection frame (x, y, w, h, θ) is encoded into a two-dimensional Gaussian distribution form (μ, Σ), specifically, μ represents the mean value of the converted two-dimensional Gaussian distribution, and Σ represents the converted two-dimensional Gaussian distribution Covariance of a Gaussian distribution. The encoding of the sample detection frame and the label frame can be performed by the above formula.
在对样本检测框进行编码后,得到样本函数分布(μ1,Σ1);在对标注框进行编码后,得到标注函数分布(μ2,Σ2)。计算样本函数分布(μ1,Σ1)与标注函数分布(μ2,Σ2)之间的度量距离。在样本图像为正样本的情况下,度量距离越小,则说明样本检测框与标注框越相似,检测结果越符合真实结果;度量距离越大,则说明样本检测框与标注框越不相似,检测结果越不符合真实结果。在样本图像为负样本的情况下,度量距离越小,则说明样本检测框与标注框越不相似,检测结果越符合真实结果;度量距离越大,则说明样本检测框与标注框越相似,检测结果越不符合真实结果。After encoding the sample detection frame, the sample function distribution (μ 1 , Σ 1 ) is obtained; after encoding the annotation frame, the label function distribution (μ 2 , Σ 2 ) is obtained. Computes the metric distance between the sample function distribution (μ 1 , Σ 1 ) and the labeled function distribution (μ 2 , Σ 2 ). In the case where the sample image is a positive sample, the smaller the measurement distance, the more similar the sample detection frame is to the label frame, and the more consistent the detection result is with the real result; the larger the measurement distance, the less similar the sample detection frame is to the label frame. The detection results are more inconsistent with the real results. In the case where the sample image is a negative sample, the smaller the measurement distance, the less similar the sample detection frame is to the label frame, and the more consistent the detection result is with the real result; the larger the measurement distance, the more similar the sample detection frame is to the label frame. The detection results are more inconsistent with the real results.
上述度量距离的计算可以采用Wasserstein距离和KL散度等计算方法进行,本发明实施例优选Wasserstein距离来计算样本函数分布(μ1,Σ1)与标注函数分布(μ2,Σ2)之间的度量距离,具体可以如下述式子所示:The calculation of the above-mentioned metric distance can be performed by calculation methods such as Wasserstein distance and KL divergence. In the embodiment of the present invention, the Wasserstein distance is preferred to calculate the relationship between the sample function distribution (μ 1 , Σ 1 ) and the label function distribution (μ 2 , Σ 2 ). The metric distance of can be shown in the following formula:
其中,d为样本函数分布(μ1,Σ1)与标注函数分布(μ2,Σ2)之间的度量距离,Tr()函数表示计算出来的矩阵的迹。Among them, d is the metric distance between the sample function distribution (μ 1 , Σ 1 ) and the label function distribution (μ 2 , Σ 2 ), and the Tr() function represents the trace of the calculated matrix.
在训练过程中,通过对检测检测框和标注框进行编码,使得分类特征与辅助特征通过编码函数进行耦合,提高目标检测模型对于辅助特征的学习能力,使得训练好的目标检测模型能够提取出更准确的辅助特征。In the training process, by encoding the detection frame and the label frame, the classification feature and the auxiliary feature are coupled through the encoding function, which improves the learning ability of the object detection model for the auxiliary feature, so that the trained object detection model can extract more Accurate helper features.
在一种可能的实施例中,目标检测模型中还包括偏置特征输出网络,通过特征提取网络提取样本图像的多尺度特征,通过特征融合网络将样本图像的多尺度特征进行融合。通过分类特征输出网络对融合后的特征进行预测处理,得到分类特征;通过高宽特征输出网络对融合后的特征进行预测处理,得到高宽特征;通过旋转角度特征输出网络对融合后的特征进行预测处理,得到旋转角度特征;通过偏置特征输出网络对融合后的特征进行预测处理,得到偏置特征。通过上述分类特征进行关键点提取,得到目标关键点,基于目标关键点,在偏置特征中索引对应的目标偏置属性,基于目标关键点、目标高宽属性以及目标旋转角度属性,得到样本旋转目标的检测结果,样本旋转目标的检测结果对应样本检测框。In a possible embodiment, the target detection model further includes a bias feature output network, which extracts multi-scale features of the sample image through a feature extraction network, and fuses the multi-scale features of the sample image through a feature fusion network. The fused features are predicted and processed through the classification feature output network to obtain classification features; the fused features are predicted and processed through the height and width feature output network to obtain height and width features; the fused features are obtained through the rotation angle feature output network Prediction processing to obtain the rotation angle feature; predictive processing of the fused features through the bias feature output network to obtain the bias feature. Extract the key points through the above classification features to obtain the target key points. Based on the target key points, index the corresponding target bias attributes in the offset feature, and obtain the sample rotation based on the target key points, target height and width attributes, and target rotation angle attributes. The detection result of the target, the detection result of the sample rotation target corresponds to the sample detection frame.
通过在目标检测模型中增加辅助特征对应的输出网络,将分类特征与辅助特征之间进行耦合,使得训练好的目标检测模型可以输出更准确的辅助特征。By adding the output network corresponding to the auxiliary features in the target detection model, the classification features are coupled with the auxiliary features, so that the trained target detection model can output more accurate auxiliary features.
可选的,在将样本图像输入到目标检测模型,得到样本旋转目标对应的样本检测框的步骤中,可以通过目标检测模型对样本图像进行处理,得到样本图像对应的样本特征图,并根据样本特征图的高宽,对样本特征图构建矩阵网格,样本特征图像包括样本图像对应分类特征图、高宽特征图与旋转角度特征图;在高宽特征图中每个网格点建立一个高宽属性的索引,以及在旋转角度特征图中每个网格点建立一个旋转角度属性的索引;根据样本特征图中每个网格点及其对应的索引属性,得到样本旋转目标对应的样本检测框。Optionally, in the step of inputting the sample image into the target detection model to obtain the sample detection frame corresponding to the sample rotation target, the sample image can be processed through the target detection model to obtain the sample feature map corresponding to the sample image, and according to the sample The height and width of the feature map, constructing a matrix grid for the sample feature map, the sample feature image includes the corresponding classification feature map, height and width feature map and rotation angle feature map of the sample image; each grid point in the height and width feature map establishes a height The index of the wide attribute, and each grid point in the rotation angle feature map establishes an index of the rotation angle attribute; according to each grid point in the sample feature map and its corresponding index attribute, the sample detection corresponding to the sample rotation target is obtained frame.
在本发明实施例中,可以通过目标检测模型对样本图像进行处理,得到样本图像对应的样本特征图。上述样本特征图中包括分类特征图、高宽特征图、以及旋转角度特征图,在一种可能的实施例中,还可以包括偏置特征图。分类特征图、高宽特征图、旋转角度特征图以及偏置特征图具有相同的高H和宽W。In the embodiment of the present invention, the sample image may be processed by a target detection model to obtain a sample feature map corresponding to the sample image. The above sample feature map includes a classification feature map, a height-width feature map, and a rotation angle feature map, and in a possible embodiment, may also include an offset feature map. Classification feature maps, height-width feature maps, rotation angle feature maps, and bias feature maps have the same height H and width W.
可以根据样本特征图的高H和宽W,为样本特征图创建矩阵网格,具体可以通过meshgrid为样本特征图创建矩阵网格。对于一个网格点(i0,j0),在高宽特征图中可以建立对应于该网格点(i0,j0)的检测框宽度属性w0和检测框高度h0的索引关系,以及在旋转角度特征图也可以建立对应于该网格点(i0,j0)的旋转角度属性θ0索引关系,在偏置特征图也可以建立对应于该网格点(i0,j0)的x方向偏置属性dx0和y方向偏置属性dy0的索引关系。对于样本检测框的中心点(i1,j1),则可以索引到样本高宽属性(w1,h1)、样本旋转角度属性θ1、样本偏置属性(dx1,dy1),从而得到样本检测框(x1,y1,w1,h1,θ1)。According to the height H and width W of the sample feature map, a matrix grid can be created for the sample feature map. Specifically, a matrix grid can be created for the sample feature map through meshgrid. For a grid point (i 0 , j 0 ), the index relationship between the detection frame width attribute w 0 and the detection frame height h 0 corresponding to the grid point (i 0 , j 0 ) can be established in the height-width feature map , and the rotation angle attribute θ 0 index relationship corresponding to the grid point (i 0 , j 0 ) can also be established in the rotation angle feature map, and the index relationship corresponding to the grid point (i 0 , j 0 ) can also be established in the offset feature map The index relationship between the x-direction offset attribute dx 0 and the y-direction offset attribute dy 0 of j 0 ). For the center point (i 1 , j 1 ) of the sample detection frame, it can be indexed to the sample height and width attribute (w 1 , h 1 ), the sample rotation angle attribute θ 1 , the sample offset attribute (dx 1 , dy 1 ), Thus, the sample detection frame (x 1 , y 1 , w 1 , h 1 , θ 1 ) is obtained.
通过在目标检测模型中增加辅助特征对应的输出网络,建立分类特征与辅助特征之间的索引关系,使得训练好的目标检测模型可以输出更准确的辅助特征。By adding the output network corresponding to the auxiliary features in the target detection model, the index relationship between the classification features and the auxiliary features is established, so that the trained target detection model can output more accurate auxiliary features.
可选的,标注框内包括标注关键点,在根据样本函数分布与标注函数分布之间的度量距离,对目标检测模型进行网络参数调整的步骤中,可以根据分类特征图的样本关键点,计算样本关键点与标注关键点之间的第一损失;通过预设的转换函数将所述度量距离转换为第二损失;基于第一损失与第二损失,对目标检测模型进行网络参数调整。Optionally, the marked key points are included in the marked frame, and in the step of adjusting the network parameters of the target detection model according to the metric distance between the sample function distribution and the marked function distribution, it is possible to calculate according to the sample key points of the classification feature map The first loss between the sample key point and the labeled key point; the metric distance is converted into a second loss through a preset conversion function; based on the first loss and the second loss, network parameters are adjusted for the target detection model.
在本发明实施例中,上述样本关键点为样本检测框中的中心点,上艺术品样本关键点的计算方式与上述目标关键点的计算方式相同,均是通过最大池化核进行计算得到。上述标注关键点为标注获取,计算样本关键点与标注关键点之间的第一损失,可以是通过第一损失函数进行计算,第一损失函数如下述式子所示:In the embodiment of the present invention, the above-mentioned sample key point is the center point in the sample detection frame, and the calculation method of the above-mentioned artwork sample key point is the same as that of the above-mentioned target key point, both of which are calculated by the maximum pooling kernel. The above-mentioned marked key points are marked to obtain, and the calculation of the first loss between the sample key point and the marked key point can be calculated through the first loss function, and the first loss function is shown in the following formula:
losshm=Guassian_focal_loss(hmpred,hmtarget)loss hm =Guassian_focal_loss(hm pre d,hm target )
其中,losshm为第一损失,hmpred为样本检测框中心点的预测结果,hmtarget为对应标注框中心点的真实标签。Guassian_focal_loss()为第一损失函数。Among them, loss hm is the first loss, hm pred is the prediction result of the center point of the sample detection frame, and hm target is the real label corresponding to the center point of the label frame. Guassian_focal_loss() is the first loss function.
上述第二损失通过预设的转换函数得到,上述预设的转换函数可以是非线性函数,具体的,上述预设的轮换函数可以如下述式子所示:The above-mentioned second loss is obtained through a preset conversion function, and the above-mentioned preset conversion function may be a nonlinear function. Specifically, the above-mentioned preset rotation function may be shown in the following formula:
其中,lossrbbox为第二损失,d为样本函数分布与标注函数分布之间的度量距离,τ为可调整常数。Among them, loss rbbox is the second loss, d is the metric distance between the sample function distribution and the label function distribution, and τ is an adjustable constant.
基于第一损失和第二损失,可以得到分类特征与辅助特征的总损失,该总损失如下述式子所示:Based on the first loss and the second loss, the total loss of classification features and auxiliary features can be obtained, and the total loss is shown in the following formula:
Loss=losshm+λlossrbbox Loss=loss hm +λloss rbbox
上述λ为先验系数,可以在训练过程中根据先验知识进行调整。The above λ is a priori coefficient, which can be adjusted according to prior knowledge during the training process.
在一种可能的实施例中,辅助特征还包括偏置特征,可以计算样本检测框对应的偏置属性与标注框对应的标注偏置属性之间的损失作为第三损失,第三损失可以通过第三损失函数进行计算得到,第三损失函数可以如下述式子所示:In a possible embodiment, the auxiliary feature also includes a bias feature, and the loss between the bias attribute corresponding to the sample detection frame and the label bias attribute corresponding to the label frame can be calculated as the third loss, and the third loss can be obtained by The third loss function is calculated, and the third loss function can be shown in the following formula:
lossoffset=Smooth-L1(offsetpred,offsettarget)loss offset =Smooth-L1(offset pred ,offset target t)
其中,lossoffset为第三损失,hmpred为样本检测框偏置属性的预测结果,hmtarget为对应标注框偏置属性的真实标签。Smooth-L1()为第三损失函数。Among them, loss offset is the third loss, hm pred is the prediction result of the bias attribute of the sample detection frame, and hm target is the real label of the offset attribute of the corresponding annotation frame. Smooth-L1() is the third loss function.
在该实施例中,基于第一损失、第二损失和第三损失,可以得到分类特征与辅助特征的总损失,该总损失如下述式子所示:In this embodiment, based on the first loss, the second loss and the third loss, the total loss of classification features and auxiliary features can be obtained, and the total loss is shown in the following formula:
Loss=losshm+λ1lossoffset+λ2lossrbbox Loss=loss hm +λ 1 loss offset +λ 2 loss rbbox
其中,上述λ1为第一先验系数,λ2为第二先验系数,第一先验系数和第二先验系数均可以在训练过程中根据先验知识进行调整。Wherein, the above-mentioned λ1 is the first priori coefficient, and λ2 is the second priori coefficient, and both the first priori coefficient and the second priori coefficient can be adjusted according to the priori knowledge during the training process.
在本发明实施例中,通过增加辅助特征对应的输出网络,通过对应的第一损失和第二损失对辅助特征进行训练,使得训练好的目标检测模型可以输出更准确的辅助特征。In the embodiment of the present invention, by adding the output network corresponding to the auxiliary features, the auxiliary features are trained through the corresponding first loss and the second loss, so that the trained object detection model can output more accurate auxiliary features.
可选的,在基于辅助特征对分类特征进行辅助处理,得到所述待检测旋转目标的检测结果的步骤中,可以在获得的目标关键点的坐标(i,j)在高宽特征上索引出相应位置的目标宽度属性w和目标高度属性h,在偏置特征上索引出相应位置的x方向偏置dx和y方向偏置dy,在旋转角度特征上索引出相应位置的目标旋转角度;待检测旋转目标可以表示为(x,y,w,h,θ)的形式,其中x=i+dx和y=j+dy;将所有检测结果的(x,y,w,h)四个元素值缩放到待检测图像的原图尺度(x′,y′,w′,h′),因此是最终的待检测旋转目标的检测结果表示形式为(x′,y′,w′,h′,θ)。Optionally, in the step of performing auxiliary processing on the classification features based on the auxiliary features to obtain the detection result of the rotating target to be detected, the obtained coordinates (i, j) of the key points of the target can be indexed on the height and width features Target width attribute w and target height attribute h at the corresponding position, index the x-direction offset dx and y-direction offset dy of the corresponding position on the offset feature, and index the target rotation angle at the corresponding position on the rotation angle feature; Detection of rotating targets can be expressed as (x, y, w, h, θ) in the form of x=i+dx and y=j+dy; the four elements of (x, y, w, h) of all detection results The value is scaled to the original image scale (x', y', w', h') of the image to be detected, so the final detection result representation of the rotating target to be detected is (x', y', w', h' , θ).
本发明实施例中,通过提取待检测旋转目标的分类特征和辅助特征,利用辅助特征来对分类特征进行辅助处理,从而得到待检测旋转目标的检测结果,不需要对anchor进行设计,因此也不需要进行非极大值抑制,提高了旋转目标检测的检测准确率,进而提高了旋转目标检测的检测性能。In the embodiment of the present invention, by extracting the classification features and auxiliary features of the rotating target to be detected, the auxiliary features are used to perform auxiliary processing on the classification features, so as to obtain the detection result of the rotating target to be detected, and there is no need to design the anchor, so there is no need to Non-maximum suppression is required to improve the detection accuracy of rotating target detection, thereby improving the detection performance of rotating target detection.
需要说明的是,本发明实施例提供的目标检测方法可以应用于可以进行目标检测的智能手机、电脑、服务器等设备。It should be noted that the target detection method provided by the embodiment of the present invention can be applied to smart phones, computers, servers and other devices capable of target detection.
可选的,请参见图3,图3是本发明实施例提供的一种目标检测装置的结构示意图,如图3所示,所述装置包括:Optionally, please refer to FIG. 3. FIG. 3 is a schematic structural diagram of a target detection device provided in an embodiment of the present invention. As shown in FIG. 3, the device includes:
第一获取模块301,用于获取待检测图像,所述待检测图像包括待检测旋转目标;The first acquiring
提取模块302,用于通过训练好的目标检测模型对所述待检测图像进行特征提取,得到所述待检测旋转目标的分类特征和辅助特征;The
处理模块303,用于基于所述辅助特征对所述分类特征进行辅助处理,得到所述待检测旋转目标的检测结果。The
可选的,所述训练好的目标检测模型包括特征提取网络、特征融合网络、分类特征输出网络、辅助特征输出网络,所述提取模块302包括:Optionally, the trained target detection model includes a feature extraction network, a feature fusion network, a classification feature output network, and an auxiliary feature output network, and the
第一提取子模块,用于通过所述特征提取网络对所述待检测图像进行特征提取,得到所述待检测图像的多尺度特征;The first extraction submodule is used to perform feature extraction on the image to be detected through the feature extraction network to obtain multi-scale features of the image to be detected;
融合子模块,用于通过所述特征融合网络对所述多尺度特征进行特征融合,得到所述待检测图像的融合特征;The fusion sub-module is used to perform feature fusion on the multi-scale features through the feature fusion network to obtain the fusion features of the image to be detected;
第一预测子模块,用于通过所述分类特征输出网络对所述融合特征进行预测,得到所述待检测旋转目标的分类特征,所述分类特征包括特征通道,不同类别的待检测旋转目标对应于不同的特征通道;The first prediction sub-module is used to predict the fusion feature through the classification feature output network to obtain the classification feature of the rotation target to be detected, the classification feature includes a feature channel, and the rotation target of different categories corresponds to in different feature channels;
第二预测子模块,用于通过所述辅助特征输出网络对所述融合特征进行预测,得到所述待检测旋转目标的辅助特征。The second prediction sub-module is configured to predict the fused feature through the auxiliary feature output network to obtain the auxiliary feature of the rotating target to be detected.
可选的,所述辅助特征包括高宽特征以及旋转角度特征,所述处理模块303包括:Optionally, the auxiliary features include height-width features and rotation angle features, and the
第二提取子模块,用于对所述分类特征进行关键点提取,得到所述待检测旋转目标的目标关键点;The second extraction submodule is used to extract the key points of the classification features to obtain the target key points of the rotating target to be detected;
索引子模块,用于基于所述目标关键点,在所述高宽特征中索引对应的目标高宽属性,以及在所述旋转角度特征中索引对应的目标旋转角度属性;An indexing submodule, configured to index the corresponding target height-width attribute in the height-width feature based on the target key point, and index the corresponding target rotation angle attribute in the rotation angle feature;
第一处理子模块,用于基于所述目标关键点、所述目标高宽属性以及所述目标旋转角度属性,得到所述待检测旋转目标的检测结果。The first processing submodule is configured to obtain a detection result of the rotating object to be detected based on the key points of the object, the object height and width attributes, and the object rotation angle attribute.
可选的,所述装置还包括:Optionally, the device also includes:
获取模块,用于获取训练数据集,所述训练数据集中包括样本图像以及标注框,所述样本图像中包括样本旋转目标,所述标注框为所述样本旋转目标的标注框;An acquisition module, configured to acquire a training data set, the training data set includes a sample image and a label frame, the sample image includes a sample rotation target, and the label frame is a label frame of the sample rotation target;
训练模块,用于获取目标检测模型,并通过所述训练数据集对的目标检测模型进行训练,得到训练好的目标检测模型,所述目标检测模型包括特征提取网络、特征融合网络、分类特征输出网络、高宽特征输出网络以及旋转角度特征输出网络。The training module is used to obtain the target detection model, and train the target detection model through the training data set to obtain the trained target detection model. The target detection model includes a feature extraction network, a feature fusion network, and a classification feature output Network, height and width feature output network, and rotation angle feature output network.
可选的,所述训练模块包括:Optionally, the training module includes:
第二处理子模块,用于将所述样本图像输入到所述目标检测模型,得到所述样本旋转目标对应的样本检测框;The second processing submodule is configured to input the sample image into the target detection model to obtain a sample detection frame corresponding to the sample rotation target;
编码子模块,用于将所述样本检测框与所述标注框分别通过预设的编码函数进行编码,分别得到所述样本检测框对应的样本函数分布,以及所述标注框对应的标注函数分布;The encoding sub-module is used to encode the sample detection frame and the label frame respectively through a preset encoding function, and respectively obtain the sample function distribution corresponding to the sample detection frame and the label function distribution corresponding to the label frame ;
调整子模块,用于根据所述样本函数分布与所述标注函数分布之间的度量距离,对所述目标检测模型进行网络参数调整;An adjustment submodule, configured to adjust the network parameters of the target detection model according to the measured distance between the sample function distribution and the label function distribution;
迭代子模块,用于对所述目标检测模型的网络参数调整过程进行迭代,直到所述目标检测模型收敛或达到预设的迭代次数,得到训练好的目标检测模型。The iteration sub-module is used to iterate the network parameter adjustment process of the target detection model until the target detection model converges or reaches a preset number of iterations to obtain a trained target detection model.
可选的,所述第二处理子模块包括:Optionally, the second processing submodule includes:
第一处理单元,用于通过所述目标检测模型对所述样本图像进行处理,得到所述样本图像对应的样本特征图,并根据所述样本特征图的高宽,对所述样本特征图构建矩阵网格,所述样本特征图像包括所述样本图像对应分类特征图、高宽特征图与旋转角度特征图;The first processing unit is configured to process the sample image through the target detection model to obtain a sample feature map corresponding to the sample image, and construct the sample feature map according to the height and width of the sample feature map A matrix grid, the sample feature image includes a classification feature map, a height and width feature map, and a rotation angle feature map corresponding to the sample image;
索引建立单元,用于在所述高宽特征图中每个网格点建立一个高宽属性的索引,以及在所述旋转角度特征图中每个网格点建立一个旋转角度属性的索引;An index building unit, configured to create an index of a height-width attribute for each grid point in the height-width feature map, and create an index of a rotation angle attribute for each grid point in the rotation angle feature map;
第二处理单元,用于根据所述样本特征图中每个网格点及其对应的索引属性,得到所述样本旋转目标对应的样本检测框。The second processing unit is configured to obtain a sample detection frame corresponding to the sample rotation target according to each grid point in the sample feature map and its corresponding index attribute.
可选的,所述标注框内包括标注关键点,所述调整子模块包括:Optionally, the labeling frame includes labeling key points, and the adjustment submodule includes:
计算单元,用于根据分类特征图的样本关键点,计算所述样本关键点与所述标注关键点之间的第一损失;A calculation unit, configured to calculate a first loss between the sample key points and the labeled key points according to the sample key points of the classification feature map;
转换单元,用于通过预设的转换函数将所述度量距离转换为第二损失;a conversion unit, configured to convert the metric distance into a second loss through a preset conversion function;
调整单元,用于基于所述第一损失与第二损失,对所述目标检测模型进行网络参数调整。An adjustment unit, configured to adjust network parameters of the target detection model based on the first loss and the second loss.
需要说明的是,本发明实施例提供的目标检测装置可以应用于可以进行目标检测的智能手机、电脑、服务器等设备。It should be noted that the object detection device provided by the embodiment of the present invention can be applied to devices such as smart phones, computers, and servers capable of object detection.
本发明实施例提供的目标检测装置能够实现上述方法实施例中目标检测方法实现的各个过程,且可以达到相同的有益效果。为避免重复,这里不再赘述。The target detection device provided by the embodiment of the present invention can realize each process realized by the target detection method in the above method embodiment, and can achieve the same beneficial effect. To avoid repetition, details are not repeated here.
参见图4,图4是本发明实施例提供的一种电子设备的结构示意图,如图4所示,包括:存储器402、处理器401及存储在所述存储器402上并可在所述处理器401上运行的目标检测方法的计算机程序,其中:Referring to FIG. 4, FIG. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention. As shown in FIG. 4, it includes: a
处理器401用于调用存储器402存储的计算机程序,执行如下步骤:The
获取待检测图像,所述待检测图像包括待检测旋转目标;Acquiring an image to be detected, the image to be detected includes a rotating target to be detected;
通过训练好的目标检测模型对所述待检测图像进行特征提取,得到所述待检测旋转目标的分类特征和辅助特征;performing feature extraction on the image to be detected through the trained target detection model to obtain classification features and auxiliary features of the rotating target to be detected;
基于所述辅助特征对所述分类特征进行辅助处理,得到所述待检测旋转目标的检测结果。An auxiliary process is performed on the classification feature based on the auxiliary feature to obtain a detection result of the rotating target to be detected.
可选的,所述训练好的目标检测模型包括特征提取网络、特征融合网络、分类特征输出网络、辅助特征输出网络,处理器401执行的所述通过训练好的目标检测模型对所述待检测图像进行特征提取,得到所述待检测旋转目标的分类特征和辅助特征,包括:Optionally, the trained target detection model includes a feature extraction network, a feature fusion network, a classification feature output network, and an auxiliary feature output network, and the trained target detection model executed by the
通过所述特征提取网络对所述待检测图像进行特征提取,得到所述待检测图像的多尺度特征;performing feature extraction on the image to be detected through the feature extraction network to obtain multi-scale features of the image to be detected;
通过所述特征融合网络对所述多尺度特征进行特征融合,得到所述待检测图像的融合特征;performing feature fusion on the multi-scale features through the feature fusion network to obtain the fusion features of the image to be detected;
通过所述分类特征输出网络对所述融合特征进行预测,得到所述待检测旋转目标的分类特征,所述分类特征包括特征通道,不同类别的待检测旋转目标对应于不同的特征通道;The fusion feature is predicted by the classification feature output network to obtain the classification feature of the rotating target to be detected, the classification feature includes a feature channel, and different types of rotating targets to be detected correspond to different feature channels;
通过所述辅助特征输出网络对所述融合特征进行预测,得到所述待检测旋转目标的辅助特征。The fusion feature is predicted by the auxiliary feature output network to obtain the auxiliary feature of the rotating target to be detected.
可选的,所述辅助特征包括高宽特征以及旋转角度特征,处理器401执行的所述基于所述辅助特征对所述分类特征进行辅助处理,得到所述待检测旋转目标的检测结果,包括:Optionally, the auxiliary features include a height-width feature and a rotation angle feature, and the
对所述分类特征进行关键点提取,得到所述待检测旋转目标的目标关键点;Carrying out key point extraction on the classification features to obtain target key points of the rotating target to be detected;
基于所述目标关键点,在所述高宽特征中索引对应的目标高宽属性,以及在所述旋转角度特征中索引对应的目标旋转角度属性;Based on the target key point, indexing a corresponding target height and width attribute in the height and width feature, and indexing a corresponding target rotation angle attribute in the rotation angle feature;
基于所述目标关键点、所述目标高宽属性以及所述目标旋转角度属性,得到所述待检测旋转目标的检测结果。A detection result of the rotating object to be detected is obtained based on the key point of the object, the attribute of the height and width of the object, and the attribute of the rotation angle of the object.
可选的,在所述通过训练好的目标检测模型对所述待检测图像进行特征提取,得到所述待检测旋转目标的分类特征和辅助特征之前,处理器401执行的所述方法还包括:Optionally, before performing feature extraction on the image to be detected by using the trained target detection model to obtain classification features and auxiliary features of the rotating target to be detected, the method executed by the
获取训练数据集,所述训练数据集中包括样本图像以及标注框,所述样本图像中包括样本旋转目标,所述标注框为所述样本旋转目标的标注框;Obtain a training data set, the training data set includes a sample image and a label frame, the sample image includes a sample rotation target, and the label frame is a label frame of the sample rotation target;
获取目标检测模型,并通过所述训练数据集对的目标检测模型进行训练,得到训练好的目标检测模型,所述目标检测模型包括特征提取网络、特征融合网络、分类特征输出网络、高宽特征输出网络以及旋转角度特征输出网络。Obtain the target detection model, and train the target detection model through the training data set to obtain the trained target detection model. The target detection model includes a feature extraction network, a feature fusion network, a classification feature output network, and a height and width feature. Output network and rotation angle feature output network.
可选的,处理器401执行的所述获取目标检测模型,并通过所述训练数据集对的目标检测模型进行训练,得到训练好的目标检测模型,包括:Optionally, the acquisition of the target detection model executed by the
将所述样本图像输入到所述目标检测模型,得到所述样本旋转目标对应的样本检测框;Inputting the sample image into the target detection model to obtain a sample detection frame corresponding to the sample rotation target;
将所述样本检测框与所述标注框分别通过预设的编码函数进行编码,分别得到所述样本检测框对应的样本函数分布,以及所述标注框对应的标注函数分布;Encoding the sample detection frame and the label frame respectively by a preset encoding function, respectively obtaining a sample function distribution corresponding to the sample detection frame and a label function distribution corresponding to the label frame;
根据所述样本函数分布与所述标注函数分布之间的度量距离,对所述目标检测模型进行网络参数调整;adjusting the network parameters of the target detection model according to the metric distance between the sample function distribution and the label function distribution;
对所述目标检测模型的网络参数调整过程进行迭代,直到所述目标检测模型收敛或达到预设的迭代次数,得到训练好的目标检测模型。The network parameter adjustment process of the target detection model is iterated until the target detection model converges or reaches a preset number of iterations to obtain a trained target detection model.
可选的,处理器401执行的所述将所述样本图像输入到所述目标检测模型,得到所述样本旋转目标对应的样本检测框,包括:Optionally, the inputting the sample image into the target detection model performed by the
通过所述目标检测模型对所述样本图像进行处理,得到所述样本图像对应的样本特征图,并根据所述样本特征图的高宽,对所述样本特征图构建矩阵网格,所述样本特征图像包括所述样本图像对应分类特征图、高宽特征图与旋转角度特征图;Process the sample image through the target detection model to obtain a sample feature map corresponding to the sample image, and construct a matrix grid for the sample feature map according to the height and width of the sample feature map, the sample The feature image includes a classification feature map, a height-width feature map, and a rotation angle feature map corresponding to the sample image;
在所述高宽特征图中每个网格点建立一个高宽属性的索引,以及在所述旋转角度特征图中每个网格点建立一个旋转角度属性的索引;Establishing an index of a height and width attribute for each grid point in the height-width feature map, and establishing an index of a rotation angle attribute for each grid point in the rotation angle feature map;
根据所述样本特征图中每个网格点及其对应的索引属性,得到所述样本旋转目标对应的样本检测框。According to each grid point in the sample feature map and its corresponding index attribute, a sample detection frame corresponding to the sample rotation target is obtained.
可选的,所述标注框内包括标注关键点,处理器401执行的所述根据所述样本函数分布与所述标注函数分布之间的度量距离,对所述目标检测模型进行网络参数调整,包括:Optionally, the marked key points are included in the marked frame, and the
根据分类特征图的样本关键点,计算所述样本关键点与所述标注关键点之间的第一损失;Calculate the first loss between the sample key point and the label key point according to the sample key point of the classification feature map;
通过预设的转换函数将所述度量距离转换为第二损失;converting the metric distance into a second loss through a preset conversion function;
基于所述第一损失与第二损失,对所述目标检测模型进行网络参数调整。Based on the first loss and the second loss, network parameters are adjusted for the target detection model.
本发明实施例提供的电子设备能够实现上述方法实施例中目标检测方法实现的各个过程,且可以达到相同的有益效果。为避免重复,这里不再赘述。The electronic device provided by the embodiment of the present invention can realize each process realized by the target detection method in the above method embodiment, and can achieve the same beneficial effect. To avoid repetition, details are not repeated here.
本发明实施例还提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现本发明实施例提供的目标检测方法或应用端目标检测方法的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。The embodiment of the present invention also provides a computer-readable storage medium, and a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the target detection method or the application-side target detection method provided by the embodiment of the present invention is implemented. Each process can achieve the same technical effect, so in order to avoid repetition, it will not be repeated here.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存取存储器(Random AccessMemory,简称RAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented through computer programs to instruct related hardware, and the programs can be stored in a computer-readable storage medium. During execution, it may include the processes of the embodiments of the above-mentioned methods. Wherein, the storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM) or a random access memory (Random Access Memory, RAM for short).
以上所揭露的仅为本发明较佳实施例而已,当然不能以此来限定本发明之权利范围,因此依本发明权利要求所作的等同变化,仍属本发明所涵盖的范围。The above disclosures are only preferred embodiments of the present invention, and certainly cannot limit the scope of rights of the present invention. Therefore, equivalent changes made according to the claims of the present invention still fall within the scope of the present invention.
Claims (10)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210819568.0A CN115311553A (en) | 2022-07-12 | 2022-07-12 | Target detection method and device, electronic equipment and storage medium |
PCT/CN2022/143514 WO2024011873A1 (en) | 2022-07-12 | 2022-12-29 | Target detection method and apparatus, electronic device, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210819568.0A CN115311553A (en) | 2022-07-12 | 2022-07-12 | Target detection method and device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115311553A true CN115311553A (en) | 2022-11-08 |
Family
ID=83856438
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210819568.0A Pending CN115311553A (en) | 2022-07-12 | 2022-07-12 | Target detection method and device, electronic equipment and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115311553A (en) |
WO (1) | WO2024011873A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024011873A1 (en) * | 2022-07-12 | 2024-01-18 | 青岛云天励飞科技有限公司 | Target detection method and apparatus, electronic device, and storage medium |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118644667B (en) * | 2024-08-15 | 2024-10-15 | 山东浪潮数字服务有限公司 | Method and device for detecting surface defects of hub based on deep learning |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111191566B (en) * | 2019-12-26 | 2022-05-17 | 西北工业大学 | Optical remote sensing image multi-target detection method based on pixel classification |
CN111931877B (en) * | 2020-10-12 | 2021-01-05 | 腾讯科技(深圳)有限公司 | Target detection method, device, equipment and storage medium |
CN114757250A (en) * | 2020-12-29 | 2022-07-15 | 华为云计算技术有限公司 | Image processing method and related equipment |
CN113420648B (en) * | 2021-06-22 | 2023-05-05 | 深圳市华汉伟业科技有限公司 | Target detection method and system with rotation adaptability |
CN115311553A (en) * | 2022-07-12 | 2022-11-08 | 青岛云天励飞科技有限公司 | Target detection method and device, electronic equipment and storage medium |
-
2022
- 2022-07-12 CN CN202210819568.0A patent/CN115311553A/en active Pending
- 2022-12-29 WO PCT/CN2022/143514 patent/WO2024011873A1/en unknown
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024011873A1 (en) * | 2022-07-12 | 2024-01-18 | 青岛云天励飞科技有限公司 | Target detection method and apparatus, electronic device, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2024011873A1 (en) | 2024-01-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109615611B (en) | Inspection image-based insulator self-explosion defect detection method | |
CN108108764B (en) | Visual SLAM loop detection method based on random forest | |
CN107274451A (en) | Isolator detecting method and device based on shared convolutional neural networks | |
CN110866934B (en) | Method and system for segmentation of complex point cloud based on normative coding | |
CN105005760B (en) | A kind of recognition methods again of the pedestrian based on Finite mixture model | |
CN110443258B (en) | Character detection method and device, electronic equipment and storage medium | |
CN112837315A (en) | A deep learning-based detection method for transmission line insulator defects | |
WO2024011873A1 (en) | Target detection method and apparatus, electronic device, and storage medium | |
CN113313703A (en) | Unmanned aerial vehicle power transmission line inspection method based on deep learning image recognition | |
CN111598942A (en) | A method and system for automatic positioning of power facility meters | |
CN111310690B (en) | Forest fire recognition method and device based on CN and three-channel capsule network | |
CN108537790A (en) | Heterologous image change detection method based on coupling translation network | |
CN112819008A (en) | Method, device, medium and electronic equipment for optimizing instance detection network | |
CN103093243A (en) | High resolution panchromatic remote sensing image cloud discriminating method | |
CN116012709A (en) | High-resolution remote sensing image building extraction method and system | |
Liu et al. | Indoor Visual Positioning Method Based on Image Features. | |
CN111291712B (en) | Forest fire recognition method and device based on interpolation CN and capsule network | |
CN106093074B (en) | A method for detecting solder joints of IC components based on robust principal component analysis | |
CN110895701B (en) | Forest fire online identification method and device based on CN and FHOG | |
CN116129280B (en) | Method for detecting snow in remote sensing image | |
CN116543298A (en) | Building Extraction Method of Remote Sensing Image Based on Fractal Geometric Features and Edge Supervision | |
CN109753896A (en) | An Unsupervised Heterogeneous Remote Sensing Image Change Detection Method Based on Common Autoencoder | |
CN113505650B (en) | Topographic feature line extraction method, device and equipment | |
CN116246161A (en) | Method and device for identifying target fine type of remote sensing image under guidance of domain knowledge | |
CN104715265B (en) | Radar scene classification method based on compression sampling Yu integrated coding grader |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |