CN118071997A

CN118071997A - Water surface target identification method and device based on visual image and electronic equipment

Info

Publication number: CN118071997A
Application number: CN202410254580.0A
Authority: CN
Inventors: 龙飞; 陈姚节; 葛启桢; 张泽宇; 耿鹏; 夏叶亮; 林云汉
Original assignee: Wuhan Research Institute Of Marine Electric Propulsion No 712 Research Institute Of China Shipbuilding Corp; Wuhan University of Science and Engineering WUSE
Current assignee: Wuhan Research Institute Of Marine Electric Propulsion No 712 Research Institute Of China Shipbuilding Corp; Wuhan University of Science and Engineering WUSE
Priority date: 2024-03-06
Filing date: 2024-03-06
Publication date: 2024-05-24
Anticipated expiration: 2044-03-06
Also published as: CN118071997B

Abstract

The invention provides a water surface target identification method and device based on visual images and electronic equipment, and relates to the technical field of visual identification, wherein the method comprises the following steps: shooting a plurality of target images at different angles on the same water surface target in a night environment; generating a comprehensive credible coefficient according to the illuminance value of the target image, the maximum shooting length and the maximum shooting height of the target object; identifying each target image to generate a corresponding target object type; the method disclosed by the invention comprehensively considers the influence factors of illumination intensity and shooting angle on object identification compared with single object image identification, and analyzes and judges the credibility of a plurality of output object types, so that the most credible object type is screened out, and the accuracy of the object identification on water in a night environment is improved.

Description

Water surface target identification method and device based on visual image and electronic equipment

Technical Field

The invention relates to the technical field of visual identification, in particular to a visual image-based water surface target identification method and device and electronic equipment.

Background

Object detection is to find all interested objects in an image, and comprises two subtasks of object positioning and object classification, namely, the category and the position of the object are determined simultaneously. The target detection is a popular direction of computer vision and image processing, is widely applied to various fields such as robot navigation, intelligent video monitoring, industrial detection and the like, reduces the consumption of human capital through the computer vision, and has important practical significance. Therefore, object detection is also becoming a research hotspot of theory and application in recent years, which is an important branch of image processing and computer vision discipline, and is also a core part of intelligent monitoring systems. Meanwhile, the target detection is also a basic algorithm in the field of general identity recognition, and plays a vital role in subsequent tasks such as face recognition, gait recognition, crowd counting, instance segmentation and the like.

Due to the wide application of deep learning, the target detection algorithm is developed more rapidly. Since 2006, a large number of papers for deep neural networks have been published under the lead of Hinton, bengio, lecun et al, and in particular, in 2012, the Hinton subject group first participated in ImageNet image recognition contests, which at once seized the champion by the constructed CNN network AlexNet, and from this neural network has received a great deal of attention. Deep learning utilizes multi-layer computational models to learn abstract data representations, enabling the discovery of complex structures in large data, and at present, this technique has been successfully applied to a variety of pattern classification problems, including the field of computer vision.

In the prior art, a ship detection and identification method in a day and night image with the publication number of CN110334703B (class number of G06K 9/32) comprises the following steps: s1, detecting illuminance of ship images at different time intervals by using a light sensing element, and dividing the ship images into a daytime image and a nighttime image according to different illuminance ranges of the ship images; s2, aiming at daytime images, firstly detecting all objects in a detection range, and then screening out ship objects from the detected objects; s3, aiming at night images, firstly detecting remarkable targets in the night images, and screening out ship objects from the remarkable targets; s4, based on the screened ship objects, acquiring real-time positions and belonging type information of all ships in the current video frame, detecting and identifying ship targets in a full-time scene are achieved, and good robustness is achieved.

However, the prior art has major drawbacks such as: in the prior art, aiming at the problem of larger illumination intensity difference between a daytime image and a night image, two different recognition methods are set to respectively recognize the daytime image and the night image so as to achieve the recognition under a full-time scene, but the night image can be influenced by factors such as illumination intensity, image shooting angle and the like in the recognition analysis process, for example, the problem that the recognition judgment error is easy to occur when the illumination intensity is lower and the image shooting angle is unfavorable for displaying the night image shot under the integral characteristics of a water surface target, and the problem of the reduction of the night image recognition precision is caused.

Disclosure of Invention

The invention aims to provide a visual image-based water surface target identification method and device and electronic equipment, so as to solve the problems in the background technology.

In order to achieve the above purpose, the present invention provides the following technical solutions:

A water surface target identification method based on visual images comprises the following steps:

s1, shooting a plurality of target images at different angles on a target on the same water surface in a night environment, and acquiring illuminance values of the target images;

s2, positioning a target object in the target image, drawing a boundary frame of the target object, and calculating the maximum shooting length and the maximum shooting height of the target object according to the length and the height of the boundary frame;

s3, generating a comprehensive credibility coefficient of the target image according to the illuminance value of the target image, the maximum shooting length and the maximum shooting height of the target object;

S4, inputting the target images into a fast R-CNN model, identifying the target objects in each target image by the fast R-CNN model, and outputting corresponding target object types;

And S5, accumulating and averaging the comprehensive credibility coefficients of the target images corresponding to the same target object type to generate a credibility coefficient of the target object type, and taking the target object type with the largest credibility coefficient as the most credible target object type.

Further, the illuminance value is calibrated as E _i, i represents the numbers of different target images, i=1, 2,3, … …, m represents the total number of different target images, m is greater than or equal to 5, m E N ₊, the length of the bounding box is calibrated as L1 _i, the height of the bounding box is calibrated as H1 _i, the maximum shooting length of the target object is calibrated as L2 _i, and the maximum shooting height of the target object is calibrated as H2 _i.

Further, the specific logic of locating the target object in the target image and drawing a bounding box of the target object, and calculating the maximum shooting length and the maximum shooting height of the target object according to the length and the height of the bounding box is as follows:

S21, training and generating a Mask R-CNN model, inputting a target image into the Mask R-CNN model, and outputting the position information of a target object in the target image and a Mask of a pixel level by the Mask R-CNN model;

s22, drawing a rectangular boundary frame internally tangent to the outer edge of the target object according to the position information of the target object output by the Mask R-CNN model, and acquiring the length and the height of the boundary frame;

S23, calculating the size proportion of the target image according to a reference object with known size in the target image, and generating the maximum shooting length and the maximum shooting height of the target object according to the size proportion, the length and the height of the boundary box, wherein the calculation formula is as follows:

Wherein Hsj _i denotes the actual height of the selected reference object in the i-th target image, hxs _i denotes the height of the selected reference object in the i-th target image, and λ _i denotes the size ratio of the i-th target image.

Further, the specific logic for generating the comprehensive trusted coefficient of the target image according to the illuminance value of the target image, the maximum shooting length and the maximum shooting height of the target object is as follows:

S31, order To generate intermediate first confidence coefficients for respective target images, and to/>Scaling w1 ₁、w1₂、w1₃、…、w1_i、…、w1_m to generate a first confidence coefficient w1 ₁'、w1₂'、w1₃'、...、w1_i'、...、w1_m' for each target image;

s32, order To generate intermediate second confidence coefficients for respective target images, and to/>Scaling w2 ₁、w2₂、w2₃、...、w2_i、...、w2_m to generate a second confidence coefficient w2 ₁'、w2₂'、w2₃'、...、w2_i'、...、w2_m' for each target image, for constraint;

Wherein, α1, α2, α3 are all preset proportionality coefficients, and α1+α2+α3=1, and 0< α3< α2< α 1<1;

S33, generating a comprehensive trusted coefficient w1 _i of the target image according to the first trusted coefficient w1 _i 'and the second trusted coefficient w2 _i' of the target image, wherein the calculation formula is as follows:

w_i'＝β1*w1_i'+β2*w2_i'

Wherein β1 and β2 are both preset scaling factors, and β1+β2=1, and 0< β2< β 1<1.

Further, the value range of the alpha 1 is 0.5-0.7, the value range of the alpha 2 is 0.2-0.3, and the value range of the alpha 3 is 0.1-0.2.

Further, the value range of the beta 1 is 0.6-0.8, and the value range of the beta 2 is 0.2-0.4.

Further, the target types include passenger ships, cargo ships, light beacon ships, warships, sailing ships, and others.

A water surface target recognition device based on a visual image is used for the water surface target recognition method based on the visual image, and comprises the following steps:

The photographing and illuminance value acquisition module is used for photographing a plurality of target images under different angles on the same water surface target in a night environment and acquiring illuminance values of all the target images;

The shooting size acquisition module is used for positioning a target object in a target image, drawing a boundary frame of the target object, and calculating the maximum shooting length and the maximum shooting height of the target object according to the length and the height of the boundary frame;

the comprehensive credibility coefficient calculation module is used for generating a comprehensive credibility coefficient of the target image according to the illumination value of the target image, the maximum shooting length and the maximum shooting height of the target object;

The target object type output module is used for inputting target images into a fast R-CNN model, and the fast R-CNN model is used for identifying target objects in each target image and outputting corresponding target object types;

the most reliable object type generating module is used for accumulating and averaging the comprehensive credibility coefficients of the object images corresponding to the same object type to generate the credibility coefficient of the object type, and taking the object type with the largest credibility coefficient as the most reliable object type.

An electronic device, comprising:

A memory for storing a computer program;

and the processor is used for executing the computer program to realize the visual image-based water surface target identification method.

Compared with the prior art, the invention has the beneficial effects that:

according to the visual image-based water surface target identification method, device and electronic equipment, multiple target images are shot aiming at the same target object and target object types are output respectively, comprehensive credibility coefficients of the target images are calculated according to illumination values of the target images, maximum shooting lengths and maximum shooting heights of the target objects, finally credibility coefficients of the target object types are calculated according to the comprehensive credibility coefficients to screen out the most credible target object types, compared with single target image identification, influence factors of illumination intensity and shooting angles on target object identification are comprehensively considered, the credibility of the output multiple target object types is analyzed and judged, the most credible target object types are screened out, and the accuracy of target identification on water in a night environment is improved.

Drawings

FIG. 1 is a flow chart of a visual image-based water surface target recognition method of the present invention;

fig. 2 is a block diagram of a visual image-based water surface target recognition device according to the present invention.

Detailed Description

The present invention will be further described in detail with reference to specific embodiments in order to make the objects, technical solutions and advantages of the present invention more apparent.

It is to be noted that unless otherwise defined, technical or scientific terms used herein should be taken in a general sense as understood by one of ordinary skill in the art to which the present invention belongs. The terms "first," "second," and the like, as used herein, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "up", "down", "left", "right" and the like are used only to indicate a relative positional relationship, and when the absolute position of the object to be described is changed, the relative positional relationship may be changed accordingly.

Embodiment one:

referring to fig. 1, the present invention provides a visual image-based water surface target recognition method, which includes the following steps:

The illuminance value is calibrated as E _i, i represents the numbers of different target images, i=1, 2, 3, … … and m, m represents the total number of different target images, m is more than or equal to 5, and m epsilon N ₊;

It should be noted that, when a water surface target in a moving state is photographed, an optical camera can be installed at a fixed position, and a camera of the optical camera rotates to capture the water surface target in the moving state, so that the optical camera photographs a plurality of target images at different photographing angles, and when a water surface target in a static state is photographed, the optical camera can be installed on a water-borne ship, and the water-borne ship drives the optical camera to move through the water surface target, so that the optical camera photographs a plurality of target images at different photographing angles;

it should be noted that, the illuminance value of the target image may be obtained by a light sensing element detection method, that is, the illuminance value of the target image transmitted by the optical camera is detected by the light sensing element, which is the prior art and will not be described herein.

the length of the boundary frame is marked as L1 _i,L1_i to represent the boundary frame length of the object in the ith target image, and the height of the boundary frame is marked as H1 _i,H1_i to represent the boundary frame height of the object in the ith target image;

The maximum shooting length of the target object is calibrated to be L2 _i,L2_i to represent the maximum shooting length of the target object shot at the shooting angle when the ith target image is shot, and the maximum shooting height of the target object is calibrated to be H2 _i,H2_i to represent the maximum shooting height of the target object shot at the shooting angle when the ith target image is shot;

the step S2 specifically comprises the following steps:

It should be noted that, training to generate a Mask R-CNN model, and then outputting the position information of the target object in the target image according to the Mask R-CNN model as in the prior art, taking the target object as a ship as an example, the manner of training to generate the Mask R-CNN model is as follows:

S211, selecting a pre-trained Mask R-CNN model as a basic model, selecting a plurality of water surface ship images from an image library, and randomly dividing the water surface ship images into a training set or a testing set;

S212, inputting the water surface ship image in the training set into a pre-trained Mask R-CNN model for training, adjusting the model structure and parameters in the training process, specifically adopting a proper loss function to optimize the model parameters, wherein the loss function comprises a boundary box regression loss and a segmentation loss, the boundary box regression loss is used for accurately positioning the position of a target object, and the segmentation loss is used for generating a pixel level Mask of the target;

S213, evaluating the Mask R-CNN model obtained by training by using a test set to ensure that the model has good generalization capability and accuracy, wherein evaluation indexes can comprise accuracy, recall rate, F1 score and the like of target detection, so as to obtain the Mask R-CNN model capable of identifying a target object in a target image;

it should be noted that, the position information of the bounding box is determined by the coordinates of the bounding box of the object output by the Mask R-CNN model, the length and the height of the bounding box obtained by drawing the bounding box of the object are the prior art, and the bounding box can be obtained by using the existing Python code;

It should be noted that Hsj _i denotes an actual height of the selected reference object in the ith target image, hxs _i denotes a height of the selected reference object in the ith target image, λ _i denotes a size ratio of the ith target image, hsj _i may be obtained by on-site measurement or a manner of retrieving from a known database, and Hxs _i may be directly obtained from the target image by image processing software, which is not described in detail herein.

S3, generating a comprehensive credibility coefficient of the target image according to the illumination value of the target image, the maximum shooting length and the maximum shooting height of the target object, wherein the comprehensive credibility coefficient comprises the following steps:

S31, order To generate intermediate first confidence coefficients for respective target images, and to/>Scaling w1 ₁、w1₂、w1₃、…、w1_i、...、w1_m to generate a first confidence coefficient w1 ₁'、w1₂'、w1₃'、...、w1_i'、…、w1_m' for each target image;

It should be noted that, w1 _i represents the intermediate first trusted coefficient of the ith target image, w1 _i' represents the first trusted coefficient of the ith target image, E ₀ is a preset illuminance standard value, and E ₀ may be set to 1Lx;

The larger E _i is, the larger the illuminance value of the ith target image is, that is, the better the light is, the clearer the features of the target object are shot, and the higher the accuracy of feature recognition analysis on the target object by the target image is, the larger the first trusted coefficient w1 _i and the first trusted coefficient w1 _i' in the middle of the ith target image are;

s32, order To generate intermediate second confidence coefficients for respective target images, and to/>Scaling w2 ₁、w2₂、w2₃、…、w2_i、…、w2_m to generate a second confidence coefficient w2 ₁'、w2₂'、w2₃'、…、w2_i'、…、w2_m' for each target image, for constraint;

It should be noted that, α1, α2, and α3 are preset scaling factors, α1 represents a weight of the maximum shooting length of the object to be shot in the middle second trusted coefficient calculation, α2 represents a weight of the maximum shooting height of the object to be shot in the middle second trusted coefficient calculation, α3 represents a weight of the maximum shooting area of the object to be shot in the middle second trusted coefficient calculation, α1+α2+α3=1, and specific values of 0< α3< α2< α 1<1, α1, α2, and α3 are generally determined by those skilled in the art according to actual conditions;

as one embodiment, the value range of α1 is 0.5 to 0.7, the value range of α2 is 0.2 to 0.3, and the value range of α3 is 0.1 to 0.2;

It should be noted that, L2 _max is the maximum value in L2 ₁、L2₂、L2₃、…、L2_i、…、L2_m, H2 _max is the maximum value in H2 ₁、H2₂、H2₃、…、H2_i、…、H2_m, and both L2 _max and H2 _max may be obtained from L2₁、L2₂、L2₃、…、L2_i、…、L2_m、H2₁、H2₂、H2₃、…、H2_i、…、H2_m by the data processing software, which is not described in detail herein;

It should be noted that, the larger L2 _i and H2 _i are, the larger the maximum shooting length and the maximum shooting height of the object shot at the shooting angle of shooting the ith target image are, that is, the more features of the object shot, the higher the accuracy of feature recognition analysis of the object by the target image is, the larger the intermediate second trusted coefficient w2 _i and the second trusted coefficient w2 _i' of the ith target image are;

S33, generating a comprehensive trusted coefficient w _i ' of the target image according to the first trusted coefficient w1 _i ' and the second trusted coefficient w2 _i ' of the target image, wherein the calculation formula is as follows:

w_i'＝β1*w1_i'+β2*w2_i'

It should be noted that, β1 and β2 are preset proportionality coefficients, β1 represents the weight occupied by the first trusted coefficient w1 _i 'in the calculation of the comprehensive trusted coefficient w _i', β2 represents the weight occupied by the second trusted coefficient w2 _i 'in the calculation of the comprehensive trusted coefficient w _i', β1+β2=1, and the specific values of 0< β2< β 1<1, β1 and β2 are generally determined by those skilled in the art according to actual conditions;

As one embodiment, β1 has a value ranging from 0.6 to 0.8 and β2 has a value ranging from 0.2 to 0.4;

It should be noted that, w _i 'represents the comprehensive reliability coefficient of the ith target image, and the larger the first reliability coefficient w1 _i' and the second reliability coefficient w2 _i ', the higher the accuracy of feature recognition analysis on the target object through the ith target image, that is, the more reliable the recognition judgment on the target object is, the larger the comprehensive reliability coefficient w _i' is.

The target object type is marked as LX _j, j represents the number of different target object types, j=1, 2, 3, … … and N, N represents the total number of different target object types, m is more than or equal to 5, and m is E N ₊;

as a real-time manner, for the ship type recognition, the target object types output by the fast R-CNN model include passenger ships, cargo ships, beacon ships, warships, sailing ships and others;

It should be noted that, training to generate a fast R-CNN model for identifying the type of the object, identifying the object in the target image by using the fast R-CNN model, and outputting the corresponding type of the object as the prior art, taking identifying the type of the ship as an example, the manner of outputting the type of the object by using the fast R-CNN model is as follows:

S41, selecting various types of ship images and other images from an image library, randomly dividing the images into a training set or a testing set, initializing RPN network parameters by using a pre-training network model, and finely adjusting the RPN network parameters by using a random gradient descent algorithm and a counter-propagation algorithm;

S42, initializing the parameters of a Faster R-CNN target detection network by using a pre-training network model, extracting candidate areas by using the RPN network in the step S41, and training the target detection network by using a training set;

S43, re-initializing and fine-tuning RPN network parameters by using the target detection network in the step S42;

S44, extracting candidate areas by using the RPN network in the step S43 and fine-tuning the target detection network parameters;

s45, repeating the step S43 and the step S44 until the maximum iteration number or the network convergence is reached;

S46, verifying the performance of the model by using the test set, and storing a Faster R-CNN model with qualified performance;

S47, inputting target images into a Faster R-CNN model with qualified performance, identifying target objects in each target image by the Faster R-CNN model, and outputting corresponding target object types;

It should be noted that, the initial learning rate in the fast R-CNN network is 0.0002, and the iteration is 25000 times.

S5, accumulating and averaging the comprehensive credibility coefficients of the target images corresponding to the same target object type to generate a credibility coefficient of the target object type, and taking the target object type with the largest credibility coefficient as the most credible target object type;

Such as: the number of the target images corresponding to the jth target object type is m1, the numbers of the m1 target images are i1, i2, i3, … … and ik respectively, the numbers of i1, i2, i3 and … … ik are positive integers between 1 and m, the credibility coefficient of the jth target object type is calibrated to be KX _j, and the calculation formula of the credibility coefficient KX _j of the jth target object type is as follows:

Embodiment two:

Referring to fig. 2, the present invention provides a visual image-based water surface target recognition device, for the visual image-based water surface target recognition method, comprising:

The shooting size acquisition module is used for positioning a target object in the target image, drawing a boundary frame of the target object, and calculating the maximum shooting length and the maximum shooting height of the target object according to the length and the height of the boundary frame;

The most reliable object type generating module is used for accumulating and averaging the comprehensive credibility coefficients of the object images corresponding to the same object type to generate a credibility coefficient of the object type, and taking the object type with the largest credibility coefficient as the most reliable object type.

Embodiment III:

An electronic device, comprising:

A memory for storing a computer program;

And the processor is used for executing a computer program to realize the visual image-based water surface target identification method.

The above formulas are all formulas with dimensions removed and numerical values calculated, the formulas are formulas with a large amount of data collected for software simulation to obtain the latest real situation, and preset parameters in the formulas are set by those skilled in the art according to the actual situation.

The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. Those of skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein can be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application.

Claims

1. The water surface target identification method based on the visual image is characterized by comprising the following steps of:

2. The visual image-based water surface target recognition method according to claim 1, wherein: the illuminance value is calibrated as E _i, i represents the numbers of different target images, i=1, 2,3, … …, m represents the total number of different target images, m is more than or equal to 5, m epsilon N ₊, the length of the boundary box is calibrated as L1 _i, the height of the boundary box is calibrated as H1 _i, the maximum shooting length of the target object is calibrated as L2 _i, and the maximum shooting height of the target object is calibrated as H2 _i.

3. The visual image-based water surface target recognition method according to claim 2, wherein: the specific logic of locating the target object in the target image, drawing a boundary box of the target object, and calculating the maximum shooting length and the maximum shooting height of the target object according to the length and the height of the boundary box is as follows:

4. The visual image-based water surface target recognition method according to claim 2, wherein: the specific logic for generating the comprehensive credibility coefficient of the target image according to the illuminance value of the target image, the maximum shooting length and the maximum shooting height of the target object is as follows:

S31, order To generate intermediate first confidence coefficients for respective target images, and to/>Scaling w1 ₁、w1₂、w1₃、…、w1_i、…、w1_m to generate a first confidence coefficient w1 ₁'、w1₂'、w1₃'、…、w1_i'、…、w1_m' for each target image;

w_i'＝β1*w1_i'+β2*w2_i'

5. The visual image-based water surface target recognition method according to claim 4, wherein: the value range of the alpha 1 is 0.5-0.7, the value range of the alpha 2 is 0.2-0.3, and the value range of the alpha 3 is 0.1-0.2.

6. The visual image-based water surface target recognition method according to claim 4, wherein: the value range of beta 1 is 0.6-0.8, and the value range of beta 2 is 0.2-0.4.

7. The visual image-based water surface target recognition method according to claim 1, wherein: the types of objects include passenger ships, cargo ships, light beacon ships, warships, sailing ships, and others.

8. A visual image-based water surface target recognition apparatus for a visual image-based water surface target recognition method according to any one of claims 1 to 7, comprising:

9. An electronic device, comprising:

A memory for storing a computer program;

A processor for executing the computer program to implement the visual image-based water surface target recognition method of any one of the preceding claims 1-7.