CN109143247B

CN109143247B - Three-eye underwater detection method for acousto-optic imaging

Info

Publication number: CN109143247B
Application number: CN201810795897.XA
Authority: CN
Inventors: 刘艳; 李庆武; 霍冠英; 周妍
Original assignee: Changzhou Campus of Hohai University
Current assignee: Changzhou Campus of Hohai University
Priority date: 2018-07-19
Filing date: 2018-07-19
Publication date: 2020-10-02
Anticipated expiration: 2038-07-19
Also published as: CN109143247A

Abstract

The invention discloses a three-eye underwater detection method for acousto-optic imaging, which comprises the following steps: mounting forward-looking sonar imaging equipment and two visible light cameras at the front end of a detector in a parallel trinocular form; automatically analyzing whether a target exists in a sonar imaging range, if so, automatically segmenting a target image, and estimating the target distance and direction; setting the navigation direction, speed and distance of the detector according to the target distance and direction, and acquiring an optical image; carrying out image restoration and significance detection on the optical left and right views to distinguish a foreground and a background; taking the detected foreground as a target, performing optimal search domain stereo matching on the left view and the right view, and calculating world coordinates of a reference image; and estimating the length, width, height and distance of the target, displaying and labeling in a result window, and completing detection. The invention automatically acquires the underwater optical and sonar images, determines the distance and the size of the underwater target, and provides visual, convenient and accurate monitoring data for underwater operation.

Description

Three-eye underwater detection method for acousto-optic imaging

Technical Field

The invention relates to the technical field of computers, in particular to a three-eye underwater detection method based on acousto-optic imaging.

Background

The underwater target detection and positioning has an important role in the fields of national defense, civil life and the like, and a proper underwater imaging technology and a corresponding processing method are required. Currently, underwater imaging technologies mainly include sonar imaging and optical imaging. Sonar imaging has the advantages of long working distance and strong penetrating power, but is affected by the complexity and changeability of a water medium, loss, scattering, transmission and the like of sound waves during transmission, and sonar images have the problems of low contrast, strong speckle noise, fuzzy target edges and the like. The optical imaging resolution is higher, underwater scenes and targets in visible light images are clearer and truer, but the working distance is shorter. The single sonar image and optical image processing method cannot accurately detect and locate underwater targets, which brings difficulties to underwater surveying, underwater operation and underwater structure safety detection.

Disclosure of Invention

In view of the above-mentioned defects in the prior art, the technical problem to be solved by the present invention is to provide a method for underwater detection with three objectives of acousto-optic imaging, so as to automatically detect the position and size of an underwater target, integrate two visible light cameras, a forward-looking sonar imaging device, a data analysis module, a navigation control module, and a communication module, automatically acquire underwater optical and sonar images, analyze and determine the distance and size of the underwater target, and provide visual, convenient, and accurate monitoring data for underwater operation.

In order to achieve the purpose, the invention provides a three-eye underwater detection method for acousto-optic imaging, which comprises the following steps:

step one, mounting a forward-looking sonar imaging device and two visible light cameras at the front end of a detector in a parallel three-eye mode, and respectively photographing a left view and a right view by the two visible light cameras mounted in a parallel two-eye mode;

automatically analyzing whether a target exists in the sonar imaging range, if so, automatically segmenting a target image, and estimating the target distance and direction;

thirdly, setting the navigation direction, speed and distance of the detector according to the target distance and direction, and acquiring an optical image;

fourthly, image restoration and significance detection are carried out on the optical left and right views, and a foreground and a background are distinguished;

step five, taking the detected foreground as a target, performing optimal search domain stereo matching on the left view and the right view, and calculating world coordinates of the reference image;

and sixthly, estimating the length, width, height and distance of the target, displaying and labeling in a result window, and finishing detection.

In the first step, the detector consists of two optical imaging devices, a forward-looking sonar imaging device, a three-axis electronic compass, a thruster module, a power supply and an information processing and control unit which has functions of sonar and optical image analysis, aircraft attitude judgment, navigation control and communication; and the two optical imaging devices, one forward-looking sonar imaging device, the three-axis electronic compass, the propeller module and the power supply are all connected with the information processing and control unit.

A sonar image filtering step is also arranged between the first step and the second step, and the filtered sonar image is represented as:

wherein, (x, y) is the pixel position in the sonar image, (s, t) is the window pixel position of the mask w, g is the filtered sonar image,

a filter that is a stair step mask of 5 × 5.

In the second step, the automatic analysis of whether a target exists in the sonar imaging range is realized by sonar target detection, the target detection performs gradient analysis on the filtered sonar image G, whether the target exists in the image is judged, and the calculation method of the image gradient G [ G (x, y) ] is as follows:

setting a gradient threshold Th_GAnd the gradient value smaller than the threshold value is cut off, a gradient graph G' is obtained, connectivity detection is carried out on the gradient graph, the number of pixel points in all the connected regions is calculated, and if the number of the pixel points in a certain connected region is larger than 0.01m × n, a target is considered to exist in the sonar image.

The method for estimating the distance and the direction of the target comprises the following steps:

the connected region with the maximum number of pixel points in the gradient map G' is taken as the offAnd (3) target injection, extracting the outline of the target, calculating the minimum circumscribed rectangle of the outline, solving the coordinate of the central point of the rectangle, and calculating the distance and the offset angle between the coordinate of the central point of the rectangle and the origin by taking the central point of the sonar image as the origin to realize target positioning. In the fourth step, the image restoration of the optical left and right views is based on the Jaffe-McGlamry underwater imaging model

I is an image to be restored; j is a restored clear image; a ═ A₁,A₂,A₃,A₄,A₅,A₆) The illumination intensity of the water body; t is t₀Is the lower limit value of the transmittance,

is an estimated value of transmittance

θ is the image region sequence number, where θ is 0, …, 6, Ω_θIs the theta image area, A_θIs the water illumination intensity of the theta image region, I_LIs the L component of the image LAB space.

The method for calculating the illumination intensity A of the water body comprises the following steps:

an optical image with resolution of m × n, where m is greater than n, pixel calculation region of original image is (0-m, 0-n), pixel effective calculation region is (b/s-m-b/s, 0-n), finding out maximum inscribed circle and radius of the region as r by using the central point of the effective calculation region as the center of circle, and finding out the maximum inscribed circle and radius of the inscribed circle as r respectively

In 5 concentric circles of r, the gray average value of the pixel points found in each concentric circle is used as the water body illumination intensity of the circle region, and the gray average value of the pixel points of 0.06% of the non-effective calculation region with the maximum brightness is used as the water body illumination intensity of the region, wherein the pixel points of 0.01%, 0.02%, 0.03%, 0.04% and 0.05% with the maximum brightness are in the concentric circles.

In the fourth step, the significance detection step is as follows:

step 1: extracting color features and texture features of the image, calculating the similarity of the color features and the texture features, carrying out normalization and weighted fusion, and generating super pixels by taking the fused distance metric D as a similarity metric reference of pixels and a cluster center in an SLIC algorithm, wherein the calculation method of the distance metric D comprises the following steps:

d in formula (3)_t、d_c、d_sRespectively representing texture, color and spatial similarity of the pixels; n is a radical of_c、N_sα is a weight value adjusting coefficient of texture and color space characteristics, the smaller the distance measurement D value is, the greater the similarity between pixels is;

step 2: constructing a directed graph for the image after the super-pixel segmentation, taking the generated super-pixel bodies as a minimum unit, calculating color characteristic difference between the super-pixel bodies, and clustering the super-pixel bodies by taking the difference value as an edge weight value between the super-pixel bodies in the directed graph to realize the region combination based on the super-pixels;

color feature C of a single superpixel_PThe calculation formula is as follows:

wherein, P represents a single super-pixel region, i is a pixel point in the super-pixel, and n' is the total number of the pixel points in the super-pixel;

and step 3: calculating the global brightness mean characteristic avgL of the image:

where j is 1, …, K is the number of superpixels, the significance F of the superpixel P_PExpressed as:

||C_P-avgL||²characterizing the saliency of a superpixel, wherein a larger value indicates a stronger saliency of the superpixel;

as significant reference coefficient, S_jIs the contrast characteristic of a single super-pixel region, if F_PIf the number is more than 1, marking the super pixel P as a significant super pixel;

taking each super pixel as a clustering data point with the value of avgL, clustering all super pixels in the image to obtain the corresponding class mark of each super pixel, and calculating the total number n of the super pixels in the class_PAnd a total number of salient superpixels n_FPFinally, the occupation ratio of the significant super-pixel in the class is obtained

If the ratio η of the significant superpixel in the cluster is greater than the predetermined threshold Th_PIf yes, marking the class as a target foreground class, otherwise, marking the class as a background class and Th_PThe value is 0.5.

In the fifth step, the optimal search domain stereo matching takes all pixel points in the foreground, namely the target, in the left view and the right view as nodes to form a minimum weight spanning tree, obtain the relation between each node and other pixels, namely edge weights, and carry out cost aggregation;

sorting the edge weight values from large to small according to the formula (3), traversing all edges, and if the edge weight value omega meets the condition shown in the formula (7), considering that the parallax values of two pixel points are consistent, and combining the two points; otherwise, do not combine:

wherein, T_p、T_qIs a child node of the spanning tree, Int (T)_p)、Int(T_q) Is a child node T_p、T_qThe maximum edge weight in (1);

and performing the tree segmentation process again according to the obtained target parallax of the reference image, and performing parallax refinement on the foreground according to the connected region label to obtain the final parallax of the target.

In the sixth step, the length, width, height and distance of the target are estimated by the following specific method:

acquiring connected regions in all the foreground, and marking the connected regions as a target T from top to bottom and from left to right₁,…,T_NCreating a tiltable bounding box, namely a minimum bounding rectangle, for the edge, namely the outline, of each connected region, performing histogram statistics on the parallax value of the pixel points in each connected region according to the reference image parallax map to obtain a maximum peak position P_maxAnd the disparity value d of the nearest trough thereof_P-And d_P+The apparent disparity value is (d)_P-,d_P+) Judging connected regions of the pixels in the range, and creating a minimum external rectangle R for the connected region with the maximum number of the pixels_NTaking the length and width of the rectangle as the calculation basis of the width and height of a target, wherein the maximum parallax position of the pixel points in each connected region is the closest position; the minimum parallax position is the nearest distance, the difference value of the minimum parallax position and the nearest distance is the maximum length of each target, and the length, width and height and distance calculation formula of the targets is as follows:

in the formula (8), d_max、d_minMaximum and minimum disparities for the object, respectively; f' represents a camera focal length in the optical imaging apparatus; l_r1、l_r2Are respectively a target R_NThe side length of the minimum external rectangle is defined as the Euclidean distance of a three-dimensional space, and the calculation method comprises the following steps:

in the formula (9) (X)₁,Y₁,Z₁)、(X₂,Y₂,Z₂)、(X₃,Y₃,Z₃) Are respectively the external rectangleWorld coordinates of three vertices.

The invention has the beneficial effects that:

1) according to the invention, the sonar imaging equipment and the two optical imaging equipment are simultaneously carried at the front end of the underwater detector, and the advantages of the two imaging methods in long-distance detection and short-distance accurate imaging are combined, so that an underwater target can be accurately sensed;

2) the invention adopts an image restoration method of regional estimation of illumination intensity, and solves the problem of underwater optical image clearness caused by light scattering in water;

3) the invention applies the multi-view vision technology to underwater detection, can accurately sense the position and the size of an underwater target and provides accurate data for underwater target detection and underwater operation.

The conception, the specific structure and the technical effects of the present invention will be further described with reference to the accompanying drawings to fully understand the objects, the features and the effects of the present invention.

Drawings

FIG. 1 is a schematic view of an underwater detector;

FIG. 2 is a flow chart of a data analysis process;

FIG. 3 is a diagram of the results of sonar image analysis;

fig. 4 is a schematic diagram of the target detection result.

Detailed Description

The following detailed description of the principles of the invention is provided in connection with the accompanying drawings.

Referring to fig. 1, the detector is composed of two optical imaging devices, a forward-looking sonar imaging device, a three-axis electronic compass, a thruster module, a power supply, and an information processing and control unit integrating data analysis, attitude determination, navigation control, and communication functions. ,

the two optical imaging devices have completely consistent parameters and are respectively arranged at the left end and the right end of a straight line, the front-view sonar imaging device is arranged in the middle, and the three imaging devices are parallelly fixed at the front end of the detector to simultaneously acquire binocular visible light and sonar images. And underwater LED lamps are arranged above the two optical imaging devices.

The three-axis electronic compass magnetic sensor composed of the three-axis magnetic sensor and the three-axis acceleration sensor can output the heading, the rolling and the pitching angles of the underwater detector, and provides accurate direction and attitude information for navigation and target propulsion of the underwater detector.

The thruster module consists of two vertical thrusters and two longitudinal thrusters, and finishes actions such as advancing, retreating, floating, submerging and the like according to a navigation control command, so that the navigation speed and the navigation direction of the detector can be controlled.

The power supply system adopts a silver-zinc battery with small volume, light weight and large capacity, is provided with a corresponding power supply controller and supplies power for the imaging equipment, the underwater searchlight LED lamp, the three-axis electronic compass, the propeller and the information processing and control unit.

Two optical imaging devices, a forward-looking sonar imaging device, a three-axis electronic compass, a propeller module and a power supply are all connected with the information processing and control unit. The information processing and control unit can analyze image data acquired by the imaging equipment, acquire triaxial electronic compass data and judge the posture of the detector, control four propellers of the propeller module so as to control the detector to sail, keep communication with a water control end, send the state and detection result of the detector, and receive a manual sailing instruction, a docking instruction and the like.

Data analysis includes sonar image processing, optical image processing, target detection and localization, and the like. The process flow is shown in FIG. 2.

And sonar image processing, namely filtering and denoising an image acquired by forward-looking sonar equipment, and then detecting a target. And automatically segmenting the sonar image with the target to obtain the target image.

The optical image processing is to restore an underwater optical image.

And target detection, namely segmenting and extracting foreground targets in the restored left view and right view by using significance detection.

And target positioning, including stereo matching, scene world coordinate calculation and target position marking. The scene world coordinate calculation is to perform stereo matching on the restored optical left and right views, to calculate a disparity map of the left or right view, and further to obtain world coordinates of each pixel point in the corresponding image; the target position mark is to create a tiltable rectangular bounding box for each target, calculate the length, width, height and distance of the bounding box, and display and mark the bounding box in a result window.

The three-eye underwater detection method of acousto-optic imaging comprises the following steps:

1. sonar image acquisition and processing

And (3) placing the detector into the water body, starting all imaging devices, acquiring the detector posture by the information processing and control unit, and controlling the propeller to sail according to the sailing instruction of the overwater control end. Receiving sonar imaging equipment data, preprocessing a sonar image, reducing noise interference universally existing in a forward sonar image, and improving image quality. Carrying out target detection on the preprocessed image to judge whether a target exists in an imaging range; if the target exists, automatically segmenting the target image, and estimating the target distance and the target direction;

and the sonar image preprocessing is to perform spatial filtering on the sonar image acquired by the sonar equipment. The resolution of the sonar imaging device is x y, the original image is f, and the resolution is calculated using a 5 x 5 filter with a stepped mask:

the filtered sonar images are:

wherein, (x, y) is the pixel position in the sonar image, (s, t) is the window pixel position of the mask w, and g is the filtered sonar image.

And the target detection is to perform gradient analysis on the filtered sonar image g and judge whether a target exists in the image. When the target exists in the image, the gray value of some pixel points in the image is changed sharply. The gradient G [ G (x, y) ] is calculated by:

the filtering can reduce the influence of noise but cannot completely remove the noise, and therefore, a threshold value Th of gradient is set_GAnd if the number of pixel points contained in a certain communication area is greater than 0.01m × n (m and n are the numbers of horizontal and vertical pixel points of the image respectively), determining that the target exists in the sonar image.

The target segmentation and positioning are to perform target extraction and position calculation on the sonar image with the target. And taking the connected region with the maximum number of pixel points in the gradient map G' as an attention target, extracting the outline of the connected region, calculating the minimum external rectangle of the outline, and solving the coordinates of the center point of the rectangle. And (4) calculating the distance and the offset angle between the coordinates of the target central point and the origin by taking the sonar image central point as the origin, so as to realize target positioning.

2. Optical image acquisition and restoration

And setting the navigation direction, speed and distance of the detector according to the target distance and direction estimated by the sonar image, carrying out proximity detection, and entering an optical image processing flow when the distance reaches a threshold value of 3 m. Taking 1 frame every 0.2 second for the optical imaging video, and respectively carrying out image restoration on a group of (2) visible left and right views at the moment.

The underwater image restoration aims to improve the image noise and the image quality reduction caused by the absorption and the back scattering of light rays by a water body. According to the Jaffe-McGlamry underwater imaging model, the following are provided:

I＝J·t+A·(1-t) (4)

wherein, I is an original image (an image to be restored), J is a clear image after restoration, A is the illumination intensity of the water body, and t represents the transmissivity of the water body.

Because the left and right eye optical cameras use the circular LED lamps for light supplement, the circular light source is considered to be in a circular shape, an effective calculation area is found according to the calibration parameters of the optical cameras, and then the water body illumination intensity A is calculated. Known as optical systemsThe resolution of the image equipment is m × n (m is larger than n), the size of the corresponding left view and right view is m × n, the focal length of the left eye optical camera and right eye optical camera is f, the pixel size is s, the camera baseline distance b is obtained through stereo calibration, the pixel width of the non-overlapping area imaged by the left camera and right camera is b/s, the pixel calculation area of the original image is (0-m, 0-n), the pixel effective calculation area is (b/s-m-b/s, 0-n), the maximum inscribed circle and the radius of the area are r by taking the center point of the effective calculation area as the center point, and the circle center and the radius of the inscribed circle are respectively found

And in the 5 concentric rings of r, the average gray value of the pixel points found in each concentric ring is used as the water body illumination intensity of the ring area, wherein the brightness of the pixel points is 0.01%, 0.02%, 0.03%, 0.04% and 0.05% of the maximum brightness. And taking the gray average value of the pixel points with the maximum brightness of 0.06% in the non-effective calculation region as the water body illumination intensity of the region. The obtained water body illumination intensity can be expressed as A ═ A (A)₁,A₂,A₃,A₄,A₅,A₆)。

Due to the change of different water body environments and water depths, the transmissivity t of the water body in a single image is not a fixed value, the image to be restored is divided into 6 areas according to the method, and the transmissivity of each area is estimated respectively.

In the formula (5), n is an image region serial number, where θ is 0, …, 6, Ω_θIs the theta image area, A_θIs the water illumination intensity of the theta image region, I_LIs the L component of the image LAB space.

In order to avoid recovery errors caused by too low transmittance, a lower limit value t is set for the transmittance₀Taking 0.1, the final restored image is:

3. target detection

And respectively carrying out significance detection on the restored left view and right view. The significance detection steps are as follows:

step 1: and extracting color features and texture features of the image, calculating the similarity of the color features and the texture features, normalizing and weighting and fusing the color features and the texture features, and generating the superpixel by taking the fused distance metric D as a similarity metric standard between the pixels and the clustering center in the SLIC algorithm. The distance metric D is calculated by the following method:

d in formula (7)_t、d_c、d_sRespectively representing texture, color and spatial similarity of the pixels; n is a radical of_c、N_sThe maximum class space distance and the maximum color distance are respectively, α is a weight adjusting coefficient of the texture and the color space characteristics, and the smaller the distance measurement D value is, the greater the similarity between pixels is.

Step 2: and constructing a directed graph for the image after the super-pixel segmentation, taking the generated super-pixel bodies as a minimum unit, calculating color characteristic differences among the super-pixel bodies, and clustering the super-pixel bodies by taking the difference values as edge weights among the super-pixel bodies in the directed graph to realize the region merging based on the super-pixels.

Color feature C of a single superpixel_PThe calculation formula is as follows:

wherein P represents a single super-pixel region, i is a pixel point within the super-pixel, and n' is the total number of pixel points within the super-pixel.

wherein K is a super pixelThe number of the cells. Significance F of super pixel P_PExpressed as:

||G_P-avgLv²characterizing the saliency of a superpixel, wherein a larger value indicates a stronger saliency of the superpixel;

as significant reference coefficient, S_jContrast characteristics for a single super pixel region. If F_PIf > 1, the super pixel P is marked as a significant super pixel.

If the ratio η of the significant superpixel in the cluster is greater than the predetermined threshold Th_PIf yes, the class is marked as a target foreground class, otherwise, the class is marked as a background class. Th_PThe value is 0.5.

4. Target localization

After a target in an image is detected, performing optimal search domain stereo matching based on a foreground target on left and right views; calculating the world coordinates of the reference image according to the reference image disparity map obtained by matching and the calibration parameters of the two visible light cameras; and calculating the size and the distance of the target, and determining the position of the target. The method comprises the following specific steps:

and taking all pixel points in the foreground (namely the target) in the left view and the right view as nodes to form a minimum weight spanning tree, obtaining the relation (namely the edge weight) between each node and other pixels, and carrying out cost aggregation.

Calculating edge weights according to the distance metric D obtained in the formula (7), sequencing the edge weights from large to small, traversing all edges, and if the edge weight omega meets the condition shown in the formula (13), considering that the parallax values of two pixel points are consistent, and combining the two points; otherwise, not merged.

T_p、T_qIs a child node of the spanning tree, Int (T)_p)、Int(T_q) Is a child node T_p、T_qAnd k is an edge weight value adjusting factor, and the value of k is 1000-1500.

And calculating the world coordinates of the target in the reference image by combining the calibrated parameters of the two visible light cameras.

The focal length of the two visible light cameras is f', the baseline distance between the two visible light cameras is b, the parallax of each target pixel point in the reference image is d, x and y are respectively the abscissa and the ordinate of the reference image, X, Y respectively represent world coordinates of the longitudinal axis and the transverse axis in the two-dimensional image space, and Z represents the scene depth.

The maximum foreground disparity represents the minimum distance of the target from the camera, Z at this time_minI.e. the target distance.

5. Display of detection results

Obtaining connected regions in all the foreground by the foreground target obtained in the target detection, and marking the connected regions from top to bottom and from left to right as a target T₁,…,T_N. A tiltable bounding box, i.e. a minimum bounding rectangle, is created for each connected region edge (i.e. contour) and displayed in the result window, as shown in fig. 3, 4.

The parallax image of the reference image obtained in the target positioning is subjected to histogram statistics on the parallax value of the pixel points in each connected region to obtain the parallax imageMaximum peak position P_maxAnd the disparity value d of the nearest trough thereof_P-And d_P+The apparent disparity value is (d)_P-,d_P+) Judging connected regions of the pixels in the range, and creating a minimum external rectangle R for the connected region with the maximum number of the pixels_NThe length and width of the rectangle are used as the calculation basis of the width and height of the target. The maximum parallax position of the pixel points in each connected region is the closest position; the minimum parallax is the closest distance, and the difference between the minimum parallax and the closest distance is the possible maximum length of each target. The length, width, height and distance calculation formula of the target is as follows:

in the formula (13), d_max、d_minMaximum and minimum disparities for the object, respectively; l_r1、l_r2Are respectively a target R_NThe side length of the minimum external rectangle is defined as the Euclidean distance of a three-dimensional space, and the calculation method comprises the following steps:

in the formula (15) (X)₁,Y₁,Z₁)、(X₂,Y₂,Z₂)、(X₃,Y₃,Z₃) Respectively, the world coordinates of three vertexes of the circumscribed rectangle.

The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims

1. A three-eye underwater detection method of acousto-optic imaging is characterized by comprising the following steps:

a sonar image denoising and filtering step is also arranged between the first step and the second step, and the sonar image after denoising and filtering is represented as:

f (x, y) is the value of a pixel point (x, y) in the sonar image, and f (x + s, y + t) represents the value of each pixel point in a window of the mask w centering on the pixel point (x, y);

step five, taking the detected foreground as a target, performing optimal search domain stereo matching on the left view and the right view, solving a target disparity map, and calculating world coordinates of the target in the reference image according to parameters obtained by calibration;

2. The acousto-optic imaging trinocular underwater detection method of claim 1, characterized in that: in the first step, the detector consists of two optical imaging devices, a forward-looking sonar imaging device, a three-axis electronic compass, a propeller module, a power supply and an information processing and control unit which has functions of sonar and optical image analysis, aircraft attitude judgment, navigation control and communication; and the two optical imaging devices, one forward-looking sonar imaging device, the three-axis electronic compass, the propeller module and the power supply are all connected with the information processing and control unit.

3. The acousto-optic imaging trinocular underwater detection method according to claim 2, characterized in that: in the second step, the automatic analysis of whether a target exists in the sonar imaging range is realized by sonar target detection, the target detection performs gradient analysis on the filtered sonar image G, whether the target exists in the image is judged, and the calculation method of the image gradient G [ G (x, y) ] is as follows:

setting a gradient threshold Th_GAnd the gradient value smaller than the threshold value is cut off, a gradient graph G' is obtained, connectivity detection is carried out on the gradient graph, the number of pixel points in all connected regions is calculated, and if the number of pixel points contained in a certain connected region is larger than 0.01m × n, wherein m and n are the number of horizontal and vertical pixel points of the image respectively, the sonar image is considered to have a target.

4. The acousto-optic imaging trinocular underwater detection method according to claim 3, characterized in that the estimation method of the object distance and orientation is as follows:

the method comprises the steps of taking a connected region with the largest number of pixel points in a gradient map G' as an attention target, extracting the outline of the connected region, calculating the minimum external rectangle of the outline, solving the coordinate of a central point of a rectangle, taking the central point of a sonar image as an origin, and calculating the distance between the coordinate of the central point of the rectangle and the origin and the offset angle to realize target positioning.

5. The acousto-optic imaging trinocular underwater detection method according to claim 4, characterized in that in step four, the image restoration of the optical left and right views is based on the Jaffe-McGlamry underwater imaging model

is an estimated value of transmittance

6. The acousto-optic imaging trinocular underwater detection method according to claim 5, characterized in that the method for calculating the water body illumination intensity A is as follows:

an optical image with the resolution of m × n, wherein m is larger than n, the pixel calculation area of an original image is (0-m, 0-n), the pixel effective calculation area is (b/s-m-b/s, 0-n), wherein s and b are the pixel size and the baseline distance of a left eye optical camera and a right eye optical camera obtained by stereo calibration respectively, the maximum inscribed circle and the radius of the area are r by taking the center point of the effective calculation area as the center of a circle, and the center point and the radius of the inscribed circle are respectively taken as the center of the circle and the radius of the area as r

In 5 concentric rings of r, the average gray value of the pixel points found in each concentric ring is used as the water illumination intensity of the ring area, and the ineffectively counting is carried out on the pixel points with the maximum brightness of 0.01%, 0.02%, 0.03%, 0.04% and 0.05% in the 5 concentric ringsAnd calculating the gray average value of the pixel points with the maximum brightness of 0.06 percent of the area as the water body illumination intensity of the area.

7. The acousto-optic imaging trinocular underwater detection method according to claim 6, characterized in that in step four, the saliency detection step is:

color feature C of a single superpixel_PThe calculation formula is as follows:

8. The acousto-optic imaging trinocular underwater detection method according to claim 7, characterized in that in step five, the optimal search domain stereo matching takes all pixel points in the foreground, i.e. the target, in the left and right views as nodes to form a minimum weight spanning tree, and the relationship, i.e. the edge weight, between each node and other pixels is obtained, and cost aggregation is performed;

wherein k is an edge weight adjustment factor, T_p、T_qIs a child node of the spanning tree, Int (T)_p)、Int(T_q) Is a child node T_p、T_qThe maximum edge weight in (1);

9. The acousto-optic imaging trinocular underwater detection method according to claim 8, characterized in that in step six, the length, width, height and distance of the target are estimated by the following specific method:

acquiring connected regions in all the foreground, and marking the connected regions as a target T from top to bottom and from left to right₁,…,T_NCreating a tiltable bounding box, namely a minimum bounding rectangle, for the edge, namely the outline, of each connected region, performing histogram statistics on the parallax value of the pixel points in each connected region according to the reference image parallax map to obtain a maximum peak position P_maxAnd the disparity value d of the nearest trough thereof_P-And d_P+The apparent disparity value is (d)_P-,d_P+) Judging connected regions of the pixels in the range, and creating a minimum external rectangle R for the connected region with the maximum number of the pixels_NTaking the length and width of the rectangle as the calculation basis of the width and height of the target, wherein the maximum parallax position of the pixel points in each connected region is the nearest position to the sonar detector; the minimum parallax position is the farthest position of the sonar detector, the difference value of the minimum parallax position and the farthest position is the maximum length of each target, and the length, width and height and distance calculation formula of the target is as follows:

in the formula (9) (X)₁,Y₁,Z₁)、(X₂,Y₂,Z₂)、(X₃,Y₃,Z₃) Respectively, the world coordinates of three vertexes of the circumscribed rectangle.