CN118644640B - Underwater image three-dimensional reconstruction method and system based on deep learning - Google Patents
Underwater image three-dimensional reconstruction method and system based on deep learning Download PDFInfo
- Publication number
- CN118644640B CN118644640B CN202411093922.1A CN202411093922A CN118644640B CN 118644640 B CN118644640 B CN 118644640B CN 202411093922 A CN202411093922 A CN 202411093922A CN 118644640 B CN118644640 B CN 118644640B
- Authority
- CN
- China
- Prior art keywords
- target
- underwater
- image
- data
- view
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 65
- 238000013135 deep learning Methods 0.000 title claims abstract description 25
- 230000033001 locomotion Effects 0.000 claims abstract description 156
- 238000004458 analytical method Methods 0.000 claims abstract description 44
- 230000004927 fusion Effects 0.000 claims description 66
- 230000000007 visual effect Effects 0.000 claims description 39
- 230000003287 optical effect Effects 0.000 claims description 30
- 238000009877 rendering Methods 0.000 claims description 20
- 238000005516 engineering process Methods 0.000 claims description 18
- 238000005259 measurement Methods 0.000 claims description 16
- 238000006243 chemical reaction Methods 0.000 claims description 15
- 238000006073 displacement reaction Methods 0.000 claims description 14
- 238000013507 mapping Methods 0.000 claims description 14
- 238000007781 pre-processing Methods 0.000 claims description 14
- 108010076504 Protein Sorting Signals Proteins 0.000 claims description 10
- 230000009466 transformation Effects 0.000 claims description 10
- 230000000694 effects Effects 0.000 claims description 9
- 230000001788 irregular Effects 0.000 claims description 9
- 238000012216 screening Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 7
- 230000010354 integration Effects 0.000 claims description 5
- 238000005070 sampling Methods 0.000 claims description 4
- 230000011218 segmentation Effects 0.000 claims description 4
- 238000012790 confirmation Methods 0.000 claims description 3
- 238000003706 image smoothing Methods 0.000 claims description 3
- 238000003384 imaging method Methods 0.000 abstract description 3
- 238000004422 calculation algorithm Methods 0.000 description 23
- 238000012545 processing Methods 0.000 description 20
- 238000013527 convolutional neural network Methods 0.000 description 19
- 230000008569 process Effects 0.000 description 19
- 238000012549 training Methods 0.000 description 9
- 230000008859 change Effects 0.000 description 6
- 238000010276 construction Methods 0.000 description 5
- 239000013598 vector Substances 0.000 description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 4
- 238000001514 detection method Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000000737 periodic effect Effects 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 238000007635 classification algorithm Methods 0.000 description 2
- 230000000875 corresponding effect Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 230000003252 repetitive effect Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000010355 oscillation Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000013441 quality evaluation Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/30—Assessment of water resources
Landscapes
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention relates to the technical field of underwater imaging, in particular to an underwater image three-dimensional reconstruction method and system based on deep learning. The method comprises the following steps: acquiring target underwater position data; performing single-view preliminary shooting according to the underwater position data of the target to obtain a preliminary shooting image of the underwater target; performing image ambiguity analysis on the primary shot image of the underwater target to generate single-view image ambiguity data of the underwater target; performing target motion recognition on the underwater target single-view image ambiguity data to generate target motion recognition data; according to the invention, the image blur compensation, multi-view shooting and accurate depth estimation are carried out on the underwater target, so that the comprehensiveness and accuracy of three-dimensional image reconstruction are improved.
Description
Technical Field
The invention relates to the technical field of underwater imaging, in particular to an underwater image three-dimensional reconstruction method and system based on deep learning.
Background
The background development history of the underwater image three-dimensional reconstruction method based on deep learning can be described as follows: with the progress of the underwater detection technology, the conventional underwater image processing method faces the challenges of large underwater illumination change, image blurring, lack of effective depth information and the like. With the advent of deep learning techniques, researchers began exploring the task of applying Convolutional Neural Networks (CNNs) to feature extraction and reconstruction of underwater images. By utilizing a large number of tagged data sets, such as underwater virtual data sets and limited field data, researchers have first developed CNN-based underwater image enhancement algorithms for improving the sharpness and contrast of images. With the introduction of technologies such as generation of a countermeasure network (GAN) and a self-encoder (Autoencoder), researchers have begun to explore how to reconstruct more accurate three-dimensional structures from a single underwater image, these methods utilize deep neural networks to learn features from complex underwater optical and physical scenes and through an end-to-end training process, achieve the ability to accurately reconstruct object shape and depth information in an underwater environment. However, the current traditional method is often influenced by factors such as light attenuation and water flow when processing underwater images, so that the images are blurred and motion blurred, meanwhile, the underwater target images can be obtained only from a single view angle, the reconstructed three-dimensional model is lack of comprehensiveness and accuracy, and the comprehensiveness and accuracy of target three-dimensional image reconstruction are low.
Disclosure of Invention
Based on this, it is necessary to provide a method and a system for three-dimensional reconstruction of underwater images based on deep learning, so as to solve at least one of the above technical problems.
To achieve the above object, a three-dimensional reconstruction method of an underwater image based on deep learning, the method comprising the steps of:
Step S1: acquiring target underwater position data; performing single-view preliminary shooting according to the underwater position data of the target to obtain a preliminary shooting image of the underwater target; performing image ambiguity analysis on the primary shot image of the underwater target to generate single-view image ambiguity data of the underwater target; performing target motion recognition on the underwater target single-view image ambiguity data to generate target motion recognition data;
Step S2: performing target classification on the standard underwater target single-view shot image according to the target motion identification data to generate first underwater target type data and second underwater target type data; performing multi-view shooting compensation on the standard underwater target single-view shooting image based on the first underwater target type data and the second underwater target type data, so as to generate a standard underwater target multi-view shooting image;
step S3: performing image superposition on a standard underwater target multi-view shot image and a standard underwater target single-view shot image to generate a target view superposition image; performing depth estimation on the target visual angle superposition image to generate an underwater target depth map; performing depth fusion reconstruction on the target visual angle superposition image and the underwater target depth map to generate a target underwater fusion depth map;
step S4: performing three-dimensional point cloud conversion on the target underwater fusion depth map and the standard underwater target multi-view shooting image to generate target underwater three-dimensional point cloud data; constructing a three-dimensional model of the target underwater three-dimensional point cloud data to generate a target underwater three-dimensional preliminary model; and performing model rendering on the target underwater three-dimensional preliminary model so as to generate a high-precision target three-dimensional underwater model to execute three-dimensional reconstruction operation of the underwater image.
According to the invention, the underwater target single-view image ambiguity data and the target motion identification data are generated by acquiring the target underwater position data, carrying out single-view preliminary shooting and image ambiguity analysis, and the data provide a basis for subsequent depth estimation and multi-view shooting compensation. The image is classified by utilizing the target motion identification data, the first underwater target type data and the second underwater target type data are generated, multi-view shooting compensation is carried out, and the information quantity and the accuracy of the underwater target image are enhanced. And overlapping the standard underwater target multi-view shooting image and the single-view shooting image, and carrying out depth estimation and depth fusion reconstruction to generate a high-quality target underwater fusion depth map, wherein the steps improve the accuracy of depth information acquisition and image reconstruction under an underwater scene. And generating target underwater three-dimensional point cloud data and a preliminary three-dimensional model by carrying out three-dimensional point cloud conversion and model construction on the target underwater fusion depth map and the multi-view shooting image. And then performing model rendering on the preliminary model to generate a high-precision target three-dimensional underwater model, and providing detailed and real three-dimensional representation for three-dimensional reconstruction operation of the underwater image. Therefore, the method improves the comprehensiveness and accuracy of three-dimensional image reconstruction by carrying out image blur compensation, multi-view shooting and accurate depth estimation on the underwater target.
Preferably, step S1 comprises the steps of:
step S11: acquiring target underwater position data by using a GPS;
Step S12: performing single-view preliminary shooting by utilizing an underwater camera array according to the underwater position data of the target to obtain an underwater target preliminary shooting image;
Step S13: performing image preprocessing on the primary shooting image of the underwater target to obtain a standard single-view shooting image of the underwater target, wherein the image preprocessing comprises image brightness enhancement, image geometric transformation and image smoothing;
Step S14: performing image ambiguity analysis on a standard underwater target single-view shot image to generate underwater target single-view image ambiguity data; and carrying out target motion recognition on the underwater target single-view image ambiguity data to generate target motion recognition data.
According to the invention, the GPS technology is utilized to accurately and rapidly acquire the underwater position data of the target, so that the positioning error is reduced, and the working efficiency is improved. The underwater camera array is used for shooting at a single view, so that a preliminary underwater target image can be acquired in a complex underwater environment, and basic data is provided for subsequent processing. The brightness enhancement, geometric transformation and smoothing treatment in the image preprocessing process effectively improve the definition and stability of the image, so that the subsequent analysis is more accurate and reliable. And the single-view image is subjected to ambiguity analysis, so that the image quality is evaluated, and the blurred image caused by the influence of the underwater environment is identified, so that the accuracy of the subsequent 3D reconstruction is ensured. The motion characteristics of the target can be effectively distinguished by carrying out target motion recognition through the ambiguity data, which is very important for tracking and recognition of the dynamic target.
Preferably, step S2 comprises the steps of:
Step S21: performing target optical flow tracking on the standard underwater target single-view shot image according to the target motion identification data to generate target optical flow tracking data;
Step S22: performing target classification on a standard underwater target single-view shot image through target optical flow tracking data and target motion identification data to generate first underwater target type data and second underwater target type data; judging the target type of the standard underwater target single-view shot image, and outputting the standard underwater target single-view shot image when the standard underwater target single-view shot image is confirmed to be the first underwater target type data;
Step S23: when the target is confirmed to be the second underwater target type data, multi-view shooting compensation is carried out on the standard underwater target single-view shooting image based on the underwater target single-view image ambiguity data, so that the standard underwater target multi-view shooting image is generated.
According to the invention, the target movement identification data is utilized to track the target optical flow, so that the movement track of the target can be more accurately captured, and the accuracy of the image data is improved. The target motion recognition data is utilized to track the target optical flow, so that the motion trail of the target can be captured more accurately, and the accuracy of the image data is improved. Aiming at the second underwater target type data, the defects of single-view shooting can be effectively overcome through multi-view shooting compensation, and more comprehensive target image information is obtained, which is particularly important for complex underwater environments, because more view information can be provided, and the quality of 3D reconstruction is improved. In the process of confirming the target type, the shooting strategy can be dynamically adjusted, and for the first underwater target type data, a standard single-view shot image is directly output, so that the processing efficiency is improved; for the second underwater target type data, the image quality is further optimized by compensating the photographing. The whole process of the step S2 enables the system to be more flexible and robust when processing different types of underwater targets, can adapt to changeable underwater environments, and provides high-quality image data support.
Preferably, the object classification of the standard underwater object single view shot image by the object optical flow tracking data and the object motion recognition data includes:
screening adjacent frame images of the standard underwater target single-view shot images to obtain the adjacent frame images;
Performing target position offset analysis on the adjacent frame images through the target optical flow tracking data to generate target position offset data;
Performing target motion period analysis on the target position offset data by utilizing the target motion identification data to generate target motion period analysis data; performing motion law exploration on the target motion cycle analysis data to obtain target motion law exploration data, wherein the target motion law exploration data comprises target motion law data and target motion irregular data;
Classifying targets of the standard underwater target single-view shot images according to the object motion rule exploration data, and generating first underwater target type data when the object motion rule exploration data are the target motion rule data; and when the object motion rule exploration data is the target motion irregular data, generating second underwater target type data.
According to the invention, the adjacent frame images are screened for the standard underwater target single-view shot images, so that the relation between adjacent frames in the target motion process can be rapidly determined, the data processing amount is effectively reduced, and the processing speed is improved. And performing target position offset analysis on the adjacent frame images by utilizing the target optical flow tracking data to generate accurate target position offset data, thereby being beneficial to accurately capturing the displacement change of the target in each frame image and improving the accuracy of motion analysis. And carrying out target movement period analysis on the target position deviation data by utilizing the target movement identification data, so that the movement period characteristics of the target can be identified, detailed target movement period analysis data is generated, and a solid foundation is provided for subsequent movement rule exploration. The motion rule exploration is carried out on the target motion cycle analysis data, the motion mode of the target, including regular motion and irregular motion, can be identified, so that target motion rule exploration data are generated, the process is helpful for understanding the motion behaviors of the target in depth, and the classification accuracy is improved. Target classification is carried out on the standard underwater target single-view shot images according to target movement rule exploration data, and the regularly moving targets and the irregularly moving targets can be effectively distinguished. When the target motion rule exploration data is target motion rule data, generating first underwater target type data; when the target motion rule exploration data is target motion irregular data, generating second underwater target type data, wherein the classification process improves the accuracy and reliability of target identification. Through carrying out detailed analysis and classification on the movement of the target, the movement characteristics of the target can be better understood, the subsequent image processing and analysis steps are optimized, and the overall processing quality is improved.
Preferably, step S23 includes the steps of:
Step S231: when the target is confirmed to be the second underwater target type data, carrying out target boundary frame identification on the standard underwater target single-view shot image to obtain target boundary frame data;
Step S232: performing image core region segmentation on a standard underwater target single-view shot image through target boundary frame data to generate a target core region image;
step S233: gray value sampling is carried out on the target core area image, and a one-dimensional signal sequence of the target core area image is obtained; performing Hill transformation effect calculation on the target core region image according to the one-dimensional signal sequence of the target core region image to obtain underwater target single view image ambiguity data;
step S234: and performing multi-view shooting compensation on the standard underwater target single-view shooting image based on the underwater target single-view image ambiguity data, so as to generate the standard underwater target multi-view shooting image.
According to the invention, the target boundary frame identification is carried out on the standard underwater target single-view shot image, so that the boundary position of the target can be accurately determined, accurate target boundary frame data is generated, and a reliable basis is provided for subsequent processing. The image is segmented by the core area by utilizing the target boundary frame data, so that the core area image of the target can be effectively extracted, the whole image is prevented from being processed, and the processing efficiency and the processing precision are improved. And gray value sampling is carried out on the target core region image to obtain a one-dimensional signal sequence, key characteristic information of the target core region can be reserved in the process, and accurate data support is provided for ambiguity calculation. And performing Hill bit transformation effect calculation according to the one-dimensional signal sequence to obtain underwater target single-view image ambiguity data, wherein the process can effectively evaluate the image ambiguity and provide important reference for multi-view compensation. The multi-view shooting compensation is carried out based on the fuzzy data, so that the definition and detail performance of the image can be improved by increasing the view information under the condition of image blurring, and a standard underwater target multi-view shooting image can be generated. Through multi-view shooting compensation, the blurring problem of a single-view shot image is effectively improved, the overall quality and usability of the image are improved, and clearer and complete data support is provided for subsequent 3D reconstruction and analysis.
Preferably, step S234 includes the steps of:
Step S2341: performing multi-view shooting and data acquisition through an IMU (inertial measurement unit) device on an underwater camera array based on single-view image ambiguity data of an underwater target to obtain an initial target multi-view image and target inertial measurement data;
Step S2342: performing data preprocessing on the target inertial measurement data to obtain target displacement data and target attitude information data, wherein the data preprocessing comprises denoising and integration;
step S2343: and carrying out image space positioning compensation on the target displacement data, the target attitude information data and the initial target multi-view image according to the visual inertia mileage technology, and generating a standard underwater target multi-view shooting image.
According to the invention, based on the underwater target single-view image ambiguity data, the IMU equipment on the underwater camera array is utilized to carry out multi-view shooting and data acquisition, so that more comprehensive and diversified initial target multi-view images and target inertial measurement data can be obtained, and the richness and accuracy of image information are enhanced. The target inertial measurement data is preprocessed, including denoising and integration, noise interference can be effectively reduced, more accurate target displacement data and target attitude information data are obtained, and high-precision reference data are provided for subsequent image space positioning compensation. By utilizing the visual inertia technology, the target displacement data, the target posture information data and the initial target multi-view image are combined to perform image space positioning compensation, so that the image position and posture of each view can be accurately calculated, and a standard underwater target multi-view shooting image is generated. Through image space positioning compensation, the position deviation caused by camera movement and target movement in multi-view shooting can be effectively corrected, so that the generated standard multi-view shooting image is more consistent and accurate, and the error in the image reconstruction process is reduced. The combination of multi-view shooting and space positioning compensation can remarkably improve the definition and accuracy of the generated multi-view image, reduce the condition of image blurring and distortion and improve the image quality.
Preferably, step S3 comprises the steps of:
step S31: target feature point confirmation is carried out on a standard underwater target multi-view shot image and a standard underwater target single-view shot image, so as to obtain multi-view image target feature point data and single-view image target feature point data;
Step S32: carrying out image superposition on the standard underwater target multi-view shot image and the standard underwater target single-view shot image to generate a target view superposition image;
step S33: performing feature point matching on the target view coincident image through multi-view image target feature point data and single-view image target feature point data to generate a target matching feature point set;
Step S34: performing depth estimation on the target matching feature point set by using a deep learning network to generate an underwater target depth map; and carrying out depth fusion reconstruction on the target visual angle superposition image and the underwater target depth map to generate a target underwater fusion depth map.
According to the invention, target characteristic points of the standard underwater target multi-view shot image and the single-view shot image are confirmed, and the obtained target characteristic point data of the multi-view and single-view images provides accurate basic data for subsequent image superposition and characteristic point matching. The process can integrate multi-view information to provide more complete image information, and is beneficial to subsequent feature point matching and depth estimation. Through feature point matching, feature point matching is carried out on the target view angle superposition image by utilizing target feature point data of the multi-view angle image and the single-view angle image, and a target matching feature point set is generated. And carrying out depth estimation on the target matching feature point set by using a deep learning network to generate an underwater target depth map. The deep learning network can train by utilizing large-scale data, has stronger feature extraction and depth estimation capabilities, and improves the accuracy and the robustness of depth estimation. The target view angle coincident image and the underwater target depth map are subjected to depth fusion reconstruction, so that the target underwater fusion depth map can be generated, the process combines multi-view angle and depth information, more accurate and complete 3D structure information can be provided, and the quality of 3D reconstruction is improved. By combining multi-view shooting with depth estimation, the effect and the precision of 3D reconstruction can be remarkably improved, a more vivid and fine underwater target 3D model can be generated, and high-quality data support is provided for subsequent analysis and application.
Preferably, step S34 includes the steps of:
step S341: screening the characteristic point motion trail of the target matching characteristic point set to obtain an image characteristic point trail set;
step S342: constructing a depth estimation network; predicting pixel depth values of the image characteristic point track set by using a depth estimation network, and generating an underwater target preliminary depth map;
step S343: and carrying out multi-view depth map fusion on the underwater target preliminary depth map by using the target view coincident image to generate a target underwater fusion depth map.
According to the method, the characteristic point motion trail screening is carried out on the target matching characteristic point set, and the obtained image characteristic point trail set can reflect the motion trail of the target under different visual angles more accurately, so that the accuracy of depth estimation is improved. And constructing a depth estimation network, and predicting pixel depth values of the image characteristic point track set by using the network to generate an underwater target preliminary depth map. The depth estimation network can effectively utilize the characteristic point track information, and improves the accuracy of depth prediction. And predicting the pixel depth value of the characteristic point track set by using a depth estimation network, and generating an underwater target preliminary depth map. The preliminary depth map provides preliminary depth information of the target and lays a foundation for subsequent multi-view fusion. And carrying out multi-view depth map fusion on the underwater target preliminary depth map by using the target view coincident image to generate a target underwater fusion depth map. The multi-view fusion can synthesize the depth information of each view, reduce the error of single-view depth estimation and improve the accuracy and reliability of the depth map. Through multi-view depth map fusion, the challenges brought by complex light conditions and target movement in the underwater environment can be better processed, and the generated target underwater fusion depth map has higher precision and definition. The high-precision target underwater fusion depth map provides more accurate depth information for subsequent 3D reconstruction, remarkably improves the effect and precision of 3D reconstruction, and generates a more vivid and fine underwater target 3D model.
Preferably, step S4 comprises the steps of:
step S41: performing three-dimensional point cloud conversion on the target underwater fusion depth map and the standard underwater target multi-view shooting image to generate target underwater three-dimensional point cloud data;
Step S42: carrying out surface reconstruction on the target underwater three-dimensional point cloud data to generate target three-dimensional surface reconstruction data;
Step S43: performing texture mapping on the target three-dimensional surface reconstruction data to generate target three-dimensional texture mapping data;
Step S44: constructing a three-dimensional model based on the target three-dimensional surface reconstruction data and the target three-dimensional texture mapping data to generate a target underwater three-dimensional preliminary model; and performing model rendering on the target underwater three-dimensional preliminary model so as to generate a high-precision target three-dimensional underwater model to execute three-dimensional reconstruction operation of the underwater image.
According to the invention, three-dimensional point cloud conversion is carried out on the target underwater fusion depth map and the standard underwater target multi-view shot image, so that the generated target underwater three-dimensional point cloud data can accurately reflect the three-dimensional structure of the target, and a reliable data base is provided for subsequent surface reconstruction. And carrying out surface reconstruction on the target underwater three-dimensional point cloud data, wherein the generated target three-dimensional surface reconstruction data can accurately reproduce the surface details of the target, and the accuracy and the authenticity of the model are improved. By performing texture mapping on the target three-dimensional surface reconstruction data, the generated target three-dimensional texture mapping data can accurately map detail information in a photographed image to the surface of the three-dimensional model, and the visual effect and detail performance of the model are enhanced. The three-dimensional model construction is carried out based on the target three-dimensional surface reconstruction data and the target three-dimensional texture mapping data, and the generated target underwater three-dimensional preliminary model can accurately reflect the three-dimensional structure and texture details of the target, so that a foundation is provided for a final high-precision model. And performing model rendering on the target underwater three-dimensional preliminary model, wherein the generated high-precision target three-dimensional underwater model can accurately reproduce the three-dimensional form and texture details of the target, and the visual effect and the authenticity of the model are improved. Through the whole process of the step S4, the three-dimensional reconstruction effect of the underwater image can be remarkably improved, the generated high-precision three-dimensional model has higher precision and authenticity, and high-quality three-dimensional data support is provided for subsequent analysis and application.
In the present specification, there is provided a depth learning-based underwater image three-dimensional reconstruction system for performing the above-described depth learning-based underwater image three-dimensional reconstruction method, the depth learning-based underwater image three-dimensional reconstruction system comprising:
The target identification module is used for acquiring target underwater position data; performing single-view preliminary shooting according to the underwater position data of the target to obtain a preliminary shooting image of the underwater target; performing image ambiguity analysis on the primary shot image of the underwater target to generate single-view image ambiguity data of the underwater target; performing target motion recognition on the underwater target single-view image ambiguity data to generate target motion recognition data;
The multi-view compensation module is used for carrying out target classification on the standard underwater target single-view shot image according to the target motion identification data to generate first underwater target type data and second underwater target type data; performing multi-view shooting compensation on the standard underwater target single-view shooting image based on the first underwater target type data and the second underwater target type data, so as to generate a standard underwater target multi-view shooting image;
The depth estimation module is used for carrying out image superposition on the standard underwater target multi-view shot image and the standard underwater target single-view shot image to generate a target view superposition image; performing depth estimation on the target visual angle superposition image to generate an underwater target depth map; performing depth fusion reconstruction on the target visual angle superposition image and the underwater target depth map to generate a target underwater fusion depth map;
The three-dimensional reconstruction module is used for carrying out three-dimensional point cloud conversion on the target underwater fusion depth map and the standard underwater target multi-view shooting image to generate target underwater three-dimensional point cloud data; constructing a three-dimensional model of the target underwater three-dimensional point cloud data to generate a target underwater three-dimensional preliminary model; and performing model rendering on the target underwater three-dimensional preliminary model so as to generate a high-precision target three-dimensional underwater model to execute three-dimensional reconstruction operation of the underwater image.
The invention has the beneficial effects that the primary image data and the definition analysis can be provided by acquiring the underwater position data of the target and carrying out single-view primary shooting and image ambiguity analysis, so as to lay a foundation for subsequent processing. The image is classified by utilizing the target motion identification data, different underwater target type data are generated, multi-view shooting compensation is carried out, the details and the motion characteristics of the target can be more comprehensively captured, and the information richness and the accuracy of the image are improved. And overlapping the standard underwater target multi-view shot image and the single-view shot image, performing depth estimation, performing depth fusion reconstruction, and generating a high-quality underwater target depth map, wherein the steps effectively improve the depth information acquisition and the image quality of the images in the underwater scene. By carrying out three-dimensional point cloud conversion and model construction on the target underwater fusion depth map and the multi-view shooting image, a high-precision target underwater three-dimensional model is generated, and model rendering is carried out at the same time, so that the accuracy and visual effect of three-dimensional reconstruction operation are improved. Therefore, the method improves the comprehensiveness and accuracy of three-dimensional image reconstruction by carrying out image blur compensation, multi-view shooting and accurate depth estimation on the underwater target.
Drawings
FIG. 1 is a schematic flow chart of the steps of a three-dimensional reconstruction method of an underwater image based on deep learning;
FIG. 2 is a flowchart illustrating the detailed implementation of step S2 in FIG. 1;
FIG. 3 is a flowchart illustrating the detailed implementation of step S3 in FIG. 1;
FIG. 4 is a flowchart illustrating the detailed implementation of step S4 in FIG. 1;
the achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
The following is a clear and complete description of the technical method of the present invention, taken in conjunction with the accompanying drawings, and it is evident that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, are intended to fall within the scope of the present invention.
Furthermore, the drawings are merely schematic illustrations of the present invention and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. The functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor methods and/or microcontroller methods.
It will be understood that, although the terms "first," "second," etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another element. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of example embodiments, with the term "and/or" as used herein including any and all combinations of one or more of the associated items listed.
To achieve the above objective, please refer to fig. 1 to 4, a three-dimensional reconstruction method of an underwater image based on deep learning, the method comprises the following steps:
Step S1: acquiring target underwater position data; performing single-view preliminary shooting according to the underwater position data of the target to obtain a preliminary shooting image of the underwater target; performing image ambiguity analysis on the primary shot image of the underwater target to generate single-view image ambiguity data of the underwater target; performing target motion recognition on the underwater target single-view image ambiguity data to generate target motion recognition data;
Step S2: performing target classification on the standard underwater target single-view shot image according to the target motion identification data to generate first underwater target type data and second underwater target type data; performing multi-view shooting compensation on the standard underwater target single-view shooting image based on the first underwater target type data and the second underwater target type data, so as to generate a standard underwater target multi-view shooting image;
step S3: performing image superposition on a standard underwater target multi-view shot image and a standard underwater target single-view shot image to generate a target view superposition image; performing depth estimation on the target visual angle superposition image to generate an underwater target depth map; performing depth fusion reconstruction on the target visual angle superposition image and the underwater target depth map to generate a target underwater fusion depth map;
step S4: performing three-dimensional point cloud conversion on the target underwater fusion depth map and the standard underwater target multi-view shooting image to generate target underwater three-dimensional point cloud data; constructing a three-dimensional model of the target underwater three-dimensional point cloud data to generate a target underwater three-dimensional preliminary model; and performing model rendering on the target underwater three-dimensional preliminary model so as to generate a high-precision target three-dimensional underwater model to execute three-dimensional reconstruction operation of the underwater image.
According to the invention, the underwater target single-view image ambiguity data and the target motion identification data are generated by acquiring the target underwater position data, carrying out single-view preliminary shooting and image ambiguity analysis, and the data provide a basis for subsequent depth estimation and multi-view shooting compensation. The image is classified by utilizing the target motion identification data, the first underwater target type data and the second underwater target type data are generated, multi-view shooting compensation is carried out, and the information quantity and the accuracy of the underwater target image are enhanced. And overlapping the standard underwater target multi-view shooting image and the single-view shooting image, and carrying out depth estimation and depth fusion reconstruction to generate a high-quality target underwater fusion depth map, wherein the steps improve the accuracy of depth information acquisition and image reconstruction under an underwater scene. And generating target underwater three-dimensional point cloud data and a preliminary three-dimensional model by carrying out three-dimensional point cloud conversion and model construction on the target underwater fusion depth map and the multi-view shooting image. And then performing model rendering on the preliminary model to generate a high-precision target three-dimensional underwater model, and providing detailed and real three-dimensional representation for three-dimensional reconstruction operation of the underwater image. Therefore, the method improves the comprehensiveness and accuracy of three-dimensional image reconstruction by carrying out image blur compensation, multi-view shooting and accurate depth estimation on the underwater target.
In the embodiment of the present invention, as described with reference to fig. 1, a schematic step flow diagram of a depth learning-based three-dimensional reconstruction method for an underwater image according to the present invention is provided, and in this example, the depth learning-based three-dimensional reconstruction method for an underwater image includes the following steps:
Step S1: acquiring target underwater position data; performing single-view preliminary shooting according to the underwater position data of the target to obtain a preliminary shooting image of the underwater target; performing image ambiguity analysis on the primary shot image of the underwater target to generate single-view image ambiguity data of the underwater target; performing target motion recognition on the underwater target single-view image ambiguity data to generate target motion recognition data;
In the embodiment of the invention, the accurate coordinates of the underwater position of the target are determined by utilizing sonar positioning equipment or an underwater positioning system. The position data may be obtained by a combination of GPS and underwater sensors, or may be acquired by an underwater robot (ROV/AUV). On the basis of the acquired underwater position data of the target, underwater imaging equipment (such as an underwater camera, a camera carried by an ROV and the like) is used for shooting the target at a single view angle. The camera equipment is ensured to be kept stable in the shooting process, and shake caused by water flow or other factors is reduced as much as possible. After shooting is completed, shooting underwater target images are acquired, and the images are used as basic data for subsequent processing. And carrying out ambiguity analysis on the shot image by using an image processing algorithm. The degree of blurring of the image can be evaluated using Laplacian transform, edge detection, etc. And generating ambiguity data reflecting the image ambiguity degree by calculating the gradient amplitude or the frequency domain characteristic of the image. The analyzed blur data is stored as structured data, which may contain specific values of blur, the location of the blur areas, and a sharpness assessment of the image as a whole. The ambiguity data is analyzed using a deep learning algorithm to identify target motion in the image. A Convolutional Neural Network (CNN) or a Recurrent Neural Network (RNN) may be used to analyze the sequence of images to determine whether there is motion, and the direction and speed of motion, of a target in the image. The identified target motion information is arranged into data, including the motion track, speed, direction and the like, and the data is used in subsequent multi-view image registration and three-dimensional reconstruction so as to improve the accuracy and consistency of reconstruction.
Step S2: performing target classification on the standard underwater target single-view shot image according to the target motion identification data to generate first underwater target type data and second underwater target type data; performing multi-view shooting compensation on the standard underwater target single-view shooting image based on the first underwater target type data and the second underwater target type data, so as to generate a standard underwater target multi-view shooting image;
In the embodiment of the present invention, the targets in the single-view shot image are classified by using the target motion recognition data generated in step S1. Deep learning algorithms (e.g., convolutional neural networks, CNNs) may be used for target detection and classification. By training a predefined object classification model, objects in the image can be classified into different types. For example, the model may classify the targets into two categories, a "stationary target" and a "moving target", or more refined categories. And respectively storing the classified target data as first underwater target type data and second underwater target type data. The first underwater object type data may represent a stationary object and the second underwater object type data may represent a moving object or other category defined according to particular needs. For stationary targets (first underwater target type data), conventional multi-view shots may be performed, ensuring that enough images are shot from different angles for three-dimensional reconstruction. For a moving object (second underwater object type data), multi-view photographing compensation is required. Based on the movement track and speed data of the target, the shooting time and angle are adjusted, and a plurality of visual angle images of the target at different time points are shot. Multi-view camera compensation may be implemented using a multi-camera array, or by controlling the motion path of the ROV/AUV. And (3) arranging the image set subjected to multi-view shooting and compensation into standard underwater target multi-view shooting images, wherein the images are used for a subsequent three-dimensional reconstruction process. Good registration and consistency between images are ensured, so that the accuracy of three-dimensional reconstruction is improved.
Step S3: performing image superposition on a standard underwater target multi-view shot image and a standard underwater target single-view shot image to generate a target view superposition image; performing depth estimation on the target visual angle superposition image to generate an underwater target depth map; performing depth fusion reconstruction on the target visual angle superposition image and the underwater target depth map to generate a target underwater fusion depth map;
In the embodiment of the invention, the multi-view shot image and the single-view shot image are aligned by using an image registration technology. Image registration may be performed using feature matching algorithms (e.g., SIFT, SURF) or directly with a deep learning model (e.g., a convolutional neural network-based image registration model). And after registering all the images, generating target visual angle superposition images which contain target information shot from different angles and visual points. And carrying out depth estimation on the coincident images by using a depth learning model. Common methods include single view depth estimation based on convolutional neural networks (e.g., U-Net, resNet, etc.), and multi-view based depth estimation (e.g., stereo Matching, structure-from-Motion). The result of the depth estimation will generate a depth map of the underwater target representing depth information corresponding to each pixel in the image. And combining the target visual angle superposition image with the underwater target depth map, and carrying out depth fusion reconstruction. A depth fusion algorithm (e.g., a 3D reconstruction algorithm based on a depth neural network) may be used to integrate the image information and the depth information. And generating a target underwater fusion depth map through depth fusion reconstruction, wherein the depth map can reflect the three-dimensional structure and shape of the target.
Step S4: performing three-dimensional point cloud conversion on the target underwater fusion depth map and the standard underwater target multi-view shooting image to generate target underwater three-dimensional point cloud data; constructing a three-dimensional model of the target underwater three-dimensional point cloud data to generate a target underwater three-dimensional preliminary model; and performing model rendering on the target underwater three-dimensional preliminary model so as to generate a high-precision target three-dimensional underwater model to execute three-dimensional reconstruction operation of the underwater image.
In the embodiment of the invention, three-dimensional point cloud data is generated by utilizing the target underwater fusion depth map and the multi-view image. The depth map may be converted into three-dimensional point cloud data using computer vision techniques, such as backprojection. By converting the depth information of each pixel into three-dimensional space coordinates (x, y, z), point cloud data is generated, which represents the three-dimensional structure of the underwater target. And generating a three-dimensional preliminary model of the target by using the three-dimensional point cloud data. Point cloud processing and three-dimensional modeling techniques may be used, such as surface reconstruction algorithms (e.g., poisson surface reconstruction, marching Cubes algorithms, etc.). Rendering the generated three-dimensional preliminary model to improve the visual effect of the model. The high-precision three-dimensional underwater model can be generated by adding effects such as textures, illumination, shadows and the like by using a rendering technology in computer graphics.
Preferably, step S1 comprises the steps of:
step S11: acquiring target underwater position data by using a GPS;
Step S12: performing single-view preliminary shooting by utilizing an underwater camera array according to the underwater position data of the target to obtain an underwater target preliminary shooting image;
Step S13: performing image preprocessing on the primary shooting image of the underwater target to obtain a standard single-view shooting image of the underwater target, wherein the image preprocessing comprises image brightness enhancement, image geometric transformation and image smoothing;
Step S14: performing image ambiguity analysis on a standard underwater target single-view shot image to generate underwater target single-view image ambiguity data; and carrying out target motion recognition on the underwater target single-view image ambiguity data to generate target motion recognition data.
In the embodiment of the invention, the accurate position coordinates on the water surface are acquired by using a GPS receiver, and the coordinates are used as the position reference of the underwater target. The underwater sonar equipment (such as multi-beam sonar) is used for scanning from the subsurface, and the accurate position of the target under the water is determined by combining the GPS coordinates of the water surface. And according to the determined target underwater position, the underwater camera array is deployed at a proper position and angle, so that the target area can be covered. And starting the underwater camera array to carry out single-view shooting. Each camera is ensured to shoot at the same time point, so that a group of preliminary shooting images of the underwater targets are obtained. The image brightness is enhanced by using a histogram equalization or adaptive histogram equalization (CLAHE) technology, and the visual effect of the image is improved. The images are geometrically corrected using affine or perspective transformation, ensuring correct alignment and correction of the images. The image is smoothed using a gaussian filter to reduce noise and detail. The image blur is analyzed using a Laplacian transform to generate blur data. And performing target motion recognition by using a deep learning algorithm or a traditional image processing technology to generate target motion recognition data.
As an example of the present invention, referring to fig. 2, the step S2 in this example includes:
Step S21: performing target optical flow tracking on the standard underwater target single-view shot image according to the target motion identification data to generate target optical flow tracking data;
Step S22: performing target classification on a standard underwater target single-view shot image through target optical flow tracking data and target motion identification data to generate first underwater target type data and second underwater target type data; judging the target type of the standard underwater target single-view shot image, and outputting the standard underwater target single-view shot image when the standard underwater target single-view shot image is confirmed to be the first underwater target type data;
Step S23: when the target is confirmed to be the second underwater target type data, multi-view shooting compensation is carried out on the standard underwater target single-view shooting image based on the underwater target single-view image ambiguity data, so that the standard underwater target multi-view shooting image is generated.
The embodiment of the invention is from the recognition of the motion track of the target through a sensor (such as an Inertial Measurement Unit (IMU)) and an image processing technology. The motion state of the target is analyzed in real time by using a deep learning algorithm, such as a Convolutional Neural Network (CNN) in combination with a Recurrent Neural Network (RNN). The motion of the object in the image is tracked using an optical flow algorithm, such as the Lucas-Kanade optical flow algorithm or the Farneback optical flow algorithm. The optical flow vector of the object is calculated by the pixel change between the successive frames. The optical flow vector data is combined with the target motion recognition data to generate complete target optical flow tracking data. The data includes displacement, velocity and direction information of the target at different points in time. And classifying the standard underwater target single-view shot image according to the optical flow tracking data and the target motion recognition data by using a classification algorithm (such as a Support Vector Machine (SVM) or a Convolutional Neural Network (CNN)). The targets are classified into a first underwater target type and a second underwater target type by a classification algorithm. The first type is a static moving object and the second type is a dynamic moving object. And judging the target type of each single-view image, and confirming whether the single-view image belongs to the first type or the second type. And if the image is confirmed to be the first underwater target type data, outputting an original single-view image. An image quality evaluation algorithm (such as a Laplace transform) is used to evaluate the image for blur, and blur data is obtained. Based on the blur data, an image that requires multi-view compensation is determined. The target image is re-shot from different angles by utilizing a multi-view shooting technology (such as structured light or stereoscopic vision), so that the defect of single-view shooting is overcome. Through image stitching and fusion algorithms (such as a multi-view geometric reconstruction algorithm), standard underwater target multi-view shooting images are generated, the images provide more comprehensive and detailed target information, and the accuracy and quality of three-dimensional reconstruction are improved.
Preferably, the object classification of the standard underwater object single view shot image by the object optical flow tracking data and the object motion recognition data includes:
screening adjacent frame images of the standard underwater target single-view shot images to obtain the adjacent frame images;
Performing target position offset analysis on the adjacent frame images through the target optical flow tracking data to generate target position offset data;
Performing target motion period analysis on the target position offset data by utilizing the target motion identification data to generate target motion period analysis data; performing motion law exploration on the target motion cycle analysis data to obtain target motion law exploration data, wherein the target motion law exploration data comprises target motion law data and target motion irregular data;
Classifying targets of the standard underwater target single-view shot images according to the object motion rule exploration data, and generating first underwater target type data when the object motion rule exploration data are the target motion rule data; and when the object motion rule exploration data is the target motion irregular data, generating second underwater target type data.
In an embodiment of the invention, adjacent frames of a standard underwater target single view shot image are selected based on target motion identification data, and are typically images that are continuous in time and correlated to target motion. Adjacent frame images are processed by using an optical flow algorithm, and optical flow vectors of the target between successive frames are calculated, wherein the vectors represent the motion track and speed of the target in time. The target position offset in the optical flow vector is analyzed based on target optical flow tracking data that describe the position change of the target between successive frames. The target position offset data is periodically analyzed using the target motion identification data, which includes identifying and analyzing a periodic motion pattern, such as periodic oscillations or repetitive motion, that exists for the target. Based on the target movement period analysis data, the movement rule of the target is explored, and the movement behavior of the target is classified into two types of rule and non-rule. Classifying according to target motion law exploration data: if the target exhibits an explicit law of motion (e.g., periodic motion), the single view captured image is classified as first underwater target type data. If the target exhibits irregular motion (such as random motion or no apparent period), the single view shot image is classified as second underwater target type data.
Preferably, step S23 includes the steps of:
Step S231: when the target is confirmed to be the second underwater target type data, carrying out target boundary frame identification on the standard underwater target single-view shot image to obtain target boundary frame data;
Step S232: performing image core region segmentation on a standard underwater target single-view shot image through target boundary frame data to generate a target core region image;
step S233: gray value sampling is carried out on the target core area image, and a one-dimensional signal sequence of the target core area image is obtained; performing Hill transformation effect calculation on the target core region image according to the one-dimensional signal sequence of the target core region image to obtain underwater target single view image ambiguity data;
step S234: and performing multi-view shooting compensation on the standard underwater target single-view shooting image based on the underwater target single-view image ambiguity data, so as to generate the standard underwater target multi-view shooting image.
In the embodiment of the invention, when the standard underwater target single-view shot image is confirmed to be the second underwater target type data, a target detection algorithm (such as YOLO, fast R-CNN and the like) is used for identifying target boundary frames in the image, and the boundary frames define the positions and the boundaries of the targets in the image. And extracting a core region of the target from the standard underwater target single-view shot image based on the target boundary box data. An image segmentation algorithm (e.g., an area-growth-based algorithm or a deep-learning-based semantic segmentation model) may be used to separate the target from the background, generating a target core area image. The image of the target core region is gray value sampled and converted into a one-dimensional signal sequence describing the gray value variation of the target core region in the image. And analyzing the ambiguity data in the one-dimensional signal sequence by using a Hill-bit transformation effect calculation method. Hill-bit transforms are commonly used to evaluate sharpness or blur of images, particularly in underwater environments, to effectively evaluate the quality and sharpness of images. Based on the underwater target single view image blur degree data, an image part needing multi-view compensation is determined. And a multi-view shooting technology is adopted to re-shoot the fuzzy area of the target image from different angles so as to improve the definition and detail presentation of the image.
Preferably, step S234 includes the steps of:
Step S2341: performing multi-view shooting and data acquisition through an IMU (inertial measurement unit) device on an underwater camera array based on single-view image ambiguity data of an underwater target to obtain an initial target multi-view image and target inertial measurement data;
Step S2342: performing data preprocessing on the target inertial measurement data to obtain target displacement data and target attitude information data, wherein the data preprocessing comprises denoising and integration;
step S2343: and carrying out image space positioning compensation on the target displacement data, the target attitude information data and the initial target multi-view image according to the visual inertia mileage technology, and generating a standard underwater target multi-view shooting image.
In the embodiment of the invention, the underwater camera array is arranged to carry out multi-view shooting based on the single-view image ambiguity data of the underwater target, and the cameras can be arranged at different positions and angles to cover multiple views of the target. Meanwhile, inertial measurement data of a target are acquired in real time through IMU (inertial measurement unit) equipment on an underwater camera array, and the data comprise measurement results of an accelerometer and a gyroscope and are used for subsequent motion state estimation and attitude calculation. Preprocessing the collected target inertial measurement data, including denoising and integrating: denoising: noise from the sensor is removed using digital filters or other signal processing techniques. And (3) integration treatment: integrating the denoised acceleration data, and calculating displacement data of the target; meanwhile, attitude information (such as a rotation angle) of the target is calculated using the gyroscope data. And (3) performing image space positioning compensation by combining target displacement data and attitude information data and an initial target multi-view image by using a visual inertial mileage technology: displacement data: a change in the position of the object in space is determined. Attitude information data: the posture change of the target in the photographed images at different angles is corrected. And (3) adjusting the initial multi-view shooting image by combining the compensated displacement and posture information to generate a standard underwater target multi-view shooting image, wherein the images integrate the view angles from different angles, and more comprehensive and accurate target information is provided.
As an example of the present invention, referring to fig. 3, the step S3 in this example includes:
step S31: target feature point confirmation is carried out on a standard underwater target multi-view shot image and a standard underwater target single-view shot image, so as to obtain multi-view image target feature point data and single-view image target feature point data;
Step S32: carrying out image superposition on the standard underwater target multi-view shot image and the standard underwater target single-view shot image to generate a target view superposition image;
step S33: performing feature point matching on the target view coincident image through multi-view image target feature point data and single-view image target feature point data to generate a target matching feature point set;
Step S34: performing depth estimation on the target matching feature point set by using a deep learning network to generate an underwater target depth map; and carrying out depth fusion reconstruction on the target visual angle superposition image and the underwater target depth map to generate a target underwater fusion depth map.
In the embodiment of the invention, the characteristic points, which can be key points, angular points or other obvious characteristics, are extracted and confirmed by the multi-view shot images of the standard underwater target and the single-view shot images of the standard underwater target, so that the uniqueness of the target in different view angles and images is described. And carrying out image superposition on the standard underwater target multi-view shot image and the standard underwater target single-view shot image, wherein the step ensures that the images of different view angles can be aligned under the same coordinate system to form a target view angle superposition image. And performing feature point matching on the target view coincident image by using the multi-view image target feature point data and the single-view image target feature point data. Common approaches include feature descriptor based matching algorithms such as SIFT, SURF, or deep learning network extracted features. And carrying out depth estimation on the target matching characteristic point set by using a deep learning network. Depth estimation may infer the depth distribution of the target in space based on the location of the matching feature points and the disparity information. And carrying out depth fusion reconstruction on the target visual angle superposition image and a depth estimation result (underwater target depth map), wherein the step combines visual information and depth information to generate a more accurate and fine target underwater fusion depth map.
Preferably, step S34 includes the steps of:
step S341: screening the characteristic point motion trail of the target matching characteristic point set to obtain an image characteristic point trail set;
step S342: constructing a depth estimation network; predicting pixel depth values of the image characteristic point track set by using a depth estimation network, and generating an underwater target preliminary depth map;
step S343: and carrying out multi-view depth map fusion on the underwater target preliminary depth map by using the target view coincident image to generate a target underwater fusion depth map.
In the embodiment of the invention, the motion trail analysis and screening are carried out on the target matching characteristic point sets, the characteristic points can be obvious points in images under different visual angles, and the continuity information of the characteristic points in time and space can be obtained by tracking the motion trail of the characteristic points. The depth estimation network is constructed, typically using a deep learning based approach, such as Convolutional Neural Network (CNN) or variants thereof. The network is used to learn predicted pixel-level depth values from the image feature point trajectories. And processing the image characteristic point track set by using a depth estimation network, and predicting the depth value of each pixel point, wherein the depth value reflects the depth information of the target in the multi-view image. The target visual angle superposition image is used as a reference, multi-visual angle depth map fusion is carried out on the underwater target primary depth map, the process can be combined with the depth information under each visual angle, the accuracy and the integrity of the depth map are improved, and a training data set is specifically collected and prepared. The dataset should include underwater multiview images and their corresponding depth ground truth values or depth maps. Analog data or actual acquired data may be used to ensure coverage of different underwater environments and target types. A suitable deep learning network architecture, such as a Convolutional Neural Network (CNN), is selected. Common depth estimation networks include: single view depth estimation network: a single image is processed and the depth value for each pixel is predicted. Multi-view depth estimation network: depth estimation is performed using image information from multiple perspectives, which may be processed in combination with multiple input images through a convolutional neural network. Depth estimation methods based on self-encoders or generating countermeasure networks: the depth image is generated using the generation capabilities of the network. The depth estimation network is trained using the prepared training data set. A loss function needs to be defined during the training process, and typically a depth difference or other loss function is used to compare the difference between the network generated depth map and the real depth map. And carrying out data enhancement and regularization on the training data to improve the generalization capability and the robustness of the network. The data enhancement technology comprises operations such as image rotation, scaling, overturning and the like, and regularization can control overfitting through means such as batch normalization, dropout and the like. Network parameters are optimized using gradient descent or variants thereof to minimize the loss function. Learning rate adjustment strategies and batch training techniques can be employed to optimize convergence speed and stability of the network. During the training process, the performance of the network on the validation set is monitored and the network architecture or training parameters are adjusted based on the performance.
As an example of the present invention, referring to fig. 4, the step S4 includes, in this example:
step S41: performing three-dimensional point cloud conversion on the target underwater fusion depth map and the standard underwater target multi-view shooting image to generate target underwater three-dimensional point cloud data;
Step S42: carrying out surface reconstruction on the target underwater three-dimensional point cloud data to generate target three-dimensional surface reconstruction data;
Step S43: performing texture mapping on the target three-dimensional surface reconstruction data to generate target three-dimensional texture mapping data;
Step S44: constructing a three-dimensional model based on the target three-dimensional surface reconstruction data and the target three-dimensional texture mapping data to generate a target underwater three-dimensional preliminary model; and performing model rendering on the target underwater three-dimensional preliminary model so as to generate a high-precision target three-dimensional underwater model to execute three-dimensional reconstruction operation of the underwater image.
In the embodiment of the invention, the depth information is converted into three-dimensional point cloud data by using the target underwater fusion depth map and the standard underwater target multi-view shot image, and the three-dimensional point cloud data can be realized by reconstructing space coordinates from the depth image and converting the space coordinates into a point cloud format. And carrying out surface reconstruction on the generated target underwater three-dimensional point cloud data. Common surface reconstruction methods include: point cloud based mesh reconstruction: the point cloud data is used to generate a gridded triangular surface to form the precise three-dimensional geometry of the target. Voxel grid based method: the point cloud data is converted into a voxel grid and then a three-dimensional surface is reconstructed using a voxel surface extraction algorithm. Texture mapping is performed on the three-dimensional surface reconstruction data of the target, and texture information from a standard underwater target multi-view photographed image is mapped onto the three-dimensional surface, which can be achieved by using an image projection technology or a method of combining the texture information with a three-dimensional geometric model. And constructing a three-dimensional model based on the target three-dimensional surface reconstruction data and the texture mapping data, wherein the three-dimensional model comprises integrating surface geometric information and texture information to form a target underwater three-dimensional preliminary model. And rendering the target underwater three-dimensional preliminary model. The rendering process can adopt ray tracing or real-time rendering technology to project the three-dimensional model into a two-dimensional image so as to generate a visual effect of the high-precision target three-dimensional underwater model.
In the present specification, there is provided a depth learning-based underwater image three-dimensional reconstruction system for performing the above-described depth learning-based underwater image three-dimensional reconstruction method, the depth learning-based underwater image three-dimensional reconstruction system comprising:
The target identification module is used for acquiring target underwater position data; performing single-view preliminary shooting according to the underwater position data of the target to obtain a preliminary shooting image of the underwater target; performing image ambiguity analysis on the primary shot image of the underwater target to generate single-view image ambiguity data of the underwater target; performing target motion recognition on the underwater target single-view image ambiguity data to generate target motion recognition data;
The multi-view compensation module is used for carrying out target classification on the standard underwater target single-view shot image according to the target motion identification data to generate first underwater target type data and second underwater target type data; performing multi-view shooting compensation on the standard underwater target single-view shooting image based on the first underwater target type data and the second underwater target type data, so as to generate a standard underwater target multi-view shooting image;
The depth estimation module is used for carrying out image superposition on the standard underwater target multi-view shot image and the standard underwater target single-view shot image to generate a target view superposition image; performing depth estimation on the target visual angle superposition image to generate an underwater target depth map; performing depth fusion reconstruction on the target visual angle superposition image and the underwater target depth map to generate a target underwater fusion depth map;
The three-dimensional reconstruction module is used for carrying out three-dimensional point cloud conversion on the target underwater fusion depth map and the standard underwater target multi-view shooting image to generate target underwater three-dimensional point cloud data; constructing a three-dimensional model of the target underwater three-dimensional point cloud data to generate a target underwater three-dimensional preliminary model; and performing model rendering on the target underwater three-dimensional preliminary model so as to generate a high-precision target three-dimensional underwater model to execute three-dimensional reconstruction operation of the underwater image.
The invention has the beneficial effects that the primary image data and the definition analysis can be provided by acquiring the underwater position data of the target and carrying out single-view primary shooting and image ambiguity analysis, so as to lay a foundation for subsequent processing. The image is classified by utilizing the target motion identification data, different underwater target type data are generated, multi-view shooting compensation is carried out, the details and the motion characteristics of the target can be more comprehensively captured, and the information richness and the accuracy of the image are improved. And overlapping the standard underwater target multi-view shot image and the single-view shot image, performing depth estimation, performing depth fusion reconstruction, and generating a high-quality underwater target depth map, wherein the steps effectively improve the depth information acquisition and the image quality of the images in the underwater scene. By carrying out three-dimensional point cloud conversion and model construction on the target underwater fusion depth map and the multi-view shooting image, a high-precision target underwater three-dimensional model is generated, and model rendering is carried out at the same time, so that the accuracy and visual effect of three-dimensional reconstruction operation are improved. Therefore, the method improves the comprehensiveness and accuracy of three-dimensional image reconstruction by carrying out image blur compensation, multi-view shooting and accurate depth estimation on the underwater target.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.
The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (10)
1. The underwater image three-dimensional reconstruction method based on deep learning is characterized by comprising the following steps of:
Step S1: acquiring target underwater position data; performing single-view preliminary shooting according to the underwater position data of the target to obtain a preliminary shooting image of the underwater target; performing image ambiguity analysis on the primary shot image of the underwater target to generate single-view image ambiguity data of the underwater target; performing target motion recognition on the underwater target single-view image ambiguity data to generate target motion recognition data;
Step S2: performing target classification on the standard underwater target single-view shot image according to the target motion identification data to generate first underwater target type data and second underwater target type data; performing multi-view shooting compensation on the standard underwater target single-view shooting image based on the first underwater target type data and the second underwater target type data, so as to generate a standard underwater target multi-view shooting image;
step S3: performing image superposition on a standard underwater target multi-view shot image and a standard underwater target single-view shot image to generate a target view superposition image; performing depth estimation on the target visual angle superposition image to generate an underwater target depth map; performing depth fusion reconstruction on the target visual angle superposition image and the underwater target depth map to generate a target underwater fusion depth map;
step S4: performing three-dimensional point cloud conversion on the target underwater fusion depth map and the standard underwater target multi-view shooting image to generate target underwater three-dimensional point cloud data; constructing a three-dimensional model of the target underwater three-dimensional point cloud data to generate a target underwater three-dimensional preliminary model; and performing model rendering on the target underwater three-dimensional preliminary model so as to generate a high-precision target three-dimensional underwater model to execute three-dimensional reconstruction operation of the underwater image.
2. The depth learning based underwater image three-dimensional reconstruction method according to claim 1, wherein the step S1 comprises the steps of:
step S11: acquiring target underwater position data by using a GPS;
Step S12: performing single-view preliminary shooting by utilizing an underwater camera array according to the underwater position data of the target to obtain an underwater target preliminary shooting image;
Step S13: performing image preprocessing on the primary shooting image of the underwater target to obtain a standard single-view shooting image of the underwater target, wherein the image preprocessing comprises image brightness enhancement, image geometric transformation and image smoothing;
Step S14: performing image ambiguity analysis on a standard underwater target single-view shot image to generate underwater target single-view image ambiguity data; and carrying out target motion recognition on the underwater target single-view image ambiguity data to generate target motion recognition data.
3. The depth learning based underwater image three-dimensional reconstruction method according to claim 1, wherein the step S2 comprises the steps of:
Step S21: performing target optical flow tracking on the standard underwater target single-view shot image according to the target motion identification data to generate target optical flow tracking data;
Step S22: performing target classification on a standard underwater target single-view shot image through target optical flow tracking data and target motion identification data to generate first underwater target type data and second underwater target type data; judging the target type of the standard underwater target single-view shot image, and outputting the standard underwater target single-view shot image when the standard underwater target single-view shot image is confirmed to be the first underwater target type data;
Step S23: when the target is confirmed to be the second underwater target type data, multi-view shooting compensation is carried out on the standard underwater target single-view shooting image based on the underwater target single-view image ambiguity data, so that the standard underwater target multi-view shooting image is generated.
4. A depth learning based underwater image three-dimensional reconstruction method according to claim 3, wherein the object classification of the standard underwater object single view shot image by the object optical flow tracking data and the object motion recognition data comprises:
screening adjacent frame images of the standard underwater target single-view shot images to obtain the adjacent frame images;
Performing target position offset analysis on the adjacent frame images through the target optical flow tracking data to generate target position offset data;
Performing target motion period analysis on the target position offset data by utilizing the target motion identification data to generate target motion period analysis data; performing motion law exploration on the target motion cycle analysis data to obtain target motion law exploration data, wherein the target motion law exploration data comprises target motion law data and target motion irregular data;
Classifying targets of the standard underwater target single-view shot images according to the object motion rule exploration data, and generating first underwater target type data when the object motion rule exploration data are the target motion rule data; and when the object motion rule exploration data is the target motion irregular data, generating second underwater target type data.
5. A depth learning based underwater image three-dimensional reconstruction method according to claim 3, wherein the step S23 comprises the steps of:
Step S231: when the target is confirmed to be the second underwater target type data, carrying out target boundary frame identification on the standard underwater target single-view shot image to obtain target boundary frame data;
Step S232: performing image core region segmentation on a standard underwater target single-view shot image through target boundary frame data to generate a target core region image;
step S233: gray value sampling is carried out on the target core area image, and a one-dimensional signal sequence of the target core area image is obtained; performing Hill transformation effect calculation on the target core region image according to the one-dimensional signal sequence of the target core region image to obtain underwater target single view image ambiguity data;
step S234: and performing multi-view shooting compensation on the standard underwater target single-view shooting image based on the underwater target single-view image ambiguity data, so as to generate the standard underwater target multi-view shooting image.
6. The depth learning based underwater image three-dimensional reconstruction method according to claim 5, wherein the step S234 comprises the steps of:
Step S2341: performing multi-view shooting and data acquisition through an IMU (inertial measurement unit) device on an underwater camera array based on single-view image ambiguity data of an underwater target to obtain an initial target multi-view image and target inertial measurement data;
Step S2342: performing data preprocessing on the target inertial measurement data to obtain target displacement data and target attitude information data, wherein the data preprocessing comprises denoising and integration;
step S2343: and carrying out image space positioning compensation on the target displacement data, the target attitude information data and the initial target multi-view image according to the visual inertia mileage technology, and generating a standard underwater target multi-view shooting image.
7. The depth learning based underwater image three-dimensional reconstruction method according to claim 1, wherein the step S3 comprises the steps of:
step S31: target feature point confirmation is carried out on a standard underwater target multi-view shot image and a standard underwater target single-view shot image, so as to obtain multi-view image target feature point data and single-view image target feature point data;
Step S32: carrying out image superposition on the standard underwater target multi-view shot image and the standard underwater target single-view shot image to generate a target view superposition image;
step S33: performing feature point matching on the target view coincident image through multi-view image target feature point data and single-view image target feature point data to generate a target matching feature point set;
Step S34: performing depth estimation on the target matching feature point set by using a deep learning network to generate an underwater target depth map; and carrying out depth fusion reconstruction on the target visual angle superposition image and the underwater target depth map to generate a target underwater fusion depth map.
8. The depth learning based underwater image three-dimensional reconstruction method according to claim 7, wherein the step S34 comprises the steps of:
step S341: screening the characteristic point motion trail of the target matching characteristic point set to obtain an image characteristic point trail set;
step S342: constructing a depth estimation network; predicting pixel depth values of the image characteristic point track set by using a depth estimation network, and generating an underwater target preliminary depth map;
step S343: and carrying out multi-view depth map fusion on the underwater target preliminary depth map by using the target view coincident image to generate a target underwater fusion depth map.
9. The depth learning based underwater image three-dimensional reconstruction method according to claim 1, wherein the step S4 comprises the steps of:
step S41: performing three-dimensional point cloud conversion on the target underwater fusion depth map and the standard underwater target multi-view shooting image to generate target underwater three-dimensional point cloud data;
Step S42: carrying out surface reconstruction on the target underwater three-dimensional point cloud data to generate target three-dimensional surface reconstruction data;
Step S43: performing texture mapping on the target three-dimensional surface reconstruction data to generate target three-dimensional texture mapping data;
Step S44: constructing a three-dimensional model based on the target three-dimensional surface reconstruction data and the target three-dimensional texture mapping data to generate a target underwater three-dimensional preliminary model; and performing model rendering on the target underwater three-dimensional preliminary model so as to generate a high-precision target three-dimensional underwater model to execute three-dimensional reconstruction operation of the underwater image.
10. A depth learning based underwater image three-dimensional reconstruction system for performing the depth learning based underwater image three-dimensional reconstruction method as set forth in claim 1, the depth learning based underwater image three-dimensional reconstruction system comprising:
The target identification module is used for acquiring target underwater position data; performing single-view preliminary shooting according to the underwater position data of the target to obtain a preliminary shooting image of the underwater target; performing image ambiguity analysis on the primary shot image of the underwater target to generate single-view image ambiguity data of the underwater target; performing target motion recognition on the underwater target single-view image ambiguity data to generate target motion recognition data;
The multi-view compensation module is used for carrying out target classification on the standard underwater target single-view shot image according to the target motion identification data to generate first underwater target type data and second underwater target type data; performing multi-view shooting compensation on the standard underwater target single-view shooting image based on the first underwater target type data and the second underwater target type data, so as to generate a standard underwater target multi-view shooting image;
The depth estimation module is used for carrying out image superposition on the standard underwater target multi-view shot image and the standard underwater target single-view shot image to generate a target view superposition image; performing depth estimation on the target visual angle superposition image to generate an underwater target depth map; performing depth fusion reconstruction on the target visual angle superposition image and the underwater target depth map to generate a target underwater fusion depth map;
The three-dimensional reconstruction module is used for carrying out three-dimensional point cloud conversion on the target underwater fusion depth map and the standard underwater target multi-view shooting image to generate target underwater three-dimensional point cloud data; constructing a three-dimensional model of the target underwater three-dimensional point cloud data to generate a target underwater three-dimensional preliminary model; and performing model rendering on the target underwater three-dimensional preliminary model so as to generate a high-precision target three-dimensional underwater model to execute three-dimensional reconstruction operation of the underwater image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411093922.1A CN118644640B (en) | 2024-08-09 | 2024-08-09 | Underwater image three-dimensional reconstruction method and system based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411093922.1A CN118644640B (en) | 2024-08-09 | 2024-08-09 | Underwater image three-dimensional reconstruction method and system based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118644640A CN118644640A (en) | 2024-09-13 |
CN118644640B true CN118644640B (en) | 2024-10-29 |
Family
ID=92664602
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202411093922.1A Active CN118644640B (en) | 2024-08-09 | 2024-08-09 | Underwater image three-dimensional reconstruction method and system based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118644640B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115170746A (en) * | 2022-09-07 | 2022-10-11 | 中南大学 | Multi-view three-dimensional reconstruction method, system and equipment based on deep learning |
CN118071968A (en) * | 2024-04-17 | 2024-05-24 | 深圳市赛野展览展示有限公司 | Intelligent interaction deep display method and system based on AR technology |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR3111222B1 (en) * | 2020-06-06 | 2023-04-28 | Olivier Querbes | Generation of scaled 3D models from 2D images produced by a monocular imaging device |
CN113971691B (en) * | 2021-09-16 | 2024-08-23 | 中国海洋大学 | Underwater three-dimensional reconstruction method based on multi-view binocular structured light |
WO2023164845A1 (en) * | 2022-03-02 | 2023-09-07 | 深圳市大疆创新科技有限公司 | Three-dimensional reconstruction method, device, system, and storage medium |
-
2024
- 2024-08-09 CN CN202411093922.1A patent/CN118644640B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115170746A (en) * | 2022-09-07 | 2022-10-11 | 中南大学 | Multi-view three-dimensional reconstruction method, system and equipment based on deep learning |
CN118071968A (en) * | 2024-04-17 | 2024-05-24 | 深圳市赛野展览展示有限公司 | Intelligent interaction deep display method and system based on AR technology |
Also Published As
Publication number | Publication date |
---|---|
CN118644640A (en) | 2024-09-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Beall et al. | 3D reconstruction of underwater structures | |
Dolson et al. | Upsampling range data in dynamic environments | |
US20200234397A1 (en) | Automatic view mapping for single-image and multi-view captures | |
CN110381268B (en) | Method, device, storage medium and electronic equipment for generating video | |
CN109791696A (en) | It is positioned and is mapped simultaneously using event video camera | |
US20230419438A1 (en) | Extraction of standardized images from a single-view or multi-view capture | |
CN113178009A (en) | Indoor three-dimensional reconstruction method utilizing point cloud segmentation and grid repair | |
WO2000036564A9 (en) | Creating a three-dimensional model from two-dimensional images | |
CN114782628A (en) | Indoor real-time three-dimensional reconstruction method based on depth camera | |
CN117456114B (en) | Multi-view-based three-dimensional image reconstruction method and system | |
CN115222884A (en) | Space object analysis and modeling optimization method based on artificial intelligence | |
CN108961378B (en) | Multi-eye point cloud three-dimensional reconstruction method, device and equipment | |
Ramirez et al. | Booster: a benchmark for depth from images of specular and transparent surfaces | |
CN117274514A (en) | Remote sensing image generation method and device based on ground-air visual angle geometric transformation | |
Tao et al. | SiLVR: Scalable Lidar-Visual Reconstruction with Neural Radiance Fields for Robotic Inspection | |
Nouduri et al. | Deep realistic novel view generation for city-scale aerial images | |
CN118644640B (en) | Underwater image three-dimensional reconstruction method and system based on deep learning | |
Choi et al. | Tmo: Textured mesh acquisition of objects with a mobile device by using differentiable rendering | |
CN104104911A (en) | Timestamp eliminating and resetting method in panoramic image generation process and system thereof | |
CN116912393A (en) | Face reconstruction method and device, electronic equipment and readable storage medium | |
Alqahtani et al. | 3d face tracking using stereo camera | |
Jäger et al. | A comparative Neural Radiance Field (NeRF) 3D analysis of camera poses from HoloLens trajectories and Structure from Motion | |
Onmek et al. | Evaluation of underwater 3D reconstruction methods for Archaeological Objects: Case study of Anchor at Mediterranean Sea | |
Stipes et al. | 4D scan registration with the SR-3000 LIDAR | |
Brunken et al. | Incorporating Plane-Sweep in Convolutional Neural Network Stereo Imaging for Road Surface Reconstruction. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |