[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN118470222B - Medical ultrasonic image three-dimensional reconstruction method and system based on SDF diffusion - Google Patents

Medical ultrasonic image three-dimensional reconstruction method and system based on SDF diffusion Download PDF

Info

Publication number
CN118470222B
CN118470222B CN202410917497.7A CN202410917497A CN118470222B CN 118470222 B CN118470222 B CN 118470222B CN 202410917497 A CN202410917497 A CN 202410917497A CN 118470222 B CN118470222 B CN 118470222B
Authority
CN
China
Prior art keywords
sdf
ultrasonic image
voxel
diffusion
noise
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410917497.7A
Other languages
Chinese (zh)
Other versions
CN118470222A (en
Inventor
蔡青
童亨
仇世纪
刘治
董军宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ocean University of China
Original Assignee
Ocean University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ocean University of China filed Critical Ocean University of China
Priority to CN202410917497.7A priority Critical patent/CN118470222B/en
Publication of CN118470222A publication Critical patent/CN118470222A/en
Application granted granted Critical
Publication of CN118470222B publication Critical patent/CN118470222B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Image Processing (AREA)
  • Ultra Sonic Daignosis Equipment (AREA)

Abstract

The invention discloses a medical ultrasonic image three-dimensional reconstruction method and system based on SDF diffusion, and belongs to the technical field of computer vision. Features are first extracted from an input ultrasound image using a visual model pre-trained on medical data. The SDF diffusion process is then started, the back diffusion process is started by completely random noise, and the noise is gradually removed to obtain a clean SDF field, wherein the SDF features and the ultrasonic image features are fused by using a state space model and cross attention so as to realize that the SDF field representation three-dimensional surface is consistent with the ultrasonic image. Through practical verification, the medical ultrasonic image three-dimensional reconstruction method based on SDF diffusion provided by the invention has the characteristics of high efficiency and high accuracy.

Description

Medical ultrasonic image three-dimensional reconstruction method and system based on SDF diffusion
Technical Field
The invention relates to a medical ultrasonic image three-dimensional reconstruction method and system based on SDF diffusion, and belongs to the technical field of computer vision.
Background
The ultrasonic image three-dimensional reconstruction refers to the final generation of three-dimensional anatomical structure images or volume data by acquiring and processing a series of two-dimensional ultrasonic image data, and has important clinical significance and application value. The three-dimensional shape of an object may have various expressions such as voxels, point clouds, grids, etc., and voxel representation-based methods may use 3D Convolutional Neural Networks (CNNs) that reconstruct objects having arbitrary topologies. However, the huge memory requirements and computation time limit the reconstruction results of most methods for high resolution, so ultra high precision reconstruction cannot be achieved. The point cloud representation is relatively simple and highly flexible, but because the point cloud is not a regular structure, it does not adapt well to traditional 3D CNN networks. Similarly, the network can represent three-dimensional shapes with high accuracy, but it is also discrete and cannot be represented well with conventional neural networks.
The diffusion denoising probability model is a generation model based on an iterative inversion Markov noise process. In vision, early work expressed the problem as learning variant lower bounds, or framed it as a discretization that optimizes a score-based generative model or continuous stochastic process. Many recent efforts have shown great potential for diffusion modeling in content generation tasks, so that using diffusion for three-dimensional reconstruction tasks will also take full advantage of the model.
In the field of three-dimensional reconstruction of ultrasonic images, there is a conventional method based on a position sensor which tracks the position and posture of a probe in three-dimensional space in real time by using a position sensor (e.g., a magnetic position sensor, an optical position sensor, etc.) connected to the probe, stores spatial position information of a two-dimensional ultrasonic image at the time of acquisition, and then reconstructs a plurality of two-dimensional images into three-dimensional data based on the position information. However, the method requires an external position sensor system, greatly increases the complexity and cost of the system, and has the problems of cable interference, probe shielding and the like to influence the accuracy. Based on the free hand scanning method, the spatial position relation of each two-dimensional image is estimated by utilizing the information such as the motion track of an operator when the ultrasonic probe is freely moved to scan the surface of a human body, the scanning direction of ultrasonic beams and the like, so that three-dimensional data is reconstructed. However, it requires high accuracy in estimating the probe motion trajectory and beam direction, which is not only computationally complex but also less real-time. Based on an image registration method, information such as gray scale, gradient, characteristics and the like is extracted from a series of adjacent two-dimensional ultrasonic images, registration is performed through similarity measurement, the geometric transformation relationship between the images is estimated, and three-dimensional data is finally obtained. But the requirements on image quality, gray level distribution uniformity in the field of view and the like are high, the precision is easy to influence, and the calculation is complex.
By analyzing and summarizing the existing ultrasonic three-dimensional reconstruction method, the existing method has the following defects: (1) The need for additional equipment or complex algorithms to determine the probe motion trajectory and direction increases implementation difficulty and cost. (2) The reconstruction process requires a continuous ultrasound video or a series of continuous ultrasound images, and the required data size is large and the calculation is complex. (3) the reconstruction process is time consuming and generally accurate.
Disclosure of Invention
The invention aims to provide a medical ultrasonic image three-dimensional reconstruction method based on SDF (SIGNED DISTANCE FIELD symbol distance field) diffusion so as to improve the speed and accuracy of three-dimensional reconstruction of a two-dimensional ultrasonic image.
SDF is an implicit representation method of three-dimensional shape of object, the essence of SDF is to store the nearest distance from each point to the shape surface, namely, the model is drawn out of a surface, the point value outside the model surface is more than 0, the point value inside the model surface is less than 0, SDF can conveniently use 3D convolution network, and can also be converted into high-precision grid representation by running cube method.
In order to achieve the aim of the invention, the invention adopts the following specific technical scheme:
A medical ultrasonic image three-dimensional reconstruction method based on SDF diffusion comprises the following steps:
s1: acquiring medical ultrasonic image data, extracting characteristics of the ultrasonic image, and guiding a subsequent diffusion process;
s2: randomly initializing SDF field, and adding SDF field Random initialization according to normal distribution
S3: the SDF diffusion process is performed using a diffusion model that contains two key processes: a forward diffusion process and a reverse diffusion process; fusing the voxel and the ultrasonic image features by using a state space model to obtain voxel features fused with the ultrasonic image features, and fusing the voxel features and the ultrasonic image features by using cross attention at a stage with smaller voxel resolution;
S4: the meshes are extracted by using the marching cubes, and finally, the triangular mesh model guided by the ultrasonic image is reconstructed.
Further, in S1, a visual Model MedSAM (MEDICAL SEGMENT ANYTHING Model) is first trained using a large number of medical ultrasound image data, the trained MedSAM image encoder is reused to perform feature extraction on an input single medical ultrasound image, and the output is mapped to feature vectorsAnd the method is used for guiding the SDF to approach the picture features in the subsequent diffusion process so as to realize the SDF three-dimensional reconstruction of the ultrasonic image.
Further, in said S2, the SDF field is a continuous three-dimensional scene representation that treats a three-dimensional object or scene as a three-dimensional scalar field consisting of distance values. Specifically, for any point in a scene, the value of SDF is defined as the signed distance of that point to the object surface, namely: if the point is inside the object, the SDF value is a negative distance value from the point to the nearest surface; if the point is just on the object surface, the SDF value is 0.
Further, the step S3 specifically includes:
S3-1: the forward diffusion process (Forward Diffusion Process) is a process used in training that adds data gradually to gaussian noise until eventually completely corrupted, becoming pure noise with gaussian distribution. This process can be represented by a Markov chain, adding a certain amount of Gaussian noise per time step T until the final time T, the original data becomes pure noise; given a data sample Forward processIt gradually converts the data samples into pure gaussian noise:
(1);
Wherein the method comprises the steps of A sample at the time step t is represented,A sample representing a previous time step is taken,The gaussian transition probability:
(2);
Wherein the method comprises the steps of A normal distribution is indicated and the distribution is determined,Representing the noise scheduling super parameter, which is a leachable coefficient or is set as a constant; the forward process allows sampling at any time step t in a closed form
(3);
Wherein the method comprises the steps of; Thus, the first and second substrates are bonded together,Sampling is directly performed by the following formula:
(4);
therein, wherein Is a randomly sampled noise from a normal distribution.
S3-2: the back diffusion process (Reverse Diffusion Process) occurs at generation time and aims to reconstruct the original data from the pure noise samples, i.e. joint distributionThis inverse process is defined as a markov chain with learned gaussian transitions:
(5);
(6);
Wherein the method comprises the steps of Is a standard normal distribution of the distribution,The representation is composed ofA parameterized denoising function,Is a time-step dependent variance set to; From the slaveMiddle sampling and then plotting by equation (6)The method comprises the following steps:
(7);
(8);
Wherein the method comprises the steps of Is a group consisting ofParameterized neural network, which is input from noiseAnd predicting noise. By repeating this process, it is finally possible to generate
Thus, the objective function compares the prediction noiseAnd apply noiseThe difference between:
(9);
in addition, prediction by neural networks Rather than noiseTo modify
(10);
Wherein, Is composed ofParameterized neural network that predicts noise-free data; In this case, the objective function becomes:
(11);
specifically, a clean data sample is given Sampling the samples having the same shape according to the formula (4) in the forward direction
(12);
Then, train a networkTo predict noiseless data whose MSE objective function is as follows:
(13);
therein, wherein The ultrasonic image features extracted in the S1;
in the reasoning process, a trained neural network is used Through the process ofGradually predicting smaller noise samples according to equation (14)To generate new SDF voxels:
(14);
S3-3: the diffusion process uses a U-like network, the voxel feature resolution of each layer is halved (e.g. 64- > 32- > 16), and the channel is doubled; in order to keep the object represented by the generated SDF field consistent with the ultrasonic image, the SDF features and the ultrasonic image features are fused in the diffusion process, so that the ultrasonic image features guide the SDF diffusion process; at the top of the model, the voxel features remain in a larger order of magnitude due to the larger feature resolution, and the cross-attention cannot be directly used for fusing the voxel and ultrasonic image features, so that the state space model is used for processing the voxel and ultrasonic image features;
the state space Model (STATE SPACE Model) is a mathematical Model framework commonly used for time series analysis and control theory. It describes the evolution process of a dynamic system as a combination of state and observation equations. A one-dimensional sequence is formed Through a hidden stateMapping toThis system uses a as the evolution parameter, B, C as the projection parameter:
(15);
(16);
mamba is used as a discrete version of the continuous system, which includes a time scale parameter Converting continuous parameters A, B to discrete parameters; A common transformation method is zero-order hold (ZOH), defined as follows:
(17);
(18);
At the position of After discretization, equations (15) (16) can be expressed as:
(19);
(20);
finally, the model is output through global convolution calculation:
(21);
(22);
Where M is the length of the input sequence x, Is a structured convolution kernel.
In order to apply the method to the SDF diffusion process, the voxel features are unfolded into one-dimensional features, and the time step t in S3-1 is subjected to sine-cosine transformation and then is mapped to one-dimensional features with the same size as the voxel features in a linear mode and added with the voxel features. To fuse the ultrasound image features with voxel features, the ultrasound image features are compared with each otherAnd (3) linearly mapping the input x to the same channel number as the voxel characteristic, and then connecting the input x with the voxel characteristic to obtain the SSM.
And obtaining an output y after Mamba calculation, taking out the voxel characteristic part, and remolding the one-dimensional sequence back to the original shape of the voxel characteristic to obtain the voxel characteristic after the ultrasonic image characteristic is fused.
S3-4: after the voxel feature passes through the top of the model, downsampling is carried out to reduce the resolution of the voxel feature, the number of channels is increased, the overall data volume is reduced, and the voxel feature and the ultrasonic image feature are fused by using cross attention at the stage of small voxel resolution.
The attention function is described as mapping a query and a set of key-value pairs to an output, where the query, key, value, and output are all vectors and the output is a weighted sum of values. In practice, the attention functions of a set of queries are computed simultaneously, packed into a matrix Q, and keys and values are also packed into matrices K and V; the calculated output moment is:
(23);
multi-headed attention allows the model to focus jointly on information of different representation subspaces at different locations; the formula is as follows:
(24);
(25);
Where concat represents a splicing operation, Are all learnable weight matrices.
For voxels in the gridProjecting the center of the image to obtain projection coordinates; Select proximity toAdjacent image blocks of (a) and (b)Interactions because of their characteristics most likely affect the response of the userControlled local geometry; neighborhood image patch setThe representation was chosen as follows: if it isAnd (3) withThe distance between the centers is less than a distance thresholdThen patchBelonging to
Using multi-headed self-attention pairsVoxel characterization atAnd belong toUltrasound image patch feature set of a patch in (a)The feature interactions between are modeled as follows:
(26);
(27);
Here the number of the elements is the number, Is a standard multi-headed attention operation,Is a learnable matrix, M is a mask of attention calculations caused by view projection, and absolute position encoding is used for voxels and ultrasound image tiles.
After the cross attention, the ultrasonic image features and the voxel features are further fused, and the output voxel features are high-precision SDF fields.
Further, the marching cubes algorithm in S4 is a classical algorithm for extracting iso-surfaces (iso-surfaces) from a three-dimensional scalar field, and is widely applied to the fields of volume data visualization, geometric reconstruction and the like. Algorithm steps: setting an isosurface threshold value T, and traversing all cube units C in the SDF field:
(1) Calculating vertex state code index according to the size relation between the 8 vertex scalar values of C and T;
(2) Searching a corresponding intersection point configuration pattern in a predefined condition table according to the index;
(3) Calculating the intersection points with the isosurface on 12 sides of the cube by utilizing linear interpolation;
(4) Connecting corresponding intersection points according to pattern to form one or more triangles;
(5) Triangles are added to the mesh surface data.
A medical ultrasonic image three-dimensional reconstruction system based on SDF diffusion consists of an ultrasonic image feature extraction module, an SDF diffusion module and a grid generation module;
the ultrasonic image feature extraction module is used for: the module is used for extracting the characteristics of an input ultrasonic image, and the image encoder of a visual model SAM-Med2D pre-trained by a large amount of medical image data is used for extracting the characteristics of the input ultrasonic image, so that rich and accurate characteristic guidance is provided for the subsequent diffusion process.
The SDF diffusion module: the module consists of a U-shaped neural network embedded with a state space model and cross attention, and is used for executing an SDF diffusion process, denoising gradually from random noise to obtain a clean SDF field, and fusing voxel characteristics and ultrasonic picture characteristics by the state space model and the cross attention in the diffusion process to achieve the aim of keeping consistency of a reconstruction grid and the ultrasonic picture characteristics.
The grid generation module: the module executes a marching cubes algorithm to extract a high precision three-dimensional grid from the reconstructed SDF field.
The invention has the advantages and beneficial effects that:
According to the invention, the state space model and the cross attention are introduced into the SDF diffusion process, and the ultrasonic image features and the SDF features are fully fused, so that the end-to-end rapid and accurate ultrasonic image three-dimensional reconstruction is realized. The problems that additional equipment or complex algorithms are needed for determining the motion track and direction of the probe, the time consumption is long and the like in three-dimensional reconstruction of the ultrasonic image are solved to a certain extent.
Through practical verification, the ultrasonic image three-dimensional reconstruction method provided by the invention has the characteristics of high efficiency and high accuracy.
Drawings
Fig. 1 is an overall flow chart of the present invention.
Fig. 2 is a frame diagram of the present invention.
FIG. 3 is a graph of the results of the present invention.
Detailed Description
The invention will be further described with reference to fig. 1-3 and the specific examples.
Example 1:
A medical ultrasonic image three-dimensional reconstruction method based on SDF diffusion, the whole flow is shown in figure 1, comprises the following steps:
s1: feature extraction of input single view using image encoder of visual model SAM-Med2D pre-trained with large amount of medical image data, output feature vector And in the subsequent diffusion process, the three-dimensional reconstruction is fused with SDF features in a state space model and cross attention, so that the purpose of guiding three-dimensional reconstruction is realized, and the three-dimensional reconstruction of an ultrasonic image is realized.
S2: the voxel SDF field is randomly initialized, where the SDF is known collectively as SIGNED DISTANCE Function (signed distance Function). An SDF field is a continuous three-dimensional representation of a scene that treats a three-dimensional object or scene as a three-dimensional scalar field consisting of distance values. Specifically, for any point in the scene, the value of SDF is defined as the signed distance of that point to the surface of the object, i.e., if that point is inside the object, the SDF value is the negative distance value of that point to the nearest surface; if the point is just on the object surface, the SDF value is 0. At initialization, the SDF field is setRandom initialization according to normal distribution
S3: the SDF field is subjected to a diffusion process. Diffusion Models (Diffusion Models) are an emerging depth generation model that trains neural networks using a reversible Diffusion process (Diffusion Process) to generate the required data from the noise distribution. The generated diffusion model comprises two key processes, namely a forward diffusion process and a backward diffusion process. The method specifically comprises the following steps:
S3-1: the forward diffusion process (Forward Diffusion Process) is a process used in training that adds data gradually to gaussian noise until eventually completely corrupted, becoming pure noise with gaussian distribution. This process can be represented by a Markov chain, adding a certain amount of Gaussian noise per time step T until the final time T, the original data becomes pure noise; given a data sample Forward processIt gradually converts the data samples into pure gaussian noise:
(1);
Wherein the method comprises the steps of A sample at the time step t is represented,A sample representing a previous time step is taken,The gaussian transition probability:
(2);
Wherein the method comprises the steps of A normal distribution is indicated and the distribution is determined,Representing the noise scheduling super parameter, which is a leachable coefficient or is set as a constant; the forward process allows sampling at any time step t in a closed form
(3);
Wherein the method comprises the steps of; Thus, the first and second substrates are bonded together,Sampling is directly performed by the following formula:
(4);
therein, wherein Is a randomly sampled noise from a normal distribution.
S3-2: the back diffusion process (Reverse Diffusion Process) occurs at generation time and aims to reconstruct the original data from the pure noise samples, i.e. joint distributionThis inverse process is defined as a markov chain with learned gaussian transitions:
(5);
(6);
Wherein the method comprises the steps of Is a standard normal distribution of the distribution,The representation is composed ofA parameterized denoising function,Is a time-step dependent variance set to; From the slaveMiddle sampling and then plotting by equation (6)The method comprises the following steps:
(7);
(8);
Wherein the method comprises the steps of Is a group consisting ofParameterized neural network, which is input from noiseAnd predicting noise. By repeating this process, it is finally possible to generate
Thus, the objective function compares the prediction noiseAnd apply noiseThe difference between:
(9);
in addition, prediction by neural networks Rather than noiseTo modify
(10);
Wherein, Is composed ofParameterized neural network that predicts noise-free data; In this case, the objective function becomes:
(11);
specifically, a clean data sample is given Sampling the samples having the same shape according to the formula (4) in the forward direction
(12);
Then, train a networkTo predict noiseless data whose MSE objective function is as follows:
(13);
therein, wherein The ultrasonic image features extracted in the S1;
in the reasoning process, a trained neural network is used Through the process ofGradually predicting smaller noise samples according to equation (14)To generate new SDF voxels:
(14);
S3-3: the diffusion process uses a U-like network, the voxel feature resolution of each layer is halved (e.g. 64- > 32- > 16), and the channel is doubled; in order to keep the object represented by the generated SDF field consistent with the ultrasonic image, the SDF features and the ultrasonic image features are fused in the diffusion process, so that the ultrasonic image features guide the SDF diffusion process; at the top of the model, the voxel features remain in a larger order of magnitude due to the larger feature resolution, and the cross-attention cannot be directly used for fusing the voxel and ultrasonic image features, so that the state space model is used for processing the voxel and ultrasonic image features;
the state space Model (STATE SPACE Model) is a mathematical Model framework commonly used for time series analysis and control theory. It describes the evolution process of a dynamic system as a combination of state and observation equations. A one-dimensional sequence is formed Through a hidden stateMapping toThis system uses a as the evolution parameter, B, C as the projection parameter:
(15);
(16);
mamba is used as a discrete version of the continuous system, which includes a time scale parameter Converting continuous parameters A, B to discrete parameters; A common transformation method is zero-order hold (ZOH), defined as follows:
(17);
(18);
At the position of After discretization, equations (15) (16) can be expressed as:
(19);
(20);
finally, the model is output through global convolution calculation:
(21);
(22
Where M is the length of the input sequence x, Is a structured convolution kernel.
In order to apply the method to the SDF diffusion process, the voxel features are unfolded into one-dimensional features, and the time step t in S3-1 is subjected to sine-cosine transformation and then is mapped to one-dimensional features with the same size as the voxel features in a linear mode and added with the voxel features. To fuse the ultrasound image features with voxel features, the ultrasound image features are compared with each otherAnd (3) linearly mapping the input x to the same channel number as the voxel characteristic, and then connecting the input x with the voxel characteristic to obtain the SSM.
And obtaining an output y after Mamba calculation, taking out the voxel characteristic part, and remolding the one-dimensional sequence back to the original shape of the voxel characteristic to obtain the voxel characteristic after the ultrasonic image characteristic is fused.
S3-4: after the voxel feature passes through the top of the model, downsampling is carried out to reduce the resolution of the voxel feature, the number of channels is increased, the overall data volume is reduced, and the voxel feature and the ultrasonic image feature are fused by using cross attention at the stage of small voxel resolution.
The attention function is described as mapping a query and a set of key-value pairs to an output, where the query, key, value, and output are all vectors and the output is a weighted sum of values. In practice, the attention functions of a set of queries are computed simultaneously, packed into a matrix Q, and keys and values are also packed into matrices K and V; the calculated output moment is:
(23);
multi-headed attention allows the model to focus jointly on information of different representation subspaces at different locations; the formula is as follows:
(24);
(25);
Where concat represents a splicing operation, Are all learnable weight matrices.
For voxels in the gridProjecting the center of the image to obtain projection coordinates; Select proximity toAdjacent image blocks of (a) and (b)Interactions because of their characteristics most likely affect the response of the userControlled local geometry; neighborhood image patch setThe representation was chosen as follows: if it isAnd (3) withThe distance between the centers is less than a distance thresholdThen patchBelonging to
Using multi-headed self-attention pairsVoxel characterization atAnd belong toUltrasound image patch feature set of a patch in (a)The feature interactions between are modeled as follows:
(26);
(27);
Here the number of the elements is the number, Is a standard multi-headed attention operation,Is a learnable matrix, M is a mask of attention calculations caused by view projection, and absolute position encoding is used for voxels and ultrasound image tiles.
After the cross attention, the ultrasonic image features and the voxel features are further fused, and the output voxel features are high-precision SDF fields.
S4: a marching cubes algorithm is executed to extract the grid from the SDF field. Setting an isosurface threshold value T, and traversing all cube units C in the SDF field:
(1) Calculating vertex state code index according to the size relation between the 8 vertex scalar values of C and T;
(2) Searching a corresponding intersection point configuration pattern in a predefined condition table according to the index;
(3) Calculating the intersection points with the isosurface on 12 sides of the cube by utilizing linear interpolation;
(4) Connecting corresponding intersection points according to pattern to form one or more triangles;
(5) Adding triangles into the grid surface data;
And obtaining the final reconstructed triangle mesh model guided by the ultrasonic image with high precision.
Example 2: this example uses example 1 as a basic method, and a module design is performed.
The medical ultrasonic image three-dimensional reconstruction system based on SDF diffusion consists of an ultrasonic image feature extraction module, an SDF diffusion module and a grid generation module, as shown in fig. 2, the following details are given:
An ultrasonic image feature extraction module: the module is used for extracting the characteristics of an input ultrasonic image, and the image encoder of a visual model SAM-Med2D pre-trained by a large amount of medical image data is used for extracting the characteristics of the input ultrasonic image, so that rich and accurate characteristic guidance is provided for the subsequent diffusion process.
SDF diffusion module: the module consists of a U-shaped neural network embedded with a state space model and cross attention, and is used for executing an SDF diffusion process, denoising gradually from random noise to obtain a clean SDF field, and fusing voxel characteristics and ultrasonic picture characteristics by the state space model and the cross attention in the diffusion process to achieve the aim of keeping consistency of a reconstruction grid and the ultrasonic picture characteristics.
And a grid generation module: the module executes a marching cubes algorithm to extract a high precision three-dimensional grid from the reconstructed SDF field.
Example 3: the embodiment performs instance verification based on the method and the system:
In order to verify the accuracy and efficiency of the three-dimensional reconstruction of the ultrasound image of the present invention, experiments were performed on USOVA D datasets.
The embodiment performs example verification based on the method and the system, and tests are performed on USOVA D data sets in order to verify the accuracy of the three-dimensional reconstruction of the ultrasonic image SDF. The results shown in FIG. 3 are obtained, and the method (Unet-mamba) of the invention can better generate three-dimensional shapes according to the input ultrasonic images, and has shorter output time. The model constructed by the invention has higher precision and higher speed in SDF three-dimensional reconstruction of the ultrasonic image.
The above-mentioned plan is merely an implementation method in the present invention, but the scope of the present invention is not limited thereto, and all those skilled in the art should understand that the conceivable substitutions or alterations are included in the scope of the present invention, so the scope of the present invention shall be defined by the scope of the claims.

Claims (4)

1. The medical ultrasonic image three-dimensional reconstruction method based on SDF diffusion is characterized by comprising the following steps of:
s1: acquiring medical ultrasonic image data, extracting characteristics of the ultrasonic image, and guiding a subsequent diffusion process;
s2: randomly initializing SDF field, and randomly initializing SDF field V according to normal distribution
S3: the SDF diffusion process is performed using a diffusion model that contains two key processes: a forward diffusion process and a reverse diffusion process; fusing the voxel and the ultrasonic image features by using a state space model to obtain voxel features fused with the ultrasonic image features, and fusing the voxel features and the ultrasonic image features by using cross attention at a stage with smaller voxel resolution; the method specifically comprises the following steps:
S3-1: the forward diffusion process is represented by a Markov chain, a certain amount of Gaussian noise is added to each time step T until the final moment T, and the original data becomes pure noise; given one data sample X 0~q(X0), the forward process q (X 0:t) gradually converts the data sample into pure gaussian noise:
Where x t denotes the sample at time step t, x t-1 denotes the sample at the previous time step, q (x t∣xt-1) is the gaussian transition probability:
Wherein the method comprises the steps of Representing normal distribution, beta t representing noise scheduling super-parameters, which are a learned coefficient or set as a constant; the forward process samples x t in a closed form at any time step t:
Wherein a t=1-βt is defined as the total number of the components, Thus, x t is sampled directly by:
where ε is a noise randomly sampled from a normal distribution;
S3-2: the back diffusion process is defined as a markov chain with learned gaussian transitions:
Wherein the method comprises the steps of Is a standard normal distribution, mu θ(xt, t) represents a denoising function parameterized by theta,Is a time-step dependent variance set toSampling from p (x T), then plotting x t-1~pθ(xt-1∣xt by equation (6) as:
Where ε θ(xt, t) is a neural network parameterized by θ that predicts noise from noise input x T; by repeating this process, X 0 is finally generated;
Thus, the objective function compares the difference between the predicted noise e θ(xt, t) and the applied noise e:
Furthermore, μ θ(xt, t) is modified by neural network prediction x 0 instead of noise ε:
μθ(xt,t)=γtfθ(xt,t)+δtxt (10)
Wherein, F θ(xt, t) is a neural network parameterized by θ, which predicts noise-free data x 0; in this case, the objective function becomes:
specifically, a clean data sample is given Sampling the samples having the same shape according to equation (4) in the forward direction
Then, train a networkTo predict noiseless data whose MSE objective function is as follows:
F p is the ultrasonic image characteristic extracted in S1;
in the reasoning process, a trained neural network is used The process of t=t …,1 predicts progressively smaller noise samples according to equation (14)To generate new SDF voxels:
s3-3: the diffusion process uses a U-shaped network, the voxel feature resolution of each layer is halved, and the channel is doubled; in order to keep the object represented by the generated SDF field consistent with the ultrasonic image, the SDF features and the ultrasonic image features are fused in the diffusion process, so that the ultrasonic image features guide the SDF diffusion process;
the state space model will map a one-dimensional sequence x (t) to y (t) through a hidden state h (t), this system using a as evolution parameter, B, C as projection parameter:
h′(t)=Ah(t)+Bx(t) (15)
y(t)=Ch(t) (16)
mamba including a time scale parameter delta converts the continuous parameter A, B into a discrete parameter A common transformation method is zero-order preservation, defined as follows:
At the position of After discretization, equations (15) (16) are expressed as:
yt=Cht (20)
finally, the model is output through global convolution calculation:
where M is the length of the input sequence x, Is a structured convolution kernel;
In order to fuse the ultrasonic image characteristics and the voxel characteristics, the ultrasonic image characteristics F p are linearly mapped to the same channel number as the voxel characteristics and then are connected with the voxel characteristics, so that an input x of the SSM is obtained;
Obtaining output y after Mamba calculation, taking out the voxel characteristic part, and remolding the one-dimensional sequence back to the original shape of the voxel characteristic to obtain the voxel characteristic after the ultrasonic image characteristic is fused;
s3-4: after the voxel feature passes through the top of the model, downsampling is carried out to reduce the resolution of the voxel feature, the number of channels is increased, the overall data volume is reduced, and the voxel feature and the ultrasonic image feature are fused by using cross attention at the stage of smaller voxel resolution; simultaneously calculating a group of attention functions of inquiry, packing the attention functions into a matrix Q, and packing keys and values into matrices K and V; the calculated output moment is:
the formula is used as follows:
MultiHead(Q,K,V)=Concat(head1,...,headh)WO (24)
headi=Attention(QWi Q,KWi K,VWi V) (25)
Wherein concat represents a stitching operation, and W O、Wi Q、Wi K、Wi V are all learnable weight matrices;
For the voxels v in the grid, projecting the center of the voxel v onto an image to obtain a projection coordinate P; selecting adjacent image blocks close to P to interact with v, wherein a neighborhood image patch set is denoted by N V, and the selection is as follows: if the distance between P and the center of P j is less than the distance threshold d δ, then patch P j belongs to N V;
Feature interactions between voxel feature f V at v and ultrasound image patch feature set f I belonging to the patch in N V are modeled using multi-headed self-attention as follows:
Q=fVWQ,K=fNWK,V=fNWV (26)
Here MultiHead (·) is a standard multi-headed attention operation, W Q、WK、WV is a learnable matrix, M is a mask of attention calculations caused by view projection, absolute position encoding is used for voxels and ultrasound image tiles;
s3-5: further fusing the ultrasonic image features and the voxel features after the cross attention, wherein the output voxel features are high-precision SDF fields;
S4: the meshes are extracted by using the marching cubes, and finally, the triangular mesh model guided by the ultrasonic image is reconstructed.
2. The SDF diffusion-based medical ultrasound image three-dimensional reconstruction method of claim 1, wherein in S1, first training a visual model MedSAM using medical ultrasound image data, then feature extracting an input single medical ultrasound image using a trained MedSAM image encoder, and mapping the output to a feature vectorAnd the method is used for guiding the SDF to approach the picture features in the subsequent diffusion process so as to realize the SDF three-dimensional reconstruction of the ultrasonic image.
3. The SDF diffusion-based medical ultrasound image three-dimensional reconstruction method according to claim 1, wherein the marching cubes algorithm step in S4: setting an isosurface threshold value T, and traversing all cube units C in the SDF field:
(1) Calculating vertex state code index according to the size relation between the 8 vertex scalar values of C and T;
(2) Searching a corresponding intersection point configuration pattern in a predefined condition table according to the index;
(3) Calculating the intersection points with the isosurface on 12 sides of the cube by utilizing linear interpolation;
(4) Connecting corresponding intersection points according to pattern to form one or more triangles;
(5) Triangles are added to the mesh surface data.
4. The system based on the medical ultrasonic image three-dimensional reconstruction method based on SDF diffusion according to claim 1, which is characterized by comprising an ultrasonic image feature extraction module, an SDF diffusion module and a grid generation module;
The ultrasonic image feature extraction module is used for: the module is used for extracting the characteristics of an input ultrasonic image, and the characteristics of the input ultrasonic image are extracted by using a visual model SAM-Med2D image encoder which is pre-trained by a large amount of medical image data;
The SDF diffusion module: the module consists of a U-shaped neural network embedded with a state space model and cross attention, and is used for executing an SDF diffusion process, denoising gradually from random noise to obtain a clean SDF field, and fusing voxel characteristics and ultrasonic picture characteristics by the state space model and the cross attention in the diffusion process to achieve the aim of keeping consistency of a reconstruction grid and the ultrasonic picture characteristics;
The grid generation module: the module executes a marching cubes algorithm to extract a high precision three-dimensional grid from the reconstructed SDF field.
CN202410917497.7A 2024-07-10 2024-07-10 Medical ultrasonic image three-dimensional reconstruction method and system based on SDF diffusion Active CN118470222B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410917497.7A CN118470222B (en) 2024-07-10 2024-07-10 Medical ultrasonic image three-dimensional reconstruction method and system based on SDF diffusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410917497.7A CN118470222B (en) 2024-07-10 2024-07-10 Medical ultrasonic image three-dimensional reconstruction method and system based on SDF diffusion

Publications (2)

Publication Number Publication Date
CN118470222A CN118470222A (en) 2024-08-09
CN118470222B true CN118470222B (en) 2024-09-06

Family

ID=92162282

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410917497.7A Active CN118470222B (en) 2024-07-10 2024-07-10 Medical ultrasonic image three-dimensional reconstruction method and system based on SDF diffusion

Country Status (1)

Country Link
CN (1) CN118470222B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116580172A (en) * 2023-05-19 2023-08-11 湖南大学 Diffusion model-based three-dimensional shape generation method applied to industrial vision requirements
CN116912419A (en) * 2023-07-21 2023-10-20 广州大学 Three-dimensional human body reconstruction method, system, equipment and medium based on diffusion model

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11087459B2 (en) * 2015-08-14 2021-08-10 Elucid Bioimaging Inc. Quantitative imaging for fractional flow reserve (FFR)
CN117689805A (en) * 2023-05-09 2024-03-12 北京航空航天大学 Large-scale cloud scene simulation method based on noise and particles
CN117115355A (en) * 2023-09-07 2023-11-24 上海微创电生理医疗科技股份有限公司 Three-dimensional ultrasonic modeling method, system, electronic device and readable storage medium
CN117541471B (en) * 2023-11-09 2024-06-07 西安电子科技大学 SPH heuristic PG-SPECT image super-resolution reconstruction method
CN117953180B (en) * 2024-03-26 2024-10-08 厦门大学 Text-to-three-dimensional object generation method based on dual-mode latent variable diffusion
CN118247414A (en) * 2024-03-26 2024-06-25 杭州电子科技大学 Small sample image reconstruction method based on combined diffusion texture constraint nerve radiation field
CN118196121B (en) * 2024-04-08 2024-09-20 兰州交通大学 Breast ultrasound image segmentation method based on denoising diffusion probability model
CN118229886A (en) * 2024-04-28 2024-06-21 中国人民解放军空军工程大学 Engine blade in-situ three-dimensional reconstruction method, equipment and storage medium based on deep learning and nerve radiation field

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116580172A (en) * 2023-05-19 2023-08-11 湖南大学 Diffusion model-based three-dimensional shape generation method applied to industrial vision requirements
CN116912419A (en) * 2023-07-21 2023-10-20 广州大学 Three-dimensional human body reconstruction method, system, equipment and medium based on diffusion model

Also Published As

Publication number Publication date
CN118470222A (en) 2024-08-09

Similar Documents

Publication Publication Date Title
CN110288695B (en) Single-frame image three-dimensional model surface reconstruction method based on deep learning
Paulsen et al. Markov random field surface reconstruction
CN110490917A (en) Three-dimensional rebuilding method and device
CN113450396B (en) Three-dimensional/two-dimensional image registration method and device based on bone characteristics
CN113781640A (en) Three-dimensional face reconstruction model establishing method based on weak supervised learning and application thereof
CN110889893B (en) Three-dimensional model representation method and system for expressing geometric details and complex topology
CN104408760A (en) Binocular-vision-based high-precision virtual assembling system algorithm
CN109447096B (en) Glance path prediction method and device based on machine learning
Soussen et al. Polygonal and polyhedral contour reconstruction in computed tomography
Goldenstein et al. Statistical cue integration in dag deformable models
Yang et al. Multiscale mesh deformation component analysis with attention-based autoencoders
Zhu et al. Nonlocal low-rank point cloud denoising for 3-D measurement surfaces
CN116721210A (en) Real-time efficient three-dimensional reconstruction method and device based on neurosigned distance field
CN116385667B (en) Reconstruction method of three-dimensional model, training method and device of texture reconstruction model
CN115965765A (en) Human motion capture method in deformable scene based on neural deformation
CN117974693B (en) Image segmentation method, device, computer equipment and storage medium
CN118470222B (en) Medical ultrasonic image three-dimensional reconstruction method and system based on SDF diffusion
Chen et al. Learning shape priors for single view reconstruction
CN118071932A (en) Three-dimensional static scene image reconstruction method and system
Hu et al. Self-perceptual generative adversarial network for synthetic aperture sonar image generation
Ong et al. Machine learning for human design: Sketch interface for structural morphology ideation using neural networks
CN116079727A (en) Humanoid robot motion simulation method and device based on 3D human body posture estimation
CN116486030A (en) Modeling method and related device of three-dimensional geologic body model based on surface image
KR102620475B1 (en) Method and apparatus for performing a re-topology using deep learning algorithms based on images
Mokhtar et al. Neural Implicit Fields for Performance-Informed Geometries in Building Design

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant