CN118430650A

CN118430650A - Chromosome key point sequence prediction method and device

Info

Publication number: CN118430650A
Application number: CN202410897387.9A
Authority: CN
Inventors: 刘利枚; 卢沁阳; 彭伟雄; 汤滔
Original assignee: Xiangjiang Laboratory
Current assignee: Xiangjiang Laboratory
Priority date: 2024-07-05
Filing date: 2024-07-05
Publication date: 2024-08-02
Anticipated expiration: 2044-07-05
Also published as: CN118430650B

Abstract

The invention discloses a chromosome key point sequence prediction method and device. The method comprises the following steps: acquiring a chromosome image dataset marked with 5 key points; constructing a sequence prediction network model based on a transducer, and training the model by using the data set acquired in the step S1 to obtain a fully trained chromosome key point sequence prediction model; inputting a new single chromosome image to be tested into the chromosome key point sequence prediction model, inputting a space occupation symbol, starting cyclic prediction, and after predicting one point each time, putting the single chromosome image into the input to predict the next point again until 5 key points are all predicted, namely, drawing a chromosome image marked with 5 key point positions. The invention aims to establish a key point set sequence to construct the association relation between key points, replace the existing discrete prediction method of the key points and finally lead the prediction of the key point set to be more reasonable and reliable.

Description

Chromosome key point sequence prediction method and device

Technical Field

The invention relates to the technical field of chromosome key point prediction under optical cell images, in particular to a chromosome key point sequence prediction method and device.

Background

In the abnormal chromosome karyotype judgment under the cell optical image, the detection of key points is an important technology. This is because key point detection can help identify specific regions of the chromosome, such as centromeres, telomeres, etc., while structural changes of the chromosome, such as breaks, deletions, inversions or translocations, etc., can be revealed, which are common forms of chromosomal abnormalities.

Key points generally refer to the endpoints and centromere points on both ends of a chromosome. Chromosome centromeres refer to the junction of two sister chromatids of a metaphase chromosome, and are positioned at the main constriction of the chromosome, and the centromeres divide the two chromatids into a short arm (p) and a long arm (q); after chromosome segmentation and classification, the alignment operation of the centromere of the chromosome and the homologous chromosome is required, according to the international standard requirement, all chromosomes in one split phase are not only arranged according to numbers, but also are required to be placed in the directions of the short arm upwards and the long arm downwards, and meanwhile doctors prefer two homologous similar chromosomes to be placed according to the principle of centromere approaching.

In the prior related research, researchers generally use a convolutional neural network-based method to directly predict key points of chromosome images at one time and activate the key points in a thermodynamic diagram mode. The point set predicted by the method has larger disturbance, because the convolutional neural network may predict excessive thermodynamic extremum, but only the largest points can be finally obtained; secondly, the predicted points are discrete, and no association relation exists between the points, so that sometimes obvious low-level errors exist between key points, and even almost coincident points exist. Based on the above problems, the existing key point prediction mode can make the predicted value difficult to process, and lacks logic, so that a more reasonable prediction algorithm is particularly important.

Disclosure of Invention

The invention aims to provide a chromosome key point sequence prediction method and device. The method aims at establishing a key point set sequence to construct an association relation between key points, replaces the existing key point discrete prediction method, and finally enables the prediction of the key point set to be more reasonable and reliable.

In a first aspect, the present invention provides a method for predicting a sequence of chromosomal keypoints, comprising:

S1, acquiring a chromosome image dataset marked with 5 key points; the method comprises the steps of marking 5 key points of a chromosome on the left upper corner, the right upper corner, the left lower corner, the right lower corner and the centromere center point of the chromosome on the premise that the chromosome is correctly placed;

S2, constructing a sequence prediction network model based on a transducer, and training the model by using the data set acquired in the S1 to obtain a fully trained chromosome key point sequence prediction model;

S3, inputting a new single chromosome image to be tested into the chromosome key point sequence prediction model, inputting a space occupation symbol, starting cyclic prediction, putting the single chromosome image into the input for predicting the next point again after predicting one point each time until 5 key points are completely predicted, and drawing a chromosome image marked with 5 key point positions.

More specifically, the transducer is composed of an encoder responsible for encoding the chromosome image and a decoder responsible for predicting 5 key points of the chromosome in sequence through the encoding of the chromosome image.

More specifically, during a data flow process, the transform will first encode a chromosome image X with dimension 64×64 into a feature vector V, and then input the point vector p_last obtained by the previous prediction and V into the decoder together, so as to obtain the point p_pred predicted by the current round; where V has a length of 256, P_last and P_pred have lengths of (n, 4098), n representing the number of points that are currently predicted.

More specifically, the encoder is responsible for encoding the chromosome image, including:

a process of encoding X into a vector V;

The method comprises the steps of processing a convolution layer before global pooling in ResNet neural network, coding a chromosome image X into a feature matrix with a dimension of (C, H, W), straightening second and third dimensions of the feature matrix to obtain a feature matrix F with a dimension of (C, H×W), and finally obtaining a feature vector V through self-attention operation, wherein the specific calculation formula is as follows: ；

where W ^Q、W^K、W^V is the parameters in the model, d _k is the dimension, and softmax is the activation function commonly used in the machine learning field, respectively.

More specifically, the p_last and p_pred are one-dimensional point vectors, which are converted into two-dimensional coordinates in the euclidean space, and the conversion formulas of the coordinates x, y and the point vector P in the two-dimensional euclidean space are as follows:

；

Wherein onehot () represents the one-hot encoding of the number with a length of 4098, c represents the number of columns of the image, and 2 is added because the first and second bits of the fetch vector P are the start and end placeholders, respectively; similarly, the formula scaled to x, y by P is as follows:

；

Where the symbol% represents the remainder, the symbol// represents the rounding division, and argmax () represents the maximum index of the rounding vector.

More specifically, the decoder employs a combination of a mask multi-head attention module, a multi-head attention module, and a plurality of fully connected layers. The mask multi-head attention module is responsible for processing the point vector P, and the multi-head attention module is responsible for processing the pre-information and the feature vector V of the image.

More specifically, the training of the model uses KL divergence as a loss function, as follows:

；

Wherein, P (i) represents the i-th value of the predicted point vector, p_y (i) represents the i-th value of the point vector of the label, KL divergence is used as a loss function, the prediction of the point vector is regarded as a probability distribution fitting process, each node on the point vector can return gradient learning, the distribution always tends to be reasonable, multiple times of training are carried out until the average euclidean distance difference value is converged to 1, a model is saved, and training is ended.

In a second aspect, the present invention provides a chromosome keypoint sequence prediction apparatus comprising:

an acquisition unit for acquiring a chromosome image dataset labeled with 5 key points; the method comprises the steps of marking 5 key points of a chromosome on the left upper corner, the right upper corner, the left lower corner, the right lower corner and the centromere center point of the chromosome on the premise that the chromosome is correctly placed;

The training unit is used for constructing a sequence prediction network model based on a transducer, training the model by using the data set acquired in the step S1, and obtaining a fully trained chromosome key point sequence prediction model;

And the prediction output unit is used for inputting a new single chromosome image to be tested into the chromosome key point sequence prediction model, inputting a space occupation symbol, starting cyclic prediction, and putting the single chromosome image into the input for predicting the next point again after predicting one point each time until 5 key points are all predicted, namely drawing the chromosome image marked with 5 key point positions.

In a third aspect, the present invention provides an electronic device comprising: a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the method of the first aspect above.

The invention has the beneficial effects that: the chromosome key point sequence prediction method based on the Transformer can effectively capture complex relations and modes in chromosome images by using a self-attention machine of the Transformer, thereby realizing accurate prediction of chromosome key points. The key point prediction Euclidean distance difference value of the model trained by the invention on the test set can be converged to less than 1, and the model has excellent key point detection effect on bifurcated and bent chromosomes. Accurate prior information can be provided for subsequent chromosome alignment, classification, principal axis extraction, stripe analysis and the like.

Drawings

For a further understanding of the nature and technical aspects of the present invention, reference should be made to the following detailed description of the invention and to the accompanying drawings, which are provided for purposes of reference only and are not intended to limit the invention.

FIG. 1 is a flow chart of the transform-based chromosome keypoint sequence prediction of the present invention;

FIG. 2 is a schematic diagram of 5 key points of a chromosome according to the present invention;

FIG. 3 is a schematic diagram of a process for sequentially predicting chromosomal keypoints according to the present invention;

FIG. 4 is a block diagram of a transducer of the present invention;

FIG. 5 is a schematic diagram of the predictive process of the model of the present invention;

Marked in the figure as: 1-upper left corner point, 2-upper right corner point, 3-lower left corner point, 4-lower right corner point and 5-centromere center point.

Detailed Description

In order to further explain the technical means adopted by the present invention and the effects thereof, the following detailed description is given with reference to the preferred embodiments of the present invention and the accompanying drawings.

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to fall within the scope of the application.

In the description of the present application, it should be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", etc. indicate orientations or positional relationships based on the drawings are merely for convenience in describing the present application and simplifying the description, and do not indicate or imply that the apparatus or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present application. Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more features. In the description of the present application, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.

In the present application, the term "exemplary" is used to mean "serving as an example, instance, or illustration. Any embodiment described as "exemplary" in this disclosure is not necessarily to be construed as preferred or advantageous over other embodiments. The following description is presented to enable any person skilled in the art to make and use the application. In the following description, details are set forth for purposes of explanation. It will be apparent to one of ordinary skill in the art that the present application may be practiced without these specific details. In other instances, well-known structures and processes have not been described in detail so as not to obscure the description of the application with unnecessary detail. Thus, the present application is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.

The invention aims to establish a key point set sequence to construct the association relation between key points, replace the existing discrete prediction method of the key points and finally lead the prediction of the key point set to be more reasonable and reliable. The following describes in detail the chromosome keypoint sequence prediction based on the transducer of the present invention.

Referring to fig. 1, an embodiment of the present invention provides a method for predicting a sequence of chromosome keypoints, including:

s1, acquiring a chromosome image dataset marked with 5 key points; the method comprises the steps that 5 key points of a chromosome are marked on an upper left corner 1, an upper right corner 2, a lower left corner 3, a lower right corner 4 and a centromere center point 5 of the chromosome on the premise that the chromosome is placed correctly; as shown in FIG. 2, a chromosome 1 is shown, and even though sister chromatids of its long arm are not separated, two key points of 4 and 5 are marked.

The method comprises the steps of dividing a chromosome image to be tested by the existing correlation method to obtain a region of interest (ROI) of the chromosome image, directly inputting the ROI image into a model trained in the step 2 without alignment, inputting a space occupying symbol, starting cyclic prediction, and putting the single chromosome image into the input for predicting the next point again after predicting one point each time until 5 key points are all predicted. As shown in fig. 3,5 key points of the chromosome are predicted in turn, and the prediction of each point affects the prediction of the following point, and it should be noted that the convertors in the figure are the same model.

The region of interest (ROI, region of Interest) is set to reduce the amount of computation and to filter out targets that are not in the region of interest, thereby increasing the speed and accuracy of the algorithm execution and further improving the accuracy of the passenger flow statistics.

As shown in fig. 4, the transducer used in the present invention is composed of one encoder and one decoder. In a data flow process, a chromosome image X with dimensions of 64×64 is encoded into a feature vector V, and then, a point vector p_last obtained by the previous prediction is input to a decoder together with the point vector V, so as to obtain a point p_pred predicted by the current prediction. Where the length of V is set to 256, the lengths of p_last and p_pred are set to (n, 4098), n representing the number of points that have been predicted currently.

For the process of encoding X into vector V, the present invention uses a neural network of class ResNet for processing. The invention removes ResNet full-connection and classification layers, and only retains the convolution layers before global pooling, where the chromosome image X is encoded into a feature matrix of dimension (C, H, W). Then, the second dimension and the third dimension of the feature matrix are straightened to obtain a feature matrix F with one dimension of (C, H multiplied by W), and a feature vector V is finally obtained through self-attention operation, wherein the specific calculation formula is as follows:

；

Wherein, W ^Q、W^K、W^V is a parameter in the model, d _k is a dimension, in this example, set to 256, and softmax is an activation function commonly used in the machine learning field, which is not described in detail in the present invention.

As described above, p_pred and p_last in fig. 4 are both one-dimensional point vectors, but what the present invention needs to predict is two-dimensional coordinates in the european space, specifically, conversion formulas of coordinates x, y and point vector P in the two-dimensional european space are as follows:

；

Wherein onehot () represents the one-hot encoding of numbers with a length of 4098, and c represents the number of columns of the image, and 2 is added because the first and second bits of the fetch vector P are the start and end placeholders, respectively. Similarly, the formula scaled to x, y by P is as follows:

；

The decoder in fig. 4 adopts a combination of a mask multi-head attention module, a multi-head attention module and a plurality of fully connected layers, wherein the mask multi-head attention mechanism is responsible for processing the point vector P, and the multi-head attention mechanism is responsible for processing the pre-information and the feature vector V of the image.

After the data are processed according to the method, model training can be performed, and in the invention, the prediction of each sample refers to predicting the n+1th point on the premise of knowing n predicted points. In this example it is explicitly pointed out that the length of a point vector P is 4098, comprising two start-end placeholders and 4096 coordinate points. If MSE loss function is used according to general point prediction thought, gradient return strategy is too redundant, maximum index operation is involved in the process from P to (x, y), and the process is not conductive, so that more reasonable KL divergence is adopted as loss function, and the formula is as follows:

；

Where P (i) represents the i-th value of the predicted point vector and P_y (i) represents the i-th value of the point vector of the tag. The invention uses KL divergence as a loss function, and considers the prediction of the point vector as a probability distribution fitting process, so that each node on the point vector can return gradient learning, and the distribution always tends to be reasonable. And performing multiple times of training until the average Euclidean distance difference value converges to 1, and storing the model to finish the training.

In this embodiment, a complete prediction process using the model trained by the present invention will be described, and for convenience of representing the point vector P, specific values are used to represent the vector P in the present invention, for example, p=1 represents that the first node on the vector P with a length of 4098 has a value of 1, the other nodes have a value of 0, and so on.

Fig. 5 shows a prediction process according to the present invention, in which a chromosome image with a resolution of 64×64 is encoded into a feature vector V by an encoder, the process is only required to be performed once, and the feature vector V encoded by the process can be reused in each subsequent decoding process. In the first round of decoding, a start placeholder point vector 1 is input, a first point vector 3733 of a chromosome is obtained through decoding after a decoder, and coordinates represented by the point are (19, 58); in the second round of decoding, the point vector 3733 obtained in the first round and the start placeholder point vector 1 are combined into a new point vector a, which is marked as [1, 3733], and are input into an encoder together, so that a predicted point vector b of a second round of the chromosome can be obtained, which is marked as [1, 3733, 2887], and the coordinates represented by the point are (5, 45); repeating the process to obtain point vectors c [1, 3733, 2887, 829] of a third round in sequence, wherein the coordinates are (59, 12); a fourth wheel of point vectors d [1, 3733, 2887, 829, 307], coordinates (49,4); the point vector e [1, 3733, 2887, 829, 307, 2399] of the fifth round, coordinates (26, 37); prediction was stopped until 5 keypoints of the chromosome were predicted.

The final predicted point vector in this example is [1, 3733, 2887, 829, 307, 2399], after the first placeholder is removed, the coordinates of 5 points are converted to be (19,58), (5,45), (59,12), (49,4), (26,37), and 5 points are drawn as shown in the figure. So far, in this example, all the 5 key points of the chromosome are obtained, and then the subsequent other analysis such as alignment, classification, streak analysis, principal axis analysis, straightening and the like can be further performed.

The invention provides a chromosome key point sequence prediction device, which comprises:

an acquisition unit for acquiring a chromosome image dataset labeled with 5 key points; the method comprises the steps that 5 key points of a chromosome are marked on an upper left corner 1, an upper right corner 2, a lower left corner 3, a lower right corner 4 and a centromere center point 5 of the chromosome on the premise that the chromosome is placed correctly;

According to the chromosome key point sequence prediction device based on the Transformer, the complex relation and mode in the chromosome image are efficiently captured through the self-attention module of the Transformer, so that accurate prediction of the chromosome key points is realized. The key point prediction Euclidean distance difference value of the model trained by the invention on the test set can be converged to less than 1, and the model has excellent key point detection effect on bifurcated and bent chromosomes. Accurate prior information can be provided for subsequent chromosome alignment, classification, principal axis extraction, stripe analysis and the like.

The embodiment of the invention also provides an electronic device, a memory and a processor, wherein the memory stores a computer program, and the computer program is executed by the processor to enable the processor to execute the steps of the method.

The embodiment of the invention also provides a storage medium, and a computer program is stored in the storage medium, and when the computer program is executed by a processor, part or all of the steps in each embodiment of chromosome key point sequence prediction based on the Transformer provided by the invention are realized. The storage medium may be a magnetic disk, an optical disk, a Read-only memory (ROM), a Random Access Memory (RAM), or the like.

It will be apparent to those skilled in the art that the techniques of embodiments of the present invention may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be embodied in essence or what contributes to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present invention.

The same or similar parts between the various embodiments in this specification are referred to each other. In particular, for embodiments of the density map-based passenger flow statistics apparatus, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments where relevant.

The embodiments of the present invention described above do not limit the scope of the present invention.

Claims

1. A method for predicting a sequence of chromosomal keypoints, comprising:

2. The method for predicting a sequence of key points of a chromosome according to claim 1, wherein the fransformer is composed of an encoder for encoding the chromosome image and a decoder for predicting 5 key points of the chromosome in sequence by encoding the chromosome image.

3. The method for predicting a sequence of key points of a chromosome according to claim 2, wherein the transform encodes a chromosome image X having a dimension of 64X 64 into a feature vector V during one data flow, and then inputs the point vectors p_last obtained by the previous prediction and V together into the decoder to obtain the point p_pred predicted by the current prediction; where V has a length of 256, P_last and P_pred have lengths of (n, 4098), n representing the number of points that are currently predicted.

4. A method of predicting a sequence of chromosomal keypoints according to claim 2, wherein the encoder is responsible for encoding the chromosomal image, comprising:

a process of encoding X into a vector V;

5. A method for predicting a sequence of chromosomal keypoints according to claim 3, wherein p_last and p_pred are one-dimensional point vectors, which are converted into two-dimensional coordinates in the euclidean space, and the conversion formulas of the coordinates x, y and the point vector P in the two-dimensional euclidean space are as follows:

；

6. A method of predicting a sequence of chromosomal keypoints according to claim 2, wherein the decoder is in the form of a combination of a mask multi-head attention module, a multi-head attention module and a number of fully connected layers, wherein the mask multi-head attention module is responsible for processing the point vector P and the multi-head attention module is responsible for processing the pre-information and the feature vector V of the image.

7. The method for predicting a sequence of chromosomal keypoints according to claim 1, wherein the model is trained using KL divergence as a loss function, as follows:

；

8. A chromosome keypoint sequence prediction apparatus, comprising:

9. An electronic device, comprising: a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of the method of any of claims 1-7.