[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN115131551A - Target feature extraction method based on cross-correlation self-attention mechanism - Google Patents

Target feature extraction method based on cross-correlation self-attention mechanism Download PDF

Info

Publication number
CN115131551A
CN115131551A CN202210778826.5A CN202210778826A CN115131551A CN 115131551 A CN115131551 A CN 115131551A CN 202210778826 A CN202210778826 A CN 202210778826A CN 115131551 A CN115131551 A CN 115131551A
Authority
CN
China
Prior art keywords
matrix
attention mechanism
channel
cross
self
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210778826.5A
Other languages
Chinese (zh)
Inventor
袁帅
许景科
栾方军
张笑闻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Jianzhu University
Original Assignee
Shenyang Jianzhu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Jianzhu University filed Critical Shenyang Jianzhu University
Priority to CN202210778826.5A priority Critical patent/CN115131551A/en
Publication of CN115131551A publication Critical patent/CN115131551A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of target detection and identification, and discloses a target feature extraction method based on a cross-correlation self-attention mechanism, which specifically comprises the steps of firstly inputting a feature diagram X with the size of H X W X C; then, carrying out window division operation on the feature graph; expanding the channel dimension of the linear layer to be 2 x C, and dividing the matrix into a matrix M and a matrix V along the channel dimension; acquiring a cross-correlation matrix and activating operation; then self-attention calculation and channel attention calculation are carried out; and finally, outputting a feature diagram Y with the size H W C. The invention searches the correlation among elements in the characteristic diagram, obtains the similar characteristics of the target, and simultaneously makes information sharing among channels, thereby realizing attention area selection of space dimension and channel dimension. The method and the device improve the recognition effect of the model on the information to be detected in the image and improve the recognition precision of the model.

Description

Target feature extraction method based on cross-correlation self-attention mechanism
Technical Field
The invention relates to the technical field of target detection and identification, in particular to a target feature extraction method based on a cross-correlation self-attention mechanism.
Background
With the development of deep learning, image processing methods based on the basis of convolutional neural networks are becoming mainstream. In deep learning, with the rapid increase of data information amount, how to focus limited computing power on a target area by using an attention mechanism becomes a current research hotspot. There are many studies to integrate attention mechanism into feature extraction, firstly, the convolutional neural network can use the attention mechanism to automatically calculate the feature region needing to be highlighted through learnable weight, and secondly, to imitate the attention behavior of human, find the focus region in the image.
Attention mechanisms can be classified into three categories, namely spatial attention mechanisms, channel attention mechanisms and self-attention mechanisms. Hou et al propose a Coding Attention (CA) mechanism that processes a feature map in spatial dimensions, performs pooling operations from two dimensions, respectively, and can capture long-range dependence and precise location of information. Wang et al first performs pooling operations on the feature map, and then performs one-dimensional convolution operations on the channel dimensions of the feature map to obtain the interrelations between the channels. To combine the spatial attention mechanism with the channel attention mechanism, Woo et al propose a CBAM module that first performs weight assignment in the channel dimension, and then performs target search in the spatial dimension. The Transformer model applied the self-attention mechanism to the natural language processing domain for the first time, and then extended the self-attention mechanism to the computer vision domain by the ViT model. The self-attention mechanism can find the position of the target through the connection between the pixels of the image, and can capture global information at one time. It can be seen that the spatial attention mechanism and the channel attention mechanism based on convolution operation do not have the characteristic of capturing global information by the self-attention mechanism, but the self-attention mechanism cannot enable information interaction between channels. Based on the advantages and disadvantages of the existing attention mechanism, the invention provides a target feature extraction method based on a cross-correlation self-attention mechanism.
Disclosure of Invention
The invention aims to provide a target feature extraction method based on a cross-correlation self-attention mechanism; the method utilizes the capability of capturing global information of the traditional self-attention mechanism, realizes information interaction among channels through the channel attention mechanism, uses a cross-correlation matrix to find out similar information in a characteristic diagram on the basis, and further finds out an attention area in an image, thereby realizing efficient and accurate target identification.
The invention is realized by the following steps: a target feature extraction method based on a cross-correlation self-attention mechanism; the method is specifically executed according to the following steps;
S 1 : firstly, inputting a characteristic diagram X with the size of H, W and C;
S 2 : carrying out window division operation on the characteristic diagram; dividing the input tensor into a group of n pixels along the length and width directions, wherein the size of each window is n x n; finally, the three-dimensional tensor in the window is expanded from H x W C to HW 1C.
S 3 : the dimension of the expansion channel of the linear layer is 2 × C, and the matrix is divided into a matrix M and a matrix V along the dimension of the channel; the method is specifically implemented according to the following steps;
expanding the channel dimension to 2 × C by linear layers, and dividing the matrix into a matrix M and a matrix V along the channel dimension, wherein each column of the M matrix and the V matrix is as shown in formula (1) and formula (2):
Figure BDA0003723402110000021
Figure BDA0003723402110000022
where i ∈ [0, C ], C represents the number of channels of the matrix.
S 4 : acquiring a cross-correlation matrix; the method is specifically carried out according to the following steps,
S 4.1 : copying each column vector in the M matrix into H x W columns, copying the matrix with the size of (H x W) obtained after copying into two parts, and copying one of the two partsTransposing, subtracting the transposed matrix from the copied matrix to obtain the difference between each element in the M matrix and other elements, wherein the matrix is positioned as M dis
S 4.2 : will M dis Adding the elements at the corresponding positions of each channel in the matrix to obtain a molecular matrix,
Figure BDA0003723402110000031
S 4.3 : defining the denominator matrix as
Figure BDA0003723402110000032
The expression is shown in formula (3):
Figure BDA0003723402110000033
S 4.4 : dividing the numerator matrix into denominator matrix to obtain a similarity matrix M mask The calculation formula is shown as formula (4);
Figure BDA0003723402110000034
S 4.5 : will matrix M mask The convolution operation is performed using a convolution kernel of size 1 x 1.
S 5 : activating operation; activating the convolved tensor by using an activation function, and defining the activation function as a formula (5);
M sofimax (1-sigmoid (x)) formula (5)
Wherein X is the input tensor, M Is the matrix after passing the activation function.
S 6 : performing self-attention calculation; the matrix M after activation Performing product operation with the matrix V; rearranging the computed tensor result into H W C; then multiplying the rearranged result by the channel weight obtained by the channel attention mechanism according to the corresponding channel;
S 7 : performing channel attention calculation;
S 8 : and outputting a feature graph Y with the size H, W and C.
Further, adding a channel attention mechanism, namely firstly, performing average pooling on the input feature graph H W C to obtain a feature graph with the size of 1W 1C; performing convolution operation on the feature map by using a one-dimensional convolution kernel with the size of 3 x 1; activating the feature value after convolution through a Sigmoid function, wherein the specific implementation formula is as shown in formula (6):
E(X)=Sigmoid(C 3×1 (Avgpool (X))) formula (6)
Wherein X represents the input characteristic, C 3×1 The convolution operation with size 3 x 1 is indicated.
Further, the network model operated is YOLOv5, and the operation steps are as follows:
S 8.1 : acquiring a data set, and performing Mosaic data enhancement on the data set; sending the enhanced data into a network for training;
S 8.2 : the method for extracting the target features based on the cross-correlation self-attention mechanism is applied to a YOLOv5 network, and the last three C3 modules in the neck structure are replaced by the following steps:
S 8.2.1 : the input tensor is copied into 2 parts and is processed through two branches respectively;
S 8.2.2 : one of the branches is subjected to 1-by-1 convolution and a modified self-attention mechanism; convolving the other branch by 1 x 1;
S 8.2.3 : performing concat operation on output results of the two branches, splicing along channel dimensions, and performing 1 × 1 convolution operation;
S 9 : the optimization algorithm adopts a random gradient descent algorithm SGD as an optimizer, takes 16 pictures as a training batch, the initial learning rate of the model is 1e-2, the weight attenuation parameter is 5e-4, the momentum is 0.937, and 300 epochs are trained. In the initial stage of model training, 3 epochs are adopted for warm-up training;
S 10 : and predicting the picture after the model is trained to obtain a result.
Further, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a master controller, implements a method as claimed in any one of the above.
Compared with the prior art, the invention has the beneficial effects that:
and searching the correlation among elements in the characteristic diagram, acquiring the similar characteristics of the target, sharing the information among channels, and realizing the attention area selection of the space dimension and the channel dimension. The method and the device improve the recognition effect of the model on the information to be detected in the image and improve the recognition precision of the model.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
FIG. 1 is a flow chart of a model of the present invention.
Fig. 2 is a block diagram of the operation of the present invention in the YOLOv5 network.
FIG. 3 is a block diagram of the cross-correlation self-attention mechanism of the present invention.
FIG. 4 is a flow chart of the cross-correlation self-attention mechanism of the present invention.
Fig. 5 is a block diagram of the channel attention operation of the present invention.
FIG. 6 is a schematic diagram of a molecular matrix in the cross-correlation self-attention mechanism of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings of the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without inventive efforts based on the embodiments of the present invention, are within the scope of protection of the present invention. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
Referring to fig. 1-6, a method for extracting target features based on a cross-correlation self-attention mechanism, where a network model is YOLOv5, and modules of YOLOv5 used in the present invention include:
1. a Focus module: expanding four times of three channels of the RGB image into twelve channels, and performing convolution operation to obtain a feature map of double down sampling;
2. a Conv module: the method comprises two-dimensional convolution, batch normalization and an activation function;
3. an SPP module: performing maximal pooling operations of 5 × 5, 9 × 9 and 13 × 13 on the input images respectively, and obtaining fused features through convolution;
4. module C3: the system comprises three Conv modules and a Bottleneeck structure, wherein one branch passes through the Conv modules and the Bottleneeck modules, the other branch passes through the Conv modules, and the results of the two branches are subjected to Concat operation and pass through the last Conv module;
5. cross-correlation C3 module: the Bottleneck structure is replaced by a cross-correlation self-attention mechanism on the basis of a C3 module.
In this embodiment, after the present invention is combined with YOLOv5 network, the operation steps are as follows:
S 1 acquiring a data set, dividing the data, and performing Mosaic data enhancement on the data set.
S 2 Building a Yolov5 network model.
S 3 Applying a target feature extraction method based on a cross-correlation self-attention mechanism to a YOLOv5 network, and connecting the last three in the neck structureThe C3 module is replaced by the following steps:
S 3.1 the input tensor is copied into 2 parts and processed through two branches respectively.
S 3.2 One of the branches is subjected to a 1 x 1 convolution and a modified self-attention mechanism.
S 3.3 The other branch is convolved by 1 x 1.
S 3.4 The output results of the two branches are subjected to concat operation, spliced along the channel dimension, and subjected to 1 × 1 convolution operation.
S 4 The processed data set is sent to a model for training, and the model result is detected through a test image.
S 5 The optimization algorithm adopts a random gradient descent algorithm (SGD) as an optimizer, takes 16 pictures as a training batch, the initial learning rate of the model is 1e-2, the weight attenuation parameter is 5e-4, the momentum is 0.937, and 300 epochs are trained. In the initial stage of model training, 3 epochs are used for warm-up training. The invention uses PyTorch frame to build, and uses Intel Xeon Gold 5320CPU @2.20GHz and NVIDIA RTX A4000 GPU, and the system is Ubuntu 18.04.
S 6 Comparing the original Yolov5 model with a Yolov5 model added in the method provided by the invention, and comparing the test effects of two networks, wherein the results are shown in Table 1:
TABLE 1 comparison of the results
Precision(%) Recall(%) AP(%)
YOLOv5 78.6 71.2 73.3
Proposed method 79.2 76.9 77.1
The meaning of the precision index is what the model predicts is of all targets, namely the targets. The recall index indicates how much was successfully predicted in the model for all real targets. The average precision can balance two indexes of precision rate and recall rate, the recall rate is used as an abscissa, the precision rate is used as an ordinate, and the area size under a curve (PR-curve) enclosed by the two parameters is calculated.
By completing the steps, efficient and accurate target identification can be realized, and the target prediction accuracy can be improved.
In this embodiment, the invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a host controller, implements the method of any one of the above.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. A target feature extraction method based on a cross-correlation self-attention mechanism is characterized by comprising the following steps:
S 1 firstly, inputting a characteristic diagram X with the size of H, W and C;
S 2 carrying out window division operation on the characteristic diagram;
S 3 the dimension of an expansion channel of the linear layer is 2 × C, and the matrix is divided into a matrix M and a matrix V along the dimension of the channel;
S 4 obtaining a cross-correlation matrix;
S 5 activating operation;
S 6 performing self-attention calculation;
S 7 performing channel attention calculation;
S 8 outputting a characteristic diagram Y with the size of H, W, C.
2. The method for extracting target features based on cross-correlation self-attention mechanism as claimed in claim 1, wherein step S 2 The method comprises the following steps:
S 2.1 dividing the input tensor into a group every n pixels along the length and width directions, wherein the size of each window is n x n;
S 2.2 extending the three-dimensional tensor in the window from H W C to HW 1C.
3. The method for extracting target features based on the cross-correlation self-attention mechanism as claimed in claim 1, wherein step S 3 The method comprises the following steps:
expanding the channel dimension by a linear layer by 2 × C; partitioning the matrix along the channel dimension into a matrix M and a matrix V, wherein each column of the M and V matrices is as in equations (1) and (2):
Figure FDA0003723402100000011
Figure FDA0003723402100000021
wherein i ∈ [0, C ], C represents the number of channels of the matrix.
4. The method for extracting target features based on cross-correlation self-attention mechanism as claimed in claim 1, wherein step S 4 The method comprises the following steps:
S 4.1 copying each column vector in M matrix into H x W column, copying matrix with size of (H x W) into two copies, transposing one copy, subtracting the copied matrix from its transposed matrix to obtain difference between each element and other elements in M matrix, and defining the matrix as M dis
S 4.2 Will M dis The elements at the corresponding position of each channel in the matrix are added, and the obtained molecular matrix is
Figure FDA0003723402100000022
S 4.3 Defining the denominator matrix as
Figure FDA0003723402100000023
The expression is shown in formula (3):
Figure FDA0003723402100000024
S 4.4 dividing the numerator matrix into denominator matrix to obtain a similarity matrix M mask The calculation formula is shown as formula (4);
Figure FDA0003723402100000025
S 4.5 will matrix M ask The convolution operation is performed using a convolution kernel of size 1 x 1.
5. The method for extracting target features based on the cross-correlation self-attention mechanism as claimed in claim 1, wherein in step S 5 In the middle, specifically pressThe following steps are executed; activating the convolved tensor by using an activation function, and defining the activation function as a formula (5);
M softmax (1-sigmoid (X)) formula (5)
Wherein X is the input tensor, M Is the matrix after passing the activation function.
6. The method for extracting target features based on the cross-correlation self-attention mechanism as claimed in claim 1, wherein the step S 6 The method comprises the following steps: the activated matrix M Performing product operation with the matrix V; rearranging the computed tensor result into H W C; and then multiplying the rearranged result by the channel weight obtained by the channel attention mechanism by the corresponding channel.
7. The method for extracting the target features based on the cross-correlation self-attention mechanism according to any one of claims 1 or 6, wherein a channel attention mechanism is added, and firstly, the input feature graph H W C is subjected to average pooling to obtain a feature graph with the size of 1C;
performing convolution operation on the feature map by using a one-dimensional convolution kernel with the size of 3 x 1; activating the feature value after convolution through a Sigmoid function, wherein the specific implementation formula is as shown in formula (6):
E(X)=Sigmoid(C 3×1 (Avgpool (X))) formula (6)
Wherein X represents the input characteristic, C 3×1 The convolution operation with size 3 x 1 is indicated.
8. The method for extracting target features based on the cross-correlation self-attention mechanism as claimed in claim 1, wherein the network model is YOLOv5, and the method comprises the following steps:
S 8.1 acquiring a data set, and performing Mosaic data enhancement on the data set; sending the enhanced data into a network for training;
S 8.2 an objective of a cross-correlation-based self-attention mechanismThe target feature extraction method is applied to a YOLOv5 network, and the last three C3 modules in the neck structure are replaced, wherein the replacement method comprises the following steps:
S 8.2.1 copying the input tensor into 2 parts, and processing the parts through two branches respectively;
S 8.2.2 one of the branches is subjected to 1 x 1 convolution and an improved self-attention mechanism; convolving the other branch by 1 x 1;
S 8.2.3 performing concat operation on output results of the two branches, splicing along channel dimensions, and performing 1 × 1 convolution operation;
S 8.3 an optimization algorithm adopts a random gradient descent algorithm SGD as an optimizer, 16 pictures are used as a training batch, the initial learning rate of a model is 1e-2, the weight attenuation parameter is 5e-4, the momentum is 0.937, and 300 epochs are trained. In the initial stage of model training, 3 epochs are adopted for warm-up training;
S 8.4 after the model is trained, the picture is predicted to obtain a result.
9. A computer-readable storage medium, on which a computer program is stored, which, when executed by a master controller, implements the method of any one of claims 1-8.
CN202210778826.5A 2022-06-30 2022-06-30 Target feature extraction method based on cross-correlation self-attention mechanism Pending CN115131551A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210778826.5A CN115131551A (en) 2022-06-30 2022-06-30 Target feature extraction method based on cross-correlation self-attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210778826.5A CN115131551A (en) 2022-06-30 2022-06-30 Target feature extraction method based on cross-correlation self-attention mechanism

Publications (1)

Publication Number Publication Date
CN115131551A true CN115131551A (en) 2022-09-30

Family

ID=83380949

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210778826.5A Pending CN115131551A (en) 2022-06-30 2022-06-30 Target feature extraction method based on cross-correlation self-attention mechanism

Country Status (1)

Country Link
CN (1) CN115131551A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115937197A (en) * 2023-01-05 2023-04-07 哈尔滨市科佳通用机电股份有限公司 Method for detecting breaking fault of pull rod chain of manual brake

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115937197A (en) * 2023-01-05 2023-04-07 哈尔滨市科佳通用机电股份有限公司 Method for detecting breaking fault of pull rod chain of manual brake
CN115937197B (en) * 2023-01-05 2023-09-08 哈尔滨市科佳通用机电股份有限公司 Method for detecting breaking fault of pull rod chain of manual brake

Similar Documents

Publication Publication Date Title
Liu et al. Global attention mechanism: Retain information to enhance channel-spatial interactions
CN112750140B (en) Information mining-based disguised target image segmentation method
CN110188795B (en) Image classification method, data processing method and device
CN114758383A (en) Expression recognition method based on attention modulation context spatial information
CN115147598B (en) Target detection segmentation method and device, intelligent terminal and storage medium
CN112529904B (en) Image semantic segmentation method, device, computer readable storage medium and chip
CN113066089B (en) Real-time image semantic segmentation method based on attention guide mechanism
CN111368850B (en) Image feature extraction method, image target detection method, image feature extraction device, image target detection device, convolution device, CNN network device and terminal
CN112861970B (en) Fine-grained image classification method based on feature fusion
CN112232134A (en) Human body posture estimation method based on hourglass network and attention mechanism
CN113192076B (en) MRI brain tumor image segmentation method combining classification prediction and multi-scale feature extraction
CN117689928A (en) Unmanned aerial vehicle detection method for improving yolov5
CN117033609A (en) Text visual question-answering method, device, computer equipment and storage medium
CN118097150B (en) Small sample camouflage target segmentation method
CN114049314A (en) Medical image segmentation method based on feature rearrangement and gated axial attention
CN115171052B (en) Crowded crowd attitude estimation method based on high-resolution context network
CN114898457B (en) Dynamic gesture recognition method and system based on hand key points and transformers
CN115131551A (en) Target feature extraction method based on cross-correlation self-attention mechanism
CN114764870A (en) Object positioning model processing method, object positioning device and computer equipment
CN115100107B (en) Method and system for dividing skin mirror image
CN117058235A (en) Visual positioning method crossing various indoor scenes
Mujtaba et al. Automatic solar panel detection from high-resolution orthoimagery using deep learning segmentation networks
CN117437557A (en) Hyperspectral image classification method based on double-channel feature enhancement
CN116050469A (en) AI model processing method, AI model operation method and AI model operation device
Wen et al. Pavement Recognition Based on Improving VGG16 Network Model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination