CN115147703B - Garbage segmentation method and system based on GinTrans network - Google Patents
Garbage segmentation method and system based on GinTrans network Download PDFInfo
- Publication number
- CN115147703B CN115147703B CN202210901322.8A CN202210901322A CN115147703B CN 115147703 B CN115147703 B CN 115147703B CN 202210901322 A CN202210901322 A CN 202210901322A CN 115147703 B CN115147703 B CN 115147703B
- Authority
- CN
- China
- Prior art keywords
- module
- feature
- output
- characteristic
- frequency
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 44
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000010586 diagram Methods 0.000 claims abstract description 35
- 238000012545 processing Methods 0.000 claims abstract description 34
- 230000002776 aggregation Effects 0.000 claims abstract description 13
- 238000004220 aggregation Methods 0.000 claims abstract description 13
- 239000003550 marker Substances 0.000 claims abstract description 10
- 238000003709 image segmentation Methods 0.000 claims abstract description 9
- 230000006798 recombination Effects 0.000 claims abstract description 7
- 238000005215 recombination Methods 0.000 claims abstract description 7
- 230000004927 fusion Effects 0.000 claims description 26
- 238000005070 sampling Methods 0.000 claims description 25
- 238000000605 extraction Methods 0.000 claims description 21
- 230000008569 process Effects 0.000 claims description 19
- 238000005520 cutting process Methods 0.000 claims description 18
- 238000011176 pooling Methods 0.000 claims description 18
- 238000010606 normalization Methods 0.000 claims description 15
- 230000007246 mechanism Effects 0.000 claims description 9
- 230000009471 action Effects 0.000 claims description 6
- 239000000284 extract Substances 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 6
- 239000000463 material Substances 0.000 claims description 4
- 238000000354 decomposition reaction Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 claims description 3
- 238000011084 recovery Methods 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 3
- 230000008521 reorganization Effects 0.000 claims description 3
- 230000009466 transformation Effects 0.000 abstract description 5
- 238000005065 mining Methods 0.000 abstract description 3
- 230000004044 response Effects 0.000 abstract description 3
- 238000003776 cleavage reaction Methods 0.000 abstract 1
- 230000009191 jumping Effects 0.000 abstract 1
- 230000007017 scission Effects 0.000 abstract 1
- 230000006872 improvement Effects 0.000 description 4
- 238000004064 recycling Methods 0.000 description 4
- 230000007547 defect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000003915 air pollution Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000002689 soil Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/10—Image acquisition
- G06V10/12—Details of acquisition arrangements; Constructional details thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a garbage segmentation method based on a GinTrans network, which comprises the following steps: (1) image segmentation system settings; (2) image acquisition input; extracting a characteristic diagram; (4) cleavage marker recombination; (5) remodelling and fusing; (6) jumping aggregation; (7) dividing the output. The invention also discloses an image segmentation system. The invention generates the feature map required by the feature mining through linear transformation, reduces network parameters, reduces complexity, ensures the segmentation processing efficiency and improves response speed; the Bi-Frequency Transformer (BiFTrans block) module is utilized to realize the frequency division processing characteristics of high frequency and low frequency, capture of high frequency details and overall information of low frequency are realized, the speed and stability of image processing are ensured, the precision and accuracy of garbage target segmentation are improved, and the classification efficiency is improved.
Description
Technical Field
The invention relates to the technical field of vision processing, in particular to a garbage segmentation method and system based on a GinTrans network.
Background
Along with the rapid economic development, the living standard of people is improved, the types of generated garbage are more and more complex, and the unified landfill method can not meet the concept of protecting natural and green development. The sorting treatment of the garbage can realize the recycling of resources, reduce soil hazard and prevent air pollution. The garbage classification is an important step for realizing the green environment-friendly society and recycling of resources. Garbage classification is carried out from each small household and each individual, so that a large amount of manpower and material resources for subsequent work can be saved. Realize the recycling of the recyclable garbage, the correct throwing of non-recyclable matters and the disposal of the garbage. At present, in 46 key cities which are firstly tested in advance in the current garbage classification of China, the coverage rate of a household garbage classification district is 86.6%, the average recycling rate of the household garbage is 30.4%, and the kitchen garbage treatment capacity is improved from 3.47 ten thousand tons per day in 2019 to 6.28 ten thousand tons per day in the year 2020.
However, although the domestic garbage classification and treatment of China are achieved in a very practical way in recent years, garbage classification is still a troublesome problem in part of cities and communities, manual sorting is still required in most garbage classification centers, the efficiency is low, and the human body is injured due to the scratch of sharp garbage. Through artificial intelligence technique, utilize computer vision and image processing to cut apart rubbish, not only can promote rubbish letter sorting efficiency greatly, can also accurate classification, will mix the recoverable rubbish screening out in other rubbish, realize the high-efficient and cyclic utilization of resource, but the following problem that current rubbish adopted based on vision rubbish classification system still exists: 1. the existing garbage segmentation method based on the traditional image processing has low segmentation precision; 2. the existing segmentation method based on the traditional deep learning network has the defects of multiple model parameters, high calculation complexity and influence on the subsequent image processing speed and efficiency; 3. the garbage targets are various in variety, serious in deformation and offset, boundaries among the targets are extremely not cleaned, details and contour information cannot be captured well by the existing method, so that the segmentation difficulty is high, the false segmentation rate is high, and the subsequent classification accuracy is affected.
Disclosure of Invention
The invention provides a garbage segmentation method based on a GinTrans network, aiming at the defects of the prior art.
The invention also discloses an image segmentation system.
The technical scheme adopted by the invention for achieving the purpose is as follows:
a garbage segmentation method based on a GinTrans network comprises the following steps:
(1) Image segmentation system settings: the image segmentation system comprises an image acquisition module, an extraction module, a cutting identification recombination module, a remodelling fusion output module, a jump connection aggregation module and a segmentation output module, wherein the extraction module is provided with a GhostNet network, the remodelling fusion output module is provided with a Bi-Frequency Transformer (BiFTrans block) module, the jump connection aggregation module is provided with an up-sampling recovery module and a GhostNet down-sampling feature extraction module, and the segmentation output module is provided with a segmentation head module;
(2) And (3) image acquisition input: the image acquisition module is connected with an image acquisition camera, the image acquisition camera performs image acquisition on garbage on the intelligent garbage collection and sorting line and transmits the garbage to the image acquisition module, and an image acquired by the image acquisition module is used as image input data and is transmitted to the lifting module for processing;
(3) Extracting a feature map: the extraction module processes the image input data, adopts a GhostNet network to extract bottom layer characteristics of the image input data, extracts an output characteristic diagram and processes the next step;
(4) Cutting identification recombination: the cutting identification reorganization module performs cutting operation on the extracted output feature map, gridding the output feature map through the cutting operation, generating marking features through linear mapping after gridding is completed, forming marking feature sequences by the marking features, and taking the marking feature sequences as input of a Bi-Frequency Transformer (BiFTrans block) module;
(5) Remodelling and fusing: the remolding fusion output module remodels the input marked characteristic sequence, and utilizes a double-frequency mixer arranged in a Bi-Frequency Transformer (BiFTrans block) module to perform fusion action on the characteristics from different frequencies, and the fusion action obtains fusion characteristic output;
(6) Jump aggregation: the fused feature output is restored to an input image with the same size as the image input data through an up-sampling restoring module, the GHostNet down-sampling feature extraction module extracts lower material features from the corresponding same-layer resolution feature image, and the input image obtained by the up-sampling restoring module is in jump link with the same-layer resolution feature image obtained by the GHostNet down-sampling feature extraction module, so that feature images presented in different resolution levels are aggregated;
(7) And (3) segmentation output: and the segmentation head module of the segmentation output module generates and segments the characteristic image to obtain a target image.
Further improvement is made, the Ghostret network is provided with a Ghost module, and the Ghost module is designed as follows:
(3.1) inputting the garbage image to be identified in the step (2) into data, and generating original feature images f with different resolutions by convolution through a GhostNet network i ′;
(3.2) the extraction module contains linear operation phi, and generates a Ghost characteristic diagram as an output characteristic diagram by using the linear operation phi, wherein the formula is f ij =Φ i,j (f i ') wherein f i ' is the ith original feature map obtained by convolution operation of the garbage image to be identified, phi i,j The jth linear operation of the ith original feature map is carried out to obtain a jth Ghost feature map f after the operation ij 。
Further improved, the step (4) further comprises the following steps:
(4.1) carrying out gridding cutting on each Ghost characteristic diagram output in the step (3), wherein the size of each characteristic diagram is H multiplied by W, the size of each characteristic diagram is H, W respectively the length and the width of the characteristic diagram, the size of each grid sub-characteristic diagram is s multiplied by s, the s is the side length of a grid, and the s can be divided by H, W and can be cut into grid sub-characteristic diagrams;
(4.2) performing linear mapping on the gridded sub-feature graphs, wherein each grid sub-feature graph is mapped into a mark feature, and all grid sub-feature graphs form a mark feature sequenceX i I=1, 2, and N is the number of the signature.
Further improvements, said step (5) further comprises upsampling aggregation and skip chaining comprising the steps of:
(5.1) first performing layer normalization processing on the marked characteristic sequence,x is a marker feature sequence after layer normalization treatment, < >>The marker feature sequence obtained in the step (4);
(5.2) decomposing each marking characteristic in the marking characteristic sequence in the previous step into a high-frequency marking characteristic and a low-frequency marking characteristic by frequency decomposition;
(5.3) for high frequency features, the maximum pooling operation is performed first, then a linear layer is passed, and then a depth separable convolution layer is passed, the pooled layer, the linear layer and the depth separable convolution layer form a high frequency feature processor, and the output is Y high =DConv(FC(MaxPool(X hihg ) And) wherein Y high For the feature output processed by the high-frequency feature processor, DConv is depth separable convolution processing, FC is linear full-connection processing, maxPool is maximum pooling processing, and X hihg The high-frequency marking characteristics are obtained by decomposing the marking characteristics in the step (5.2);
(5.4) for the low frequency marker feature, firstly, carrying out an average pooling operation, then, carrying out a self-attention mechanism layer MSA, then, carrying out up-sampling to compensate the dimension reduction after the average pooling operation, wherein the average pooling layer, the self-attention mechanism layer and the up-sampling layer form a low frequency feature processor, and the output is Y low =Upsample(MSA(AvePool(X low ) And) wherein Y low Is the feature output after the low frequency feature processor processes, upsamples is the upsampling process, MSA is the self-attention mechanism process, aveboost is the average pooling process, X low The low-frequency marking characteristics are obtained by decomposing the marking characteristics in the step (5.2);
(5.5) high-low frequency characteristic fusion, namely fusing the high-frequency characteristic output and the low-frequency characteristic output to obtain a fused output Y o =Concat(Y high ,Y low ) Wherein Y is o For the fused characteristic output, concat is a characteristic link function;
(5.6) performing layer normalization operation again on the fused characteristic output;
(5.7) after normalization operation, outputting by a feedforward network FFN and a final BiFTrans block moduleWherein->Is the signature sequence of step (4), Y o The characteristic output after high-low frequency fusion is processed by a feed-forward network, FFN is processed by a layer normalization.
The invention has the beneficial effects that: according to the invention, through the linear transformation of GhostNet, the characteristic diagram required by the feature mining is generated by using simple linear transformation, so that the network parameters are reduced, the complexity is reduced, the subsequent segmentation processing efficiency is ensured, and the response speed is improved; the dual-frequency mixer arranged in the Bi-Frequency Transformer (BiFTrans block) module is utilized to realize the frequency division processing characteristics of high frequency and low frequency, capture of high frequency details is realized, meanwhile, the low frequency global information can be considered, the speed and the processing stability of image processing are ensured, the segmentation network of the GhostNet network is enabled to learn the comprehensive characteristics of the high frequency information and the low frequency information in the garbage image data more effectively through fusion after the high frequency processing characteristics and the low frequency processing characteristics, the precision and the accuracy of garbage target segmentation with serious deformation and offset and fuzzy boundary sense are improved, and the subsequent classification efficiency is improved.
The invention will be further described with reference to the drawings and the detailed description.
Drawings
Fig. 1 is a flow chart of a garbage segmentation method based on GinTrans network in the present embodiment;
fig. 2 is a schematic diagram of the GinTrans network structure of the present embodiment;
fig. 3 is a schematic structural diagram of a Ghost module of the present embodiment;
fig. 4 is a schematic structural diagram of a bitrans block module in the present embodiment;
fig. 5 is a schematic structural diagram of a dual-frequency mixer according to the present embodiment.
Detailed Description
The following description is of the preferred embodiments of the invention, and is not intended to limit the scope of the invention.
Referring to fig. 1 to 5, an embodiment of a garbage segmentation method based on GinTrans network includes the following steps:
(1) Image segmentation system settings: the image segmentation system comprises an image acquisition module, an extraction module, a cutting identification recombination module, a remodelling fusion output module, a jump connection aggregation module and a segmentation output module, wherein the extraction module is provided with a GhostNet network, the remodelling fusion output module is provided with a Bi-Frequency Transformer (BiFTrans block) module, the jump connection aggregation module is provided with an up-sampling recovery module and a GhostNet down-sampling feature extraction module, and the segmentation output module is provided with a segmentation head module;
(2) And (3) image acquisition input: the image acquisition module is connected with an image acquisition camera, the image acquisition camera performs image acquisition on garbage on the intelligent garbage collection and sorting line and transmits the garbage to the image acquisition module, and an image acquired by the image acquisition module is used as image input data and is transmitted to the lifting module for processing;
(3) Extracting a feature map: the extraction module processes the image input data, adopts a GhostNet network to extract bottom layer characteristics of the image input data, extracts an output characteristic diagram and processes the next step;
(4) Cutting identification recombination: the cutting identification reorganization module performs cutting operation on the extracted output feature map, gridding the output feature map through the cutting operation, generating marking features through linear mapping after gridding is completed, forming marking feature sequences by the marking features, and taking the marking feature sequences as input of a Bi-Frequency Transformer (BiFTrans block) module;
(5) Remodelling and fusing: the remolding fusion output module remodels the input marked characteristic sequence, and utilizes a double-frequency mixer arranged in a Bi-Frequency Transformer (BiFTrans block) module to perform fusion action on the characteristics from different frequencies, and the fusion action obtains fusion characteristic output;
(6) Jump aggregation: the fused feature output is restored to an input image with the same size as the image input data through an up-sampling restoring module, the GHostNet down-sampling feature extraction module extracts lower material features from the corresponding same-layer resolution feature image, and the input image obtained by the up-sampling restoring module is in jump link with the same-layer resolution feature image obtained by the GHostNet down-sampling feature extraction module, so that feature images presented in different resolution levels are aggregated;
(7) And (3) segmentation output: and the segmentation head module of the segmentation output module generates and segments the characteristic image to obtain a target image.
Further improvement is made, the Ghostret network is provided with a Ghost module, and the Ghost module is designed as follows:
(3.1) inputting the garbage image to be identified in the step (2) into data, and generating original feature images f with different resolutions by convolution through a GhostNet network i ′;
(3.2) the extraction module contains linear operation phi, and generates a Ghost characteristic diagram as an output characteristic diagram by using the linear operation phi, wherein the formula is f ij =Φ i,j (f i ') wherein f i ' is the ith original feature map obtained by convolution operation of the garbage image to be identified, phi i,j The jth linear operation of the ith original feature map is carried out to obtain a jth Ghost feature map f after the operation ij 。
Further improved, the step (4) further comprises the following steps:
(4.1) carrying out gridding cutting on each Ghost characteristic diagram output in the step (3), wherein the size of each characteristic diagram is H multiplied by W, the size of each characteristic diagram is H, W respectively the length and the width of the characteristic diagram, the size of each grid sub-characteristic diagram is s multiplied by s, the s is the side length of a grid, and the s can be divided by H, W and can be cut into grid sub-characteristic diagrams;
(4.2) performing linear mapping on the gridded sub-feature graphs, wherein each grid sub-feature graph is mapped into a mark feature, and all grid sub-feature graphs form a mark feature sequenceX i I=1, 2, and N is the number of the signature.
Further improvements, said step (5) further comprises upsampling aggregation and skip chaining comprising the steps of:
(5.1) first performing layer normalization processing on the marked characteristic sequence,x is a marker feature sequence after layer normalization treatment, < >>The marker feature sequence obtained in the step (4);
(5.2) decomposing each marking characteristic in the marking characteristic sequence in the previous step into a high-frequency marking characteristic and a low-frequency marking characteristic by frequency decomposition;
(5.3) for high frequency features, the maximum pooling operation is performed first, then a linear layer is passed, and then a depth separable convolution layer is passed, the pooled layer, the linear layer and the depth separable convolution layer form a high frequency feature processor, and the output is Y high =DConv(FC(MaxPool(X hihg ) And) wherein Y high For the feature output processed by the high-frequency feature processor, DConv is depth separable convolution processing, FC is linear full-connection processing, maxPool is maximum pooling processing, and X hihg The high-frequency marking characteristics are obtained by decomposing the marking characteristics in the step (5.2);
(5.4) for the low frequency marker feature, firstly, carrying out an average pooling operation, then carrying out a self-attention mechanism layer MSA, and then carrying out up-sampling to compensate the dimension reduction after the average pooling operation, wherein the average pooling layer, the self-attention mechanism layer and the up-sampling layer formA low frequency characteristic processor is provided, the output is Y low =Upsample(MSA(AvePool(X low ) And) wherein Y low Is the feature output after the low frequency feature processor processes, upsamples is the upsampling process, MSA is the self-attention mechanism process, aveboost is the average pooling process, X low The low-frequency marking characteristics are obtained by decomposing the marking characteristics in the step (5.2);
(5.5) high-low frequency characteristic fusion, namely fusing the high-frequency characteristic output and the low-frequency characteristic output to obtain a fused output Y o =Concat(Y high ,Y low ) Wherein Y is o For the fused characteristic output, concat is a characteristic link function;
(5.6) performing layer normalization operation again on the fused characteristic output;
(5.7) after normalization operation, outputting by a feedforward network FFN and a final BiFTrans block moduleWherein->Is the signature sequence of step (4), Y o The characteristic output after high-low frequency fusion is processed by a feed-forward network, FFN is processed by a layer normalization.
According to the invention, through the linear transformation of GhostNet, the characteristic diagram required by the feature mining is generated by using simple linear transformation, so that the network parameters are reduced, the complexity is reduced, the subsequent segmentation processing efficiency is ensured, and the response speed is improved; the dual-frequency mixer arranged in the Bi-Frequency Transformer (BiFTrans block) module is utilized to realize the frequency division processing characteristics of high frequency and low frequency, capture of high frequency details is realized, meanwhile, the low frequency global information can be considered, the speed and the processing stability of image processing are ensured, the segmentation network of the GhostNet network is enabled to learn the comprehensive characteristics of the high frequency information and the low frequency information in the garbage image data more effectively through fusion after the high frequency processing characteristics and the low frequency processing characteristics, the precision and the accuracy of garbage target segmentation with serious deformation and offset and fuzzy boundary sense are improved, and the subsequent classification efficiency is improved.
The present invention is not limited to the above embodiments, and other garbage segmentation methods and systems for GinTrans network are all within the scope of the present invention, which are obtained by adopting the same or similar structures, devices, processes or methods as the above embodiments of the present invention.
Claims (4)
1. A garbage segmentation method based on a GinTrans network is characterized by comprising the following steps:
(1) Image segmentation system settings: the image segmentation system comprises an image acquisition module, an extraction module, a cutting identification recombination module, a remodelling fusion output module, a jump connection aggregation module and a segmentation output module, wherein the extraction module is provided with a GhostNet network, the remodelling fusion output module is provided with a Bi-Frequency Transformer (BiFTrans block) module, the jump connection aggregation module is provided with an up-sampling recovery module and a GhostNet down-sampling feature extraction module, and the segmentation output module is provided with a segmentation head module;
(2) And (3) image acquisition input: the image acquisition module is connected with an image acquisition camera, the image acquisition camera performs image acquisition on garbage on the intelligent garbage collection and sorting line and transmits the garbage to the image acquisition module, and an image acquired by the image acquisition module is used as image input data and is transmitted to the lifting module for processing;
(3) Extracting a feature map: the extraction module processes the image input data, adopts a GhostNet network to extract bottom layer characteristics of the image input data, extracts an output characteristic diagram and processes the next step;
(4) Cutting identification recombination: the cutting identification reorganization module performs cutting operation on the extracted output feature map, gridding the output feature map through the cutting operation, generating marking features through linear mapping after gridding is completed, forming marking feature sequences by the marking features, and taking the marking feature sequences as input of a Bi-Frequency Transformer (BiFTrans block) module;
(5) Remodelling and fusing: the remolding fusion output module remodels the input marked characteristic sequence, and utilizes a double-frequency mixer arranged in a Bi-Frequency Transformer (BiFTrans block) module to perform fusion action on the characteristics from different frequencies, and the fusion action obtains fusion characteristic output;
(6) Jump aggregation: the fused feature output is restored to an input image with the same size as the image input data through an up-sampling restoring module, the GHostNet down-sampling feature extraction module extracts lower material features from the corresponding same-layer resolution feature image, and the input image obtained by the up-sampling restoring module is in jump link with the same-layer resolution feature image obtained by the GHostNet down-sampling feature extraction module, so that feature images presented in different resolution levels are aggregated;
(7) And (3) segmentation output: and the segmentation head module of the segmentation output module generates and segments the characteristic image to obtain a target image.
2. The garbage segmentation method based on the GinTrans network according to claim 1, wherein the GhotNet network is provided with a Ghost module, and the Ghost module is designed as follows:
(3.1) inputting the garbage image to be identified in the step (2) into data, and generating original feature images f with different resolutions by convolution through a GhostNet network i ′;
(3.2) the extraction module contains linear operation phi, and generates a Ghost characteristic diagram as an output characteristic diagram by using the linear operation phi, wherein the formula is f ij =Φ i,j (f i ') wherein f i ' is a garbage map to be identifiedThe j-th linear operation is carried out to obtain a j-th Ghost characteristic diagram f after the operation ij 。
3. The GinTrans network-based garbage segmentation method according to claim 2, wherein the step (4) further comprises the steps of:
(4.1) carrying out gridding cutting on each Ghost characteristic diagram output in the step (3), wherein the size of each characteristic diagram is H multiplied by W, the size of each characteristic diagram is H, W respectively the length and the width of the characteristic diagram, the size of each grid sub-characteristic diagram is s multiplied by s, the s is the side length of a grid, and the s can be divided by H, W and can be cut into grid sub-characteristic diagrams;
(4.2) performing linear mapping on the gridded sub-feature graphs, wherein each grid sub-feature graph is mapped into a mark feature, and all grid sub-feature graphs form a mark feature sequenceX i I=1, 2, and N is the number of the signature.
4. The GinTrans network-based garbage segmentation method according to claim 3, wherein the step (5) further comprises up-sampling aggregation and skip linking, which comprises the steps of:
(5.1) first performing layer normalization processing on the marked characteristic sequence,x is a marker feature sequence after layer normalization treatment, < >>The marker feature sequence obtained in the step (4);
(5.2) decomposing each marking characteristic in the marking characteristic sequence in the previous step into a high-frequency marking characteristic and a low-frequency marking characteristic by frequency decomposition;
(5.3) for high frequency features, the maximum pooling operation is performed first, then a linear layer is passed, and then a depth separable convolution layer is passed, the pooled layer, the linear layer and the depth separable convolution layer form a high frequency feature processor, and the output is Y high =DConv(FC(MaxPool(X hihg ) (ii)) wherein Y high For the feature output processed by the high-frequency feature processor, DConv is depth separable convolution processing, FC is linear full-connection processing, maxPool is maximum pooling processing,X hihg The high-frequency marking characteristics are obtained by decomposing the marking characteristics in the step (5.2);
(5.4) for the low frequency marker feature, firstly, carrying out an average pooling operation, then, carrying out a self-attention mechanism layer MSA, then, carrying out up-sampling to compensate the dimension reduction after the average pooling operation, wherein the average pooling layer, the self-attention mechanism layer and the up-sampling layer form a low frequency feature processor, and the output is Y low =Upsample(MSA(AvePool(X low ) And) wherein Y low Is the feature output after the low frequency feature processor processes, upsamples is the upsampling process, MSA is the self-attention mechanism process, aveboost is the average pooling process, X low The low-frequency marking characteristics are obtained by decomposing the marking characteristics in the step (5.2);
(5.5) high-low frequency characteristic fusion, namely fusing the high-frequency characteristic output and the low-frequency characteristic output to obtain a fused output Y o =Concat(Y high ,Y low ) Wherein Y is o For the fused characteristic output, concat is a characteristic link function;
(5.6) performing layer normalization operation again on the fused characteristic output;
(5.7) after normalization operation, outputting by a feedforward network FFN and a final BiFTrans block moduleWherein->Is the signature sequence of step (4), Y o The characteristic output after high-low frequency fusion is processed by a feed-forward network, FFN is processed by a layer normalization.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210901322.8A CN115147703B (en) | 2022-07-28 | 2022-07-28 | Garbage segmentation method and system based on GinTrans network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210901322.8A CN115147703B (en) | 2022-07-28 | 2022-07-28 | Garbage segmentation method and system based on GinTrans network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115147703A CN115147703A (en) | 2022-10-04 |
CN115147703B true CN115147703B (en) | 2023-11-03 |
Family
ID=83413245
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210901322.8A Active CN115147703B (en) | 2022-07-28 | 2022-07-28 | Garbage segmentation method and system based on GinTrans network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115147703B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115830375B (en) * | 2022-11-25 | 2024-09-24 | 中国科学院自动化研究所 | Point cloud classification method and device |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6381363B1 (en) * | 1999-03-15 | 2002-04-30 | Grass Valley (U.S.), Inc. | Histogram-based segmentation of images and video via color moments |
CN112017191A (en) * | 2020-08-12 | 2020-12-01 | 西北大学 | Method for establishing and segmenting liver pathology image segmentation model based on attention mechanism |
CN112541503A (en) * | 2020-12-11 | 2021-03-23 | 南京邮电大学 | Real-time semantic segmentation method based on context attention mechanism and information fusion |
WO2021115483A1 (en) * | 2019-12-13 | 2021-06-17 | 华为技术有限公司 | Image processing method and related apparatus |
CN113051430A (en) * | 2021-03-26 | 2021-06-29 | 北京达佳互联信息技术有限公司 | Model training method, device, electronic equipment, medium and product |
CN113102266A (en) * | 2021-03-16 | 2021-07-13 | 四川九通智路科技有限公司 | Multi-dimensional garbage recognition and classification system |
CN113159051A (en) * | 2021-04-27 | 2021-07-23 | 长春理工大学 | Remote sensing image lightweight semantic segmentation method based on edge decoupling |
CN113591939A (en) * | 2021-07-09 | 2021-11-02 | 上海智臻智能网络科技股份有限公司 | Layer classification method and device |
CN113744292A (en) * | 2021-09-16 | 2021-12-03 | 安徽世绿环保科技有限公司 | Garbage classification station garbage throwing scanning system |
US11222217B1 (en) * | 2020-08-14 | 2022-01-11 | Tsinghua University | Detection method using fusion network based on attention mechanism, and terminal device |
CN114708464A (en) * | 2022-06-01 | 2022-07-05 | 广东艺林绿化工程有限公司 | Municipal sanitation cleaning garbage truck cleaning method based on road garbage classification |
WO2022141723A1 (en) * | 2020-12-29 | 2022-07-07 | 江苏大学 | Image classification and segmentation apparatus and method based on feature guided network, and device and medium |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111259809B (en) * | 2020-01-17 | 2021-08-17 | 五邑大学 | Unmanned aerial vehicle coastline floating garbage inspection system based on DANet |
-
2022
- 2022-07-28 CN CN202210901322.8A patent/CN115147703B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6381363B1 (en) * | 1999-03-15 | 2002-04-30 | Grass Valley (U.S.), Inc. | Histogram-based segmentation of images and video via color moments |
WO2021115483A1 (en) * | 2019-12-13 | 2021-06-17 | 华为技术有限公司 | Image processing method and related apparatus |
CN112017191A (en) * | 2020-08-12 | 2020-12-01 | 西北大学 | Method for establishing and segmenting liver pathology image segmentation model based on attention mechanism |
US11222217B1 (en) * | 2020-08-14 | 2022-01-11 | Tsinghua University | Detection method using fusion network based on attention mechanism, and terminal device |
CN112541503A (en) * | 2020-12-11 | 2021-03-23 | 南京邮电大学 | Real-time semantic segmentation method based on context attention mechanism and information fusion |
WO2022141723A1 (en) * | 2020-12-29 | 2022-07-07 | 江苏大学 | Image classification and segmentation apparatus and method based on feature guided network, and device and medium |
CN113102266A (en) * | 2021-03-16 | 2021-07-13 | 四川九通智路科技有限公司 | Multi-dimensional garbage recognition and classification system |
CN113051430A (en) * | 2021-03-26 | 2021-06-29 | 北京达佳互联信息技术有限公司 | Model training method, device, electronic equipment, medium and product |
CN113159051A (en) * | 2021-04-27 | 2021-07-23 | 长春理工大学 | Remote sensing image lightweight semantic segmentation method based on edge decoupling |
CN113591939A (en) * | 2021-07-09 | 2021-11-02 | 上海智臻智能网络科技股份有限公司 | Layer classification method and device |
CN113744292A (en) * | 2021-09-16 | 2021-12-03 | 安徽世绿环保科技有限公司 | Garbage classification station garbage throwing scanning system |
CN114708464A (en) * | 2022-06-01 | 2022-07-05 | 广东艺林绿化工程有限公司 | Municipal sanitation cleaning garbage truck cleaning method based on road garbage classification |
Non-Patent Citations (2)
Title |
---|
Deformable transformers for end-to-end object detection;Zhu X et al;《arXiv》;全文 * |
基于DeeplabV3+的建筑垃圾堆放点识别;刘小玉等;《测绘通报》(第04期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN115147703A (en) | 2022-10-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103761531B (en) | The sparse coding license plate character recognition method of Shape-based interpolation contour feature | |
CN101763440B (en) | Method for filtering searched images | |
CN109410184B (en) | Live broadcast pornographic image detection method based on dense confrontation network semi-supervised learning | |
CN107239751A (en) | High Resolution SAR image classification method based on the full convolutional network of non-down sampling contourlet | |
CN106203492A (en) | The system and method that a kind of image latent writing is analyzed | |
CN112367273B (en) | Flow classification method and device of deep neural network model based on knowledge distillation | |
CN112364944B (en) | Deep learning-based household garbage classification method | |
CN111597983B (en) | Method for realizing identification of generated false face image based on deep convolutional neural network | |
CN111814750A (en) | Intelligent garbage classification method and system based on deep learning target detection and image recognition | |
CN111723657A (en) | River foreign matter detection method and device based on YOLOv3 and self-optimization | |
CN111461000A (en) | Intelligent office garbage classification method based on CNN and wavelet analysis | |
CN104463200A (en) | Satellite remote sensing image sorting method based on rule mining | |
CN115147703B (en) | Garbage segmentation method and system based on GinTrans network | |
CN109003275A (en) | The dividing method of weld defect image | |
Pradeep et al. | Diagonal feature extraction based handwritten character system using neural network | |
CN111462090A (en) | Multi-scale image target detection method | |
Meng et al. | X-DenseNet: deep learning for garbage classification based on visual images | |
CN104834891A (en) | Method and system for filtering Chinese character image type spam | |
Kumari et al. | YOLOv8 Based Deep Learning Method for Potholes Detection | |
CN109657082A (en) | Remote sensing images multi-tag search method and system based on full convolutional neural networks | |
CN105469095A (en) | Vehicle model identification method based on pattern set histograms of vehicle model images | |
CN101527001B (en) | Secret information detecting system based on expert system method | |
CN117951576A (en) | Power system malicious flow detection method based on transform time sequence multi-mode characteristics | |
CN110071884A (en) | A kind of Modulation Recognition of Communication Signal method based on improvement entropy cloud feature | |
Rashida et al. | Implementation of faster region-based convolutional neural network for waste type classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |