[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN108805874B - Multispectral image semantic cutting method based on convolutional neural network - Google Patents

Multispectral image semantic cutting method based on convolutional neural network Download PDF

Info

Publication number
CN108805874B
CN108805874B CN201810595762.9A CN201810595762A CN108805874B CN 108805874 B CN108805874 B CN 108805874B CN 201810595762 A CN201810595762 A CN 201810595762A CN 108805874 B CN108805874 B CN 108805874B
Authority
CN
China
Prior art keywords
data
different
convolution
neural network
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810595762.9A
Other languages
Chinese (zh)
Other versions
CN108805874A (en
Inventor
李含伦
戴玉成
张小博
张晓灿
唐文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Third Research Institute Of China Electronics Technology Group Corp
Original Assignee
Third Research Institute Of China Electronics Technology Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Third Research Institute Of China Electronics Technology Group Corp filed Critical Third Research Institute Of China Electronics Technology Group Corp
Priority to CN201810595762.9A priority Critical patent/CN108805874B/en
Publication of CN108805874A publication Critical patent/CN108805874A/en
Application granted granted Critical
Publication of CN108805874B publication Critical patent/CN108805874B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • G06T2207/10036Multispectral image; Hyperspectral image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a multispectral image semantic cutting method based on a convolutional neural network. The invention effectively solves the problem that the standard U-NET network can only accept one same-scale RGB \ Gray image through a network with various resolution ratio input and multichannel independent convolution, effectively improves the work efficiency of multispectral image semantic cutting and ensures the precision of image cutting.

Description

Multispectral image semantic cutting method based on convolutional neural network
Technical Field
The invention relates to a multispectral image semantic cutting method based on a convolutional neural network.
Background
Currently, the advanced semantic cut framework for RGB images commonly employs end-to-end Deep Convolutional Neural Networks (DCNN). The current usage form of the convolutional neural network is to classify objects by using a plurality of pre-trained models, and the common models mainly include VGG, ResNet and the like. The DCNN for semantic segmentation usually includes two parts, namely a front part and a rear part, wherein the front part is a commonly used DCNN network with better quality, and the rear part is a network for mapping a feature map into pixel labels. In order to save training samples, the model parameters which are pre-trained are directly adopted in the first half part, and only the model parameters in the second half part are finely adjusted.
Currently, a relatively representative image semantic segmentation network is a full-volume network (FCN), and an initial version of the network is based on VGG-16. Due to the second half of the fully-connected network of the VGG-16 designed for classification, the fully-connected operation completely loses the spatial information of the feature map so that the spatial information cannot be used for semantic segmentation of the image; the FCN replaces the non-convolutional network portion (fully-connected portion) of VGG-16 with a convolution operation and recovers a characteristic representation of each pixel using upsampling and deconvolution and further computes a class label for each pixel. The main disadvantage of this network is that 5 times of 2 times of upsampling is used to recover the size of the feature map, and a large amount of spatial information is lost in the downsampling process, and the information cannot be completely recovered in the upsampling process, so that the result of image segmentation is very rough. A common modification to FCN is post-processing the processed results using Conditional Random Fields (CRF). This approach improves the FCN result coarsening to some extent, but results in further increases in FCN memory and computation time.
Deep lab is another type of image semantic segmentation network with a large influence, which is based on a network on a deep residual error network (ResNet DCNN). DeepLab replaces the conventional convolution kernel reduction image down-sampling problem with a perforated convolution kernel (aperture convolution filter). The convolution kernel with the holes inserts points with the weight value of 0 into the traditional convolution kernel at certain intervals, so that the receptive field of the convolution kernel can be increased on the premise that the training parameters are not increased. Therefore, when the image is subjected to convolution processing by the multi-layer convolution kernel, the image can still maintain the original size. However, some scholars have found that when the entire network employs the punctured convolution kernel, the processing efficiency is very low, and thus the conventional convolution kernel and the punctured convolution kernel need to be employed simultaneously.
The U-NET is a neural network originally designed for biomedical image segmentation, and since the lower pooling part and the upper pooling part of the network are basically symmetrical, the author draws the architecture of the network into a figure, and the shape of the figure is very similar to the letter U, so that the network is named as U-NET. U-NET is broadly one type of FCN network. Since the network wins 2015-degree ISBI cell tracking challenge and is widely reported, the network has very wide influence on the image semantic segmentation research direction, especially the biomedical image segmentation direction. The U-NET network is ingeniously designed, the left half part of the U-NET network is a resolution contraction part, the right half part of the U-NET network is a resolution expansion part, and the resolution contraction part and the resolution expansion part are bilaterally symmetrical. In the resolution expansion part, fusion association (association) operation is carried out on the feature map which is pooled at each time and the feature map corresponding to the left half part, so that the convolution kernel of each resolution of the resolution expansion part is from the up-sampling part of the lower-layer feature map and the part corresponding to the left-side resolution, and the loss of spatial information caused by the scale scaling of the feature map is reduced to the greatest extent.
The research result of the semantic segmentation of the image is designed for common digital images or medical scanning images, and the basic assumption is that the image specifications for training and segmentation are unified. For example, it assumes that all training samples have the same number of channels (multichannel RGB and single channel Gray images). This makes the use of these research results in multispectral images difficult. Firstly, multispectral images often have the characteristics of multiple channels, and different channels contain different information amounts, each convolution kernel of the traditional convolution method convolutes all feature maps and accumulates the convolution results, and the feature map of each channel is equal to the classified result by default. Secondly, the same satellite is often equipped with various types of multispectral sensors, for example, a WorldView 3 commercial remote sensing satellite can simultaneously obtain a full-band image (panchrometic), multispectral data (red, total, blue, green, yellow, near-IR1 and near-IR2) in a visible light near-infrared range (400-; the wave band numbers of the three are respectively 1 wave band, 8 wave band and 8 wave band; the resolution of the three points under the star (Nadir) is 0.31m, 1.24m and 7.5 m. The data collected by different types of sensors not only have different wave band numbers and wave band types, but also have different resolutions. If all the low-resolution images are interpolated and enlarged forcibly to be unified with the high-resolution images, some convolution operations of the low-resolution image portions are ineffective, which not only loses a large amount of calculation time, but also may interfere with the image cutting result. If the down-sampling is performed on the high-resolution image to make the image scale uniform with the low-resolution image, a large amount of spatial information of the high-resolution image is lost.
Disclosure of Invention
The invention aims to provide a multispectral image semantic cutting method based on a convolutional neural network, which can improve the work efficiency of multispectral image semantic cutting and ensure the precision of image cutting.
The technical scheme for realizing the purpose of the invention is as follows:
a multispectral image semantic cutting method based on a convolutional neural network is characterized by comprising the following steps: and independently convolving each data channel of the multispectral image by using a convolutional neural network, and then fusing the feature maps after independent convolution of each data channel.
Further, when independently convolving each data channel of the multispectral image, different sizes and different numbers of convolution kernels are selected according to different wave bands.
Further, when independently convolving each data channel of the multispectral image, different convolution layer numbers are selected according to different wave bands.
Further, data of different resolutions are input to convolution layers of respective different scale levels.
Further, when data with different resolutions are input to the convolutional layers with corresponding different scale levels, the input data are fused with the feature map of the convolutional layer after pooling.
Further, normalizing the data with different resolutions to the highest resolution of the data, and inputting the data to the network once after the data is serially connected and fused; then separating the data of different categories in the network, and respectively processing the data to the required size; and inputting the data with different resolutions into the convolution layers with corresponding different scale levels.
The invention has the following beneficial effects:
aiming at the condition that the data difference between different channels of multispectral data is large, the method uses a convolution neural network to independently convolve each data channel of the multispectral image, and then fuses the feature maps after independent convolution of each data channel. When independently convolving each data channel of the multispectral image, different sizes and different numbers of convolution kernels can be selected according to different wave bands, and different numbers of convolution layers can be selected according to different wave bands.
Aiming at the condition that the resolution ratio of multispectral data is large in difference, the U-NET network is transformed into a convolutional neural network supporting input of multiple resolution ratios, data with different resolution ratios are input into convolutional layers with corresponding different scale levels, the input data are fused with a characteristic graph after the convolutional layers are pooled, and the last layer is the layer above the convolutional layers corresponding to the input data. The invention effectively solves the problem that the standard U-NET network can only accept one same-scale RGB \ Gray image through a network with various resolution ratio input and multichannel independent convolution, effectively improves the work efficiency of multispectral image semantic cutting and ensures the precision of image cutting.
Aiming at the condition that one network model of a deep learning development platform mostly only receives one-time input, the invention processes data with different resolutions to the same resolution by taking the highest resolution of the data as the standard, then performs serial fusion, inputs the data into a network, separates data with different categories in the network, and processes the separated data to a proper size, thereby effectively ensuring the working reliability of the multispectral image semantic cutting.
Drawings
FIG. 1 is a schematic diagram of the independent convolution of multiple image channels of the present invention;
fig. 2 is a schematic diagram of the channel independent convolution and multi-scale input U-NET deep neural network of the present invention.
Detailed Description
The first embodiment is as follows:
for the multispectral image, the wave bands are separated according to the wavelength, then independent convolution operation is performed on different wave bands, namely, a convolution neural network is used for independently convolving each data channel of the multispectral image, and then feature graphs after independent convolution of each data channel are fused (convolution). When independently convolving each data channel of the multispectral image, different sizes and different numbers of convolution kernels are selected according to different wave bands. And when independently convolving each data channel of the multispectral image, selecting different convolution layer numbers according to different wave bands. When the method is implemented, the convolution neural network adopts a U-NET neural network.
Example two:
when the multispectral image has multiple resolutions, on the basis of adopting the multi-channel independent convolution in the first embodiment, the multi-channel independent convolution and multi-resolution input network is adopted in the second embodiment. As shown in fig. 2, the U-NET network is modified to support a convolutional neural network with multiple resolution inputs. Similar to the traditional U-NET network, the network of the invention is composed of a scale contraction part and a scale expansion part, wherein the scale contraction part is composed of a classical convolution network, the image size is reduced along with the increase of the convolution pooling times along with the increase of the convolution hierarchy, and the number of convolution kernels is increased along with the increase of the pooling times. The scale expansion part is the same as that of the U-NET network, the scale is increased by two times in each up-sampling step of the scale expansion part, and the number of convolution kernels is reduced by half. After each up-sampling, the sampled feature map and the feature map with the same scale as the symmetric part (the contracted part) need to be subjected to merging operation (summation). Different from the traditional U-NET network, the data with different resolutions are input into the convolutional layers with corresponding different scale levels, the input data is fused with the characteristic diagram of the convolutional layer which is formed by pooling the previous layer, and the previous layer is the previous layer of the convolutional layer corresponding to the input data. In fig. 2, the open arrows indicate channel independent convolution, the right thin arrows indicate channel duplication, the juxtaposition of solid and open rectangles indicates a fusion operation, the downward wide arrows indicate a lower pooling operation, the upward wide arrows indicate an upper pooling operation, and the rightward wide arrows indicate a classical convolution operation.
Aiming at the condition that one network model of a deep learning development platform mostly only receives one-time input, the invention processes data with different resolutions to the same resolution by taking the highest resolution of the data as the standard, then performs serial fusion, inputs the data into a network, separates data with different categories in the network, and processes the separated data to a proper size, thereby effectively ensuring the working reliability of the multispectral image semantic cutting.

Claims (2)

1. A multispectral image semantic cutting method based on a convolutional neural network is characterized by comprising the following steps: independently convolving each data channel of the multispectral image by using a convolutional neural network, and then fusing the feature maps after independent convolution of each data channel;
independently convolving each data channel of the multispectral image, selecting different sizes and different numbers of convolution kernels according to different wave bands;
when independently convolving each data channel of the multispectral image, selecting different convolution layers according to different wave bands;
the convolution neural network adopts a U-NET neural network;
the U-NET neural network supports the input of data with various resolutions, and the data with different resolutions are input to convolution layers with corresponding different scale levels;
when data with different resolutions are input into the convolutional layers with corresponding different scale levels, the input data are fused with the characteristic diagram of the convolutional layer which is formed by pooling the data in the previous layer, wherein the previous layer is the layer which corresponds to the convolutional layer and is input with the input data.
2. The convolutional neural network-based multispectral image semantic segmentation method of claim 1, wherein: normalizing data with different resolutions to the highest resolution of the data, and inputting the data to a network once after series connection and fusion; then separating the data of different categories in the network, and respectively processing the data to the required size; and inputting the data with different resolutions into the convolution layers with corresponding different scale levels.
CN201810595762.9A 2018-06-11 2018-06-11 Multispectral image semantic cutting method based on convolutional neural network Active CN108805874B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810595762.9A CN108805874B (en) 2018-06-11 2018-06-11 Multispectral image semantic cutting method based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810595762.9A CN108805874B (en) 2018-06-11 2018-06-11 Multispectral image semantic cutting method based on convolutional neural network

Publications (2)

Publication Number Publication Date
CN108805874A CN108805874A (en) 2018-11-13
CN108805874B true CN108805874B (en) 2022-04-22

Family

ID=64088190

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810595762.9A Active CN108805874B (en) 2018-06-11 2018-06-11 Multispectral image semantic cutting method based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN108805874B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113168684B (en) * 2018-11-26 2024-04-05 Oppo广东移动通信有限公司 Method, system and computer readable medium for improving quality of low brightness images
CN111382761B (en) * 2018-12-28 2023-04-07 展讯通信(天津)有限公司 CNN-based detector, image detection method and terminal
CN110009637B (en) * 2019-04-09 2021-04-16 北京化工大学 Remote sensing image segmentation network based on tree structure
CN110163852B (en) * 2019-05-13 2021-10-15 北京科技大学 Conveying belt real-time deviation detection method based on lightweight convolutional neural network
CN110969182A (en) * 2019-05-17 2020-04-07 丰疆智能科技股份有限公司 Convolutional neural network construction method and system based on farmland image
CN110852385B (en) * 2019-11-12 2022-07-12 北京百度网讯科技有限公司 Image processing method, device, equipment and storage medium
CN113034535A (en) * 2019-12-24 2021-06-25 无锡祥生医疗科技股份有限公司 Fetal head segmentation method, fetal head segmentation device and storage medium
CN111428781A (en) * 2020-03-20 2020-07-17 中国科学院深圳先进技术研究院 Remote sensing image ground object classification method and system
CN112184554B (en) * 2020-10-13 2022-08-23 重庆邮电大学 Remote sensing image fusion method based on residual mixed expansion convolution
CN112633171B (en) * 2020-12-23 2024-08-02 北京恒达时讯科技股份有限公司 Sea ice identification method and system based on multisource optical remote sensing image
CN113159038B (en) * 2020-12-30 2022-05-27 太原理工大学 Coal rock segmentation method based on multi-mode fusion

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101916435A (en) * 2010-08-30 2010-12-15 武汉大学 Method for fusing multi-scale spectrum projection remote sensing images
US9760980B1 (en) * 2015-03-25 2017-09-12 Amazon Technologies, Inc. Correcting moiré pattern effects
CN107993229A (en) * 2017-12-15 2018-05-04 西安中科微光影像技术有限公司 A kind of tissue classification procedure and device based on cardiovascular IVOCT images

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101916435A (en) * 2010-08-30 2010-12-15 武汉大学 Method for fusing multi-scale spectrum projection remote sensing images
US9760980B1 (en) * 2015-03-25 2017-09-12 Amazon Technologies, Inc. Correcting moiré pattern effects
CN107993229A (en) * 2017-12-15 2018-05-04 西安中科微光影像技术有限公司 A kind of tissue classification procedure and device based on cardiovascular IVOCT images

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
AdapNet: Adaptive semantic segmentation in adverse environmental conditions;Abhinav Valada 等;《2017 IEEE International Conference on Robotics and Automation (ICRA)》;20170724;全文 *
基于多通道卷积神经网络的非结构化道路路表分析;崔巍 等;《计算机应用与软件》;20160131;第33卷(第1期);正文第2.2节 *

Also Published As

Publication number Publication date
CN108805874A (en) 2018-11-13

Similar Documents

Publication Publication Date Title
CN108805874B (en) Multispectral image semantic cutting method based on convolutional neural network
WO2021184891A1 (en) Remotely-sensed image-based terrain classification method, and system
CN108717569B (en) Expansion full-convolution neural network device and construction method thereof
CN110232394B (en) Multi-scale image semantic segmentation method
CN108596330B (en) Parallel characteristic full-convolution neural network device and construction method thereof
US11151403B2 (en) Method and apparatus for segmenting sky area, and convolutional neural network
CN113222835B (en) Remote sensing full-color and multi-spectral image distributed fusion method based on residual error network
CN111259905B (en) Feature fusion remote sensing image semantic segmentation method based on downsampling
CN109102469B (en) Remote sensing image panchromatic sharpening method based on convolutional neural network
WO2020056791A1 (en) Method and apparatus for super-resolution reconstruction of multi-scale dilated convolution neural network
CN111768432A (en) Moving target segmentation method and system based on twin deep neural network
CN112418176A (en) Remote sensing image semantic segmentation method based on pyramid pooling multilevel feature fusion network
CN110428387A (en) EO-1 hyperion and panchromatic image fusion method based on deep learning and matrix decomposition
CN108038519B (en) Cervical image processing method and device based on dense feature pyramid network
CN110866879B (en) Image rain removing method based on multi-density rain print perception
CN112001928B (en) Retina blood vessel segmentation method and system
CN111401455B (en) Remote sensing image deep learning classification method and system based on Capsules-Unet model
CN112966580B (en) Remote sensing image green tide information extraction method based on deep learning and super-resolution
CN106910202B (en) Image segmentation method and system for ground object of remote sensing image
CN111008936A (en) Multispectral image panchromatic sharpening method
CN110717921B (en) Full convolution neural network semantic segmentation method of improved coding and decoding structure
CN116309070A (en) Super-resolution reconstruction method and device for hyperspectral remote sensing image and computer equipment
CN110930409A (en) Salt body semantic segmentation method based on deep learning and semantic segmentation model
CN110348411A (en) A kind of image processing method, device and equipment
CN111951164A (en) Image super-resolution reconstruction network structure and image reconstruction effect analysis method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant