[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN114187590A - Method and system for identifying target fruits under homochromatic system background - Google Patents

Method and system for identifying target fruits under homochromatic system background Download PDF

Info

Publication number
CN114187590A
CN114187590A CN202111228465.9A CN202111228465A CN114187590A CN 114187590 A CN114187590 A CN 114187590A CN 202111228465 A CN202111228465 A CN 202111228465A CN 114187590 A CN114187590 A CN 114187590A
Authority
CN
China
Prior art keywords
target
model
processing
orchard environment
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111228465.9A
Other languages
Chinese (zh)
Inventor
贾伟宽
孟虎
卢宇琪
贾艺鸣
牛屹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Normal University
Original Assignee
Shandong Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Normal University filed Critical Shandong Normal University
Priority to CN202111228465.9A priority Critical patent/CN114187590A/en
Publication of CN114187590A publication Critical patent/CN114187590A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a method and a system for identifying target fruits under the background of the same color system, which belong to the technical field of computer vision and are used for acquiring an orchard environment image to be identified; processing the orchard environment image to be recognized by using a pre-trained recognition model to obtain a target fruit recognition result; when the orchard environment image to be recognized is processed by utilizing the recognition model trained in advance, space position supplementary codes are added to the extracted image characteristics, and loss information is supplemented. According to the fruit picking robot, a Sparse-transformer encoder-decoder model is used, so that the problems that the fruit detection efficiency of a visual system of the fruit picking robot is poor and small targets are insensitive are solved; the precision is high, the speed is high, and the agricultural requirements of fruit picking robots, yield prediction and the like are better met; the small target enhancement technology is used for expanding the sample space, so that the method is well suitable for small sample data sets and has strong generalization capability.

Description

Method and system for identifying target fruits under homochromatic system background
Technical Field
The invention relates to the technical field of computer vision, in particular to a method and a system for identifying target fruits in a homochromatic background based on Sparse-transformer small target sensitivity.
Background
In agricultural production, machine vision is widely applied to the fields of fruit and vegetable yield prediction, automatic picking, pest and disease identification and the like, and the precision and the efficiency of target detection become keys for restricting the performance of operation equipment. Currently, detection of static target fruit, dynamic target fruit, occluded or overlapping target fruit has enjoyed success.
Most of the existing detection models are based on the traditional machine learning and emerging deep network models. The detection method based on machine learning mainly depends on the characteristics of target fruits, such as color, shape and the like, and the detection effect is better for the target with larger difference with the background, but when the green target fruits are encountered, the color of the fruits is similar to that of the background, and the detection effect is relatively poor. In the detection method based on deep learning, the training target network excessively depends on the number of samples, and in the actual orchard environment, some orchards are difficult to obtain enough samples and cannot be trained to obtain an accurate detection model. Under the complex orchard environment, the posture of target fruits is changed, some target fruits are green, and the quantity of samples is insufficient due to the difficulty in acquiring partial environmental data, and the factors all bring great challenges to the accurate detection of the target.
The identification method based on machine learning usually accompanies operations such as preprocessing, feature selection and the like, an end-to-end detection process cannot be realized, and the identification effect is easily influenced by various interferences in the natural environment. Although the recognition method based on deep learning has the advantages that the precision is obviously improved, and the end-to-end detection process can be realized, the operation such as convolution and the dependence of a model on an anchor frame cause that a large amount of calculation and storage resources are consumed, and the recognition speed can not meet the real-time requirement.
Disclosure of Invention
The invention aims to provide a target fruit identification method and a target fruit identification system under the homochromatic background, which utilize the small target sensitivity and the parallel computing characteristic of a spark-transformer on the premise of ensuring the precision, improve the speed, reduce the training time, optimize the small target detection precision and speed, and better adapt to agricultural requirements such as fruit picking robots and yield prediction, and the like, so as to solve at least one technical problem in the background technology.
In order to achieve the purpose, the invention adopts the following technical scheme:
in one aspect, the invention provides a method for identifying target fruits in the background of the same color system, which comprises the following steps:
acquiring an orchard environment image to be identified;
processing the orchard environment image to be recognized by using a pre-trained recognition model to obtain a target fruit recognition result; the pre-trained recognition model is obtained by training a training set, wherein the training set comprises a plurality of orchard environment images and labels for labeling target fruits in the orchard environment images;
wherein,
and when the orchard environment image to be recognized is processed by utilizing the pre-trained recognition model, adding a space position complement code to the extracted image characteristics to supplement loss information.
Preferably, training the recognition model comprises: processing the training set by a deep convolution neural network, extracting characteristics, constructing sparse transformer model processing characteristics, processing by a feedforward neural network, and outputting a final detection result; inputting a test sample, evaluating the obtained detection result by using the evaluation index, adjusting the parameters of the model according to the evaluation result, and repeatedly training the improved model until the optimal network model is obtained.
Preferably, a single lens reflex is used for collecting green target fruit images under different illumination, different time periods and different angles; copying target fruits smaller than preset pixels in the image by using a small target enhancement technology to expand the sample, carrying out classification and labeling and constructing a data set; and dividing the expanded data set into a training set, a verification set and a test set.
Preferably, the encoder of the constructed sparse transformer model comprises: replacing an attention module for processing feature mapping in a Transformer mechanism with a hole self-attention module; and processing and reducing the dimension of the image characteristics, adding a space position complement code, supplementing loss information, inputting a cavity self-attention mechanism, a residual error module and a regularization layer, processing the image characteristics, and outputting an encoder result through a feedforward neural network, the residual error module and the regularization layer.
Preferably, the decoder of the constructed sparse transformer model comprises: and inputting the parameters learned by the encoder into a cavity self-attention mechanism, a residual error module and a regularization layer, processing the parameters, inputting the processed results into a multi-head self-attention mechanism, a residual error module and a regularization layer, and processing the processed results through a feedforward neural network, the residual error module and the regularization layer to obtain detection results.
Preferably, the feedforward neural network computes the result by a multi-layered perceptron with a ReLU activation function and hidden dimensions, and a linear projection layer.
Preferably, a final loss function is constructed by using the Hungarian loss function and the SoftMax loss function, a network model is optimized, and model training is carried out.
In a second aspect, the present invention provides a system for identifying a target fruit in a same color family background, comprising:
the acquiring module is used for acquiring an orchard environment image to be identified;
the recognition module is used for processing the orchard environment image to be recognized by utilizing a pre-trained recognition model to obtain a target fruit recognition result; the pre-trained recognition model is obtained by training a training set, wherein the training set comprises a plurality of orchard environment images and labels for labeling target fruits in the orchard environment images;
wherein,
and when the orchard environment image to be recognized is processed by utilizing the pre-trained recognition model, adding a space position complement code to the extracted image characteristics to supplement loss information.
In a third aspect, the present invention provides a non-transitory computer readable storage medium for storing computer instructions which, when executed by a processor, implement the method for identifying a target fruit in a homochromatic context as described above.
In a fourth aspect, the present invention provides an electronic device comprising: a processor, a memory, and a computer program; wherein the processor is connected with the memory, the computer program is stored in the memory, and when the electronic device runs, the processor executes the computer program stored in the memory, so as to make the electronic device execute the instruction for realizing the target fruit identification method in the same color system background.
The invention has the beneficial effects that: the method solves the problems of poor fruit detection efficiency and insensitivity of small targets of a visual system of a fruit picking robot by using a Sparse-transformer encoder-decoder model; the precision is high, the speed is high, and the agricultural requirements of fruit picking robots, yield prediction and the like are better met; the small target enhancement technology is used for expanding the sample space, so that the method is well suitable for small sample data sets and has strong generalization capability.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart illustrating training of a recognition model in a target fruit recognition method in the same color system background according to an embodiment of the present invention.
Fig. 2 is a structural diagram of a Sparse-transformer encoder of the Sparse transformer model according to the embodiment of the present invention.
Fig. 3 is a structural diagram of a Sparse-transformer decoder according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of the effect of the feedforward neural network FNN according to the embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below by way of the drawings are illustrative only and are not to be construed as limiting the invention.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
For the purpose of facilitating an understanding of the present invention, the present invention will be further explained by way of specific embodiments with reference to the accompanying drawings, which are not intended to limit the present invention.
It should be understood by those skilled in the art that the drawings are merely schematic representations of embodiments and that the elements shown in the drawings are not necessarily required to practice the invention.
Example 1
This embodiment 1 provides a target fruit identification system under the background of the same color system, which includes:
the acquiring module is used for acquiring an orchard environment image to be identified;
the recognition module is used for processing the orchard environment image to be recognized by utilizing a pre-trained recognition model to obtain a target fruit recognition result; the pre-trained recognition model is obtained by training a training set, wherein the training set comprises a plurality of orchard environment images and labels for labeling target fruits in the orchard environment images;
wherein,
and when the orchard environment image to be recognized is processed by utilizing the pre-trained recognition model, adding a space position complement code to the extracted image characteristics to supplement loss information.
In this embodiment 1, the method for identifying a target fruit in a homochromatic background is implemented by using the above system for identifying a target fruit in a homochromatic background, and includes:
using an acquisition module to acquire an orchard environment image to be identified; if the Canon single-lens reflex camera can be used for acquiring an orchard environment image to be identified.
Processing the orchard environment image to be recognized by using a recognition module and a pre-trained recognition model to obtain a target fruit recognition result; the pre-trained recognition model is obtained by training a training set, wherein the training set comprises a plurality of orchard environment images and labels for labeling target fruits in the orchard environment images. When the orchard environment image to be recognized is processed by utilizing the recognition model trained in advance, space position supplementary codes are added to the extracted image characteristics, and loss information is supplemented.
In this embodiment 1, training the recognition model includes: processing the training set by a deep convolution neural network, extracting characteristics, constructing sparse transformer model processing characteristics, processing by a feedforward neural network, and outputting a final detection result; inputting a test sample, evaluating the obtained detection result by using the evaluation index, adjusting the parameters of the model according to the evaluation result, and repeatedly training the improved model until the optimal network model is obtained.
Making a data set of a training model includes: collecting green target fruit images under different illumination, different time periods and different angles by using a single lens reflex; copying target fruits smaller than preset pixels in the image by using a small target enhancement technology to expand the sample, carrying out classification and labeling and constructing a data set; and dividing the expanded data set into a training set, a verification set and a test set.
The encoder of the constructed sparse transformer model comprises the following steps: replacing an attention module for processing feature mapping in a Transformer mechanism with a hole self-attention module; and processing and reducing the dimension of the image characteristics, adding a space position complement code, supplementing loss information, inputting a cavity self-attention mechanism, a residual error module and a regularization layer, processing the image characteristics, and outputting an encoder result through a feedforward neural network, the residual error module and the regularization layer.
The decoder of the constructed sparse transformer model comprises: and inputting the parameters learned by the encoder into a cavity self-attention mechanism, a residual error module and a regularization layer, processing the parameters, inputting the processed results into a multi-head self-attention mechanism, a residual error module and a regularization layer, and processing the processed results through a feedforward neural network, the residual error module and the regularization layer to obtain detection results.
The feed-forward neural network computes results through a multi-layered perceptron with a ReLU activation function and hidden dimensions, and a linear projection layer. And constructing a final loss function by using the Hungarian loss function and the SoftMax loss function, optimizing a network model, and training the model.
Example 2
In this embodiment 1, a method for identifying a target fruit in a same color system background is provided, which includes:
acquiring an orchard environment image to be identified;
processing the orchard environment image to be recognized by using a pre-trained recognition model to obtain a target fruit recognition result; the pre-trained recognition model is obtained by training a training set, wherein the training set comprises a plurality of orchard environment images and labels for labeling target fruits in the orchard environment images. When the orchard environment image to be recognized is processed by utilizing the recognition model trained in advance, space position supplementary codes are added to the extracted image characteristics, and loss information is supplemented.
In this embodiment 2, training the recognition model includes: processing the training set by a deep convolution neural network, extracting characteristics, constructing sparse transformer model processing characteristics, processing by a feedforward neural network, and outputting a final detection result; inputting a test sample, evaluating the obtained detection result by using the evaluation index, adjusting the parameters of the model according to the evaluation result, and repeatedly training the improved model until the optimal network model is obtained.
As shown in fig. 1, specifically, firstly, an image of a green target fruit in a green environment is collected, and preprocessing and target labeling are performed to generate a data set; a small target enhancement technology is used for copying target fruits with pixels smaller than 64 multiplied by 64 in the image, preprocessing data, expanding samples and improving model precision; constructing a Sparse-transformer encoder-decoder network model and constructing a feedforward neural network prediction final result; constructing a loss function, optimizing a result, finally inputting a test sample, evaluating the obtained detection result of the green target fruit detection model by using the evaluation index, and adjusting the parameters of the model according to the evaluation structure; and finally, repeatedly training the improved model until the optimal network model is obtained.
Wherein the creating of the data set of the training model comprises: collecting green target fruit images under different illumination, different time periods and different angles by using a single lens reflex; copying target fruits smaller than preset pixels in the image by using a small target enhancement technology to expand the sample, carrying out classification and labeling and constructing a data set; and dividing the expanded data set into a training set, a verification set and a test set. Specifically, image acquisition and classification. The Canon EOS 80D single lens reflex is used for collecting rich green fruit images in an orchard environment, the collected images are classified, and a data set is conveniently processed. The data is preprocessed by copying target fruits smaller than 64 x 64 pixels in the image using small target enhancement techniques. And (3) labeling the image by using LabelMe software, and labeling each target fruit as an independent connected domain to manufacture a COCO format data set.
The encoder of the constructed sparse transformer model comprises the following steps: replacing an attention module for processing feature mapping in a Transformer mechanism with a hole self-attention module; and processing and reducing the dimension of the image characteristics, adding a space position complement code, supplementing loss information, inputting a cavity self-attention mechanism, a residual error module and a regularization layer, processing the image characteristics, and outputting an encoder result through a feedforward neural network, the residual error module and the regularization layer.
The decoder of the constructed sparse transformer model comprises: and inputting the parameters learned by the encoder into a cavity self-attention mechanism, a residual error module and a regularization layer, processing the parameters, inputting the processed results into a multi-head self-attention mechanism, a residual error module and a regularization layer, and processing the processed results through a feedforward neural network, the residual error module and the regularization layer to obtain detection results.
The feed-forward neural network computes results through a multi-layered perceptron with a ReLU activation function and hidden dimensions, and a linear projection layer.
Specifically, in this embodiment 2, a network header is constructed to extract features. The traditional CNN network backbone is from the initial image
Figure BDA0003315124950000091
Starting from (3 color channels), a low resolution activation mapping feature f e R is generatedC×H×W. In the embodiment 2, the characteristic values used are: c is 2048,
Figure BDA0003315124950000092
Figure BDA0003315124950000093
As shown in fig. 2, constructing the spare-transformer encoder includes: the hole attention module is used instead of the attention module in the transform mechanism that handles feature mapping. And (3) reducing the dimension of the image characteristics through processing, adding a space position complement code, supplementing loss information, inputting a cavity self-attention mechanism and a residual error module and a regularization layer, processing the image characteristics, and outputting an encoder result through a feedforward neural network and the residual error module and the regularization layer. As shown in figure 4, the effect is better after the treatment of the feedforward neural network FNN.
As shown in fig. 3, constructing a variant Sparse-transformer decoder includes: the spark-transformer decoder is constructed using a variety of attention mechanisms, including a multi-headed attention mechanism, a hole self-attention mechanism. Firstly, inputting parameters learned by an encoder into a cavity self-attention mechanism, a residual error module and a regularization layer, processing the parameters, inputting a processing result into a multi-head self-attention mechanism, a residual error module and a regularization layer, and processing by a feedforward neural network, the residual error module and the regularization layer to obtain a detection result.
In this embodiment 2, the model is evaluated and the network model is optimized. Inputting a test sample, evaluating the detection result of the obtained green fruit detection model by using the evaluation index, adjusting the parameters of the model according to the evaluation result, and repeatedly training the improved model until the optimal network model is obtained. The specific process is as follows:
and evaluating the model by adopting recall rate and accuracy, and providing basis for optimizing the model. And repeatedly training and model evaluating the model according to the recall rate and the accuracy until an optimized result is obtained.
In this embodiment 2, a final loss function is constructed by using the hungarian loss function and the SoftMax loss function, a network model is optimized, and model training is performed. The method comprises the following specific steps:
using y to represent the background truth set and using
Figure BDA0003315124950000101
Representing a prediction set, two matches between the two sets are found using the following formula:
Figure BDA0003315124950000102
wherein,
Figure BDA0003315124950000103
is true value yiThe loss of binary match with the predicted sequence sigma (i),
Figure BDA0003315124950000104
the arrangement of N elements is shown, N represents a prediction set with a fixed size, and the optimization algorithm works on the basis of the Hungarian algorithm.
The Softmax function is a function which is frequently used in deep learning, can map several input numbers into real numbers between 0 and 1, and can still ensure the sum of the several numbers to be 1 after normalization. It is formulated as:
Figure BDA0003315124950000105
where T represents the number of elements and the ratio of the index of the element to the sum of the indices of all elements is calculated.
I.e. the loss function is:
Figure BDA0003315124950000106
step 4.3: will l1Loss function and GLOU loss function
Figure BDA0003315124950000107
Combining the two functions on the basis of scale invariance to establish a boundary frame loss function of the user and combine the boundary frame loss function with the boundary frame loss function
Figure BDA0003315124950000108
Is defined as:
Figure BDA0003315124950000109
l1loss function: based on comparing the differences pixel by pixel and then taking the absolute value, x (p) represents the original image pixels, y (p) represents the pixels of the image after calculation, the formula is as follows:
Figure BDA00033151249500001010
GLOU loss function is shown below using
Figure BDA00033151249500001011
Where a and B represent the generated bounding box regions:
Figure BDA00033151249500001012
λiou∈R、
Figure BDA0003315124950000111
is a hyper-parameter, normalized by the number of objects in the batch, L1Is represented by1A loss function.
In conclusion, in this embodiment 2, the invention uses a Sparse-transformer encoder-decoder model to solve the problems of poor fruit detection efficiency and insensitivity to small targets of the visual system of the fruit picking robot. The method has high precision and high speed, and better meets the agricultural requirements of fruit picking robots, yield prediction and the like. The small target enhancement technology is used for expanding the sample space, the small sample data set is well adapted, the generalization capability is strong, and the method can be applied to robot vision systems for picking or pre-producing various fruits.
Example 3
In this embodiment 3, a fruit picking robot is provided, which includes a target fruit identification system in a background of the same color system, and the system can implement a target fruit identification method in a background of the same color system, including:
acquiring an orchard environment image to be identified;
processing the orchard environment image to be recognized by using a pre-trained recognition model to obtain a target fruit recognition result; the pre-trained recognition model is obtained by training a training set, wherein the training set comprises a plurality of orchard environment images and labels for labeling target fruits in the orchard environment images;
wherein,
and when the orchard environment image to be recognized is processed by utilizing the pre-trained recognition model, adding a space position complement code to the extracted image characteristics to supplement loss information.
Example 4
Embodiment 4 of the present invention provides a non-transitory computer-readable storage medium, where the non-transitory computer-readable storage medium is used to store computer instructions, and when the computer instructions are executed by a processor, the method for identifying a target fruit in a same color system background as described above is implemented, where the method includes:
acquiring an orchard environment image to be identified;
processing the orchard environment image to be recognized by using a pre-trained recognition model to obtain a target fruit recognition result; the pre-trained recognition model is obtained by training a training set, wherein the training set comprises a plurality of orchard environment images and labels for labeling target fruits in the orchard environment images;
wherein,
and when the orchard environment image to be recognized is processed by utilizing the pre-trained recognition model, adding a space position complement code to the extracted image characteristics to supplement loss information.
Example 5
Embodiment 5 of the present invention provides a computer program (product) comprising a computer program, which when run on one or more processors, is configured to implement a method for identifying a target fruit in a homochromatic background as described above, the method comprising:
acquiring an orchard environment image to be identified;
processing the orchard environment image to be recognized by using a pre-trained recognition model to obtain a target fruit recognition result; the pre-trained recognition model is obtained by training a training set, wherein the training set comprises a plurality of orchard environment images and labels for labeling target fruits in the orchard environment images;
wherein,
and when the orchard environment image to be recognized is processed by utilizing the pre-trained recognition model, adding a space position complement code to the extracted image characteristics to supplement loss information.
Example 6
An embodiment 6 of the present invention provides an electronic device, including: a processor, a memory, and a computer program; wherein a processor is connected to the memory, a computer program is stored in the memory, and when the electronic device runs, the processor executes the computer program stored in the memory to make the electronic device execute the instructions for implementing the target fruit identification method in the same color family background as described above, the method includes:
acquiring an orchard environment image to be identified;
processing the orchard environment image to be recognized by using a pre-trained recognition model to obtain a target fruit recognition result; the pre-trained recognition model is obtained by training a training set, wherein the training set comprises a plurality of orchard environment images and labels for labeling target fruits in the orchard environment images;
wherein,
and when the orchard environment image to be recognized is processed by utilizing the pre-trained recognition model, adding a space position complement code to the extracted image characteristics to supplement loss information.
In summary, the method and the system for identifying the target fruit in the homochromatic background according to the embodiments of the present invention use the Sparse-transformer encoder-decoder model to solve the problems of poor fruit detection efficiency and insensitivity to small target in the visual system of the fruit picking robot. The identification precision is high, the speed is high, and the agricultural requirements of fruit picking robots, yield prediction and the like are well met. The small target enhancement technology is used for expanding the sample space, the small sample data set is well adapted, the generalization capability is strong, and the method can be applied to robot vision systems for picking or pre-producing various fruits.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts based on the technical solutions disclosed in the present invention.

Claims (10)

1. A method for identifying target fruits in the same color system background is characterized by comprising the following steps:
acquiring an orchard environment image to be identified;
processing the orchard environment image to be recognized by using a pre-trained recognition model to obtain a target fruit recognition result; the pre-trained recognition model is obtained by training a training set, wherein the training set comprises a plurality of orchard environment images and labels for labeling target fruits in the orchard environment images;
wherein,
and when the orchard environment image to be recognized is processed by utilizing the pre-trained recognition model, adding a space position complement code to the extracted image characteristics to supplement loss information.
2. The method of claim 1, wherein training the recognition model comprises: processing the training set by a deep convolution neural network, extracting characteristics, constructing sparse transformer model processing characteristics, processing by a feedforward neural network, and outputting a final detection result; inputting a test sample, evaluating the obtained detection result by using the evaluation index, adjusting the parameters of the model according to the evaluation result, and repeatedly training the improved model until the optimal network model is obtained.
3. The method for identifying target fruits in the same color family background as claimed in claim 2, wherein a single lens reflex is used to collect green target fruit images under different illumination, different time periods and different angles; copying target fruits smaller than preset pixels in the image by using a small target enhancement technology to expand the sample, carrying out classification and labeling and constructing a data set; and dividing the expanded data set into a training set, a verification set and a test set.
4. The method for identifying target fruits in the same color family background as claimed in claim 2, wherein the encoder of the constructed sparse transformer model comprises: replacing an attention module for processing feature mapping in a Transformer mechanism with a hole self-attention module; and processing and reducing the dimension of the image characteristics, adding a space position complement code, supplementing loss information, inputting a cavity self-attention mechanism, a residual error module and a regularization layer, processing the image characteristics, and outputting an encoder result through a feedforward neural network, the residual error module and the regularization layer.
5. The method for identifying target fruits in the same color family background as claimed in claim 4, wherein the decoder of the constructed sparse transformer model comprises: and inputting the parameters learned by the encoder into a cavity self-attention mechanism, a residual error module and a regularization layer, processing the parameters, inputting the processed results into a multi-head self-attention mechanism, a residual error module and a regularization layer, and processing the processed results through a feedforward neural network, the residual error module and the regularization layer to obtain detection results.
6. The method of claim 5, wherein the feedforward neural network computes the result by a multi-layered perceptron with the ReLU activation function and the hidden dimension, and a linear projection layer.
7. The method for identifying the target fruit under the homochromatic system background as claimed in claim 2, wherein a Hungarian loss function and a SoftMax loss function are used for constructing a final loss function, optimizing a network model and performing model training.
8. A system for identifying a target fruit in a homochromatic background, comprising:
the acquiring module is used for acquiring an orchard environment image to be identified;
the recognition module is used for processing the orchard environment image to be recognized by utilizing a pre-trained recognition model to obtain a target fruit recognition result; the pre-trained recognition model is obtained by training a training set, wherein the training set comprises a plurality of orchard environment images and labels for labeling target fruits in the orchard environment images;
wherein,
and when the orchard environment image to be recognized is processed by utilizing the pre-trained recognition model, adding a space position complement code to the extracted image characteristics to supplement loss information.
9. A non-transitory computer readable storage medium for storing computer instructions which, when executed by a processor, implement the method of identifying a target fruit in a homochromatic context of any of claims 1-6.
10. An electronic device, comprising: a processor, a memory, and a computer program; wherein a processor is connected to a memory, a computer program being stored in the memory, the processor executing the computer program stored in the memory when the electronic device is running, to cause the electronic device to execute instructions to implement the method for identifying a target fruit in a homochromatic context as claimed in any of the claims 1-6.
CN202111228465.9A 2021-10-21 2021-10-21 Method and system for identifying target fruits under homochromatic system background Pending CN114187590A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111228465.9A CN114187590A (en) 2021-10-21 2021-10-21 Method and system for identifying target fruits under homochromatic system background

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111228465.9A CN114187590A (en) 2021-10-21 2021-10-21 Method and system for identifying target fruits under homochromatic system background

Publications (1)

Publication Number Publication Date
CN114187590A true CN114187590A (en) 2022-03-15

Family

ID=80539840

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111228465.9A Pending CN114187590A (en) 2021-10-21 2021-10-21 Method and system for identifying target fruits under homochromatic system background

Country Status (1)

Country Link
CN (1) CN114187590A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114663814A (en) * 2022-03-28 2022-06-24 安徽农业大学 Fruit detection and yield estimation method and system based on machine vision
CN114700941A (en) * 2022-03-28 2022-07-05 中科合肥智慧农业协同创新研究院 Strawberry picking method based on binocular vision and robot system
CN115952830A (en) * 2022-05-18 2023-04-11 北京字跳网络技术有限公司 Data processing method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111210010A (en) * 2020-01-15 2020-05-29 上海眼控科技股份有限公司 Data processing method and device, computer equipment and readable storage medium
CN113076819A (en) * 2021-03-17 2021-07-06 山东师范大学 Fruit identification method and device under homochromatic background and fruit picking robot
US20210224998A1 (en) * 2018-11-23 2021-07-22 Tencent Technology (Shenzhen) Company Limited Image recognition method, apparatus, and system and storage medium
CN113221874A (en) * 2021-06-09 2021-08-06 上海交通大学 Character recognition system based on Gabor convolution and linear sparse attention
CN113269182A (en) * 2021-04-21 2021-08-17 山东师范大学 Target fruit detection method and system based on small-area sensitivity of variant transform

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210224998A1 (en) * 2018-11-23 2021-07-22 Tencent Technology (Shenzhen) Company Limited Image recognition method, apparatus, and system and storage medium
CN111210010A (en) * 2020-01-15 2020-05-29 上海眼控科技股份有限公司 Data processing method and device, computer equipment and readable storage medium
CN113076819A (en) * 2021-03-17 2021-07-06 山东师范大学 Fruit identification method and device under homochromatic background and fruit picking robot
CN113269182A (en) * 2021-04-21 2021-08-17 山东师范大学 Target fruit detection method and system based on small-area sensitivity of variant transform
CN113221874A (en) * 2021-06-09 2021-08-06 上海交通大学 Character recognition system based on Gabor convolution and linear sparse attention

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
贾伟宽 等: "基于优化Transformer网络的绿色目标果实高效检测模型", 农业工程学报, vol. 37, no. 014, 23 July 2021 (2021-07-23), pages 163 - 170 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114663814A (en) * 2022-03-28 2022-06-24 安徽农业大学 Fruit detection and yield estimation method and system based on machine vision
CN114700941A (en) * 2022-03-28 2022-07-05 中科合肥智慧农业协同创新研究院 Strawberry picking method based on binocular vision and robot system
CN114700941B (en) * 2022-03-28 2024-02-27 中科合肥智慧农业协同创新研究院 Strawberry picking method based on binocular vision and robot system
CN114663814B (en) * 2022-03-28 2024-08-23 安徽农业大学 Fruit detection and yield estimation method and system based on machine vision
CN115952830A (en) * 2022-05-18 2023-04-11 北京字跳网络技术有限公司 Data processing method and device, electronic equipment and storage medium
CN115952830B (en) * 2022-05-18 2024-04-30 北京字跳网络技术有限公司 Data processing method, device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111768432B (en) Moving target segmentation method and system based on twin deep neural network
Mathur et al. Crosspooled FishNet: transfer learning based fish species classification model
CN111191583B (en) Space target recognition system and method based on convolutional neural network
CN108021947B (en) A kind of layering extreme learning machine target identification method of view-based access control model
CN114187590A (en) Method and system for identifying target fruits under homochromatic system background
CN112364931B (en) Few-sample target detection method and network system based on meta-feature and weight adjustment
CN114332621B (en) Disease and pest identification method and system based on multi-model feature fusion
Lee et al. Plant Identification System based on a Convolutional Neural Network for the LifeClef 2016 Plant Classification Task.
CN114187450A (en) Remote sensing image semantic segmentation method based on deep learning
CN113920472B (en) Attention mechanism-based unsupervised target re-identification method and system
CN113269182A (en) Target fruit detection method and system based on small-area sensitivity of variant transform
CN114724155A (en) Scene text detection method, system and equipment based on deep convolutional neural network
CN111125397B (en) Cloth image retrieval method based on convolutional neural network
CN113034506A (en) Remote sensing image semantic segmentation method and device, computer equipment and storage medium
CN114359631A (en) Target classification and positioning method based on coding-decoding weak supervision network model
CN116434045B (en) Intelligent identification method for tobacco leaf baking stage
CN114612802A (en) System and method for classifying fine granularity of ship target based on MBCNN
CN113420173A (en) Minority dress image retrieval method based on quadruple deep learning
CN113076819A (en) Fruit identification method and device under homochromatic background and fruit picking robot
CN113496221B (en) Point supervision remote sensing image semantic segmentation method and system based on depth bilateral filtering
CN114463614A (en) Significance target detection method using hierarchical significance modeling of generative parameters
CN113011506A (en) Texture image classification method based on depth re-fractal spectrum network
Si et al. Image semantic segmentation based on improved DeepLab V3 model
CN117872127A (en) Motor fault diagnosis method and equipment
CN117710841A (en) Small target detection method and device for aerial image of unmanned aerial vehicle

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination