CN116704264B - Animal classification method, classification model training method, storage medium, and electronic device - Google Patents
Animal classification method, classification model training method, storage medium, and electronic device Download PDFInfo
- Publication number
- CN116704264B CN116704264B CN202310852863.0A CN202310852863A CN116704264B CN 116704264 B CN116704264 B CN 116704264B CN 202310852863 A CN202310852863 A CN 202310852863A CN 116704264 B CN116704264 B CN 116704264B
- Authority
- CN
- China
- Prior art keywords
- animal
- head
- classification
- module
- picture set
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 241001465754 Metazoa Species 0.000 title claims abstract description 190
- 238000012549 training Methods 0.000 title claims abstract description 102
- 238000013145 classification model Methods 0.000 title claims abstract description 88
- 238000000034 method Methods 0.000 title claims abstract description 85
- 238000001514 detection method Methods 0.000 claims description 35
- 230000008569 process Effects 0.000 claims description 26
- 241000894007 species Species 0.000 claims description 22
- 238000010257 thawing Methods 0.000 claims description 12
- 238000007710 freezing Methods 0.000 claims description 7
- 230000008014 freezing Effects 0.000 claims description 7
- 238000000605 extraction Methods 0.000 claims description 6
- 238000007499 fusion processing Methods 0.000 claims description 4
- 210000003128 head Anatomy 0.000 description 194
- 210000000554 iris Anatomy 0.000 description 48
- 238000005516 engineering process Methods 0.000 description 24
- 210000001508 eye Anatomy 0.000 description 23
- 238000012545 processing Methods 0.000 description 12
- 238000013473 artificial intelligence Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 10
- 239000000284 extract Substances 0.000 description 10
- 238000004891 communication Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 9
- 238000010801 machine learning Methods 0.000 description 6
- 238000012806 monitoring device Methods 0.000 description 6
- 241000282472 Canis lupus familiaris Species 0.000 description 5
- 238000012544 monitoring process Methods 0.000 description 5
- 230000002829 reductive effect Effects 0.000 description 5
- 238000005070 sampling Methods 0.000 description 5
- 230000003993 interaction Effects 0.000 description 4
- 230000011218 segmentation Effects 0.000 description 4
- 241000282326 Felis catus Species 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 238000004590 computer program Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 101100041593 Arabidopsis thaliana IREG2 gene Proteins 0.000 description 2
- 101100041595 Arabidopsis thaliana IREG3 gene Proteins 0.000 description 2
- 241000282465 Canis Species 0.000 description 2
- 101000637813 Homo sapiens Solute carrier family 40 member 1 Proteins 0.000 description 2
- 102100032008 Solute carrier family 40 member 1 Human genes 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000003190 augmentative effect Effects 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000037308 hair color Effects 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000005315 distribution function Methods 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000000873 masking effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 210000001747 pupil Anatomy 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 210000000697 sensory organ Anatomy 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/52—Scale-space analysis, e.g. wavelet analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/54—Extraction of image or video features relating to texture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/18—Eye characteristics, e.g. of the iris
- G06V40/193—Preprocessing; Feature extraction
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A40/00—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
- Y02A40/70—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in livestock or poultry
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Human Computer Interaction (AREA)
- Ophthalmology & Optometry (AREA)
- Image Analysis (AREA)
Abstract
The embodiment of the application discloses an animal classification method, a classification model training method, a storage medium and electronic equipment, wherein the animal classification method comprises the following steps: acquiring a head picture of an animal to be identified; acquiring a classification number corresponding to the head picture through a trained classification model, wherein the classification number comprises an ID number of each attribute feature in the head picture; and comparing the classification number with classification numbers in an animal variety library to determine the variety of the animal to be identified. By adopting the method and the device, the accuracy of classifying the animals to be identified can be improved.
Description
Technical Field
The application relates to the field of picture processing, in particular to an animal classification method, a classification model training method, a storage medium and electronic equipment.
Background
Animal iris recognition technology is a hot research direction in the current biological recognition technology field. Along with the continuous development of technology, animal iris recognition technology has made more remarkable progress and becomes an effective tool in the fields of recognition of raised animals, wild animal population investigation and the like. The iris recognition technology can be used for carrying out animal identity identification and recognition with high accuracy by capturing the texture characteristics of animal irises, so that the limitation of the traditional method can be broken through, and the accuracy and the efficiency of the recognition can be improved.
The monitoring equipment is used for acquiring the head picture of the animal and extracting the texture characteristics of the iris of the animal in the head picture, so that the identification of the animal is completed, and an efficient and accurate method can be provided for animal management. However, because of the large variety of animals, there is a large difference in head attribute characteristics among animals of different species, and even among animals of different species of the same species, there is a large difference in head attribute characteristics, so that the iris difference is large. Therefore, the species of the animal to be identified needs to be determined before iris identification, so that the iris identification method is correspondingly adjusted according to the head attribute characteristics of the animals of different species.
In the prior art, the animal classification method can only identify the species of the animal, so that the accuracy of the classification of the animal to be identified is low, and the accuracy of iris identification is low.
Disclosure of Invention
The application provides an animal classification method, a classification model training method, a storage medium and electronic equipment, which can improve the accuracy of classification of animals to be identified.
In a first aspect of the present application, there is provided a method of classifying animals comprising:
acquiring a head picture of an animal to be identified;
Acquiring a classification number corresponding to the head picture through a trained classification model, wherein the classification number comprises an ID number of each attribute feature in the head picture;
and comparing the classification number with classification numbers in an animal variety library to determine the variety of the animal to be identified.
By adopting the technical scheme, the classification model after training is adopted to determine the classification number corresponding to the head picture of the animal to be identified, so that the variety of the animal to be identified is determined by comparing the classification number with the classification number in the animal variety library. The classification number contains the attribute characteristics of the animal to be identified, so that the animal to be identified is identified by the method to pay more attention to the attribute characteristics, and compared with the prior art, the specific variety of the animal to be identified can be further determined according to the attribute characteristics, so that the accuracy of classifying the animal to be identified is higher.
Optionally, the comparing the classification number with classification numbers in an animal variety library, after determining the variety of the animal to be identified, further includes:
and determining the identity of the animal to be identified according to the variety of the animal to be identified and the iris picture in the head picture.
By adopting the technical scheme, the iris picture of the animal to be identified in the head picture can be extracted, the texture features of the iris picture can be further extracted, and the extracted texture features are compared with the texture features of the animal to be identified in the iris database, so that the identity of the animal to be identified is determined. Compared with the prior art, the iris texture feature comparison quantity is reduced, the complexity of comparison time is reduced, and the iris recognition efficiency is improved.
Optionally, the comparing the classification number with classification numbers in an animal variety library to determine the variety of the animal to be identified includes:
calculating the Hamming distance between the classification number and the classification number in the animal variety library to obtain a plurality of Hamming distances;
and determining the variety corresponding to the minimum hamming distance in the hamming distances lower than the threshold as the variety of the animal to be identified.
By adopting the technical scheme, when the Hamming distance is higher than the threshold value, the difference between the animal to be identified and all varieties in all variety libraries is larger, and at the moment, the head picture of the animal to be identified is possibly unclear or caused by other factors, and the variety of the animal to be identified is possibly an unknown variety. Therefore, the accuracy and reliability of the recognition result can be improved by setting the threshold value.
In a second aspect of the present application, the present application provides a classification model training method, including:
an initial classification model is built, wherein the initial classification model comprises a back one module, a Mask module, a branch head module, an FPN module and a detection head module, the Mask module comprises Mask units with different attribute characteristics, the branch head module comprises branch heads with different attribute characteristics, and the detection head module comprises at least one eye attribute characteristic detection head;
acquiring an animal head picture set, and performing feature extraction operation and downsampling operation on the animal head picture set through the backlight module to obtain a plurality of downsampled picture sets with feature scales from high to low;
marking each downsampled picture set through each Mask unit to obtain a first attribute characteristic picture set corresponding to each downsampled picture set;
inputting each first attribute characteristic picture set into each branch head in the branch head module, and adjusting parameters of the branch head module according to the accuracy and the accuracy threshold of the output first classification result;
performing splicing and fusion processing on each first attribute feature picture set through an FPN module to obtain at least one second attribute feature picture set;
Inputting each first attribute characteristic picture set into each branch head in the adjusted branch head module, inputting each second attribute characteristic picture set into each eye attribute characteristic detection head in the detection head module, and outputting a second classification result;
and if the loss value of the second classification result does not reach the loss threshold value, adjusting the parameters of the initial classification model according to the loss value until the loss value reaches the loss threshold value, and obtaining the classification model after training.
By adopting the technical scheme, the backup module, the Mask module, the branch head module, the FPN module and the detection head module are arranged in the initial classification model, and the backup module can extract a plurality of downsampled picture sets with characteristic scales from high to low in the animal head picture set, so that different attribute characteristics of animals are classified and identified according to the downsampled picture sets with different scale characteristics, the attribute characteristics of the animal heads can be more comprehensively captured, and the accuracy and the robustness of the classification model are improved. In addition, the features with different scales can provide complementary information, so that the performance of the classification model is effectively improved.
Optionally, marking each downsampled picture set by each Mask unit to obtain a first attribute feature picture set corresponding to each downsampled picture set includes:
Respectively inputting each downsampled picture set to a corresponding Mask unit according to the characteristic scale of each downsampled picture set;
determining attribute feature point distribution in the downsampled picture set through the Mask unit, and obtaining a feature point probability distribution map according to the attribute feature point distribution;
determining a first area, in the feature point probability distribution map, in which at least one feature point distribution probability is greater than a threshold value;
and generating two-dimensional Gaussian distribution according to each first region and a probability distribution formula, and distributing weight values to characteristic points in the downsampled picture set according to the two-dimensional Gaussian distribution to obtain the first attribute characteristic picture set.
By adopting the technical scheme, the position and distribution condition of the attribute features can be more accurately determined according to the feature point distribution and the probability distribution map. The two-dimensional Gaussian distribution is utilized to distribute weight values to the feature points, so that the attribute features can be weighted more accurately, and the accuracy of classifying the attribute features by the classification model is improved.
Optionally, the inputting each of the first attribute feature picture sets into each of the branching heads in the branching head module, and adjusting parameters of the branching head module according to the output first classification result, includes:
Freezing each branch head in the branch head module;
inputting a first attribute characteristic picture set into a first branch head in the branch head module, thawing the first branch head, taking the first branch head as a target branch head, and adjusting parameters of the target branch head according to a first classification result output by the target branch head until the first classification result reaches preset precision;
freezing the target branch head, continuously executing defrosting of the next branch head, taking the next branch head as the target branch head, and adjusting parameters of the target branch head according to a first classification result output by the target branch head until the first classification result reaches the preset precision, until parameter adjustment of the branch head module is completed.
By adopting the technical scheme, through freezing and thawing different branch heads and according to the output result of the target branch head, parameters of the branch heads are adjusted, so that mutual interference among different branch heads can be avoided, and the training effect of the initial classification model is improved.
Optionally, the FPN module includes at least two FPN units, and the performing, by the FPN module, a stitching and fusing process on each of the first attribute feature image sets to obtain at least one second attribute feature image set, including:
According to the sequence of the feature scale from high to low, at least one first attribute feature picture set is respectively input into each FPN unit;
extracting feature information in the first attribute feature picture set through a first FPN unit, and outputting the feature information to a next FPN unit;
and executing the step of taking the next FPN unit as a target FPN unit, extracting the first characteristic information received by the target FPN unit and the characteristic information output by the last FPN unit by the first characteristic information, fusing the first characteristic information and the characteristic information output by the last FPN unit to obtain target characteristic information and a second attribute characteristic picture set, and outputting the target characteristic information to the next FPN unit until all FPN units are output.
Through adopting above-mentioned technical scheme, the FPN module processes and fuses the characteristic information step by step, and each FPN unit in the FPN module can draw the characteristic information on the different scales in the first attribute feature picture set to fuse it in the second attribute feature picture set. The next-stage detection head module can be helped to better capture the attribute characteristic information in the second attribute characteristic picture set, and classification accuracy is improved.
Optionally, the adjusting the parameters of the initial classification model according to the loss value includes:
determining a first training frequency in the total training frequency as a target training frequency, starting a Mask unit corresponding to the target training frequency, and adjusting the initial classification model according to the loss value until the training frequency reaches the target training frequency;
and determining the second training times in the total training times as target training times, repeatedly executing the Mask unit corresponding to the target training times, and adjusting parameters of the initial classification model according to the loss value until the training times reach the target training times, until the training times of the initial classification model reach the total training times.
By adopting the technical scheme, the parameters of the initial classification model are gradually adjusted by carrying out iterative training on the initial classification model until the target training times are reached, and the parameters are adjusted according to the loss value, so that the performance of the classification model is improved.
In a third aspect the present application provides a computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the above-described method steps.
In a fourth aspect of the present application, there is provided an electronic device comprising: a processor, a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the above-mentioned method steps.
In summary, one or more technical solutions provided in the embodiments of the present application at least have the following technical effects or advantages:
1. according to the method, the classification number corresponding to the head picture of the animal to be identified can be determined through the trained classification model, so that the variety of the animal to be identified can be determined by comparing the classification number with the classification number in the animal variety library. The classification number contains the attribute characteristics of the animal to be identified, so that the animal to be identified is identified by the method to pay more attention to the attribute characteristics, and compared with the prior art, the specific variety of the animal to be identified can be further determined, so that the accuracy of classifying the animal to be identified is higher;
2. be provided with back one module, mask module, branch head module, FPN module and detect first module in the classification model that this application provided, back one module can draw a plurality of downsampled picture sets of animal head picture set characteristic scale from high to low to carry out classification to the attribute feature of animal difference according to the downsampled picture set of different scale features, can more comprehensively catch the attribute feature of animal head, improve classification model's accuracy and robustness. In addition, the features with different scales can provide complementary information, so that the performance of the classification model is effectively improved.
Drawings
FIG. 1 is a schematic illustration of an implementation environment provided by an embodiment of the present application;
FIG. 2 is a schematic diagram of a system architecture of a classification model according to an embodiment of the present application;
FIG. 3 is a flowchart of a training method for classification models according to an embodiment of the present application;
FIG. 4 is a schematic diagram illustrating interactions between modules in a training process of a classification model according to an embodiment of the present application;
FIG. 5 is a schematic flow chart of an animal classification method according to an embodiment of the present application;
FIG. 6 is a schematic diagram of Hamming distance alignment according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Reference numerals illustrate: 900. an electronic device; 901. a processor; 902. a memory; 903. a user interface; 904. a network interface; 905. a communication bus.
Description of the embodiments
In order to enable those skilled in the art to better understand the technical solutions in the present specification, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only some embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
In order to facilitate understanding of the technical solutions provided in the embodiments of the present application, some key terms used in the embodiments of the present application are explained here:
artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. Artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and other directions.
Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.
Animal classification is to classify and identify pictures containing animals, so that the types of the animals can be accurately classified into a class label.
Animal iris recognition is a biological recognition technology, and a digital image processing technology is used for analyzing and comparing the line patterns of the irises in animal eyes so as to realize the confirmation of animal identities. The iris is a ring-shaped structure in the animal's eye, the pattern of the lines is unique, and the number and shape of the lines varies with the animal's species and individual. Therefore, the iris recognition technology has high accuracy and is not tamper-resistant, and can be applied to the fields of animal identity authentication, data acquisition, behavior monitoring and the like.
In the animal iris recognition technology, image information of animal eyes is acquired first, the texture features of the irises are extracted through an image processing technology, and then comparison and recognition are carried out. The key of the iris recognition technology is to accurately extract and compare texture features, so that the recognition effect is improved by means of artificial intelligence technologies such as machine learning, deep learning and the like. Iris recognition technology has been applied to zoo and animal protection center institutions in various countries, and has achieved good results.
Because the difference of the head attribute characteristics of animals of different species is large, even the head attribute characteristics of animals of different species of the same species are also greatly different, so that the iris difference is large. Therefore, it is necessary to classify the animal species before iris recognition is performed on the animal. However, the animal classification method in the prior art can only identify the species of the animal (cat, dog, etc.), but cannot identify the detailed species of the animal (cat, siamese cat, etc.). When large-scale animal iris recognition tasks are carried out, the animal iris recognition accuracy and efficiency are low because of the large number of animals and the high time complexity of comparing the texture features of the animal irises.
Based on the above, the embodiment of the application provides an animal classification method, a classification model training method, a storage medium and electronic equipment, wherein a plurality of attribute features of an animal head picture are obtained through a classification model, so that animal varieties are determined according to the attribute features. Further, the iris texture features in the animal head picture are compared with the iris texture features corresponding to the animal variety, so that the time complexity of iris texture feature comparison is shortened, and the recognition efficiency and accuracy of animal irises are improved.
Referring to fig. 1, fig. 1 is a schematic diagram of an implementation environment provided in an embodiment of the present application, where the implementation environment includes a monitoring device and an executing device, and the monitoring device and the executing device are connected through a communication network.
For example, when the animal to be identified is within the monitoring range of the monitoring device, the monitoring device may acquire a head picture of the animal to be identified and send the head picture to the execution device. The execution device stores a trained classification model and a computer program for animal iris recognition. The classification model can identify each attribute characteristic of the head picture, and then the variety of the animal to be identified can be determined according to the attribute characteristics output by the classification model.
Further, after determining the variety of the animal to be identified, the executing device may further obtain an iris image of the animal to be identified in the head image, so as to extract texture features in the iris image. And comparing the extracted texture features with texture features stored under the variety of the animal to be identified to determine the identity ID of the animal to be identified.
Among them, the execution devices include, but are not limited to, servers, computer devices, android (Android) system devices, mobile operating system (IOS) devices developed by apple corporation, personal Computers (PCs), world Wide Web (Web) devices, virtual Reality (VR) devices, augmented Reality (Augmented Reality, AR) devices, and the like.
With the above description of the application scenario, in order to enable a person skilled in the art to better understand the implementation principle of the animal classification method, the system architecture of the classification model is described first, and referring to fig. 2, fig. 2 shows a schematic diagram of the system architecture of the classification model according to the embodiment of the present application. As shown in fig. 2, the classification model includes a backup module, a Mask module, a branching head module, an FPN module, and a detection head module.
The backbone network module (backbone network module) refers to a module serving as a backbone part in the convolutional neural network model, and common backbone network modules can include a ResNet module, a VGG module, a MobileNe module and the like.
The Mask module (masking module) is a module for the target semantic segmentation task, also called the full convolution network (Fully Convolutional Network) or the segmentation network (Segmentation Network). The Mask module is generally composed of an encoder and a decoder, wherein the encoder is responsible for extracting features of the image, and the decoder is responsible for restoring the feature image into a segmentation result at the pixel level. The Mask module in the application is calculated according to probability distribution of interesting feature points in the animal head picture set, and the main purpose of the Mask module is to divide interesting areas and non-interesting areas in the animal head picture.
The branch head module (Branch head module) is used for predicting the position or classification of the target on the basis of the image characteristics, is usually positioned at the tail or in the middle of the backbone network module, and is used for predicting the position or classification of the target and other information in the characteristic extraction process of the image. The branch head module is connected to the network in a branch mode, can influence the characteristic extraction process to the minimum extent, and can be added with a plurality of prediction tasks to improve the multi-task performance of the model.
The FPN module (feature pyramid module) is a module designed specifically for solving the problem of multi-scale feature fusion in target detection. In the target detection task, target objects of different sizes and dimensions often require features of different sizes and dimensions to detect. However, as the features extracted from the convolutional neural network in different convolutional layers are different, the FPN module can well extract the features on different scales by adding more convolutional layers at different scales of the feature image. Meanwhile, the FPN module can extract features on a large scale through special up-sampling and down-sampling operations, and the features extracted on different scales are fused together to understand the target object in a more comprehensive mode.
Detection head module (Detection head module): the main module in the target detection model is used for being responsible for tasks such as generation, adjustment and category judgment of a detection prediction frame. Common detector head modules include YOLO, retinaNet, faster R-CNN, and the like.
In the embodiment of the application, the backup module is mainly used for extracting image features. As shown in FIG. 2, the backup module comprises a plurality of backup units, and each backup unit is formed by splicing a plurality of convolution layers with different sizes, a pooling layer and an activation function. After the head picture of the animal to be identified is input to the classification module, the backup unit 1 extracts the high-dimensional features of the head picture, and outputs the extracted high-dimensional features to the backup unit 2. The backup unit 2 performs a downsampling operation on the high-dimensional feature, and further extracts the downsampled high-dimensional feature. And each backlight unit sequentially performs transverse downsampling operation on the head picture to obtain a plurality of downsampled pictures with feature scales from high to low, and outputs the extracted downsampled pictures to the Mask module.
Further, the Mask module includes a plurality of Mask units, where the backup unit corresponds to at least one Mask unit, that is, one backup unit may output downsampled pictures to the plurality of corresponding Mask units. Because the feature scales of the downsampled pictures output by the respective backup units are different, parameters set by the Mask unit need to be matched with the corresponding backup units. After the Mask unit receives the downsampled picture output by the backlight unit, marking the interesting attribute features in the downsampled picture to obtain an attribute feature picture, so that the branch head module can identify the interesting attribute features.
In addition, because iris recognition is carried out on the animal to be recognized, the eye feature of the animal to be recognized needs to be determined more accurately, and the problem of low precision exists in the attribute feature picture output by the Mask module. Therefore, the FPN module can be combined with the attribute features in the attribute feature pictures output by the Mask module to form a multi-layer feature pyramid, so that the detection head module can acquire richer and more accurate eye feature information from the feature pictures with multiple scales and layers, and the recognition accuracy is improved.
As shown in fig. 2, the FPN module includes a plurality of FPN units, and the FPN units may respectively receive the pictures output by the Mask module. And each FPN unit sequentially extracts attribute features in the attribute feature pictures in left-to-right order, and performs feature fusion. And outputting the attribute feature pictures with different dimensions to the detection head module so that the detection head module can identify different eye features of the animal to be identified.
The above system architecture of the classification model and interactions between the modules and units in the system architecture are described, and further, please refer to fig. 3, which specifically provides a flow chart of a training method of the classification model, where the method may be implemented by a computer program, may be implemented by a single-chip microcomputer, or may be run on the execution device. Referring to fig. 4 on the basis of fig. 3, fig. 4 is a schematic interaction diagram of each module in the training process of the classification model according to the embodiment of the present application. Specifically, the method includes steps 301 to 307, which are as follows:
step 301: and constructing an initial classification model, and acquiring an animal head picture set.
The initial classification model may be understood as an untrained classification model, and constructs a model structure as shown in fig. 2, where the initial classification model includes a backup module, a Mask module, a branching head module, an FPN module, and a detection head module. The Mask module comprises Mask units with different attribute characteristics, the branch head module comprises branch heads with different attribute characteristics, and the detection head module comprises at least one eye attribute characteristic detection head.
Further, the animal head picture set refers to a training set adopted when the initial classification model is subjected to model training, and the picture set comprises a plurality of animal head pictures. In the embodiment of the application, the parameters in each module in the initial classification model can be corrected by inputting the animal head picture set into the initial classification model and calculating the loss value and the accuracy of the classification result output by the initial classification model so as to obtain the classification model after training.
In one possible embodiment, the initial classification model may be trained in species by sequentially passing through different species of animal head picture sets under the same species, so as to improve the suitability of the classification model for head picture identification of animals of the species. The following is an illustration of a head set of canine species, and specific training procedures are shown in steps 302-307.
Step 302: and performing feature extraction operation and downsampling operation on the animal head picture set through a backlight module to obtain a plurality of downsampled picture sets with feature scales from high to low.
Specifically, because the animal head picture set shows different characteristic attribute information under different characteristic scales, information of a plurality of characteristic scales of the animal head picture set needs to be extracted. In a possible implementation, as shown in fig. 2, the backup module may be specifically divided into 4 backup units, where each backup unit is provided with a convolution layer, a pooling layer and an activation function of different layers, and is used to extract various attribute features in the animal head picture set.
The back one unit 1 and the back one unit 2 are provided with a deeper convolution layer for extracting attribute features (such as hair and texture features) with higher feature scales, and the back one unit 3 and the back one unit 4 are provided with a shallower convolution layer for extracting attribute features (such as outlines and shape features) with lower feature scales. The attribute features of different feature scales are extracted through the 4 backlight units, so that objects can be described more comprehensively and accurately, and the precision and efficiency of the classification model can be further improved.
After the animal head picture set is input into the initial classification model, the backup unit 1 extracts the high-dimensional attribute characteristics of the animal head picture set to obtain a first downsampled picture set, and outputs the extracted first downsampled picture set to the backup unit 2. The backup unit 2 extracts the attribute features in the first downsampled picture set by downsampling to obtain a second downsampled picture set. And each backup unit sequentially carries out transverse downsampling operation on the animal head picture set to obtain a plurality of downsampled pictures with characteristic dimensions from high to low, and outputs the extracted downsampled pictures to the Mask module.
Step 303: and marking each downsampled picture set through each Mask unit to obtain a first attribute characteristic picture set corresponding to each downsampled picture set.
Illustratively, as shown in fig. 2, the Mask module is provided with 4 Mask units, and each Mask unit in the Mask module may be respectively defined as a hair Mask unit, an age Mask unit, a body type Mask unit, and a head type Mask unit.
Further, each backup unit corresponds to at least one Mask unit. Since the scale characteristics of the downsampled picture sets output by the respective backup units are different, the respective Mask units pay attention to the difference in attribute characteristics of the downsampled picture sets. The age Mask unit is focused on the position distribution of the canine facial features in the downsampled picture set; the correlation between the body type attribute characteristics of dogs and ear distances and pupil distances is strong, and the corresponding body type mask units are more focused on the position distribution of ears and eyes in the downsampled picture set; the head type attribute characteristics of dogs have strong correlation with the head shape, and the corresponding head type mask units are more focused on the head position distribution in the sampling picture set.
Since the hair Mask unit focuses attention on the hair characteristics of dogs in the down-sampling picture set, the down-sampling picture set output by the Backone unit 2 with relatively high scale characteristics is used as the input of the hair Mask unit. Since the head mask unit focuses on the head shape of dogs in the downsampled picture set, the downsampled picture set output by the backup unit 4 with relatively low scale features is used as the input of the head mask unit. Further, regarding the age Mask unit and the body Mask unit, attention is paid to attribute features such as five sense organs distribution in the downsampled picture set, in which it is possible to include both some outline features and some detail features, and thus the downsampled picture set output by the Backone unit 2 is employed as an input thereof.
By designing corresponding attribute characteristics for different Mask units and corresponding to downsampled picture sets outputting different scale characteristics in the backstone module, classification recognition tasks can be more accurately carried out, and classification recognition efficiency and accuracy are further improved.
Specifically, after each Mask unit receives the downsampled picture set output by the corresponding backup unit, the attribute features focused on by the downsampled picture set are marked, so that a first attribute feature picture set corresponding to the downsampled picture set can be obtained. In one possible embodiment, step 303: the method specifically comprises the following steps:
Step 401: and respectively inputting each downsampled picture set to a corresponding Mask unit according to the characteristic scale of each downsampled picture set.
Step 402: and determining attribute feature point distribution in the downsampled picture set through a Mask unit, and obtaining a feature point probability distribution map according to the attribute feature point distribution.
In this embodiment, the Mask unit focuses on the points of the attribute features in the downsampled picture set. For example, the hair Mask unit focuses on the hair distribution in the downsampled picture set, and after the hair Mask unit receives the input downsampled picture set, the hair Mask unit marks the hair feature point position as 1 and the rest positions as 0 in the downsampled picture set. And after marking is completed, superposing all the downsampled pictures in the downsampled picture set to obtain a probability distribution map about the hair characteristic points.
Step 403: a first region of the feature point probability distribution map having at least one feature point distribution probability greater than a threshold is determined.
Illustratively, after obtaining the feature point probability distribution map, a critical region with dense feature point distribution needs to be focused. In the embodiment of the present application, the key area is defined as a first area, and the number of the first areas is generally controlled to be in a smaller range in consideration of diversity of different scenes and data sets, so as to reduce complexity in the calculation process, and is generally controlled to be 1 to 5. Specifically, the region with the feature point distribution probability larger than the threshold value can be determined as the first region by comparing the feature point distribution probability in the region with the preset threshold value.
Step 404: and generating two-dimensional Gaussian distribution according to each first region and the probability distribution formula, and distributing weight values to characteristic points in the downsampled picture set according to the two-dimensional Gaussian distribution to obtain a first attribute characteristic picture set.
Wherein, the probability distribution formula is:
;
wherein,;/>;
wherein F (x, y) is a probability value inhibiting (x, y) and is used as a weight value corresponding to (x, y) (-)) For the center point coordinates of the first region, +.>Is the abscissa value of the center point coordinate, +.>Is the ordinate value of the center point coordinate, and (x, y) is the coordinate value of any feature point in the first area,/I>To correspond to (+)>) Is>For the width of the first region,is the height of the first region.
Illustratively, in an embodiment of the present application, a probability distribution formula is provided, wherein,and the probability value of the two-dimensional Gaussian normal distribution function is used for each characteristic point in the first area. By subtracting the probability value from 1, a (x, y) suppressed probability value can be obtained and used as a (x, y) weight value. And adopting the weight value to assign weight to the characteristic points of the first area so as to inhibit, so that the follow-up neural network focuses on the characteristic points of the first area.
The horizontal line may be understood as a straight line parallel to the horizontal axis in the first region coordinate system, and the vertical line may be understood as a straight line parallel to the vertical axis in the first region coordinate system. Specifically, for each first region, a center position may be determined to represent the distribution of feature points of the first region, and the center position may be determined as the center point in the above-described process, so as to simplify the calculation process.
Specifically, a two-dimensional Gaussian distribution is generated according to a first region and a probability distribution formula, and a weight value is distributed to characteristic points in a downsampled picture set according to the two-dimensional Gaussian distribution, so that a first attribute characteristic picture set is obtained. The weight value is generated according to the two-dimensional Gaussian distribution, so that the branching head module and the detecting head module can pay attention to information of the key feature area better.
Step 304: and inputting each first attribute characteristic picture set into each branch head in the branch head module, and adjusting parameters of the branch head module according to the accuracy and the accuracy threshold of the output first classification result.
Illustratively, as shown in FIG. 2, the branch head module includes a plurality of branch heads, each provided with independent parameters and weights. Each Mask unit can correspondingly output the generated first attribute feature picture set to one branch head, and the branch head can predict the classification result of each first attribute feature picture according to the probability distribution of the feature points in the first attribute feature picture set and obtain a first classification result. For example, the hair Mask unit outputs the first attribute feature picture sets to the branching head 1 and the branching head 2, respectively, for example, the branching head 1 may be a hair type branching head, and the predicted classification result may be short hair, long hair, no hair, silk hair, and the like; the branching head 2 may be a hair color branching head, and the predicted classification result may be black, white, gray, brown, yellow, black-and-white, gray-and-white, yellow-and-white, or the like.
As shown in fig. 4, the branching head module outputs a predicted first classification result according to the input first attribute feature picture set, compares the precision of the first classification result with a precision threshold, trains each branching head in the branching head module by adopting a gradual thawing strategy, and takes the output classification result as a partial second classification result after all branching heads in the branching head module are trained.
The accuracy of the first classification result may be expressed as a classification result of correct prediction divided by the number of first attribute feature pictures in the first attribute feature picture set.
Based on the above embodiment, as an optional implementation manner, the training process of the branching head module by adopting a gradual thawing strategy may specifically further include the following steps:
step 501: freezing each branch head in the branch head module.
Specifically, to ensure that only a particular target branch head is trained when training is initiated, all branch heads in the branch head module are frozen prior to initiating training. During training, the frozen branch head keeps the parameters fixed.
Step 502: inputting a first attribute characteristic picture set into a first branch head in the branch head module, thawing the first branch head, taking the first branch head as a target branch head, and adjusting parameters of the target branch head according to a first classification result output by the target branch head until the first classification result reaches preset precision.
Step 503: freezing the target branch head, continuously executing defrosting of the next branch head, taking the next branch head as the target branch head, and adjusting the parameters of the target branch head according to the first classification result output by the target branch head until the first classification result reaches the preset precision, until the parameter adjustment of the branch head module is completed.
The preset precision corresponding to each branch head in the branch head module is different, and the preset precision can be specifically set according to specific tasks.
Specifically, the gradual thawing training strategy is adopted to accurately control which branch heads are to participate in training and which are to remain fixed in the training process. This avoids confusion and instability that may result from training all of the branch heads simultaneously. Because the bottom layer feature extraction layer is shared, the method of step-by-step thawing can enable each branch head to share the common features, thereby reducing the quantity of parameters and the calculated quantity. This helps to improve the efficiency and generalization ability of the model. In addition, in the training process, each branch head performs parameter adjustment after the previous branch head adjustment is completed. This may allow each branch head to be inspired by the feature that the previous branch head has optimized when tuned, thus better accommodating classification tasks.
Step 305: and performing splicing and fusion processing on each first attribute feature picture set through the FPN module to obtain at least one second attribute feature picture set.
Illustratively, as shown in FIG. 2, the FPN module includes 3 FPN units, namely FPN1, FPN2, and FPN3, and the Mask unit may output the first attribute feature picture set into at least one FPN. FPN1, FPN2, and FPN3 have different convolution kernel sizes and output channel dimensions to extract different levels of attribute features in the first attribute feature picture set. And connecting and coordinating adjacent FPN units by using a CBA module to unify the output dimension of the FPN units, so that characteristic splicing is conveniently carried out on the attribute characteristics in the first attribute characteristic picture set, and a second attribute characteristic picture set is obtained. So as to extract richer characteristic information and make more accurate results for subsequent attribute classification and identification.
Based on the above embodiment, as an alternative implementation, step 305: and performing splicing and fusion processing on each first attribute feature picture set through the FPN module to obtain at least one second attribute feature picture set. The method specifically comprises the following steps:
step 601: and respectively inputting at least one first attribute feature picture set into each FPN unit according to the sequence of the feature scale from high to low.
Step 602: and extracting the characteristic information in the first attribute characteristic picture set through the first FPN unit, and outputting the characteristic information to the next FPN unit.
Step 603: and executing the step of taking the next FPN unit as a target FPN unit, extracting the first characteristic information received by the target FPN unit and the characteristic information output by the previous FPN unit by the first characteristic information, fusing the first characteristic information and the characteristic information output by the previous FPN unit to obtain target characteristic information and a second attribute characteristic picture set, and outputting the target characteristic information to the next FPN unit until all FPN units are output.
Specifically, feature information is processed and fused step by step through the FPN module, and each FPN unit in the FPN module can extract feature information on different scales in the first attribute feature picture set and fuse the feature information into the second attribute feature picture set. The multi-scale attribute feature fusion method can help the next-stage detection head module to better capture attribute feature information in the second attribute feature picture set, and improves classification accuracy.
Step 306: and inputting each first attribute characteristic picture set into each branch head in the adjusted branch head module, inputting each second attribute characteristic picture set into each eye attribute characteristic detection head in the detection head module, and outputting a second classification result.
Illustratively, as shown in FIG. 2, the detector head module may include two detector heads, the number of detector heads being less than the number of FPN units in the FPN module. Due to the diversity of animal texture features, the eye features and iris texture feature morphology vary from species to species, from species to species. Even animals of the same breed may have differences in ocular characteristics (e.g., eye shape size, color) due to genetic factors, environmental factors. When the iris recognition of the animal is carried out, the ocular factors of the animal are required to be considered, and the iris recognition algorithm is correspondingly adjusted and optimized so as to improve the accuracy and stability of the iris recognition. Therefore, in the embodiment of the application, the detection head module is mainly used for predicting the eye characteristics of the animal in the animal head picture.
Specifically, each detection head in the detection head module may specifically include a rectangular frame regression subunit, a confidence branch subunit, and an eye attribute feature classification subunit. The rectangular frame regression subunit is used for positioning and regressing a rectangular frame of the animal eyes from the picture so as to determine the position and the size of the animal eyes in the second attribute characteristic picture set; the confidence branch subunit is used for carrying out confidence evaluation on the detected eyes of the animal and calculating the confidence score of an eye area; the eye attribute feature classification subunit performs classification prediction mainly according to the eye attribute features in the second attribute feature picture set. As shown in fig. 4, after the training of the branching head module by adopting the gradual thawing strategy is completed, the classification result output by the branching head and the classification result output by the detecting head are used as the second classification result together.
Step 307: and if the loss value of the second classification result does not reach the loss threshold value, adjusting the parameters of the initial classification model according to the loss value until the loss value reaches the loss threshold value, and obtaining the classification model after training.
For example, as shown in fig. 4, after the second classification result is obtained, the loss value of the second classification result may be compared with the loss threshold value, so that a stage training strategy is adopted to perform joint training on the backhaul module, the branch head module, the FPN module and the detection head module in the initial classification model, that is, the parameters of each module are adjusted to perform iterative training repeatedly until the loss value reaches the loss threshold value, so as to obtain a trained classification model.
It should be noted that, in the process of joint training, parameters of the Mask module need to be kept fixed. This is because the Mask module is a Mask for generating the target, and is not a critical module for predicting the classification result of the attribute features of the animals by the branch head module and the detection head module, and plays an auxiliary role in the whole classification process. Furthermore, if Mask modules are trained simultaneously, the training process of the entire initial classification model may be disturbed due to classification loss from other modules due to update gradients.
Therefore, in the stage training strategy, in order to better optimize other modules (such as a backstbone module, a branch head module, an FPN module and a detection head module) of the initial classification model, the parameter fixing of the Mask module is kept, so that the interference in the training process can be effectively reduced, other parts can better adapt to the task requirements, and the training process is more stable and efficient. Meanwhile, the Mask module has small influence on the second classification result, so that the calculation resources and the training time can be saved by fixing the parameters of the Mask module.
Based on the above embodiment, as an optional embodiment, the process of training the initial classification model by using the stage training strategy may specifically further include the following steps:
step 701: and determining the first training times in the total training times as target training times, starting a Mask unit corresponding to the target training times, and adjusting the initial classification model according to the loss value until the training times reach the target training times.
Step 702: determining the second training times in the total training times as target training times, repeatedly executing Mask units corresponding to the target training times, and adjusting parameters of the initial classification model according to the loss value until the training times reach the target training times, until the training times of the initial classification model reach the total training times.
Illustratively, the Mask module keeps the parameters fixed, but still plays an auxiliary role during training. In the process of training an initial classification model by adopting a stage training strategy, mask units in a Mask module are started in sequence according to specific times of training.
Specifically, when the number of times of round training (epoch) is less than or equal to the total number of times of training (t 1), and epoch=1, the Mask unit 1 is enabled, and the other Mask units are not enabled, and the initial classification model is trained. When epoch=2, mask unit 2 is enabled and the other Mask units are not enabled, training the initial classification model. And (3) when the epoch is greater than t1, no auxiliary Mask module is started any more, and the initial classification model is directly trained.
Through such a stage training strategy, each module of the initial classification model can be gradually optimized in the training process, and parameter updating is performed on the Mask module. Therefore, the method can be more effectively adapted to task demands, and the classification accuracy and performance of the model are improved. Meanwhile, according to specific conditions, the total training times can be set to control the stage training process.
The above embodiment describes the training process of the initial classification model, and on the basis of the above embodiment, the process of applying the classification model after training and classifying the animals will be described below. Fig. 5 is a schematic flow chart of an animal classification method according to an embodiment of the present application. The process may specifically include steps 801 to 803, specifically as follows:
Step 801: and obtaining a head map of the animal to be identified.
Step 802: and obtaining a classification number corresponding to the head picture through the trained classification model, wherein the classification number comprises the ID number of each attribute feature in the head picture.
Specifically, as shown in fig. 4, when the animal to be identified is in the monitoring range of the monitoring device, the monitoring device acquires a head picture of the animal to be identified when the animal to be identified is in the monitoring range, and sends the head picture to the execution device. The executing equipment carries out classification recognition on each attribute feature in the head picture of the animal to be recognized through a branch head module and a detection head module in the trained classification model, and a classification recognition result of each attribute feature and a corresponding ID number are obtained.
In one possible embodiment, the execution device may construct in advance an animal attribute feature data set in which the respective attribute features, the categories of the attribute features, and the mapping relationship between the ID numbers are stored. For example, the categories of hair category attribute features may include short hair, long hair, no hair, silk hair, corresponding ID numbers 00, 01, 10, 11; the categories of hair color attribute features may include black, white, gray, brown, yellow, black and white, gray and yellow and white, with corresponding ID numbers of 000, 001, 010, 011, 100, 110, 101, 111, etc. After the ID numbers corresponding to the attribute features are obtained, the ID numbers corresponding to the attribute features can be connected according to a preset sequence to obtain the classification numbers.
Step 803: and comparing the classification number with the classification numbers in the animal variety library to determine the variety of the animal to be identified.
The executing device also stores an animal variety library of classification numbers corresponding to the animal varieties, and the variety of the animal to be identified can be determined by comparing the obtained classification number of the animal to be identified with the classification number of the animal variety library.
In particular, in a possible embodiment, step 803 may specifically further comprise the following steps:
calculating the Hamming distance between the classification number and the classification number in the animal variety library to obtain a plurality of Hamming distances; and determining the variety corresponding to the minimum hamming distance in the hamming distances lower than the threshold as the variety of the animal to be identified.
Referring to fig. 6, a hamming distance comparison schematic diagram provided in an embodiment of the present application is shown. As shown in fig. 6, after determining the classification number of the animal to be identified, the classification number is compared with the classification numbers (cauchy 1 classification number, cauchy 2 classification number) in the breed library. The classification number can be composed of 0 and 1 codes, when the codes in the classification number in the variety library are different from the codes in the corresponding positions in the classification number of the animal to be identified, the different codes in the classification number in the variety library are converted, and the hamming distance is increased by 1 until the classification number in the variety library is completely consistent with the classification number of the animal to be identified. As shown in fig. 6, the hamming distance of the classification number of the cauchy 1 is 1, the hamming distance of the classification number of the cauchy 2 is 2, and finally, the taste of the animal to be identified is determined as cauchy 1.
It should be noted that, when the hamming distance between the classification number and the classification number in the animal breed library is calculated, a plurality of hamming distance values are obtained, and these values represent the degree of difference between the animal to be identified and different animal breeds. Smaller hamming distance values generally indicate that the animal to be identified is more similar to certain breeds, while larger hamming distance values indicate that the animal to be identified is more different from those breeds. When the hamming distance is higher than the threshold value, the difference between the animal to be identified and all varieties in all variety libraries is larger, and at the moment, the head picture of the animal to be identified is possibly unclear or caused by other factors, and the variety of the animal to be identified is possibly an unknown variety. Therefore, setting the threshold value can improve the accuracy and reliability of the recognition result.
Further, on the basis of the above embodiment, as an optional embodiment, after determining the variety of the animal to be identified, the identity of the animal to be identified may be determined according to the variety of the animal to be identified and the iris image in the head image.
Specifically, the iris picture of the animal to be identified in the head picture can be extracted, the texture features of the iris picture can be further extracted, and the extracted texture features are compared with the texture features of the animal to be identified in the iris database, so that the identity of the animal to be identified can be determined. Compared with the prior art, the iris texture feature comparison quantity is reduced, the complexity of comparison time is reduced, and the iris recognition efficiency is improved. In addition, as the variety of the animal to be identified carries the eye attribute characteristics, the related algorithm of iris identification can be adaptively adjusted and optimized according to the eye attribute of the animal to be identified, so that the accuracy of iris identification is improved.
The embodiment of the present application further provides a computer storage medium, where the computer storage medium may store a plurality of instructions, where the instructions are adapted to be loaded and executed by a processor, where the specific execution process may refer to the specific description of the foregoing embodiment, and the description is omitted herein.
The application also discloses electronic equipment. Referring to fig. 7, fig. 7 is a schematic structural diagram of an electronic device according to the disclosure in an embodiment of the present application. The electronic device 900 may include: at least one processor 901, at least one network interface 904, a user interface 903, memory 902, at least one communication bus 905.
Wherein a communication bus 905 is used to enable connected communications between these components.
The user interface 903 may include a Display screen (Display) and a Camera (Camera), and the optional user interface 903 may further include a standard wired interface and a wireless interface.
The network interface 904 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.
Processor 901 may include one or more processing cores, among other things. The processor 901 connects various parts within the entire server using various interfaces and lines, performs various functions of the server and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 902, and invoking data stored in the memory 902. Alternatively, the processor 901 may be implemented in hardware in at least one of digital signal processing (Digital Signal Processing, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 901 may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), and a modem, etc. The CPU mainly processes an operating system, a user interface diagram, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 901 and may be implemented by a single chip.
The Memory 902 may include a random access Memory (Random Access Memory, RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 902 includes a non-transitory computer readable medium (non-transitory computer-readable storage medium). Memory 902 may be used to store instructions, programs, code, sets of codes, or instruction sets. The memory 902 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the above-described various method embodiments, etc.; the storage data area may store data or the like involved in the above respective method embodiments. The memory 902 may also optionally be at least one storage device located remotely from the processor 901. Referring to fig. 7, an operating system, a network communication module, a user interface module, and an application program of an animal classification method may be included in a memory 902 as a computer storage medium.
In the electronic device 900 shown in fig. 7, the user interface 903 is mainly used for providing an input interface for a user, and acquiring data input by the user; and processor 901 may be used to invoke an application program in memory 902 that stores an animal classification method, which when executed by one or more processors 901, causes electronic device 900 to perform the method as described in one or more of the embodiments above. It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.
In the several embodiments provided herein, it should be understood that the disclosed apparatus may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, such as a division of units, merely a division of logic functions, and there may be additional divisions in actual implementation, such as multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some service interface, device or unit indirect coupling or communication connection, electrical or otherwise.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a memory, including several instructions for causing a computer device (which may be a personal computer, a server or a network device, etc.) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned memory includes: various media capable of storing program codes, such as a U disk, a mobile hard disk, a magnetic disk or an optical disk.
The foregoing is merely exemplary embodiments of the present disclosure and is not intended to limit the scope of the present disclosure. That is, equivalent changes and modifications are contemplated by the teachings of this disclosure, which fall within the scope of the present disclosure. Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure.
This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a scope and spirit of the disclosure being indicated by the claims.
Claims (9)
1. A method of training a classification model, comprising:
an initial classification model is built, wherein the initial classification model comprises a back one module, a Mask module, a branch head module, an FPN module and a detection head module, the Mask module comprises Mask units with different attribute characteristics, the branch head module comprises branch heads with different attribute characteristics, and the detection head module comprises at least one eye attribute characteristic detection head;
acquiring an animal head picture set, and performing feature extraction operation and downsampling operation on the animal head picture set through the backlight module to obtain a plurality of downsampled picture sets with feature scales from high to low;
marking each downsampled picture set through each Mask unit to obtain a first attribute characteristic picture set corresponding to each downsampled picture set;
Inputting each first attribute characteristic picture set into each branch head in the branch head module, and adjusting parameters of the branch head module according to the accuracy and the accuracy threshold of the output first classification result;
performing splicing and fusion processing on each first attribute feature picture set through an FPN module to obtain at least one second attribute feature picture set;
inputting each first attribute characteristic picture set into each branch head in the adjusted branch head module, inputting each second attribute characteristic picture set into each eye attribute characteristic detection head in the detection head module, and outputting a second classification result;
if the loss value of the second classification result does not reach the loss threshold value, adjusting the parameters of the initial classification model according to the loss value until the loss value reaches the loss threshold value, and obtaining a classification model after training;
the marking each downsampled picture set by each Mask unit to obtain a first attribute feature picture set corresponding to each downsampled picture set includes:
respectively inputting each downsampled picture set to a corresponding Mask unit according to the characteristic scale of each downsampled picture set;
Determining attribute feature point distribution in the downsampled picture set through the Mask unit, and obtaining a feature point probability distribution map according to the attribute feature point distribution;
determining a first area, in the feature point probability distribution map, in which at least one feature point distribution probability is greater than a threshold value;
and generating two-dimensional Gaussian distribution according to each first region and a probability distribution formula, and distributing weight values to characteristic points in the downsampled picture set according to the two-dimensional Gaussian distribution to obtain the first attribute characteristic picture set.
2. The method of claim 1, wherein inputting each of the first attribute feature picture sets into each of the branching heads in the branching head module, and adjusting parameters of the branching head module according to the output first classification result, comprises:
freezing each branch head in the branch head module;
inputting a first attribute characteristic picture set into a first branch head in the branch head module, thawing the first branch head, taking the first branch head as a target branch head, and adjusting parameters of the target branch head according to the precision and the precision threshold of a first classification result output by the target branch head until the first classification result reaches a preset precision;
Freezing the target branch head, continuously executing defrosting of the next branch head, taking the next branch head as the target branch head, and adjusting parameters of the target branch head according to the accuracy and the accuracy threshold value of the first classification result output by the target branch head until the first classification result reaches the preset accuracy, until the parameter adjustment of each branch head in the branch head module is completed.
3. The method for training a classification model according to claim 1, wherein the FPN module includes at least two FPN units, and the performing, by the FPN module, a stitching and fusing process on each of the first attribute feature picture sets to obtain at least one second attribute feature picture set includes:
according to the sequence of the feature scale from high to low, at least one first attribute feature picture set is respectively input into each FPN unit;
extracting feature information in the first attribute feature picture set through a first FPN unit, and outputting the feature information to a next FPN unit;
and executing the step of taking the next FPN unit as a target FPN unit, extracting the first characteristic information received by the target FPN unit and the characteristic information output by the last FPN unit by the first characteristic information, fusing the first characteristic information and the characteristic information output by the last FPN unit to obtain target characteristic information and a second attribute characteristic picture set, and outputting the target characteristic information to the next FPN unit until all FPN units are output.
4. The method of claim 1, wherein said adjusting parameters of the initial classification model according to the loss value comprises:
determining a first training frequency in the total training frequency as a target training frequency, starting a Mask unit corresponding to the target training frequency, and adjusting the initial classification model according to the loss value until the training frequency reaches the target training frequency;
and determining the second training times in the total training times as target training times, repeatedly executing the Mask unit corresponding to the target training times, and adjusting parameters of the initial classification model according to the loss value until the training times reach the target training times, until the training times of the initial classification model reach the total training times.
5. A method of classifying animals, comprising:
acquiring a head picture of an animal to be identified;
obtaining a classification number corresponding to the head picture through a trained classification model, wherein the classification number comprises an ID number of each attribute feature in the head picture, and the classification model is obtained after training through the classification model training method according to any one of claims 1-4;
And comparing the classification number with classification numbers in an animal variety library to determine the variety of the animal to be identified.
6. The method of claim 5, wherein the comparing the classification number with classification numbers in a library of animal species, after determining the species of the animal to be identified, further comprises:
and determining the identity of the animal to be identified according to the variety of the animal to be identified and the iris picture in the head picture.
7. The method of claim 5, wherein comparing the classification number with classification numbers in a library of animal species to determine the species of the animal to be identified comprises:
calculating the Hamming distance between the classification number and the classification number in the animal variety library to obtain a plurality of Hamming distances;
and determining the variety corresponding to the minimum hamming distance in the hamming distances lower than the threshold as the variety of the animal to be identified.
8. An electronic device comprising a processor, a memory, a user interface, and a network interface, the memory for storing instructions, the user interface and the network interface for communicating to other devices, the processor for executing the instructions stored in the memory to cause the electronic device to perform the method of any one of claims 1-4 or claims 5-7.
9. A computer readable storage medium storing instructions which, when executed, perform the method of any one of claims 1-4 or claims 5-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310852863.0A CN116704264B (en) | 2023-07-12 | 2023-07-12 | Animal classification method, classification model training method, storage medium, and electronic device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310852863.0A CN116704264B (en) | 2023-07-12 | 2023-07-12 | Animal classification method, classification model training method, storage medium, and electronic device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116704264A CN116704264A (en) | 2023-09-05 |
CN116704264B true CN116704264B (en) | 2024-01-30 |
Family
ID=87835887
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310852863.0A Active CN116704264B (en) | 2023-07-12 | 2023-07-12 | Animal classification method, classification model training method, storage medium, and electronic device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116704264B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117636400B (en) * | 2024-01-11 | 2024-07-23 | 中农华牧集团股份有限公司 | Method and system for identifying animal identity based on image |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108230240A (en) * | 2017-12-31 | 2018-06-29 | 厦门大学 | It is a kind of that the method for position and posture in image city scope is obtained based on deep learning |
CN108921026A (en) * | 2018-06-01 | 2018-11-30 | 平安科技(深圳)有限公司 | Recognition methods, device, computer equipment and the storage medium of animal identification |
CN111046858A (en) * | 2020-03-18 | 2020-04-21 | 成都大熊猫繁育研究基地 | Image-based animal species fine classification method, system and medium |
CN112508126A (en) * | 2020-12-22 | 2021-03-16 | 北京百度网讯科技有限公司 | Deep learning model training method and device, electronic equipment and readable storage medium |
CN113850890A (en) * | 2021-09-29 | 2021-12-28 | 北京字跳网络技术有限公司 | Method, device, equipment and storage medium for generating animal image |
CN114358178A (en) * | 2021-12-31 | 2022-04-15 | 东北林业大学 | Airborne thermal imaging wild animal species classification method based on YOLOv5 algorithm |
CN115035361A (en) * | 2022-05-11 | 2022-09-09 | 中国科学院声学研究所南海研究站 | Target detection method and system based on attention mechanism and feature cross fusion |
CN115546668A (en) * | 2022-10-13 | 2022-12-30 | 北京卓翼智能科技有限公司 | Marine organism detection method and device and unmanned aerial vehicle |
-
2023
- 2023-07-12 CN CN202310852863.0A patent/CN116704264B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108230240A (en) * | 2017-12-31 | 2018-06-29 | 厦门大学 | It is a kind of that the method for position and posture in image city scope is obtained based on deep learning |
CN108921026A (en) * | 2018-06-01 | 2018-11-30 | 平安科技(深圳)有限公司 | Recognition methods, device, computer equipment and the storage medium of animal identification |
CN111046858A (en) * | 2020-03-18 | 2020-04-21 | 成都大熊猫繁育研究基地 | Image-based animal species fine classification method, system and medium |
CN112508126A (en) * | 2020-12-22 | 2021-03-16 | 北京百度网讯科技有限公司 | Deep learning model training method and device, electronic equipment and readable storage medium |
CN113850890A (en) * | 2021-09-29 | 2021-12-28 | 北京字跳网络技术有限公司 | Method, device, equipment and storage medium for generating animal image |
CN114358178A (en) * | 2021-12-31 | 2022-04-15 | 东北林业大学 | Airborne thermal imaging wild animal species classification method based on YOLOv5 algorithm |
CN115035361A (en) * | 2022-05-11 | 2022-09-09 | 中国科学院声学研究所南海研究站 | Target detection method and system based on attention mechanism and feature cross fusion |
CN115546668A (en) * | 2022-10-13 | 2022-12-30 | 北京卓翼智能科技有限公司 | Marine organism detection method and device and unmanned aerial vehicle |
Also Published As
Publication number | Publication date |
---|---|
CN116704264A (en) | 2023-09-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111709409B (en) | Face living body detection method, device, equipment and medium | |
CN111754596B (en) | Editing model generation method, device, equipment and medium for editing face image | |
CN110555481B (en) | Portrait style recognition method, device and computer readable storage medium | |
CN107851191B (en) | Context-based priors for object detection in images | |
WO2023185785A1 (en) | Image processing method, model training method, and related apparatuses | |
CN111340013B (en) | Face recognition method and device, computer equipment and storage medium | |
CN110363138A (en) | Model training method, image processing method, device, terminal and storage medium | |
CN108090561B (en) | Storage medium, electronic device, and method and device for executing game operation | |
CN112381104B (en) | Image recognition method, device, computer equipment and storage medium | |
CN111723815B (en) | Model training method, image processing device, computer system and medium | |
KR20180055070A (en) | Method and device to perform to train and recognize material | |
US20230053911A1 (en) | Detecting an object in an image using multiband and multidirectional filtering | |
US20220012502A1 (en) | Activity detection device, activity detection system, and activity detection method | |
CN111652181B (en) | Target tracking method and device and electronic equipment | |
CN114387499A (en) | Island coastal wetland waterfowl identification method, distribution query system and medium | |
CN116704264B (en) | Animal classification method, classification model training method, storage medium, and electronic device | |
CN115131604A (en) | Multi-label image classification method and device, electronic equipment and storage medium | |
CN111008622B (en) | Image object detection method and device and computer readable storage medium | |
US20240249503A1 (en) | Image processing method and related apparatus | |
CN112215066A (en) | Livestock face image recognition method and device | |
CN117079339A (en) | Animal iris recognition method, prediction model training method, electronic equipment and medium | |
CN116051917B (en) | Method for training image quantization model, method and device for searching image | |
CN117036392A (en) | Image detection method and related device | |
CN117475187A (en) | Method, device, equipment and storage medium for training image classification model | |
KR20230164384A (en) | Method For Training An Object Recognition Model In a Computing Device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |