[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN116757255A - Method for improving weight reduction of mobile NetV2 distracted driving behavior detection model - Google Patents

Method for improving weight reduction of mobile NetV2 distracted driving behavior detection model Download PDF

Info

Publication number
CN116757255A
CN116757255A CN202310653803.6A CN202310653803A CN116757255A CN 116757255 A CN116757255 A CN 116757255A CN 202310653803 A CN202310653803 A CN 202310653803A CN 116757255 A CN116757255 A CN 116757255A
Authority
CN
China
Prior art keywords
model
training
network
weight reduction
behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310653803.6A
Other languages
Chinese (zh)
Inventor
白雪梅
李佳璐
张晨洁
胡汉平
史新瑞
侯聪聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changchun University of Science and Technology
Original Assignee
Changchun University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changchun University of Science and Technology filed Critical Changchun University of Science and Technology
Priority to CN202310653803.6A priority Critical patent/CN116757255A/en
Publication of CN116757255A publication Critical patent/CN116757255A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • G06V20/597Recognising the driver's state or behaviour, e.g. attention or drowsiness
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

In recent years, research on recognition of distracted driving behaviors has been greatly advanced, and a deep learning-based method is widely focused by more and more researchers, but most models have the problem of large weight files, and further have difficulty in practical application and deployment, so that light weight improvement of the models is necessary. Aiming at the problems that the existing distraction driving recognition algorithm model is too large and is difficult to adapt to low computing environment and the like, a lightweight network MobileNet V2 is selected as a main network and is improved, the calculation amount is reduced by replacing point-by-point convolution through a Ghost module, the problem of neuronal death is avoided by adding a LeakyReLU function, on the basis, model parameters are further reduced through a channel pruning algorithm, an improved MoblieNet V2 network model is trained, and finally an image to be detected is input into a detection model obtained through training, and the driving behavior type is output.

Description

Method for improving weight reduction of mobile NetV2 distracted driving behavior detection model
Technical Field
The invention relates to the field of image classification of deep learning, in particular to model weight reduction realized by improving a MobileNet V2 bottleneck structure and compressing a model.
Background
The traditional method for detecting the distraction driving behavior mainly comprises driving behavior monitoring based on facial features of a driver and distraction driving behavior monitoring based on body posture features of the driver. However, most of these methods rely on manually extracted texture or shape features, which cannot simultaneously satisfy the accuracy and real-time requirements of driving behavior monitoring, and have certain limitations in practical application.
With the development of deep learning, some researchers propose a deep learning-based distracted driving behavior recognition method, i.e. a deep learning method is adopted to recognize the face of a driver or locate the facial key points of the driver to judge that the driver has distracted driving behaviors. Although the accuracy is higher, the model structure is larger and redundant, the consumed computing resources are always increased, the training difficulty is high, the real-time performance is poor, and the requirement of equipment deployment in practical application is difficult to meet. In recent years, a lightweight network model which becomes a research hot spot can be used for carrying out lightweight reconstruction on a network in terms of volume and speed on the premise of maintaining accuracy as much as possible, so as to achieve the purposes of deployment and application.
The invention fully utilizes the depth separable rolling and residual error pouring structure by means of the MobileNet and the MobileNet V2 variant thereof proposed by Howard et al, thereby achieving the effects of real-time, high efficiency and easy deployment. However, due to the large amount of computation caused by the 1*1 convolution in mobilenv 2, the performance of such models in terms of weight reduction is still insufficient. Therefore, the Ghost module is used for replacing point-by-point convolution in the MobileNet V2 network, so that the operation amount is reduced, the Leaky ReLU function is used for replacing the original activation function, the problem of neuronal death is avoided, model parameters are further reduced through a channel pruning algorithm, the accuracy is improved well, and the purpose of network weight reduction is achieved.
Disclosure of Invention
The invention judges 10 driving behaviors of normal driving, left-hand short message sending, left-hand call making, right-hand short message sending, right-hand call making, radio operation, water drinking, body back turning, face arrangement and passenger speaking, takes a lightweight network model MobileNet V2 as a basis of a driving behavior recognition model, replaces point-by-point convolution with a Ghost module, reduces a large number of floating point operations, replaces an original activation function with a leak ReLU function, avoids the problem of neuronal death, further reduces model parameters through a channel pruning algorithm, improves the calculation efficiency of a neural network, and realizes end-to-end driver distraction driving behavior detection. The high-precision, high-efficiency and low-memory consumption distraction driving detection algorithm solves the problem that most of the current driving behavior recognition algorithm researches consider improving accuracy and neglect light weight.
The method can be realized by the following steps:
step one, downloading an open source data set State Farm data set, dividing a training set and a testing set according to 8:2, and preprocessing all the data sets.
And step two, replacing point-by-point convolution in the MobileNet V2 network by Ghostmodule, and replacing the ReLU6 function by using the leak ReLU function.
And thirdly, carrying out model compression on the structure by adopting a channel pruning algorithm.
And step four, setting training super parameters, and inputting the data set image into the improved MoblieNetV2 network model to obtain a complete training distraction driving behavior detection model.
And fifthly, inputting the image to be detected into a detection model obtained through training, and outputting a driving behavior detection result.
The invention has the advantages and beneficial effects as follows:
1. the novel light-weight distraction driving detection model can obtain higher test precision;
2. the method solves the problem of large calculation amount caused by point-to-point convolution in the model, greatly reduces the parameter number of the model on the premise of small fluctuation of accuracy, and is beneficial to practical application and deployment.
Drawings
FIG. 1 is a flow chart of the algorithm in the present invention.
Fig. 2 is a schematic representation of different distraction behavior patterns.
Fig. 3 is a schematic diagram of a Ghostmodule.
Fig. 4 shows a modified MobileNetV2 bottleneck structure.
Fig. 5 is a flow chart of a model pruning algorithm.
Fig. 6 is a schematic diagram of the channel pruning principle.
FIG. 7 shows the accuracy and loss curves during the training process of the proposed algorithm.
Detailed Description
The specific use process of the invention is realized by the following steps:
step one, downloading a State Farm data set. The State Farm dataset contains 10 types of actions: normal driving, sending a short message by the left hand, making a call by the left hand, sending a short message by the right hand, making a call by the right hand, operating a radio, drinking water, turning back the body, arranging the face and speaking with the passengers. The data set is a competition data set on a competition platform kagle, is a first distraction driving behavior identification data set which can be downloaded in a public way, and has 22424 marked pictures in the data set, and the size of the images is 480 multiplied by 460. Training set and test set this experiment was divided in 8:2.
After the data set is acquired, data preprocessing is carried out, and the actual size of the image in the State Farm data set is inconsistent with the size of the input image in the model, so that the size in the data set is uniformly adjusted to 224 multiplied by 224 of the input size of the model and randomly cut, and then the normalization function processing is carried out, so that the convergence of the model can be quickened by the image data after normalization processing.
And step two, introducing a Ghost module and a leak ReLU function to improve the MobileNet V2 network. Some convolution feature diagrams in convolution calculation are very similar, and another part of feature diagram can be obtained by linear transformation of part of feature diagrams. And generating another part of the characteristic diagram by using a part of the intrinsic characteristic diagram and using an operation with smaller calculation cost, wherein the intrinsic characteristic diagram and the generated characteristic diagram are cascaded according to channels to serve as an output characteristic diagram of the module. Aiming at the problem of large calculation amount of point-by-point convolution, ghostmodule is used for replacing point-by-point convolution in the MobileNetV2 inverse residual error module.
An activation function is typically used in neural networks to add non-linear factors to improve the expressive power of the model, and the equation for the ReLU function is shown in (1). The method is a common convolutional neural network activation function, gradient saturation and gradient disappearance can not occur when x is more than 0, the calculation complexity is low, exponential operation is not needed, and an activation value can be obtained only by a threshold value. However, when x is less than or equal to 0, the gradient is 0, the gradient of the neuron and the following neurons is always 0, no response to any data is generated, and the corresponding parameters are never updated, namely, the neurons are necrotic. The Leaky ReLU function introduces a very small alpha value as a gradient when x is less than or equal to 0 on the basis of the ReLU function, and the formula is shown as (2), so that neuronal necrosis can be avoided, and the gradient is supplemented.
ReLU(x)=max(0,x) (1)
And thirdly, further reducing model parameters by adopting a pruning algorithm. The pruning algorithm comprises the following three steps:
(1) Performing scaling factor sparsification training on the improved MobileNet V2 to obtain a model with sparse scaling factors so as to find unimportant channels in a network;
(2) In the pruning stage, pruning a channel corresponding to a scaling factor lower than a threshold value to obtain a pruned network;
(3) Retraining the pruning network to compensate for the precision loss caused by pruning.
And in the sparse training process, each channel of each convolution layer is allocated with a scaling factor gamma as an important basis for pruning, and under the regularized addition, all parameters finish tasks on the basis of gamma deviation of 0 or 1, and the parameters are multiplied with the input of the channel, so that the extracted characteristics of each channel of each layer produce different action effects, wherein the absolute value of the scaling factor represents the importance of the channel. Let Z be in and Zout Representing the input and output of BN layers; b represents the current mini-batch, and the conversion of BN layer is shown as (3) (4):
wherein ,represents the estimated value, mu B and σB The mean and standard deviation of B, gamma, epsilon and beta are trainable superparameters of BN layer, respectively.
Meanwhile, according to the change of parameters in a network, introducing a penalty term related to gamma, wherein the sparse training loss function is shown in a formula (5):
in the formula (5), (x, y) represents the input and output of the network, W represents a trainable parameter, f (x, W) represents the parameter input, the first term represents the loss during original network training, the second term represents the L1 regularization about γ, g (γ) represents the sparse induction penalty of the scaling factor, λ represents the hyper-parameter, and the effect is to balance the normal training loss and the proportion of the penalty of the channel scaling factor, Γ represents the value set of the scaling factor γ.
After the sparsification training is finished, a large number of sparse scaling coefficients exist in the pre-trained network, the scaling coefficients of the BN layer are ordered, and then pruning is carried out on the channels corresponding to the scaling coefficients lower than the threshold value.
And step four, setting training super parameters, inputting training images into the improved feature extraction network, training the model until convergence, and obtaining a completely trained distraction driving behavior detection model.
Firstly, inputting a training image into a model, then, calculating a cross entropy loss function with a known real label of the image to obtain a loss value, then, carrying out back propagation, updating model weights by using an Adam optimizer, repeating the operation of inputting the model again to complete a second iteration after the training of the data set is completed, and repeating the iteration operation until the loss value or the accuracy rate does not have large fluctuation within a section range, thus finishing the training.
In the invention, the initial learning rate is set to 0.0001, the batch_size is set to 16, the network model training process calculates the loss by using a cross entropy function as a distraction driving loss function, and the cross entropy function is a function for calculating cross entropy in Pytorch, and the formula is shown in (6):
where n is the input number, y i And training 100 epochs for real one_hot codes to obtain final distracted driving behavior network model parameters. The running result of the State Farm data set shows that the test accuracy of the model can reach 94.66%, and the model parameters only occupy 0.23M.
And fifthly, inputting the acquired image to be detected into a training-obtained distraction driving behavior detection model, and outputting a behavior type, namely a prediction result.

Claims (6)

1. The method for improving the weight reduction of the detection model of the distracted driving behavior of the MobileNet V2 is characterized by comprising the following steps of: the method is realized by the following steps:
step one, downloading an open source data set State Farm data set, dividing a training set and a testing set according to 8:2, and preprocessing all the data sets.
And step two, selecting the MobileNet V2 as a model, replacing point-by-point convolution by Ghostmodule, and replacing the original function by using a leakage ReLU function.
And thirdly, carrying out model compression on the structure by adopting a channel pruning algorithm.
And step four, setting training super parameters, and inputting training set images into the improved MoblieNetV2 network model to obtain a complete training distraction driving behavior detection model.
And fifthly, inputting the image to be tested into a detection model obtained through training, and outputting the driving behavior type.
2. The method for improving the weight reduction of the mobile netv2 model for detecting the behavior of the distracted driving of the vehicle according to claim 1, wherein the method comprises the following steps: before training, the images are preprocessed, the sizes in the data set are uniformly adjusted to 224×224 of the model input sizes, and the preprocessed image data set is obtained through normalization function processing.
3. The method for improving the weight reduction of the mobile netv2 model for detecting the behavior of the distracted driving of the vehicle according to claim 1, wherein the method comprises the following steps: aiming at the problem of large calculation amount of point-to-point convolution, replacing point-to-point convolution in the MobileNet V2 by using a Ghost module; the Leaky ReLU function introduces a very small alpha value as a gradient when x is less than or equal to 0 on the basis of the original function, can avoid neuronal necrosis, and supplements the gradient.
4. The method for improving the weight reduction of the mobile netv2 model for detecting the behavior of the distracted driving of the vehicle according to claim 1, wherein the method comprises the following steps: the pruning algorithm in the third step comprises the following steps: firstly, in order to find unimportant channels in a network, performing scaling factor sparsification training on the added improved MobileNet V2 to obtain a model with sparse scaling factors; and then in the pruning stage, pruning the channel corresponding to the scaling coefficient lower than the threshold value to obtain a pruned network, and retraining the pruned network to compensate the precision loss caused by pruning.
5. The method for improving the weight reduction of the mobile netv2 model for detecting the behavior of the distracted driving of the vehicle according to claim 1, wherein the method comprises the following steps: since each convolution layer in the detection network is added with a Batch Normalization (BN) layer, Z is set in and Zout Representing the input and output of BN layers; b represents the current mini-batch, BN layer transformation is as shown in (1) (2):
wherein ,represents the estimated value, mu B and σB The mean and standard deviation of B, gamma, epsilon and beta are trainable superparameters of BN layer, respectively. It can be seen that the BN layer has the possibility to convert standard linear activation into various scales.
Meanwhile, according to the change of parameters in a network, introducing a penalty term related to gamma, wherein the sparse training loss function is shown in a formula (3):
wherein (x, y) represents the input and output of the network, W represents a trainable parameter, f (x, W) represents a parameter input, the first term represents the loss during original network training, the second term represents the L1 regularization about γ, g (γ) represents the sparse induction penalty of the scaling factor, λ represents the hyper-parameter, and the effect is to balance the normal training loss and the proportion of the channel scaling factor penalty term loss, Γ represents the valued set of the scaling factor γ.
6. The method for improving the weight reduction of the mobile netv2 model for detecting the behavior of the distracted driving of the vehicle according to claim 1, wherein the method comprises the following steps: setting super parameters, setting an initial learning rate to be 0.0001, setting a batch_size to be 16, and training 100 epochs to obtain final distracted driving behavior network model parameters.
CN202310653803.6A 2023-06-05 2023-06-05 Method for improving weight reduction of mobile NetV2 distracted driving behavior detection model Pending CN116757255A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310653803.6A CN116757255A (en) 2023-06-05 2023-06-05 Method for improving weight reduction of mobile NetV2 distracted driving behavior detection model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310653803.6A CN116757255A (en) 2023-06-05 2023-06-05 Method for improving weight reduction of mobile NetV2 distracted driving behavior detection model

Publications (1)

Publication Number Publication Date
CN116757255A true CN116757255A (en) 2023-09-15

Family

ID=87947008

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310653803.6A Pending CN116757255A (en) 2023-06-05 2023-06-05 Method for improving weight reduction of mobile NetV2 distracted driving behavior detection model

Country Status (1)

Country Link
CN (1) CN116757255A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117910536A (en) * 2024-03-19 2024-04-19 浪潮电子信息产业股份有限公司 Text generation method, and model gradient pruning method, device, equipment and medium thereof

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117910536A (en) * 2024-03-19 2024-04-19 浪潮电子信息产业股份有限公司 Text generation method, and model gradient pruning method, device, equipment and medium thereof
CN117910536B (en) * 2024-03-19 2024-06-07 浪潮电子信息产业股份有限公司 Text generation method, and model gradient pruning method, device, equipment and medium thereof

Similar Documents

Publication Publication Date Title
CN110020682B (en) Attention mechanism relation comparison network model method based on small sample learning
CN108764471B (en) Neural network cross-layer pruning method based on feature redundancy analysis
CN110135580B (en) Convolution network full integer quantization method and application method thereof
CN111461322B (en) Deep neural network model compression method
CN111079781A (en) Lightweight convolutional neural network image identification method based on low rank and sparse decomposition
CN109902745A (en) A kind of low precision training based on CNN and 8 integers quantization inference methods
CN112699958A (en) Target detection model compression and acceleration method based on pruning and knowledge distillation
Paupamah et al. Quantisation and pruning for neural network compression and regularisation
CN107085704A (en) Fast face expression recognition method based on ELM own coding algorithms
CN104866900A (en) Deconvolution neural network training method
Yang et al. Harmonious coexistence of structured weight pruning and ternarization for deep neural networks
CN112329922A (en) Neural network model compression method and system based on mass spectrum data set
CN114186672A (en) Efficient high-precision training algorithm for impulse neural network
CN112183742A (en) Neural network hybrid quantization method based on progressive quantization and Hessian information
CN115966010A (en) Expression recognition method based on attention and multi-scale feature fusion
CN117333497A (en) Mask supervision strategy-based three-dimensional medical image segmentation method for efficient modeling
CN113610227A (en) Efficient deep convolutional neural network pruning method
Li et al. A deep learning method for material performance recognition in laser additive manufacturing
CN113516133A (en) Multi-modal image classification method and system
CN116757255A (en) Method for improving weight reduction of mobile NetV2 distracted driving behavior detection model
CN110942106A (en) Pooling convolutional neural network image classification method based on square average
Qi et al. Learning low resource consumption cnn through pruning and quantization
CN116343109A (en) Text pedestrian searching method based on self-supervision mask model and cross-mode codebook
CN115223158A (en) License plate image generation method and system based on adaptive diffusion prior variation self-encoder
CN113962262A (en) Radar signal intelligent sorting method based on continuous learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination