CN113516099A

CN113516099A - Traffic behavior recognition method and device, electronic equipment and storage medium

Info

Publication number: CN113516099A
Application number: CN202110873586.2A
Authority: CN
Inventors: 范佳柔; 甘伟豪; 武伟
Original assignee: Zhejiang Shangtang Technology Development Co Ltd
Current assignee: Zhejiang Shangtang Technology Development Co Ltd
Priority date: 2021-07-30
Filing date: 2021-07-30
Publication date: 2021-10-19
Also published as: WO2023005275A1

Abstract

The application provides a traffic behavior recognition method and device, electronic equipment and a storage medium. The method may include: acquiring an image to be identified; determining a relevant vehicle area which is associated with a rider area in the image to be recognized and comprises a vehicle and at least one human body in the vehicle area which is comprised by the image to be recognized; carrying out manned number identification on the rider area to obtain a manned number identification result, and carrying out vehicle type identification on the associated vehicle area to obtain a vehicle type identification result; and determining whether the target rider in the rider area has illegal manned behavior according to the manned number identification result and the vehicle type identification result.

Description

Traffic behavior recognition method and device, electronic equipment and storage medium

Technical Field

The present application relates to computer technologies, and in particular, to a traffic behavior recognition method and apparatus, an electronic device, and a storage medium.

Background

As regulatory authorities increase the level of supervision, traffic behaviors need to be identified. In some scenarios, the traffic behavior recognition may include recognition of non-motor vehicle manned behavior, requiring penalties and safety education if an offending manned behavior is found.

At present, aiming at the identification of the illegal manned behaviors of non-motor vehicles, the manned number is mainly determined by identifying the number of heads or bodies appearing in an image, and if the manned number is excessive, the illegal manned behavior is determined.

This includes at least the following drawbacks:

on one hand, in a non-motor vehicle manned scene, people are close to each other, human bodies and human heads are easy to be shielded, so that accurate human bodies or human head quantity cannot be obtained, and the manned quantity identification is wrong.

On the other hand, different types of non-motorized vehicles have different requirements for the number of people. The existing method cannot carry out legality identification on different types of non-motor vehicle manned behaviors.

Disclosure of Invention

In view of the above, the present application at least discloses a traffic behavior recognition method. The method may include:

acquiring an image to be identified; determining a relevant vehicle area which is associated with a rider area in the image to be recognized and comprises a vehicle and at least one human body in the vehicle area which is comprised by the image to be recognized; carrying out manned number identification on the rider area to obtain a manned number identification result, and carrying out vehicle type identification on the associated vehicle area to obtain a vehicle type identification result; and determining whether the target rider in the rider area has illegal manned behavior according to the manned number identification result and the vehicle type identification result.

In some embodiments, the determining that the to-be-recognized image includes a vehicle region, an associated vehicle region associated with a rider region in the to-be-recognized image, includes: detecting the image to be identified to obtain the vehicle area and the rider area; and determining a target vehicle region with the maximum coincidence degree with the rider region from the obtained vehicle regions, and determining the target vehicle region as a related vehicle region related to the rider region.

In some embodiments, the determining that the to-be-recognized image includes a vehicle region, an associated vehicle region associated with a rider region in the to-be-recognized image, includes: detecting the image to be identified to obtain the vehicle area and the rider area; determining an association score between the vehicle region and the rider region through a pre-trained association score prediction model; and determining the target vehicle area with the highest association score with the rider area in the vehicle areas as the associated vehicle area associated with the rider area.

In some embodiments, the identifying the number of people in the rider area to obtain a result of identifying the number of people in the rider area includes: identifying the number of people in the rider area to obtain the number of people and a corresponding first confidence coefficient; in response to the first confidence reaching a first confidence threshold, determining the number of people as a number of people recognition result for the rider region; the vehicle type recognition of the associated vehicle area to obtain a vehicle type recognition result includes: identifying the vehicle type of the associated vehicle area to obtain the vehicle type and a corresponding second confidence coefficient; determining the vehicle type as a vehicle type identification result for the vehicle region in response to the second confidence level reaching a second confidence level threshold.

In some embodiments, the determining whether the target rider in the rider area has the illegal manned behavior according to the manned number recognition result and the vehicle type recognition result includes: in response to the manned number recognition result being a first recognition result, determining that the target rider violates the manned rule; the first identification result represents that the number of the persons reaches a first preset number; or, in response to that the manned number recognition result is a second recognition result and the vehicle type represented by the type recognition result is a preset non-motor vehicle type, determining that the target rider violates the manned rule; the second identification result represents that the number of the persons reaches a second preset number, and the second preset number is smaller than the first preset number; or, in response to the number of people carrying the vehicle characterized by the number of people carrying the vehicle is the second recognition result and the type of the vehicle characterized by the type recognition result is not the preset non-motor vehicle type, determining that the target rider does not violate the number of people carrying the vehicle; or, in response to the man-carrying number recognition result being a third recognition result, determining that the target rider does not violate a man-carrying rule; the third identification result represents that the number of the persons is a third preset number, and the third preset number is smaller than the second preset number; or, in response to the person-carrying number recognition result being a fourth recognition result, determining that the traffic behavior recognition for the target rider is invalid.

In some embodiments, the fourth recognition result characterizes the image to be recognized that at least one of the following invalid scenes occurs: a scenario of a rider cart; a scene in which a rider stands beside a vehicle; a scene where a plurality of riders are in close proximity to each other; low-definition scenes; a scene in which the vehicle is occluded.

In some embodiments, the method further comprises: and sending out warning information in response to the target rider violation manned.

In some embodiments, the result of the people number recognition is obtained by detecting the rider area through a people recognition network, wherein the training method of the people recognition network is as follows: obtaining a first training sample, wherein the first training sample comprises sample images of a plurality of riders and first labeling information corresponding to the number of people carried in each image, and the first labeling information comprises one of the following labels: 1 person, 2 persons, 3 persons, an invalid tag, the invalid tag comprising at least one of: the rider cart, the rider standing beside the cart, the multiple riders abutting against each other, low definition, and vehicle sheltered; inputting the first training sample into a preset first initial network to obtain a sample manned number identification result of each sample image; and optimizing the first initial network based on the sample manned number identification result and the first loss determined by the first marking information to obtain the manned identification network.

In some embodiments, the vehicle identification result is obtained by detecting the vehicle area through a vehicle identification network, wherein the training method of the vehicle identification network is as follows: acquiring a second training sample, wherein the second training sample comprises sample images of a plurality of vehicles and second marking information corresponding to the vehicle type of each image; inputting the second training sample into a preset second initial network to obtain a sample vehicle type identification result of each sample image; and optimizing the second initial network based on the sample vehicle type identification result and the second loss determined by the second marking information to obtain the vehicle identification network.

The present application further provides a traffic behavior recognition device, the device including: the acquisition module is used for acquiring an image to be identified; the first determination module is used for determining a relevant vehicle area which is associated with a rider area in the image to be recognized in a vehicle area included in the image to be recognized, wherein the rider area comprises a vehicle and at least one human body; the identification module is used for identifying the number of people in the rider area to obtain a result of identifying the number of people in the rider area and identifying the vehicle type in the associated vehicle area to obtain a result of identifying the vehicle type; and the second determination module is used for determining whether the target rider in the rider area has illegal manned behavior according to the manned number recognition result and the vehicle type recognition result.

The present application further proposes an electronic device, the device comprising: a processor; a memory for storing processor-executable instructions; wherein the processor executes the executable instructions to implement the traffic behavior recognition method as shown in any one of the foregoing embodiments.

The present application also proposes a computer-readable storage medium, which stores a computer program for causing a processor to execute the traffic behavior recognition method as shown in any of the preceding embodiments.

In the technical scheme shown in the application, on one hand, the method utilizes the neural network model to identify the number of people in the rider area, and can learn the number of people in the rider area through model self-adaptation, so that the accurate number of people can be identified even if shielding and the like occur in the image to be identified, and the identification accuracy of traffic behaviors is improved.

On the other hand, the legality of the manned behavior can be identified according to the result of identifying the number of people and the result of identifying the vehicle type, and the vehicle type and the number of people are comprehensively considered during the legality identification, so that the effect of identifying the traffic behavior aiming at different types of vehicles is achieved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

In order to more clearly illustrate one or more embodiments of the present application or technical solutions in the related art, the drawings needed to be used in the description of the embodiments or the related art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in one or more embodiments of the present application, and other drawings can be obtained by those skilled in the art without inventive exercise.

Fig. 1 is a flow chart illustrating a method of traffic behavior recognition according to the present application;

FIG. 2 is a flow chart illustrating a method of determining an associated vehicle zone according to the present application;

FIG. 3 is a schematic view of an object detection process shown in the present application;

FIG. 4 is a flow chart of an associated vehicle zone determination method illustrated herein;

FIG. 5 is a flowchart illustrating a manned behavior recognition method according to the present application;

FIG. 6 is a schematic diagram of a rule for determining illegal manned behavior according to the present application;

fig. 7 is a schematic structural diagram of a traffic behavior recognition apparatus according to the present application;

fig. 8 is a schematic diagram of a hardware structure of an electronic device shown in the present application.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It should also be understood that the word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination," depending on the context.

The application aims to provide a traffic behavior identification method (hereinafter referred to as an identification method).

Referring to fig. 1, fig. 1 is a flowchart illustrating a method of a traffic behavior recognition method according to the present application.

As shown in fig. 1, the method may include:

s102, acquiring an image to be identified;

s104, determining a related vehicle area which is related to a rider area in the image to be recognized and comprises a vehicle and at least one human body in the vehicle area comprising the image to be recognized;

s106, carrying out manned number identification on the rider area to obtain a manned number identification result, and carrying out vehicle type identification on the associated vehicle area to obtain a vehicle type identification result;

and S108, determining whether the target rider in the rider area has illegal manned behavior according to the manned number identification result and the vehicle type identification result.

In the traffic behavior recognition method disclosed by the application, on one hand, the number of people in the rider area is recognized by the neural network model, and the number of people in the rider area can be learned in a model self-adaptive manner, so that the accurate number of people can be recognized even if shielding and the like occur in an image to be recognized, and the accuracy of traffic behavior recognition is improved.

The recognition method illustrated in fig. 1 may be applied to an electronic device. The electronic device may execute the method by loading software logic corresponding to the device method. The electronic equipment can be a notebook computer, a server, a mobile phone, a PAD terminal and the like. The type of the electronic device is not particularly limited in this application. The electronic device may also be a client device or a server device, and is not particularly limited herein. It is understood that the identification method can be executed by only the client device or the server device, or can be executed by the client device and the server device in cooperation. The server may be a cloud constructed by a single server or server machine. The following description will be given taking an execution body as an electronic device (hereinafter simply referred to as a device) as an example.

In some embodiments, the device may acquire the image to be identified from an image capture device deployed at the road site. The image acquisition equipment can be used for acquiring images aiming at a preset visual field area of a road site at a fixed angle or an adjustable angle, and the equipment can be used for sending the acquired images to be identified to the equipment.

The image to be recognized may include at least one of a vehicle and a rider. The vehicle may be a non-motor vehicle. The non-motor vehicle may be a motorcycle, a tricycle, an electric vehicle, or the like. The rider may refer to a person having driving behavior.

After acquiring the image to be recognized, the apparatus may perform S104. The rider region shown in the present application refers to a region surrounded by the detection frame of the target rider in the image to be recognized. The target rider may be designated according to business needs. For example, the target rider may be a randomly selected rider from among riders included in the image to be recognized. For another example, the target rider may be a rider with the highest definition among riders included in the image to be recognized. As another example, the target rider may be a rider about to leave the field of view. For another example, each of the riders included in the image to be recognized may be designated as a target rider, respectively. The rider area may include a vehicle and at least one human body.

The vehicle region shown in the present application refers to a region surrounded by the detection frame of the vehicle in the image to be recognized.

In the present application, the associated vehicle region associated with the rider region may be determined at least by the relevance prediction score or the degree of coincidence between the rider region and the vehicle region. The association may be characterized by the rider region being closely spatially connected with the associated vehicle region. The following description will be made for each of the two modes.

In some embodiments, the target vehicle may be determined by a degree of overlap between the rider region and the vehicle region.

Referring to fig. 2, fig. 2 is a flowchart illustrating a method for determining a related vehicle area according to the present application.

As shown in fig. 2, when S104 is executed, S202 may be executed to detect the image to be recognized, and obtain the vehicle region and the rider region. S204 may then be executed to determine a target vehicle region having the greatest degree of overlap with the rider region among the obtained vehicle regions, and determine the target vehicle region as an associated vehicle region associated with the rider region.

Therefore, the target vehicle region with the maximum coincidence degree with the rider region is determined as the associated vehicle region associated with the rider region, so that the accurate associated vehicle region can be determined by utilizing the spatial association relationship between the vehicle and the rider, the type of the vehicle driven by the target rider can be accurately determined, and the accuracy of traffic behavior recognition can be improved. In some embodiments, in S202, object detection may be performed by an object detection model, so as to obtain detection frames corresponding to the rider and the vehicle in the image to be recognized respectively; then, a region surrounded by a target detection frame corresponding to a target rider in the image to be recognized can be determined as the rider region; and determining a region surrounded by the detection frame corresponding to the vehicle in the image to be recognized as the vehicle region.

The object recognition network may be a model constructed based on RCNN (Region Convolutional Neural Networks), FAST-RCNN (FAST Region Convolutional Neural Networks), or FASTER-RCNN (FASTER Region Convolutional Neural Networks). The network structure of the object detection model is not particularly limited in the present application.

Referring to fig. 3, fig. 3 is a schematic view illustrating an object detection process according to the present application. Fig. 3 is a schematic diagram illustrating an object detection flow, and is not intended to limit the present invention.

The object detection model 30 shown in fig. 3 may be a model built based on the FASTER-RCNN network. The model may include at least a backbone Network (backbone)31, an RPN (Region-based Convolutional Neural Network) 32, and an RCNN (Region-based Convolutional Neural Network) 33.

The backbone network 31 may perform convolution operation on the image to be recognized for several times to obtain a target feature map of the image to be recognized. The RPN32 is used to process the target feature map to obtain anchors (anchor frames) corresponding to the respective riders and vehicles in the image to be recognized. The RCNN33 is configured to perform bbox (bounding boxes, detection boxes) regression and classification according to the anchor frame output by the RPN and the target feature map output by the backbone network, so as to obtain a rider frame corresponding to each rider in the image to be identified and a vehicle frame corresponding to each vehicle. In some examples, position information and/or size information for each object box may be obtained. In some embodiments, the position information and/or size information of the object box may be characterized by 4 vertex coordinates.

In some examples, the subject detection model may be first supervised trained over several training samples. In some examples, the position and size information of the object frame corresponding to each object (including the rider and the vehicle) in the sample images can be labeled to obtain training samples. The model can then be supervised and trained with the training samples in a conventional training mode until the model converges.

After the training is completed, the object detection model may be used to perform object detection on the image to be recognized, so as to obtain a rider frame corresponding to each rider in the image to be recognized and a vehicle frame corresponding to each vehicle. If multiple riders and/or multiple vehicles are included in the image, different rider frames and/or different vehicle frames may also be numbered in the recognition result.

After obtaining the rider frame and the vehicle frame included in the image to be recognized, a target rider frame corresponding to a target rider may be selected, and a region surrounded by the target rider frame in the image to be recognized is determined as the rider region, and a region surrounded by the vehicle frame in the image to be recognized is determined as the vehicle region.

Thereafter, in executing S204, the degree of coincidence between each vehicle region and the rider region may be calculated separately. Then, the vehicle regions may be sorted in the order of the calculated degrees of overlap from large to small, and the vehicle region that is ranked first may be determined as the target vehicle region. Thereafter, the target vehicle region may be determined as an associated vehicle region associated with the rider region.

In some embodiments, the degree of overlap may include a ratio of a region where the vehicle region intersects the rider region, to a region where the vehicle region merges with the rider region. I.e., the degree of overlap between the vehicle region and the rider region is characterized by an Intersection over Union ratio IoU between the two.

In making the IoU calculation, it may be determined whether the vehicle region (hereinafter referred to as region 1) and the rider region (hereinafter referred to as region 2) coincide. If they coincide, the area intersection ratio IoU (region 1, region 2) between region 1 and region 2 can be obtained by dividing the area of the region where region 1 intersects region 2 by the area of the region where region 1 merges with region 2.

Suppose the coordinates of the upper left corner of region 1 are (p _ x1, p _ y1) and the coordinates of the lower right corner are (p _ x2, p _ y 2). The coordinates of the upper left corner of the region 2 are (h _ x1, h _ y1), and the coordinates of the lower right corner are (h _ x2, h _ y 2).

If the value corresponding to p _ x1 > h _ x2| | p _ x2 < h _ x1| | p _ y1 > h _ y2| | p _ y2 < h _ y1 is 1, it may be determined that the region 1 does not coincide with the region 2, that is, it may be determined that the vehicle corresponding to the region 1 is not spatially associated with the target rider corresponding to the region 2.

If the value corresponding to the formula p _ x1 > h _ x2| | | p _ x2 < h _ x1| | p-y1 > h _ y2| | p _ y2 < h _ y1 is 0, the length Len of the intersected region may be further determined according to the formula Len ═ min (p _ x2, h _ x2) -max (p _ x 1-h _ x1), and the width Wid of the intersected region may be determined according to the formula Wid ═ min (p _ y2, h _ y2) -max (p _ y 1-h _ y 1).

After the length Len and the width Wid are determined, the area S1 of the region where the region 1 and the region 2 intersect can be obtained according to the formula S1 ═ Len × Wid.

Then, the area of the region where the region 1 merges with the region 2 may be determined according to the formula S2 ═ S (p) + S (h) -S1. Wherein: s (p) (p _ y 2-p _ y1) (p _ x 2-p _ x 1);

S(h)＝(h_y2–h_y1)*(h_x2–h_x1)。

finally, according to the formula IoU-S1/S2, the degree of overlap between the vehicle region and the rider region may be determined. Therefore, the contact ratio between the vehicle region and the rider region can be accurately calculated, so that the associated vehicle region associated with the rider region is accurately determined, and the accuracy of traffic behavior identification is improved.

In some embodiments, the target vehicle may also be determined by an association prediction score between the rider and the vehicle.

Referring to fig. 4, fig. 4 is a flowchart illustrating a method for determining a related vehicle area according to the present application.

As shown in fig. 4, when S104 is executed, S402 may be executed to detect the image to be recognized, and obtain the vehicle region and the rider region. Then, S404 is executed to determine a relevance score between the vehicle region and the rider region through a pre-trained relevance score prediction model. S406 may then be performed to determine, as an associated vehicle region associated with the rider region, a target vehicle region having a highest association score with the rider region, among the vehicle regions.

Therefore, the relevance between the rider area and the vehicle area is accurately represented through the relevance score, so that the relevant vehicle area with the strongest relevance to the rider area can be determined, the type of the vehicle driven by the target rider can be accurately determined, and the accuracy of traffic behavior identification is improved. The description of S402 may refer to the description of S202, and will not be described in detail here.

The relevance score prediction network can be a network constructed based on a deep learning network. When the network is trained, images including a plurality of pairs of vehicle regions and rider regions may be acquired, and then relevance scores between each pair of vehicle regions and rider regions are labeled to obtain a plurality of training samples. Wherein if the rider area is associated with the vehicle area, the association score is labeled 1; otherwise labeled 0. The network may then be supervised trained with training samples until the network converges. And after the training is finished, predicting the relevance score between the vehicle area and the rider area of the image to be recognized by using a relevance score prediction network.

After determining the associated vehicle region, the apparatus may proceed to S106. The object region (including the rider region and the vehicle region) in this step may be a region surrounded by an object frame corresponding to the object in the image to be recognized. The object region may carry image features related to the object.

The rider area described herein may include a first image feature related to rider manned behavior. For example, the first image feature may include a vehicle driven by a rider and an image feature corresponding to a human body carried on the vehicle. The number of people can be judged through the first image characteristics.

The vehicle region described herein may include a second image feature associated with a vehicle type. For example, the second image feature may include a corresponding image feature of the vehicle. The type of the vehicle can be judged through the second image characteristic.

In some embodiments, in executing S106, S1062 may be executed to perform people-carrying number recognition on the rider area to obtain a people-carrying number recognition result. And S1064, identifying the vehicle type of the associated vehicle area to obtain a vehicle type identification result. The execution order of S1062 and S1064 is not limited in this application.

In some embodiments, in performing S1062, a rider area map corresponding to the rider area may be acquired first. In some embodiments, a rider frame corresponding to the target rider and the received image to be recognized (or a target feature map obtained by feature extraction of the image to be recognized by using a backbone network) may be input to the region feature extraction unit, so as to obtain a rider region map corresponding to the target rider.

The Region feature extraction unit may be an ROI Align (Region of interest feature alignment) unit or an ROI pool (Region of interest feature Pooling) unit. The region feature extraction unit may be configured to perform processing such as pooling, convolution, and the like on a rider region surrounded by the rider frame, to obtain a rider region map. The rider region map may include high-dimensional or low-dimensional image features.

After the rider area map is obtained, the rider area map may be subjected to manned number recognition to obtain a manned number recognition result.

In some embodiments, people identification may be performed by a pre-trained people identification model. The people number recognition model may comprise a classifier built based on a neural network. The recognition result of the number of people output by the model can comprise a first recognition result, a second recognition result, a third recognition result and confidence degrees corresponding to the recognition results respectively. And the first preset identification result represents that the number of the manned persons reaches a first preset number. And the second preset identification result represents that the number of the manned persons reaches a second preset number. And the third identification result represents that the number of the persons is a third preset number. The first preset quantity, the second preset quantity and the third preset quantity can be set according to business requirements. For example, the first preset number may be 3, the second preset number may be 2, and the first preset number may be 1.

When determining the final manned number recognition result, the recognition result corresponding to the highest confidence coefficient may be selected. For example, the recognition result obtained by classifying the number of people in the rider region map by using the number of people recognition model indicates: the confidence degrees of the first recognition result, the second recognition result and the third recognition result are respectively 0.7, 0.2 and 0.1. The recognition result of the number of people can be determined to be the first recognition result corresponding to the highest confidence coefficient of 0.7.

When the model is trained, a training sample with the manned number labeling information can be obtained, and then multiple rounds of iteration are carried out by using the training sample in a supervised training mode until the model converges. After training is completed, the manned number can be identified by using the model. Therefore, the accuracy of the download people number identification of various situations (including occlusion situations) can be improved by utilizing the characteristic of the neural network adaptive learning.

Traffic behavior recognition may not be needed or performed normally in many scenarios. This type of scenario may be referred to herein as an invalid scenario. For example, although the rider is included in a rider-powered cart scenario and a rider-standing-beside scenario, including a vehicle, no driving behavior occurs by the rider, and thus no human-carrying behavior needs to be detected in such scenarios. For another example, in a scene in which a plurality of riders are close to each other, a low-definition scene, and a scene in which a vehicle is blocked, since recognizability of the riders or the vehicles in an image is too low, the riders or the vehicles may not be normally recognized, and thus, traffic behavior recognition may not be normally performed.

In order to solve such problems, a fourth recognition result indicating that the current recognition is invalid may be added to the recognition result of the number of persons who is obtained after the number of persons is recognized in the rider area. If the result of identifying the number of people in the rider area is the fourth identification result, the situation in the rider area can be indicated as an invalid situation, and the traffic behavior identification is not needed or can not be carried out, so that the traffic behavior identification of the rider area is not needed.

In the foregoing case, the recognition result of the number of people carried by the person carrying number output by the aforementioned recognition model of the number of people carried by the person carrying number may include the first recognition result, the second recognition result, the third recognition result, the fourth recognition result, and the confidence degrees corresponding to the respective recognition results.

The fourth recognition result represents that the image to be recognized has at least one of the following invalid scenes:

a scenario of a rider cart; a scene in which a rider stands beside a vehicle; a scene where a plurality of riders are in close proximity to each other; low-definition scenes; a scene in which the vehicle is occluded.

When determining the final manned number recognition result, the recognition result corresponding to the highest confidence coefficient may be selected. For example, the recognition result obtained by classifying the number of people in the rider region map by using the number of people recognition model indicates: the confidence degrees of the first recognition result, the second recognition result, the third recognition result and the fourth recognition result are respectively 0.1, 0.2, 0.1 and 0.6. The recognition result of the number of the persons can be determined to be an invalid recognition result corresponding to the highest confidence coefficient of 0.6.

In some embodiments, the training method of the people recognition network may include S11-S13.

Wherein, S11, a first training sample is obtained.

The first training sample comprises sample images of a plurality of riders and first labeling information corresponding to the number of people carrying in each image, and the first labeling information comprises one of the following labels: 1 person, 2 persons, 3 persons, an invalid tag, the invalid tag comprising at least one of: the rider cart, the rider standing beside the cart, the multiple riders abutting against each other, low definition, and vehicle sheltered;

and S12, inputting the first training sample into a preset first initial network to obtain a sample manned number identification result of each sample image.

The first initial network may be any type of neural network. The first initial network may output a person number identification result.

S13, optimizing the first initial network based on the sample manned number identification result and the first loss determined by the first marking information to obtain the manned identification network.

After the calculation result is obtained for the first training sample, a first loss may be determined according to the first labeling information, and the parameter of the first initial network is updated through a back propagation operation to complete one parameter iteration. In some embodiments, the number of parameter iterations may be preset, and the vehicle identification network may be obtained after completing the preset number of iterations for the first initial network.

By the training method, when the manned quantity is identified, on one hand, invalid scenes in which manned behavior detection traffic behavior identification cannot be carried out or is not required can be reduced, and manned behavior detection traffic behavior identification efficiency is improved; on the other hand, the number of people can be accurately identified, and the traffic behavior identification effect is improved.

In some embodiments, after the number of people carriers is identified for the rider region, the number of people carriers and a corresponding first confidence level are obtained, and the number of people carriers can be determined as a result of the identification of the number of people carriers for the rider region in response to the first confidence level reaching a first confidence level threshold.

The first confidence threshold may be set according to a traffic situation. For example, if the confidence degree of the model output is 0.7 for the assumption that the number of people is recognized as 1, the first confidence degree of the recognition result of the number of people is 0.7. The confidence level may characterize the confidence level of a person with a number of 1 person.

By setting a confidence threshold and outputting the identification result of the number of people when the confidence reaches the confidence threshold standard, the reliability of the output identification result can be ensured, and the accuracy of the traffic behavior identification is further ensured.

In some embodiments, multiple traffic behavior recognition scenarios can be flexibly accommodated by adjusting the magnitude of the first confidence threshold.

For example, in a scenario where priority is given to ensuring accuracy of an offending manned behavior, the first confidence threshold may be set to a higher value (e.g., 0.9). Therefore, the reliability of the output manned number identification result is high enough, and the accuracy of the traffic behavior identification is improved. For another example, in a scenario where the sensitivity of the passenger behavior recognition is preferentially ensured, the first confidence threshold may be set to a lower value (for example, 0.6), so that the number of output passenger number recognition results may be increased, and the sensitivity of the traffic behavior recognition may be further improved.

In some embodiments, when S1064 is executed, a corresponding vehicle area map may be obtained according to the associated vehicle area. In some embodiments, a vehicle frame corresponding to the associated vehicle region and a target feature map corresponding to the image to be recognized may be input to the ROI posing unit, so as to obtain the vehicle region map.

And then, vehicle type identification can be carried out on the vehicle area map to obtain a vehicle type identification result. In some embodiments, vehicle type identification may be performed by a pre-trained vehicle identification network. The vehicle identification network may include a classifier constructed based on a neural network. The calculation output by the model may include a confidence (i.e., probability) that the vehicle in the vehicle region map is recognized as each of the predetermined vehicle types. When determining the final vehicle type, the vehicle type corresponding to the highest confidence may be selected, that is, the vehicle type corresponding to the highest confidence may be determined as the vehicle type identification result.

In some embodiments, the training method of the vehicle recognition network may include S21-S23.

Wherein, S21, a second training sample is obtained.

The second training sample comprises sample images of a plurality of vehicles and second labeling information corresponding to the vehicle type of each image.

And S22, inputting the second training sample into a preset second initial network to obtain a sample vehicle type identification result of each sample image.

The second initial network may be any type of neural network. The second preset network may output a vehicle type recognition result.

S23, based on the sample vehicle type identification result and the second loss determined by the second marking information, optimizing the second initial network to obtain the vehicle identification network.

After the calculation result is obtained for the second training sample, a second loss may be determined according to the second labeling information, and the parameters of the second initial network are updated through a back propagation operation to complete one parameter iteration. In some embodiments, the number of parameter iterations may be preset, and the vehicle identification network may be obtained after the preset number of iterations is completed for the second initial network.

By the training method, the accuracy of vehicle type identification can be improved by utilizing the characteristic of neural network self-adaptive learning when the vehicle type identification is carried out.

In some embodiments, after the vehicle type identification is performed on the associated vehicle region, the vehicle type and the corresponding second confidence level are obtained. The vehicle type may be determined as a vehicle type identification of the vehicle region in response to the second confidence reaching a second confidence threshold.

The second confidence threshold may be set according to a traffic situation.

By setting the confidence threshold and outputting the vehicle type recognition result only when the confidence reaches the confidence threshold, the reliability of the output recognition result can be ensured, and the accuracy of the traffic behavior recognition is further ensured.

In some embodiments, multiple traffic behavior recognition scenarios can be flexibly adapted by adjusting the size of the second confidence threshold. For example, in a scenario where priority is given to ensuring accuracy of an offending manned behavior, the second confidence threshold may be set to a higher value (e.g., 0.9). Therefore, the reliability of the output vehicle type identification result is high enough, and the accuracy of the traffic behavior identification is improved. For another example, in a scenario where the passenger behavior recognition sensitivity is preferentially ensured, the first confidence threshold may be set to a lower value (e.g., 0.6), so that the number of output vehicle type recognition results may be increased, and the traffic behavior recognition sensitivity may be further improved.

After the rider number recognition result, the vehicle type recognition result, is obtained, the apparatus may perform S108.

In some embodiments, a legitimacy recognition result that is in line with reality may be output for different vehicle type scenarios.

In executing S108, in the first aspect, the target rider-violation carrier may be determined in response to the carrier number recognition result being a first recognition result; the first preset identification result represents that the number of the manned persons reaches a first preset number.

The first preset number may be an empirical number. For example, in a non-motor vehicle scenario, regardless of the type of vehicle, the number of people, including the driver, cannot exceed 3, and more than 3 people can be considered illegal activities. At this time, the first preset number may be set to 3, and if the number of people reaches 3 or more than 3, the passenger-carrying-line violation may be determined.

In a second aspect, the target rider violation manned vehicle can be determined in response to the manned number recognition result being a second recognition result and the vehicle type characterized by the type recognition result being a preset non-motor vehicle type; and the second identification result represents that the number of the persons reaches a second preset number, and the second preset number is smaller than the first preset number.

In a third aspect, it may be determined that the target rider does not violate the manned vehicle in response to that the number of manned vehicles characterized by the number of manned vehicles is the second recognition result and the type of vehicle characterized by the type recognition result is not the preset non-motor vehicle type.

When the number of people is the second preset number, corresponding people carrying behaviors of vehicles of different types may be legal or illegal. If the target rider-driven vehicle (target vehicle) is of a preset non-motor vehicle type, it can be determined that the manned behavior is illegal, and otherwise, it can be determined that the manned behavior is legal.

The second predetermined amount may be an empirical value. The preset non-motor vehicle type may refer to a vehicle whose number of persons in check load cannot reach the second preset number.

For example, if the second predetermined number is 2, the predetermined non-motor vehicle type is a tool vehicle such as a tricycle. The tool vehicle can only legally carry 1 person, and if the number of the identified persons is 2, the illegal person carrying can be determined. If the vehicle type is not of this kind of tool vehicle, such as possibly a motorcycle or an electric bicycle, it can be determined that the person is legitimate if the number of persons identified is 2.

In the fourth aspect, it may be determined that the target rider does not violate the carrier in response to the result of the recognition of the number of carriers being the third recognition result; and the third identification result represents that the number of the persons is a third preset number, and the third preset number is smaller than the second preset number.

The third predetermined amount may be an empirical value. For example, in a non-automotive scenario, the number of people including the driver may be considered legal if 1 person, regardless of the type of vehicle. At this time, the third preset number may be set to 1, and the number of people is 1, so that the people carrying validity may be determined.

In the fifth aspect, it may be determined that the traffic behavior recognition for the target rider is invalid in response to the manned number recognition result being the fourth recognition result. This makes it possible to dispense with the need to continue traffic behavior recognition.

Through the legitimacy judgment logics in the five aspects, the legality identification result which accords with the reality can be output according to different vehicle type scenes, and the legality identification applicability is improved.

In some embodiments, a warning message may be issued in the event that the targeted rider is determined to be in violation of the carrier.

In some embodiments, the device may be connected to an interactive terminal held by a traffic police. When the equipment identifies the illegal manned behavior, the identity information corresponding to the target rider, the driving vehicle information, the illegal reason and other information can be packaged into alarm information and sent to an interactive terminal held by a traffic police. The traffic police can make corresponding treatment after receiving the alarm. Therefore, the illegal manned behavior can be automatically and timely alarmed, and the illegal behavior is conveniently processed.

The following embodiment is described with reference to a non-motor vehicle manned behavior recognition scenario.

A number of surveillance cameras are deployed in the scene. The monitoring camera can send the image to be identified collected in the monitoring area to the identification equipment to detect the rider behavior.

The recognition device may be equipped with a pre-trained rider-vehicle recognition network (hereinafter referred to as network 1), a man recognition network (hereinafter referred to as network 2), and a vehicle recognition network (hereinafter referred to as network 3).

Wherein the network 1 is used to detect a rider and a vehicle present in an image to be recognized, and the corresponding vehicle region and the rider region. The network 2 may be used to identify the number of people. The network 3 may be used to identify the type of vehicle.

The recognition device can also perform multi-target tracking on each rider appearing in the image to be recognized according to the recognition result of the network 1 to obtain the corresponding running track of each rider, so as to recognize the rider newly appearing in the monitoring area, the rider still moving in the monitoring area and the rider about to leave the monitoring area. The rider about to leave the monitored area may then be determined as the target rider.

Referring to fig. 5, fig. 5 is a schematic flow chart of a manned behavior recognition method shown in the present application.

As shown in fig. 5, after the recognition device receives the image to be recognized, S501 may be executed to recognize a rider frame corresponding to a rider appearing in the image to be recognized and a vehicle frame corresponding to a vehicle through the network 1, screen out a target rider frame corresponding to a target rider, determine a region surrounded by the target rider frame in the image to be recognized as a rider region, and determine a region surrounded by the vehicle frame in the image to be recognized as a vehicle region.

S502 may then be performed, and a degree of overlap between each vehicle region and the rider region is determined by the IoU method, and a target vehicle region corresponding to the greatest degree of overlap is determined as an associated vehicle region spatially associated with the rider region. Therefore, the relevant vehicle area associated with the rider area can be accurately determined by utilizing the spatial overlapping relation between the rider and the vehicle driven by the rider, the vehicle type identification accuracy is improved, and an accurate manned behavior identification result is obtained.

S503 may then be executed to acquire a rider region map corresponding to the rider region, and obtain a man-riding amount recognition result using the network 2. S504, a vehicle area map corresponding to the related vehicle area is obtained, and a vehicle type identification result is obtained by utilizing the network 3. In this example, it can be judged whether the confidence degrees corresponding to the manned number identification result and the vehicle type identification result reach 0.8, so that the credible manned number identification result and the vehicle type identification result can be screened out, and the legality identification accuracy is further improved.

S505 may be executed to identify whether the manned behavior of the target rider is illegal according to the manned number identification result and the type identification result.

Referring to fig. 6, fig. 6 is a schematic diagram illustrating an illegal manned behavior determination rule according to the present application.

As shown in fig. 6, S602 may be executed first to determine the recognition result represented by the person number recognition result. If the man-riding number recognition result is invalid, the traffic behavior recognition of the target rider may not be performed.

And if the number of the persons carrying the number of the persons represented by the person carrying number identification result reaches 3 persons, determining that the target rider violates the rule of the persons carrying the number of the persons.

If the number of people represented by the people number identification result is 2 people, S604 may be further performed to determine whether the vehicle type represented by the type identification result is a tricycle.

And if the vehicle type is a tricycle, determining that the target rider violates the manned rule, otherwise determining that the target rider does not violate the manned rule.

And if the number of the persons marked by the person number identification result is 1 person, determining that the target rider does not violate the person.

Therefore, on one hand, manned behavior identification can be carried out without aiming at an invalid scene, and the manned behavior identification efficiency and effect are improved; on the other hand, the legality identification result which accords with the reality can be output according to different vehicle type scenes, and the legality identification applicability is improved.

If the illegal manned behavior is identified, warning information can be generated based on rider information, vehicle information and illegal origin information corresponding to the illegal manned behavior and is sent to handheld equipment corresponding to the traffic police in time, and therefore the traffic police can process the information in time conveniently.

Corresponding to any embodiment, the application also provides a traffic behavior recognition device.

Referring to fig. 7, fig. 7 is a schematic structural diagram of a traffic behavior recognition device shown in the present application.

As shown in fig. 7, the apparatus 70 may include:

an obtaining module 71, configured to obtain an image to be identified;

a first determining module 72, configured to determine, from among vehicle regions included in the image to be recognized, an associated vehicle region associated with a rider region in the image to be recognized, where the rider region includes a vehicle and at least one human body;

the identification module 73 is used for identifying the number of people in the rider area to obtain a result of identifying the number of people in the rider area and identifying the vehicle type in the associated vehicle area to obtain a result of identifying the vehicle type;

and a second determination module 74, configured to determine whether the target rider in the rider area has the illegal manned behavior according to the manned number recognition result and the vehicle type recognition result.

In some embodiments, the first determining module 72 is specifically configured to:

detecting the image to be identified to obtain the vehicle area and the rider area;

and determining a target vehicle region with the maximum coincidence degree with the rider region from the obtained vehicle regions, and determining the target vehicle region as a related vehicle region related to the rider region.

determining an association score between the vehicle region and the rider region through a pre-trained association score prediction model;

and determining the target vehicle area with the highest association score with the rider area in the vehicle areas as the associated vehicle area associated with the rider area.

In some embodiments, the identifying module 73 is specifically configured to:

identifying the number of people in the rider area to obtain the number of people and a corresponding first confidence coefficient;

in response to the first confidence reaching a first confidence threshold, determining the number of people as a number of people recognition result for the rider region;

identifying the vehicle type of the associated vehicle area to obtain the vehicle type and a corresponding second confidence coefficient;

determining the vehicle type as a vehicle type identification result for the vehicle region in response to the second confidence level reaching a second confidence level threshold.

In some embodiments, the second determining module 74 is specifically configured to:

in response to the manned number recognition result being a first recognition result, determining that the target rider violates the manned rule; the first identification result represents that the number of the persons reaches a first preset number; or,

determining the illegal man carrying by the target rider in response to the fact that the man carrying number recognition result is a second recognition result and the vehicle type represented by the type recognition result is a preset non-motor vehicle type; the second identification result represents that the number of the persons reaches a second preset number, and the second preset number is smaller than the first preset number; or,

in response to the number of people carrying the vehicle characterized by the number of people carrying the vehicle is the second recognition result, and the type of the vehicle characterized by the type recognition result is not the preset non-motor vehicle type, determining that the target rider does not violate the number of people carrying the vehicle; or,

in response to the manned number recognition result being a third recognition result, determining that the target rider does not violate the manned; the third identification result represents that the number of the persons is a third preset number, and the third preset number is smaller than the second preset number; or,

determining that the traffic behavior recognition for the target rider is invalid in response to the manned number recognition result being a fourth recognition result.

In some embodiments, the fourth recognition result characterizes the image to be recognized that at least one of the following invalid scenes occurs:

In some embodiments, the apparatus 70 further comprises:

and the warning module is used for responding to the illegal manned target rider and sending warning information.

In some embodiments, the rider number recognition result is obtained by detecting the rider area through a rider recognition network; the apparatus 70 further comprises:

the training module of the manned recognition network is used for obtaining a first training sample, the first training sample comprises sample images of a plurality of riders and first labeling information corresponding to the manned quantity of each image, and the first labeling information comprises one of the following labels: 1 person, 2 persons, 3 persons, an invalid tag, the invalid tag comprising at least one of: the rider cart, the rider standing beside the cart, the multiple riders abutting against each other, low definition, and vehicle sheltered;

inputting the first training sample into a preset first initial network to obtain a sample manned number identification result of each sample image;

and optimizing the first initial network based on the sample manned number identification result and the first loss determined by the first marking information to obtain the manned identification network.

In some embodiments, the vehicle identification result is obtained by detecting the vehicle area through a vehicle identification network; the apparatus 70 further comprises:

the training module of the vehicle identification network is used for acquiring a second training sample, and the second training sample comprises sample images of a plurality of vehicles and second marking information corresponding to the vehicle type of each image;

inputting the second training sample into a preset second initial network to obtain a sample vehicle type identification result of each sample image;

and optimizing the second initial network based on the sample vehicle type identification result and the second loss determined by the second marking information to obtain the vehicle identification network.

The embodiment of the traffic behavior recognition device shown in the application can be applied to electronic equipment. Accordingly, the present application discloses an electronic device, which may comprise: a processor.

A memory for storing processor-executable instructions.

Wherein the processor is configured to call the executable instructions stored in the memory to implement the traffic behavior recognition method shown in any of the foregoing embodiments.

Referring to fig. 8, fig. 8 is a schematic diagram of a hardware structure of an electronic device shown in the present application.

As shown in fig. 8, the electronic device may include a processor for executing instructions, a network interface for making network connections, a memory for storing operating data for the processor, and a non-volatile memory for storing instructions corresponding to the behavior recognizing apparatus.

The embodiments of the apparatus may be implemented by software, or by hardware, or by a combination of hardware and software. Taking a software implementation as an example, as a logical device, the device is formed by reading, by a processor of the electronic device where the device is located, a corresponding computer program instruction in the nonvolatile memory into the memory for operation. In terms of hardware, in addition to the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 8, the electronic device in which the apparatus is located in the embodiment may also include other hardware according to an actual function of the electronic device, which is not described again.

It is to be understood that, in order to increase the processing speed, the instruction corresponding to the traffic behavior recognition device may also be directly stored in the memory, which is not limited herein.

The present application proposes a computer-readable storage medium, which stores a computer program, which may be used to cause a processor to execute a traffic behavior recognition method as shown in any of the preceding embodiments.

One skilled in the art will recognize that one or more embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (which may include, but are not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

"and/or" in this application means having at least one of the two, for example, "a and/or B" may include three schemes: A. b, and "A and B".

The embodiments in the present application are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the data processing apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to part of the description of the method embodiment.

Specific embodiments of the present application have been described. Other embodiments are within the scope of the following claims. In some cases, the acts or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

Embodiments of the subject matter and functional operations described in this application may be implemented in the following: digital electronic circuitry, tangibly embodied computer software or firmware, computer hardware that may include the structures disclosed in this application and their structural equivalents, or combinations of one or more of them. Embodiments of the subject matter described in this application can be implemented as one or more computer programs, i.e., one or more modules encoded in computer program instructions that are carried by a tangible, non-transitory program carrier for execution by, or to control the operation of, data processing apparatus. Alternatively or additionally, the program instructions may be encoded in an artificially generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus for execution by the data processing apparatus. The computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.

The processes and logic flows described in this application can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).

Computers suitable for executing computer programs may include, for example, general and/or special purpose microprocessors, or any other type of central processing unit. Generally, a central processing unit will receive instructions and data from a read-only memory and/or a random access memory. The basic components of a computer may include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer does not necessarily have such a device. Moreover, a computer may be embedded in another device, e.g., a mobile telephone, a Personal Digital Assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device such as a Universal Serial Bus (USB) flash drive, to name a few.

Computer-readable media suitable for storing computer program instructions and data can include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices (e.g., EPROM, EEPROM, and flash memory devices), magnetic disks (e.g., internal hard disk or removable disks), magneto-optical disks, and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

Although this application contains many specific implementation details, these should not be construed as limiting the scope of any disclosure or of what may be claimed, but rather as merely describing features of particular disclosed embodiments. Certain features that are described in this application in the context of separate embodiments can also be implemented in combination in a single embodiment. In other instances, features described in connection with one embodiment may be implemented as discrete components or in any suitable subcombination. Moreover, although features may be described as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.

Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In some cases, multitasking and parallel processing may be advantageous. Moreover, the dispersion of various system modules and components in the described embodiments should not be understood as requiring such dispersion in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.

Thus, particular embodiments of the subject matter have been described. Other embodiments are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. Further, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some implementations, multitasking and parallel processing may be advantageous.

While the invention has been described in terms of what is presently considered to be the most practical and preferred embodiment, it is to be understood that the invention is not to be limited to the disclosed embodiment, but on the contrary, is intended to cover various modifications, equivalents, and alternatives falling within the spirit and scope of the invention as defined by the appended claims.

Claims

1. A traffic behavior recognition method, characterized in that the method comprises:

acquiring an image to be identified;

determining a relevant vehicle area which is associated with a rider area in the image to be recognized and comprises a vehicle and at least one human body in the vehicle area which is comprised by the image to be recognized;

carrying out manned number identification on the rider area to obtain a manned number identification result, and carrying out vehicle type identification on the associated vehicle area to obtain a vehicle type identification result;

and determining whether the target rider in the rider area has illegal manned behavior according to the manned number identification result and the vehicle type identification result.

2. The method according to claim 1, wherein the determining that the to-be-recognized image includes a vehicle region, an associated vehicle region associated with a rider region in the to-be-recognized image, comprises:

3. The method according to claim 1, wherein the determining that the to-be-recognized image includes a vehicle region, an associated vehicle region associated with a rider region in the to-be-recognized image, comprises:

4. The method according to claim 1, wherein the identifying the number of people in the rider area to obtain a result of identifying the number of people in the rider area comprises:

the vehicle type recognition of the associated vehicle area to obtain a vehicle type recognition result includes:

5. The method according to any one of claims 1 to 4, wherein the determining whether the target rider in the rider area has the illegal manned behavior according to the manned number recognition result and the vehicle type recognition result comprises:

6. The method according to claim 5, wherein the fourth recognition result characterizes the image to be recognized appearing in at least one of the following invalid scenes:

7. The method according to any one of claims 1-6, further comprising:

and sending out warning information in response to the target rider violation manned.

8. The method according to any one of claims 1 to 6, wherein the result of the manned number recognition is obtained by detecting the rider area through a manned recognition network, wherein the manned recognition network is trained as follows:

obtaining a first training sample, wherein the first training sample comprises sample images of a plurality of riders and first labeling information corresponding to the number of people carried in each image, and the first labeling information comprises one of the following labels: 1 person, 2 persons, 3 persons, an invalid tag, the invalid tag comprising at least one of: the rider cart, the rider standing beside the cart, the multiple riders abutting against each other, low definition, and vehicle sheltered;

9. The method according to any one of claims 1 to 6, wherein the vehicle identification result is obtained by detecting the vehicle area through a vehicle identification network, wherein the vehicle identification network is trained by the following method:

acquiring a second training sample, wherein the second training sample comprises sample images of a plurality of vehicles and second marking information corresponding to the vehicle type of each image;

10. A traffic behavior recognition apparatus, characterized in that the apparatus comprises:

the acquisition module is used for acquiring an image to be identified;

the first determination module is used for determining a relevant vehicle area which is associated with a rider area in the image to be recognized in a vehicle area included in the image to be recognized, wherein the rider area comprises a vehicle and at least one human body;

the identification module is used for identifying the number of people in the rider area to obtain a result of identifying the number of people in the rider area and identifying the vehicle type in the associated vehicle area to obtain a result of identifying the vehicle type;

and the second determination module is used for determining whether the target rider in the rider area has illegal manned behavior according to the manned number recognition result and the vehicle type recognition result.

11. An electronic device, characterized in that the device comprises:

a processor;

a memory for storing processor-executable instructions;

wherein the processor implements the traffic behavior recognition method according to any one of claims 1 to 9 by executing the executable instructions.

12. A computer-readable storage medium, characterized in that the storage medium stores a computer program for causing a processor to execute the traffic behavior recognition method according to any one of claims 1 to 9.