CN114550036B - Searching method and system for optimal cascade configuration of video target detection - Google Patents
Searching method and system for optimal cascade configuration of video target detection Download PDFInfo
- Publication number
- CN114550036B CN114550036B CN202210147889.0A CN202210147889A CN114550036B CN 114550036 B CN114550036 B CN 114550036B CN 202210147889 A CN202210147889 A CN 202210147889A CN 114550036 B CN114550036 B CN 114550036B
- Authority
- CN
- China
- Prior art keywords
- configuration
- data set
- video data
- video
- optimal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 65
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000012549 training Methods 0.000 claims abstract description 48
- 238000001914 filtration Methods 0.000 claims abstract description 45
- 238000004364 calculation method Methods 0.000 claims abstract description 24
- 238000010845 search algorithm Methods 0.000 claims abstract description 7
- 238000013528 artificial neural network Methods 0.000 claims description 12
- 238000004458 analytical method Methods 0.000 claims description 9
- 230000008859 change Effects 0.000 claims description 9
- 239000012634 fragment Substances 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 7
- 238000010606 normalization Methods 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 5
- 238000013527 convolutional neural network Methods 0.000 claims description 3
- 238000006073 displacement reaction Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 230000009286 beneficial effect Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000012795 verification Methods 0.000 description 3
- 101100481876 Danio rerio pbk gene Proteins 0.000 description 2
- 101100481878 Mus musculus Pbk gene Proteins 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2155—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a searching method and a searching system for optimal cascade configuration of video target detection, wherein the method comprises the following steps: acquiring a video data set and performing feature calculation on the video data set to obtain scene feature combinations; performing a frame filtering operation on the video dataset based on the inter-frame similarity; based on an optimized configuration search algorithm, an optimal cascading configuration scheme which meets the precision requirement and has the lowest cost in a video data set is obtained, and a training set with a label is constructed; training the cascading scheme mapper based on the training set to obtain a trained cascading scheme mapper; and acquiring the video to be detected, searching the optimal configuration based on the trained cascading scheme mapper, and completing the target detection task. The system comprises: the device comprises a characteristic calculation module, a filtering module, a searching module, a training module and a detection module. By using the method and the device, the optimal cascading scheme can be automatically and efficiently acquired according to the video scene, and target detection is completed. The invention can be widely applied to the field of target detection.
Description
Technical Field
The invention relates to the field of target detection, in particular to a searching method and a searching system for optimal cascade configuration of video target detection.
Background
Currently, with the development of computer vision technology, the result of a target detection model based on a deep neural network is increasingly accurate. But use in large-scale video datasets presents a significant challenge. Mainly because most of the video object detection tasks are executed by mobile devices such as cameras, and cannot afford high calculation cost. Therefore, it is important to study how to reduce the calculation cost while satisfying the requirement of the detection accuracy.
Disclosure of Invention
In order to solve the technical problems, the invention aims to provide a searching method and a searching system for optimal cascading configuration of video target detection, which can automatically and efficiently acquire an optimal cascading scheme according to video scenes to finish target detection.
The first technical scheme adopted by the invention is as follows: a searching method for optimal cascade configuration of video target detection comprises the following steps:
acquiring a video data set and performing feature calculation on the video data set to obtain scene feature combinations;
Performing frame filtering operation on the video data set based on the inter-frame similarity to obtain a video data set with redundant frames filtered;
based on an optimized configuration search algorithm, an optimal cascading configuration scheme which meets the precision requirement and has the lowest cost in a video data set is obtained, and a training set with a label is constructed by combining scene feature combination;
Training the cascading scheme mapper based on the training set to obtain a trained cascading scheme mapper;
and acquiring the video to be detected, searching the optimal configuration based on the trained cascading scheme mapper, and completing the target detection task.
Further, before the step of obtaining the video data set and performing feature calculation on the video data set to obtain the scene feature value, the method further includes:
And counting the running cost of all the configuration schemes.
Further, the step of obtaining a video data set and performing feature computation on the video data set to obtain a scene feature combination specifically includes:
acquiring a video data set, wherein the video data set comprises a detection data set, a network video and a shooting video;
performing scene analysis on the video data set, calculating the characteristics of each frame of picture in the video data set, preprocessing, and extracting to obtain scene characteristic combinations;
The scene feature combination includes a detection target number, a detection target speed, a detection target displacement, a scene offset, and a CNN feature.
Further, the preprocessing comprises normalization processing and extreme outlier removal processing by adopting a 0-1 normalization method.
Further, the step of performing a frame filtering operation on the video data set based on the inter-frame similarity to obtain a video data set with redundant frames filtered, specifically includes:
Acquiring an inter-frame difference algorithm;
Comparing the calculation cost of the inter-frame difference algorithm, the adaptability of the scene and the filtering threshold interval;
And selecting an inter-frame difference algorithm to calculate the inter-frame similarity, and filtering the video frames to obtain filtered video frame data.
Further, the step of obtaining an optimal cascade configuration scheme which meets the precision requirement and has the lowest cost in the video data set based on the optimized configuration search algorithm and combining scene feature combination to construct a training set with a tag specifically comprises the following steps:
Sequentially dividing videos in the video data set into preset lengths to obtain a fragment set;
performing filtering strategy and cascading configuration analysis on each fragment in the fragment set to obtain a corresponding combination scheme;
Generating tagged video data according to scene feature combination and combination scheme corresponding to the video data set, and constructing and obtaining a tagged training set.
Further, the training of the cascade scheme mapper based on the training set, to obtain a trained cascade scheme mapper, specifically includes:
Based on the training set, taking scene feature combination as input, and taking a combination scheme as an output training cascade scheme mapper;
Drawing an accuracy rate change curve in the training process, and debugging the cascade scheme mapper until the accuracy rate reaches a preset value, so as to obtain the trained cascade scheme mapper.
Further, the step of obtaining the video to be detected and searching the optimal configuration based on the trained cascading scheme mapper to complete the target detection task specifically includes:
Building and running a trained cascading scheme mapper on NCNN;
inputting a video to be detected on line, and carrying out feature extraction processing on the video to be detected to obtain scene features to be detected;
outputting an optimal combination according to the scene characteristics to be detected, wherein the optimal combination comprises cascade configuration and a filtering strategy;
and filtering the video to be detected according to the optimal combination, and completing the video target detection task by combining with cascade configuration.
The second technical scheme adopted by the invention is as follows: a search system for optimal cascade configuration for video object detection, comprising:
the feature calculation module is used for acquiring a video data set and carrying out feature calculation on the video data set to obtain scene feature combinations;
The filtering module is used for performing frame filtering operation on the video data set based on the inter-frame similarity to obtain a video data set with redundant frames filtered;
the searching module is used for acquiring an optimal cascading configuration scheme which meets the precision requirement and has the lowest cost in the video data set based on an optimized configuration searching algorithm, and constructing a training set with a label by combining scene characteristic combination;
The training module trains the cascading scheme mapper based on the training set to obtain a trained cascading scheme mapper;
the detection module is used for acquiring the video to be detected and searching the optimal configuration based on the trained cascading scheme mapper to complete the target detection task.
The method and the system have the beneficial effects that: the invention can efficiently select the effective filtering strategy and cascade configuration according to scene characteristics. After filtering the frames, light configuration is prioritized from the cascade configuration of the current scene, and if the requirements are not met, high-precision configuration is considered, so that the whole target detection process is efficient, high in precision and low in cost.
Drawings
FIG. 1 is a flow chart of the steps of a search method for an optimal cascade configuration for video object detection according to the present invention;
fig. 2 is a block diagram of a search system of an optimal cascade configuration for video object detection according to the present invention.
Detailed Description
The invention will now be described in further detail with reference to the drawings and to specific examples. The step numbers in the following embodiments are set for convenience of illustration only, and the order between the steps is not limited in any way, and the execution order of the steps in the embodiments may be adaptively adjusted according to the understanding of those skilled in the art.
As shown in fig. 1, the present invention provides a searching method for an optimal cascade configuration of video object detection, which includes the following steps:
S0, counting the running cost of all the configuration schemes.
S1, acquiring a video data set and performing feature calculation on the video data set to obtain a scene feature combination;
S1.1, acquiring a video data set, wherein the video data set comprises a detection data set, a network video and a shooting video;
Specifically, data covering a large number of common target detection task scenes can be obtained through downloading a mainstream target detection data set (such as KITTI, VOC, COCO and the like), a network video or a self-shooting video and other channels.
S1.2, performing scene analysis on a video data set, calculating the characteristics of each frame of picture in the video data set, preprocessing, and extracting to obtain a scene characteristic combination;
The scene characteristic combination comprises the number of detection targets, the speed of the detection targets, the displacement of the detection targets, the scene offset and the CNN characteristic, and the preprocessing comprises the steps of carrying out normalization processing and extreme outlier removal processing by adopting a 0-1 normalization method.
Specifically, the normalized data accords with standard normal distribution, namely, the mean value is 0, the standard deviation is 1, and the dimension influence among different characteristic values can be effectively removed.
S2, performing frame filtering operation on the video data set based on the inter-frame similarity to obtain a video data set with redundant frames filtered;
s2.1, acquiring an inter-frame difference algorithm;
s2.2, comparing the calculation cost of the inter-frame difference algorithm, the adaptability of the scene and the filtering threshold interval;
Specifically, a calculation method for measuring inter-frame differences commonly used in a frame filtering strategy in the research industry, such as Mean Square Error (MSE), mean Absolute Error (MAE), peak signal to noise ratio (PSNR), structural Similarity (SSIM), and the like, compares the calculation cost of different difference algorithms and the adaptability to a scene, and determines filtering threshold intervals of different inter-frame difference calculation methods.
S2.3, selecting an inter-frame difference algorithm to filter the video frames to obtain filtered video frame data.
S3, based on an optimized configuration search algorithm, obtaining an optimal cascading configuration scheme which meets the precision requirement and has the lowest cost in the video data set, and combining scene feature combination to construct a training set with a label;
Specifically, since the subsequent training neural network belongs to supervised learning, the precondition is that high-quality labeled data are obtained, so that an optimized searching mode is adopted to reduce the time complexity of a searching cascade space from an exponential level o (m n) to o (m n), an optimal cascade scheme under a large number of scenes is efficiently obtained, and high-quality labeled training data are generated for the subsequent training cascade scheme mapper based on the neural network.
S3.1, dividing videos in the video data set into preset lengths in sequence to obtain a fragment set;
specifically, the video data is sequentially divided into a set of segments (referred to as Window) of length 16 s. Each Window is further divided into small segments (called segments) of length 4 s.
S3.2, carrying out filtering strategy and cascading configuration analysis on each fragment in the fragment set to obtain a corresponding combination scheme;
Specifically, a comprehensive filtering strategy and cascade configuration analysis are carried out for the first Segment in each Window, an optimal combination scheme is obtained from the filtering strategy and the cascade configuration, and the combination of the top K filtering strategy and the cascade configuration (namely, the K group scheme with the minimum calculation cost, namely, the filtering strategy and the secondary cascade configuration, which meet the precision requirement) is recorded.
For each Segemt which belongs to the same Window and is left, the comprehensive combination scheme analysis is not executed, and the optimal cascade configuration and the filtering strategy are only analyzed from the top K scheme, so that the combination search space is greatly reduced, and the optimal filtering strategy and the cascade secondary configuration of the Segment are obtained.
If the optimal combination satisfying the accuracy requirement cannot be obtained from the topK combinations transmitted from the previous Segment, it is explained that the scene is greatly changed, and at this time, re-analysis is required, and topK combinations are updated.
And S3.3, generating video data with labels according to scene feature combination and combination schemes corresponding to the video data set, and constructing and obtaining a training set.
S4, training the cascade scheme mapper based on the training set to obtain a trained cascade scheme mapper;
s4.1, based on a training set, taking a scene feature combination as input, and taking a combination scheme as an output training cascade scheme mapper;
in addition, a test set may also be constructed for verification.
And S4.2, drawing an accuracy rate change curve in the training process, and debugging the cascade scheme mapper until the accuracy rate reaches a preset value, so as to obtain the trained cascade scheme mapper.
Specifically, the debug operation is: changing the learning rate, observing a convergence curve of the objective function value and the verification set accuracy, and selecting a proper learning rate from the convergence curve; changing the number of hidden layers and the number of hidden units in each layer, observing the change of accuracy in each change process, and selecting proper layers and units; and (3) while changing the number of training wheels, observing the training set accuracy rate change curve and the verification set accuracy rate change curve, and recording the most suitable number of training wheels.
S5, acquiring a video to be detected, searching the optimal configuration based on the trained cascading scheme mapper, and completing the target detection task.
S5.1, building and running a trained cascading scheme mapper on NCNN;
specifically, NCNN is a high-performance neural network forward computing framework that is extremely optimized for mobile devices, and can efficiently complete video object detection tasks. The method can be used for reducing the computational overhead by using a target detection model based on the deep neural network for a large-scale video data set on mobile equipment with limited computational resources.
S5.2, inputting a video to be detected online, and carrying out feature extraction processing on the video to be detected to obtain scene features to be detected;
S5.3, outputting an optimal combination according to the scene characteristics to be detected, wherein the optimal combination comprises cascade configuration and a filtering strategy;
and S5.4, filtering the video to be detected according to the optimal combination, and completing the video target detection task by combining with the cascade configuration.
Specifically, video data is input, and extracted features that can describe scene change conditions are calculated. Since the selected features are key visual features, they can be calculated very quickly. Inputting the characteristic values into a combination mapper to obtain output: optimal combinations, i.e. optimal cascade configuration and optimal filtering strategy. According to the selected optimal combination, a corresponding frame filtering strategy is executed, and frames can be directly filtered (frame results before multiplexing and model scheduling avoiding) with low difference degree. Thereafter, the optimal configuration is obtained from the cascaded two-stage configuration: firstly, considering the lightweight configuration, and considering the heavyweight configuration only when the precision does not reach the standard. Then using the optimal configuration, the following parameter values: and the frame rate, the resolution and the target detection model are used for completing the video target detection task.
The invention optimizes the mainstream video target detection flow based on the neural network. The method can adapt to scene change, periodically adjust an optimal cascading combination scheme for completing target detection, and ensure balance of precision and calculation cost. The method mainly comprises the steps of analyzing scene changes, calculating scene characteristics, inputting the characteristics to a cascade combination mapper trained based on a neural network, directly outputting an optimal cascade combination, and efficiently selecting the cascade combination with the lowest cost reaching the precision requirement from a huge cascade combination space. The frame filtering model automatically selects a proper filtering strategy and a threshold value according to the scene, and can dynamically adjust filtering decisions according to the scene feature type, the query precision and the time-varying correlation among video contents. In addition, the method for measuring the inter-frame difference has high calculation speed, and the frame filtering model does not bring high cost. The cascade configuration will also be dynamically adjusted according to the scene.
In the off-line stage, the cascade combination search space is greatly reduced by applying an optimized search algorithm, and high-quality labeled data is obtained efficiently by training a cascade combination mapper by using a neural network. The search speed is fast due to the adoption of the optimized search scheme. This step is performed offline, preventing the cost of analyzing the optimal cascade combination from counteracting the benefits of switching the cascade combination. In addition, through setting up neural network, learned the relation between scene and the optimal cascade combination, can adapt to most scenes in real life. In the online stage, the rapid search of the optimal cascade combination of the video target detection is completed by inputting scene characteristic information, so that the prediction precision is improved, the resource consumption of the current video detection is obviously reduced, and the effects of low cost, high efficiency and high accuracy are achieved.
As shown in fig. 2, a search system for detecting an optimal cascade configuration of video objects includes:
the feature calculation module is used for acquiring a video data set and carrying out feature calculation on the video data set to obtain scene feature combinations;
The filtering module is used for performing frame filtering operation on the video data set based on the inter-frame similarity to obtain a video data set with redundant frames filtered;
the searching module is used for acquiring an optimal cascading configuration scheme which meets the precision requirement and has the lowest cost in the video data set based on an optimized configuration searching algorithm, and constructing a training set with a label by combining scene characteristic combination;
The training module trains the cascading scheme mapper based on the training set to obtain a trained cascading scheme mapper;
the detection module is used for acquiring the video to be detected and searching the optimal configuration based on the trained cascading scheme mapper to complete the target detection task.
Further as a preferred embodiment, further comprising:
And the pre-calculation module is used for counting the running cost of all the configuration schemes.
The content in the method embodiment is applicable to the system embodiment, the functions specifically realized by the system embodiment are the same as those of the method embodiment, and the achieved beneficial effects are the same as those of the method embodiment.
A searching device for detecting optimal cascade configuration of video targets comprises:
at least one processor;
at least one memory for storing at least one program;
The at least one program, when executed by the at least one processor, causes the at least one processor to implement a search method for optimal cascading configuration of video object detection as described above.
The content in the method embodiment is applicable to the embodiment of the device, and the functions specifically realized by the embodiment of the device are the same as those of the method embodiment, and the obtained beneficial effects are the same as those of the method embodiment.
A storage medium having stored therein instructions executable by a processor, characterized by: the processor-executable instructions, when executed by the processor, are for implementing a search method for video object detection optimal cascade configuration as described above.
The content in the method embodiment is applicable to the storage medium embodiment, and functions specifically implemented by the storage medium embodiment are the same as those of the method embodiment, and the achieved beneficial effects are the same as those of the method embodiment.
While the preferred embodiment of the present application has been described in detail, the application is not limited to the embodiment, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the application, and these equivalent modifications and substitutions are intended to be included in the scope of the present application as defined in the appended claims.
Claims (7)
1. The searching method of the optimal cascade configuration for video target detection is characterized by comprising the following steps:
acquiring a video data set and performing feature calculation on the video data set to obtain scene feature combinations;
Performing frame filtering operation on the video data set based on the inter-frame similarity to obtain a video data set with redundant frames filtered;
Based on an optimized configuration search algorithm, an optimal cascading configuration scheme which meets the precision requirement and has the lowest cost in the video data set filtered with redundant frames is obtained, and a training set with labels is constructed by combining scene feature combinations;
Training the cascade scheme mapper by using a neural network based on the training set to obtain a trained cascade scheme mapper;
Acquiring a video to be detected, searching the optimal configuration based on the trained cascading scheme mapper, and completing a target detection task;
The step of obtaining the optimal cascading configuration scheme with minimum cost for filtering the video data set of the redundant frames based on the optimized configuration search algorithm and combining scene feature combination to construct a training set with labels specifically comprises the following steps:
Sequentially dividing videos in the video data set into preset lengths to obtain a fragment set;
Performing filtering strategy and cascading configuration analysis on each fragment in the fragment set to obtain a corresponding optimal cascading configuration scheme, wherein the optimal cascading configuration scheme comprises filtering strategy and cascading configuration;
generating tagged video data according to a scene feature combination corresponding to the video data set and an optimal cascading configuration scheme, and constructing a tagged training set;
The step of obtaining the video to be detected and searching the optimal configuration based on the trained cascading scheme mapper to complete the target detection task specifically comprises the following steps:
Building and running a trained cascading scheme mapper on NCNN;
inputting a video to be detected on line, and carrying out feature extraction processing on the video to be detected to obtain scene features to be detected;
Outputting optimal configuration according to the scene characteristics to be tested, wherein the optimal configuration comprises cascade configuration and filtering strategy;
and filtering the video to be detected according to the optimal configuration, and completing the video target detection task by combining the cascade configuration in the optimal configuration.
2. The method for searching for an optimal cascade configuration for video object detection according to claim 1, wherein before the step of obtaining a video data set and performing feature computation on the video data set to obtain a scene feature combination, further comprises:
And counting the running cost of all the configuration schemes.
3. The method for searching for an optimal cascade configuration for video object detection according to claim 2, wherein the steps of obtaining a video data set and performing feature computation on the video data set to obtain a scene feature combination specifically comprise:
acquiring a video data set, wherein the video data set comprises a detection data set, a network video and a shooting video;
performing scene analysis on the video data set, calculating the characteristics of each frame of picture in the video data set, preprocessing, and extracting to obtain scene characteristic combinations;
The scene feature combination includes a detection target number, a detection target speed, a detection target displacement, a scene offset, and a CNN feature.
4. A method for searching for an optimal cascade configuration for video object detection as recited in claim 3, wherein said preprocessing comprises normalization and extreme outlier removal by a 0-1 normalization method.
5. The method according to claim 4, wherein the step of performing a frame filtering operation on the video data set based on the inter-frame similarity to obtain a video data set with redundant frames filtered out, comprises:
Acquiring an inter-frame difference algorithm;
Comparing the calculation cost of the inter-frame difference algorithm, the adaptability of the scene and the filtering threshold interval;
and selecting an inter-frame difference algorithm to calculate the inter-frame similarity, and filtering the video frames to obtain a filtered video data set.
6. The method for searching for optimal cascade configuration for video object detection according to claim 5, wherein the training of the cascade scheme mapper by using a neural network based on the training set, and obtaining the trained cascade scheme mapper, specifically comprises the following steps:
Based on the training set, taking the scene feature combination as input, and taking the optimal cascading scheme as an output training cascading scheme mapper;
Drawing an accuracy rate change curve in the training process, debugging the cascade scheme mapper, and judging that the accuracy rate reaches a preset value to obtain the trained cascade scheme mapper.
7. A search system for a video object detection best cascade configuration, characterized by a search method for performing the video object detection best cascade configuration as claimed in claim 1, comprising:
the feature calculation module is used for acquiring a video data set and carrying out feature calculation on the video data set to obtain scene feature combinations;
The filtering module is used for performing frame filtering operation on the video data set based on the inter-frame similarity to obtain a video data set with redundant frames filtered;
the searching module is used for acquiring an optimal cascading configuration scheme which meets the precision requirement and has the lowest cost in the video data set filtered with redundant frames based on an optimized configuration searching algorithm, and constructing a training set with a label by combining scene characteristic combinations;
The training module is used for training the cascade scheme mapper by utilizing the neural network based on the training set to obtain a trained cascade scheme mapper;
the detection module is used for acquiring the video to be detected and searching the optimal configuration based on the trained cascading scheme mapper to complete the target detection task.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210147889.0A CN114550036B (en) | 2022-02-17 | 2022-02-17 | Searching method and system for optimal cascade configuration of video target detection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210147889.0A CN114550036B (en) | 2022-02-17 | 2022-02-17 | Searching method and system for optimal cascade configuration of video target detection |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114550036A CN114550036A (en) | 2022-05-27 |
CN114550036B true CN114550036B (en) | 2024-08-16 |
Family
ID=81675162
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210147889.0A Active CN114550036B (en) | 2022-02-17 | 2022-02-17 | Searching method and system for optimal cascade configuration of video target detection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114550036B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110991658A (en) * | 2019-11-28 | 2020-04-10 | 重庆紫光华山智安科技有限公司 | Model training method and device, electronic equipment and computer readable storage medium |
CN112418055A (en) * | 2020-11-18 | 2021-02-26 | 东方通信股份有限公司 | Scheduling method based on video analysis and personnel trajectory tracking method |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114072809A (en) * | 2019-09-18 | 2022-02-18 | 谷歌有限责任公司 | Small and fast video processing network via neural architectural search |
CN113869521A (en) * | 2020-06-30 | 2021-12-31 | 华为技术有限公司 | Method, device, computing equipment and storage medium for constructing prediction model |
CN113283426B (en) * | 2021-04-30 | 2024-07-26 | 南京大学 | Embedded target detection model generation method based on multi-target neural network search |
-
2022
- 2022-02-17 CN CN202210147889.0A patent/CN114550036B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110991658A (en) * | 2019-11-28 | 2020-04-10 | 重庆紫光华山智安科技有限公司 | Model training method and device, electronic equipment and computer readable storage medium |
CN112418055A (en) * | 2020-11-18 | 2021-02-26 | 东方通信股份有限公司 | Scheduling method based on video analysis and personnel trajectory tracking method |
Also Published As
Publication number | Publication date |
---|---|
CN114550036A (en) | 2022-05-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109711228B (en) | Image processing method and device for realizing image recognition and electronic equipment | |
CN111968150B (en) | Weak surveillance video target segmentation method based on full convolution neural network | |
CN113361645B (en) | Target detection model construction method and system based on meta learning and knowledge memory | |
CN113128478B (en) | Model training method, pedestrian analysis method, device, equipment and storage medium | |
Xu et al. | Mh-detr: Video moment and highlight detection with cross-modal transformer | |
CN113569755B (en) | Time sequence action positioning method, system, equipment and medium based on dual relation network | |
CN114663798B (en) | Single-step video content identification method based on reinforcement learning | |
CN110781818B (en) | Video classification method, model training method, device and equipment | |
CN113037783A (en) | Abnormal behavior detection method and system | |
CN115587335A (en) | Training method of abnormal value detection model, abnormal value detection method and system | |
CN114492601A (en) | Resource classification model training method and device, electronic equipment and storage medium | |
CN113936175A (en) | Method and system for identifying events in video | |
Karim et al. | Real-time weakly supervised video anomaly detection | |
CN110147724B (en) | Method, apparatus, device, and medium for detecting text region in video | |
CN116599683A (en) | Malicious traffic detection method, system, device and storage medium | |
CN114550036B (en) | Searching method and system for optimal cascade configuration of video target detection | |
CN110348509B (en) | Method, device and equipment for adjusting data augmentation parameters and storage medium | |
CN117218382A (en) | Unmanned system large-span shuttle multi-camera track tracking and identifying method | |
CN117576648A (en) | Automatic driving scene mining method and device, electronic equipment and storage medium | |
Li et al. | Siamese global location-aware network for visual object tracking | |
CN115019342B (en) | Endangered animal target detection method based on class relation reasoning | |
CN112487927B (en) | Method and system for realizing indoor scene recognition based on object associated attention | |
CN111476131B (en) | Video processing method and device | |
Ai et al. | Analysis of deep learning object detection methods | |
CN113033397A (en) | Target tracking method, device, equipment, medium and program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |