CN118470645A

CN118470645A - Visual detection-based intelligent pen test monitoring system and method

Info

Publication number: CN118470645A
Application number: CN202410743027.3A
Authority: CN
Inventors: 王一诺
Original assignee: Individual
Current assignee: Individual
Priority date: 2024-06-11
Filing date: 2024-06-11
Publication date: 2024-08-09

Abstract

The invention belongs to the technical field of computer vision and intelligent monitoring, and particularly relates to a visual detection-based intelligent pen test monitoring system and method. The technical problems solved by the invention mainly comprise efficient verification of the identity of the examinee, automatic recording of the examination position information and real-time detection of cheating behaviors. The sign-in module performs identity verification of the examinee by using a face recognition algorithm; the examinee position recording module automatically records an examinee examination position through the check-in position and the three-dimensional position information captured by the 3D depth camera; the visual anti-cheating module is combined with an instance segmentation algorithm and a video understanding model to detect illegal objects, abnormal actions and out-of-range behaviors in real time; and the examination report generating module automatically generates an analysis report according to the check-in and cheating detection data. The intelligent monitoring system is mainly used for realizing intelligent monitoring of the whole examination process by introducing an advanced computer vision technology, improving the efficiency and fairness of examination management and providing powerful technical support for education examination.

Description

Visual detection-based intelligent pen test monitoring system and method

Technical Field

The invention relates to a visual detection-based intelligent pen test monitoring system and a visual detection-based intelligent pen test monitoring method, and belongs to the technical field of artificial intelligent visual detection.

Background

With the rapid development of information technology, unmanned and intelligent examination monitoring systems are becoming increasingly an important means of examination management modernization. Traditional examination prison mainly relies on artifical inspection and video monitoring, has the problem that the human cost is high, inefficiency, subjectivity is strong. In addition, key links such as identity verification of examinees, order maintenance of examination rooms, cheating behavior identification and the like are difficult to put in place. Therefore, an intelligent examination monitoring system based on an artificial intelligence technology is needed, and full-flow, dead-angle-free and high-efficiency supervision of an examination process is realized through a front-edge algorithm such as computer vision and deep learning.

Disclosure of Invention

The invention aims to provide an intelligent examination monitoring system and a monitoring method based on deep learning, which realize functions of identity verification, examination position recording, real-time detection of cheating behaviors and the like of examinees by using deep learning algorithms such as face recognition, example segmentation, attitude estimation, video understanding and the like through arranging a panoramic camera and a 3D depth camera in an examination room, finally automatically generate an examination monitoring report, and effectively improve informatization and intellectualization levels of examination management.

1) A visual detection-based intelligent pen test monitoring system comprises an examination room panoramic camera, an examination room 3D depth camera, a monitoring server, storage equipment and a processing system. Wherein:

the examination room panoramic cameras are arranged at a plurality of positions in the examination room and are used for monitoring the examination room environment in an omnibearing manner without dead angles;

the examination room 3D depth camera is arranged at the entrance of the examination room and is used for collecting face data of a test person for identity verification and assisting examination room environment illegal object examination;

The monitoring server is connected with the panoramic camera and the 3D depth camera and is responsible for receiving and transmitting real-time video stream data;

the storage device is used for persistence of instruction set programs, deep learning training models and related parameters required by the operation of the storage system;

The processing system is connected with the monitoring server and the storage device, and is internally provided with a check-in module, an examinee position recording module, a visual anti-cheating module and an examination report generating module.

The sign-in module calls FaceNet a face recognition algorithm, and identity verification is carried out through face data of the examinee acquired by the 3D depth camera.

And the examinee position recording module is used for matching and associating the examinee with the corresponding examination position according to the check-in position of the examinee, the three-dimensional position of the examinee captured by the panoramic camera and the seating condition of the seat.

The vision anti-cheating module utilizes MaskRCNN example segmentation algorithm and a video understanding model ViViT based on a transducer to control the panoramic camera to scan the examination hall, detects cheating behaviors such as illegal carrying articles, abnormal action behaviors and examination border crossing and records in real time.

The examination report generation module automatically generates a statistical analysis type examination monitoring report based on the check-in record and the cheating detection data.

Further, the visual anti-cheating module may be further refined to:

detecting abnormal situations of the number of faces in the examination room by using FaceBoxes face detection algorithm;

Detecting whether suspicious behaviors around the east, west and sightseeing exist by using EfficientPose human head posture estimation algorithm;

judging whether the view of the camera is blocked or not by using a foreground detection algorithm based on background modeling;

And detecting whether obvious out-of-range displacement occurs to the body of the examinee by adopting an optical flow estimation algorithm.

In the check-in stage of the examinee, the check-in module firstly acquires the examinee data acquired by the 3D depth camera, and detects whether the examinee carries illegal objects or not by utilizing MaskRCNN instance segmentation algorithm, and simultaneously counts the total number of the examinees who finish the check-in. And then, cross checking is carried out on the sign-in statistical data and the examination room number information captured by the panoramic camera, so that the number of examinees is ensured to be accurate.

The invention simultaneously provides an examination monitoring method based on the system, which mainly comprises the following steps:

(1) Acquiring face data of an examinee entering an examination room through a 3D depth camera, and verifying identity information of the examinee by using a face recognition algorithm;

(2) Recording examination position information corresponding to an examinee according to the check-in position of the examinee, the three-dimensional position of the examinee captured by the panoramic camera and the seating condition of the seat;

(3) Controlling the panoramic camera to continuously scan the examination room by using an image instance segmentation algorithm and a video understanding model, and detecting cheating behaviors such as illegal objects, abnormal actions, out-of-range behaviors and the like in real time;

(4) Based on the check-in records of the examinees and the cheating behavior detection data, comprehensive and fine examination monitoring analysis reports are automatically generated.

In detecting cheating behavior, the method further refines the following steps:

Detecting abnormal situations of the number of faces in the examination room by using a face detection algorithm;

Detecting whether the examinee has frequent sightseeing and suspicious behaviors around body turning by using a human head posture estimation algorithm;

Judging whether the field of view of the monitoring camera is deliberately blocked or not by using a foreground detection algorithm;

And detecting whether the body of the examinee has obvious out-of-seat out-of-range displacement by adopting an optical flow estimation algorithm.

In addition, the technical details of the patent examinee position recording module are as follows:

The feature point detection and descriptor extraction unit is used for carrying out feature point detection and descriptor extraction on the image acquired by each panoramic camera by adopting an ORB algorithm, so as to prepare for subsequent image registration;

the image registration unit performs spatial position alignment on multi-view images acquired by different panoramic cameras by using a random sampling consistency algorithm (RANSAC) and camera internal parameters calibrated in advance;

the three-dimensional reconstruction unit inputs the registered multi-view images into a MVSNET multi-view three-dimensional reconstruction network based on depth learning, and performs vivid three-dimensional reconstruction on the examination room environment;

The three-dimensional positioning unit firstly acquires two-dimensional check-in position coordinates recorded during check-in of the examinee, then projects the two-dimensional check-in position coordinates into a three-dimensional reconstructed examination room environment coordinate system to obtain corresponding three-dimensional position coordinates of the examinee, and finally carries out matching association on the three-dimensional position of the examinee and the examination seat.

In the feature point detection and descriptor extraction unit, ORB is a rapid binary feature description algorithm, and is more efficient compared with traditional algorithms such as SIFT, SURF and the like. The RANSAC algorithm of the image registration unit is a robust parameter estimation algorithm, and can effectively remove wrong feature matching point pairs. MVSNET of the three-dimensional reconstruction unit is a multi-view three-dimensional reconstruction frame based on deep learning, a depth map is directly predicted from a multi-view image through a convolutional neural network, point cloud splicing is carried out, and the reconstruction effect is superior to that of the traditional SFM, PMVS and other algorithms.

Drawings

FIG. 1 is a schematic structural diagram of a visual detection-based intelligent pen test monitoring system;

FIG. 2 is a flowchart of a visual detection-based intelligent pen test monitoring method;

FIG. 3 is a flow chart of a visual anti-cheating module.

Detailed Description

In order to make the above objects, features and advantages of the present application more comprehensible, embodiments accompanied with figures are described in detail below. It should be noted that, without conflict, the embodiments of the present application and features in the embodiments may be combined with each other.

Fig. 1 is a schematic diagram of the overall architecture of a visual detection-based intelligent pen test monitoring system according to the present invention. The system mainly comprises the following parts:

The examination room panoramic camera 1 adopts a plurality of high-definition cameras subjected to geometric calibration, is mounted in different areas in the examination room, and can obtain 360-degree panoramic pictures covering the whole examination room through combination and splicing. The position of the camera is required to avoid a visual field blind area between cameras, and meanwhile, shielding of the sight of an examinee is reduced as much as possible.

The examination room 3D depth camera 2 is arranged at a proper position at the entrance of the examination room by adopting a structured light depth camera with accurate calibration. The 3D depth camera can acquire color image and depth image data of an examinee entering the examination room for subsequent identity verification and illegal object detection.

The monitoring server 3 is a data processing center of the whole system, is respectively connected with the panoramic camera 1 and the 3D depth camera 2 through a wired network and is responsible for receiving, storing and forwarding video stream data in real time. The monitoring server 3 has a built-in video decoding module, which can decode the original machine video format into standard RGB and RGBD frame sequences, and provide a data interface for subsequent image processing. In view of the security of video monitoring, the monitoring server 3 transmits data using an encrypted communication protocol such as SSL/TLS and stores the data using a dedicated hardware encryption disc to prevent illegal theft of the data.

The storage device 4 adopts a high-capacity disk array and is used for durably storing programs, models and data in the running process of the system. The program part mainly comprises basic software such as an operating system, middleware, a database management system and the like, and business modules such as check-in and anti-cheating and the like. The model part contains model parameter files required by various deep learning algorithms such as face recognition and human body posture estimation. The data part comprises structured data such as an examinee information base, an examination room seat base, a history examination record and unstructured data such as a video frame and a feature vector which are collected in real time.

The processing system 5 is the core of the whole intelligent monitoring system and is responsible for coordinating and scheduling each software and hardware module to realize the end-to-end processing from the original video to the structured examination report. The processing system 5 adopts a high-performance graphic workstation, is provided with a multi-core CPU and a GPU, and has strong parallel computing capability. The system architecture adopts a multi-process and multi-thread design, and can simultaneously execute the real-time analysis tasks of multiple paths of video streams. The main software modules of the system are as follows:

4.1 The check-in module 51 calls the user information base before the examination starts to acquire the name and face sample data of the registered examinee. For each examinee entering the examination room, the check-in module 51 acquires the facial image of the examinee from the 3D depth camera 2, invokes the pre-trained FaceNet convolutional neural network model to extract 128-dimensional facial feature vectors, and judges whether the examinee is a registered examinee or not through a similarity threshold. If the verification is passed, the check-in state of the examinee is updated, and the check-in time, position and other information of the examinee are recorded.

5.2 The examinee position recording module 52 continuously positions the spatial position of the examinee after the examinee sits. The module combines the data acquired by the panoramic camera 1 and the depth camera to obtain the pixel coordinates and the depth value of each examinee. For each frame of panoramic image, the position recording module firstly utilizes an ORB operator to extract image characteristic points and descriptors, and the relative gestures between the cameras are estimated by matching the characteristic points of different camera visual angles and combining a RANSAC algorithm and a PNP algorithm. And extracting the pixel position of the examinee on the basis, and then back projecting the pixel position into the three-dimensional space of the examination room to obtain the space XYZ coordinates of the examinee. The module also needs to access the examination room seat library to acquire the three-dimensional reference coordinates of each seat. By matching the test taker real-time coordinates with the seat reference coordinates, the module can continually record and update the "test taker-seat" mapping table.

5.3 The vision anti-cheating module 53 is a core functional module of the whole system, and adopts a deep learning algorithm with various fronts to identify and record abnormal behaviors in the examination process. The module receives a video frame sequence from a panoramic camera, firstly detects illegal objects in each frame, such as mobile phones, data and the like, based on MaskRCNN example segmentation algorithm, and filters error detection through prior knowledge of area, length-width ratio and the like. And then, using a high-precision ViViT video classification model to perform semantic understanding on continuous multi-frame examinee behaviors, and identifying typical cheating actions such as a sensing strip, an intersection lug, an east, a Zhang and a West. Meanwhile, the vision anti-cheating module 53 can periodically call the examinee position recording module 52 to acquire the latest examinee space coordinates, and judges whether suspicious behaviors of obvious out-of-range behaviors occur by comparing the latest examinee space coordinates with the seat reference coordinates. In order to further improve the accuracy of cheating identification, the module is also assisted with FaceBoxes face detection, efficientPose human body posture estimation, optical flow motion estimation and other algorithms, and the abnormality of the behaviors of the testees is analyzed in a multi-mode fusion mode. Once the suspected cheating behavior is detected, the module immediately records the relevant video clips, extracts the key frame screenshot and marks the cheating type, and simultaneously sends real-time alarm information to examination prisoners.

5.4 And the examination report generating module 54 is used for automatically generating an invigilation report after the examination is finished. The module firstly gathers the data of the check-in module and the anti-cheating module, wherein the data comprise check-in information of examinees, seat corresponding relations, cheating behavior records and the like. Based on the statistics, the module generates a standard electronic edition invigilation report according to a given template, and the content covers examination room overall situation statistics, cheating behavior type and frequency statistics, key suspicious examination staff list, cheating detailed evidence attachment and the like. Through the invigilation report, examination staff can comprehensively and intuitively examine and grasp examination room order and examination wind examination period conditions, and provide objective basis for subsequent treatment of the infraction.

As shown in FIG. 2, the working flow of the embodiment is as follows, before an examination starts, a prisoner opens an intelligent monitoring device located in the examination room in advance, and video stream data is accessed to the monitoring server 3 through a wired network. The examinee entering the examination room sequentially collects face data through the 3D depth camera 2, performs face verification through the check-in module 51, and records check-in states and space-time information. After the test taker sits, the test taker position recording module 52 begins tracking and positioning his seat and builds an accurate test taker seating map. In the examination process, the visual anti-cheating module 53 carries out real-time semantic understanding on the video stream of the panoramic camera 1, timely discovers and records cheating suspicion behaviors through various abnormal behavior detection algorithms, and reminds the prisoner to go to check. After the test is completed, the test report generating module 54 gathers the structured information of each module and automatically generates a complete and standard electronic prison report. By deploying the intelligent monitoring system, timeliness and accuracy of examination site management can be remarkably improved, and examination illegal cheating behaviors can be restrained to the maximum extent.

Fig. 3 shows the internal architecture of the visual anti-cheating module 53 in this embodiment. For a video frame to be detected, firstly extracting a semantic feature map through a trunk feature extraction network, and then respectively accessing four abnormal behavior detection branches:

(1) And (3) an article detection branch, namely, adopting MaskRCNN example segmentation models to label each frame of image pixel by pixel, and identifying illegal article areas such as mobile phones, paper strips, data and the like.

(2) The face detection and statistics branches adopt a lightweight FaceBoxes algorithm to detect faces, and the number of examinees in the current examination room is counted through a convolution clustering module. If the number of detected faces is significantly greater than the number of examinees, an abnormal alarm is determined.

(3) The human body posture detection branch adopts EfficientPose human skeleton key point detection models to obtain two-dimensional coordinates of body parts such as the head, the shoulders, the arms and the like of the examinee. And judging whether the head and shoulder position gestures have frequent back sightseeing, talking and other actions or not by analyzing the head and shoulder position gestures.

(4) And (3) tracking branches in the region of the examinee, estimating the motion displacement of the body region of the examinee through optical flow, and judging whether the behavior of obviously offside and offseat occurs or not.

Parameters are obtained through independent training of the four branches, and finally abnormal feature graphs of the branches are summarized through a multipath fusion module and input into ViViT video understanding backbone networks. ViViT as a powerful video classifier, semantic information of the test taker's behavior can be extracted from two dimensions in space-time. Through pre-training on a large-scale dataset ViViT can accurately identify a variety of typical examination cheating actions. The system determines whether to trigger an abnormal alarm based on the ViViT confidence score, and stores the relevant offending segment to the test exception library.

By adopting the multi-mode fusion architecture, the behaviors of the examinees can be analyzed from multiple angles such as objects, faces, human body postures, space-time behaviors and the like, and false alarm and underalarm of a single mode are avoided, so that the accuracy and recall ratio of cheating identification are effectively improved.

Claims

1. Visual detection-based intelligent pen test monitoring system and method are characterized in that the system comprises:

the examination room panoramic cameras are arranged at a plurality of positions in the examination room and are used for monitoring the environment of the examination room;

The examination room 3D depth camera is arranged at the entrance of the examination room and is used for carrying out identity verification of an examinee and assisting in environmental security check;

the monitoring server is used for being connected with the panoramic camera and the 3D depth camera to obtain a real-time video stream;

the storage device is used for storing the instruction set, the training model and related parameters;

A processing system for interfacing with the monitoring server and a storage device, the processing system configured to execute the set of instructions to implement:

The sign-in module is used for carrying out identity verification on the examinee by using FaceNet face recognition algorithm through data acquired by the 3D depth camera;

The examinee position recording module is used for recording the examination position corresponding to the examinee according to the check-in position, the 3D position of the examinee captured by the panoramic camera and the seating condition;

The visual anti-cheating module is used for controlling the panoramic camera to scan the examination hall by utilizing MaskRCNN example segmentation algorithm and a video understanding model ViViT based on a Transformer, detecting and recording cheating behaviors such as illegal objects, illegal actions, out-of-range behaviors and the like in real time;

And the examination report generation module is used for generating an examination monitoring analysis report based on the check-in record and the detection data.

2. The system of claim 1, wherein the visual anti-cheating module further comprises:

detecting abnormal situations of the number of the face of the student by using FaceBoxes face detection algorithm;

detecting whether sightseeing behaviors exist in the examinees or not by using EfficientPose human head posture estimation algorithm;

judging whether the camera is shielded or not by using a foreground detection algorithm of background modeling;

and detecting out-of-range displacement behaviors by adopting an optical flow estimation algorithm.

3. The system of claim 1 or 2, wherein the check-in module further comprises:

Acquiring data acquired by a 3D depth camera, detecting illegal objects by utilizing MaskRCNN example segmentation algorithm and counting the number of checked-in students;

Checking the check-in data with the panoramic camera capturing data to confirm the total number of examinees.

4. A visual detection-based intelligent pen test monitoring method comprises the following steps:

Collecting data of an examinee entering an examination room through a 3D depth camera, and verifying the identity of the examinee by using a face recognition algorithm;

recording an examinee examination position according to the check-in position, the 3D position captured by the panoramic camera and the seating condition;

controlling a panoramic camera to scan an examination room by using an example segmentation algorithm and a video understanding model, and detecting cheating behaviors such as illegal objects, illegal actions, out-of-range behaviors and the like;

An examination monitoring analysis report is generated based on the check-in record and the detection data.

5. The method of claim 4, wherein detecting the cheating behavior further comprises the steps of:

Detecting abnormal situations of the number of the face of the examinee by using a face detection algorithm;

detecting sightseeing behaviors of the examinees by using a human head posture estimation algorithm;

judging whether the camera is shielded or not by using a foreground detection algorithm;

6. The system of claim 1, wherein the test taker position memory module comprises:

The feature point detection and descriptor extraction unit is used for carrying out feature point detection and descriptor extraction on the image acquired by each panoramic camera, wherein the feature point detection and descriptor extraction adopts an ORB algorithm;

the image registration unit is used for carrying out multi-view registration on images acquired by different panoramic cameras through a RANSAC algorithm and existing in-camera parameters;

the three-dimensional reconstruction unit is used for carrying out three-dimensional reconstruction on the examination room environment by utilizing the registered multi-view images based on a MVSNet three-dimensional reconstruction algorithm of depth learning;

The three-dimensional positioning unit is used for positioning three-dimensional coordinates of the examinee in the three-dimensional reconstruction environment by combining the check-in position of the examinee and carrying out matching record on the three-dimensional coordinates of the examinee and the check-in position.

7. The system of claim 6, wherein the ORB algorithm in the feature point detection and descriptor extraction unit is a direction closure algorithm.

8. The system of claim 6, wherein the RANSAC algorithm in the image registration unit is a random sample consensus algorithm.

9. The system of claim 6, wherein MVSNet of the three-dimensional reconstruction units is a depth learning based multi-view 3D reconstruction network.