CN111209848B

CN111209848B - A real-time fall detection method based on deep learning

Info

Publication number: CN111209848B
Application number: CN202010006573.0A
Authority: CN
Inventors: 孙光民; 王中岐; 李新梦
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2020-01-03
Filing date: 2020-01-03
Publication date: 2023-07-21
Anticipated expiration: 2040-01-03
Also published as: CN111209848A

Abstract

The invention relates to a real-time fall detection method based on deep learning, wherein the detection method comprises the following steps of: sequentially taking out single-frame pictures from a video stream after the analysis of a camera or a local video, inputting the single-frame pictures into an openpore algorithm model human body for detection to obtain key point coordinates of each part of the human body, using an SSD-Mobilene algorithm to exclude key points of non-human body areas, and a falling detection module: the method uses a common camera, has lower requirements on environment and use angle, and simultaneously has low cost, instantaneity, lower false detection rate, higher robustness and adaptability to different complex scenes.

Description

A real-time fall detection method based on deep learning

技术领域technical field

本发明涉及一种基于深度学习的实时跌倒检测方法。根据openpose人体姿态估计算法结合SVDD分类算法，设计出一种更加准确高效的实时跌倒检测算法，属于深度学习技术领域、人体姿态估计技术领域、和图像处理技术领域。The invention relates to a real-time fall detection method based on deep learning. According to the openpose human body posture estimation algorithm combined with the SVDD classification algorithm, a more accurate and efficient real-time fall detection algorithm is designed, which belongs to the field of deep learning technology, human body posture estimation technology, and image processing technology.

背景技术Background technique

近年来，随着我国人口老龄化的速度不断加快，老年人的安全和健康已经成为了社会广泛关注的话题。根据世界卫生组织的报告，每年65岁及以上的老年人中，有大约28-35％曾经跌倒，因此，准确并且快速识别跌倒并且通知老年人的监护人或者医护人员是非常有必要的。目前很多养老院都存在着看护人员不足，不能及时发现老年人身体安全出现状况的问题，对于老年人的监护存在着很大的缺陷。In recent years, with the acceleration of my country's population aging, the safety and health of the elderly has become a topic of widespread concern in society. According to the report of the World Health Organization, about 28-35% of the elderly aged 65 and above have fallen every year. Therefore, it is very necessary to accurately and quickly identify falls and notify the guardians or medical staff of the elderly. At present, there is a shortage of nursing staff in many nursing homes, and the problem of the physical safety of the elderly cannot be discovered in time, and there are great defects in the supervision of the elderly.

近年来，研究人员提出了多种实现自动跌倒检测的方法，分别是基于可穿戴式设备的跌倒检测方法、基于环境传感器的跌倒检测方法和基于机器视觉的跌倒检测方法。In recent years, researchers have proposed a variety of methods for automatic fall detection, including wearable device-based fall detection methods, environmental sensor-based fall detection methods, and machine vision-based fall detection methods.

可穿戴式跌倒检测系统检测不受空间的限制，但往往需要被检测者长时间佩戴可穿戴式设备，这些设备通常安装较为繁琐，长时间佩戴使用者会出现厌烦情绪，而且老年人通常记忆力都会有一定程度的减弱从而忘记佩戴可穿戴设备。基于环境传感器的跌倒检测方法具有无需穿戴、检测区域大的优点，而且具有一次安装长期多人使用的优势，可以适用于家庭、养老院等场合。缺点在于对比其他两种方法对于环境设备的要求更高，成本也比较高，同时准确率也存在不稳定的问题。Wearable fall detection system detection is not limited by space, but it often requires the person to be tested to wear wearable devices for a long time. These devices are usually cumbersome to install, and users will feel bored after wearing them for a long time. Moreover, the elderly usually have a certain degree of memory weakening and forget to wear wearable devices. The fall detection method based on the environmental sensor has the advantages of not needing to be worn, having a large detection area, and having the advantage of being used by multiple people for a long time after one installation, and can be applied to occasions such as families and nursing homes. The disadvantage is that compared with the other two methods, the requirements for environmental equipment are higher, the cost is relatively high, and the accuracy rate is also unstable.

随着人工智能、模式识别等技术的发展，越来越的基于机器视觉的跌倒检测方法被提出，监控系统作为公共安全领域的一项重要技术已经得到广泛应用。但是，目前的大多数监控系统仅仅停留在对捕获的视频信号进行人工监控和事后录像分析阶段，需要大量的物力人力。所谓的智能监控系统，要求计算机自动对捕捉到的图像数据进行实时分析和处理，对目标进行检测、识别与跟踪，并对目标的行为进行分析，当目标行为异常时，监控系统会执行报警、保存视频数据等一系列操作。把智能监控系统和跌倒检测结合在一起，可以在老年人摔倒的第一时间通知监护人或者报警，大大缩短了老年人从跌倒到得到救治的时间，减少了二次伤害的可能性，减少后期治疗等费用，对于提高老年人的晚年生活质量，减少社会公共资源的浪费具有重大的意义。With the development of technologies such as artificial intelligence and pattern recognition, more and more fall detection methods based on machine vision have been proposed, and monitoring systems have been widely used as an important technology in the field of public safety. However, most of the current monitoring systems only stay at the stage of manual monitoring of captured video signals and post-event video analysis, which requires a lot of material and manpower. The so-called intelligent monitoring system requires the computer to automatically analyze and process the captured image data in real time, detect, identify and track the target, and analyze the behavior of the target. When the target behaves abnormally, the monitoring system will perform a series of operations such as alarming and saving video data. Combining the intelligent monitoring system with fall detection, the guardian or the police can be notified as soon as the elderly fall, which greatly shortens the time for the elderly from falling to receiving treatment, reduces the possibility of secondary injury, and reduces the cost of later treatment. It is of great significance for improving the quality of life of the elderly in their later years and reducing the waste of social public resources.

发明内容Contents of the invention

本发明要解决的技术问题在于，针对现有的使用传统图像处理进行跌倒检测的技术误检率和实时性上的缺陷，提高检测的速度和准确率。。The technical problem to be solved by the present invention is to improve the speed and accuracy of detection in view of the defects of false detection rate and real-time performance of the existing technology of fall detection using traditional image processing. .

为了解决上述技术问题，本发明提供一种基于深度学习的实时跌倒检测方法。In order to solve the above technical problems, the present invention provides a real-time fall detection method based on deep learning.

本跌倒检测算法主要步骤如下：The main steps of the fall detection algorithm are as follows:

步骤1，使用摄像头等设备采集人体图像；Step 1, using a camera and other equipment to collect human body images;

步骤2，进行人体骨骼关键点识别；Step 2, carry out key point recognition of human skeleton;

步骤2.1，使用神经网络卷积VGG-19提取特征，预测每个关键点的热点图谱，通过多阶段的网络迭代，每一个阶段的输出作为下一个阶段的输入，对上一个阶段的结果优化；Step 2.1, use the neural network convolution VGG-19 to extract features, predict the heat map of each key point, through multi-stage network iteration, the output of each stage is used as the input of the next stage, and optimize the results of the previous stage;

步骤2.2，对于预测的关键点增加矢量编码，对于图像中的关键点进行组合，同一个人的不同部位进行连接；Step 2.2, add vector encoding to the predicted key points, combine the key points in the image, and connect different parts of the same person;

步骤2.3，迁移学习训练图像数据集，使模型可以更好地应用于特定场景；Step 2.3, transfer learning training image data set, so that the model can be better applied to specific scenarios;

步骤3，使用SSD-Mobilenet目标检测算法对检测的人体关键点区域进行目标检测，去掉非人体的部分；Step 3, use the SSD-Mobilenet target detection algorithm to perform target detection on the detected human body key point area, and remove the non-human body part;

步骤3.1，先使用Mobilenet网络对COCO数据集进行预训练，生成预训练模型；Step 3.1, first use the Mobilenet network to pre-train the COCO dataset to generate a pre-training model;

步骤3.2，使用步骤2.3采集的数据生成tfrecord数据用于迁移学习；Step 3.2, using the data collected in step 2.3 to generate tfrecord data for transfer learning;

步骤3.3，步骤3.3把步骤3.2生成的数据和步骤3.1生成的预训练模型融合特征传入SSD网络，使模型经过较短时间的训练就可以收敛并具有较好的检测能力；In step 3.3, step 3.3 transfers the data generated in step 3.2 and the fusion feature of the pre-training model generated in step 3.1 into the SSD network, so that the model can converge after a short period of training and has better detection ability;

步骤4，使用SVDD分类算法对采集到的人体骨骼关键点进行分类。Step 4, use the SVDD classification algorithm to classify the collected key points of human bones.

有益效果Beneficial effect

本方法的优势和创新性主要体现以下方面：The advantages and innovations of this method are mainly reflected in the following aspects:

(1)实现一种全自动的算法流程，无需人工干预或最小程度干预，解决了传统视频监控方法耗费人力物力的缺点。(1) Realize a fully automatic algorithm process without manual intervention or minimal intervention, which solves the shortcomings of traditional video surveillance methods that consume manpower and material resources.

(2)使用openpose和SSD-Mobilenet算法相结合的人体姿态估计算法(2) Human body pose estimation algorithm using the combination of openpose and SSD-Mobilenet algorithm

对于使用openpose算法检测到的关键点，通过SSD-Mobilenet目标检测算法进行人体目标检测，过滤掉非人体的区域，减小算法的误判率。For the key points detected by the openpose algorithm, the SSD-Mobilenet target detection algorithm is used to detect human targets, filter out non-human areas, and reduce the misjudgment rate of the algorithm.

(3)针对跌倒数据样本不平衡的问题，使用SVDD算法进行分类，通过(3) For the problem of unbalanced data samples of falls, use the SVDD algorithm to classify, and pass

网格搜索等方法对参数进行优化。形成正样本训练的正常域，并通过正常域检测跌倒行为。The parameters are optimized by grid search and other methods. Form the normal domain of positive sample training, and detect the fall behavior through the normal domain.

(4)将(2)和(3)结合构建一个鲁棒性较好，准确率较高的实时跌倒检测框架，使用多种目标检测，图像处理等多领域的算法，具有算法应用的创新型。(4) Combining (2) and (3) to build a robust and accurate real-time fall detection framework, using a variety of target detection, image processing and other multi-field algorithms, with innovative algorithm applications.

附图说明Description of drawings

本发明上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解，其中：The above and/or additional aspects and advantages of the present invention will become apparent and easy to understand from the following description of the embodiments in conjunction with the accompanying drawings, wherein:

图1为本发明实时例的跌倒检测整体流程图。FIG. 1 is an overall flow chart of fall detection in a real-time example of the present invention.

图2为本发明实施例的神经网络特征提取部分网络结构图。FIG. 2 is a network structure diagram of the neural network feature extraction part of the embodiment of the present invention.

图3为本发明实施例的人体关键点检测结果图。Fig. 3 is a diagram of detection results of key points of a human body according to an embodiment of the present invention.

图4为本发明实施例的SVDD分类结果图。Fig. 4 is a diagram of the SVDD classification result of the embodiment of the present invention.

图5为本发明实施例的人体跌倒检测结果图。Fig. 5 is a diagram of human fall detection results according to an embodiment of the present invention.

具体实施方式Detailed ways

下面详细描述本发明的实施例，所述实施例的示例在附图中示出，其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的，仅用于解释本发明，而不能解释为对本发明的限制。Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.

如图1所示，本发明提出一种基于深度学习的实时跌倒检测算法，具体步骤如下所示：As shown in Figure 1, the present invention proposes a real-time fall detection algorithm based on deep learning, and the specific steps are as follows:

步骤2.1，使用神经网络卷积提取特征，预测每个关键点的热点图谱，通过多阶段的学习，每个阶段对上一个阶段的结果优化；Step 2.1, use neural network convolution to extract features, predict the heat map of each key point, and through multi-stage learning, each stage optimizes the results of the previous stage;

如图2所示，本方法使用VGG-19网络进行特征提取特征图fearture-map，由VGG-19的前10层初始化并进行微调，生成一组输入到每个分支第一阶段的特征映射F，在第一阶段，网络产生一组置信图的预测S¹＝ρ¹(F)和一组部分亲和力场其中ρ¹和/>是用于阶段1推断的CNN。在随后的阶段，来自前一阶段的两个分支的预测以及原始图像F被用于精确预测。特征F被连接并用于产生精确预测As shown in Figure 2, this method uses the VGG-19 network for feature extraction. The feature map fear-map is initialized and fine-tuned by the first 10 layers of VGG-19 to generate a set of feature maps F that are input to the first stage of each branch. In the first stage, the network generates a set of confidence map predictions S ¹ = ρ ¹ (F) and a set of partial affinity fields where ρ ¹ and /> is the CNN used for stage 1 inference. In subsequent stages, the predictions from the two branches of the previous stage together with the original image F are used for the precise prediction. Features F are concatenated and used to produce accurate predictions

公式中ρ^t和是阶段t的预测。In the formula, ρ ^t and is the forecast at stage t.

在网络早期身体部位之间可能存在混淆，但是通过后期网络迭代，人体关键点的预测越来越准确。网络的两个分支有着两个损失函数，每个分支一个，在预测和groudtruth的maps和fields之间使用L2损失防止过拟合的情况出现，对于损失函数加权计算。There may be confusion between body parts in the early stage of the network, but through later network iterations, the prediction of human body key points becomes more and more accurate. The two branches of the network have two loss functions, one for each branch, using L2 loss between the maps and fields of the prediction and groudtruth to prevent overfitting, and weighting the calculation of the loss function.

和/>分别是从1开始一直到C不同阶段t的L2损失函数，f为总体的损失函数。为每个人生成个人的confidence maps，由网络预测出各个置信度的集合。 and /> They are the L2 loss function from 1 to the different stage t of C, and f is the overall loss function. Personal confidence maps are generated for each person, and the network predicts a set of confidence levels.

在网络训练过程中，对于某一类给定的关键点j，对于某一个人k，他的置信度标志图只有一个峰值，置信度标签为每一个点到数据真实点的高斯距离。对于多个人，使用非极大值抑制(NMS)剔除网络中得分低的数据。In the process of network training, for a certain type of given key point j, for a certain person k, his confidence mark map has only one peak, and the confidence mark is the Gaussian distance from each point to the real point of the data. For multiple persons, non-maximum suppression (NMS) is used to cull low-scoring data from the network.

步骤2.2，对于预测的关键点增加矢量编码，对同一个人的不同部位进行连接；给定一组检测到的身体部位，对于每一对身体部位检测关联进行置信度度量，检测关联的一种方式是检测肢体上每对肢体之间的额外中点，并检查候选部位检测之间的发生率，当人体叠加在一起时这些中点可能会提供虚假关联，为了解决这些局限性，使用PAFs特征表示法，它保留了肢体支撑区域的位置和方向信息,PAFs是每个肢体的2D矢量场，对于属于特定肢体的区域中的每个像素，2D矢量编码从肢体的一部分指向另一部分的方向。每种类型的肢体都有一个对应的PAFs区域，用于连接两个相关的身体部位。对检测置信度图进行非极大值抑制(NMS)，以获得一组离散的部位候选位置。对于每个部分，可能会有多个候选位置，由于图像中有多个人或误报，这些部分候选位置定义了一大组可能的肢体。使用PAF上的线积分计算来评分每个候选肢体，寻找最佳解析的问题对应于已知为NP-Hard的K维匹配问题，使用匈牙利算法来获得最佳匹配。Step 2.2, add vector encodings for the predicted keypoints, and connect different parts of the same person; given a set of detected body parts, perform a confidence measure for each pair of body part detection associations, one way to detect associations is to detect additional midpoints between each pair of limbs on the limbs, and check the incidence between candidate part detections, these midpoints may provide false associations when the human body is superimposed. A 2D vector field for each limb, for each pixel in the region belonging to a particular limb, the 2D vector encodes the direction from one part of the limb to another. Each type of limb has a corresponding region of PAFs that connect two related body parts. Non-maximum suppression (NMS) is performed on the detection confidence map to obtain a discrete set of part candidate locations. For each part, there may be multiple candidate positions, which define a large set of possible limbs due to multiple people in the image or false positives. Each candidate limb is scored using a line integral computation over the PAF, and the problem of finding the best resolution corresponds to a K-dimensional matching problem known as NP-Hard, using the Hungarian algorithm to obtain the best match.

步骤2.3，为了使模型更加适用于本方法应用场景，采集了1000张人体各个动作的RGB图片，包含人体日常行为和跌倒图像，使用tensorflow框架对采集图片进行训练，随着迭代次数不断增加，损失函数值不算减小，设置迭代次数为5000次，当达到5000时，损失值为0.0925，使用模型对测试集进行测试，准确率为91.2％,证明算法满足使用要求，使用此步骤实施后效果如图3所示。In step 2.3, in order to make the model more suitable for the application scenarios of this method, 1000 RGB pictures of various human actions were collected, including daily human behavior and fall images, and the collected pictures were trained using the tensorflow framework. shown.

步骤3，使用SSD-Mobilenet目标检测模型对检测的人体关键点区域进行目标检测，去掉非人体的部分。Step 3, use the SSD-Mobilenet target detection model to perform target detection on the detected human body key point area, and remove the non-human body part.

步骤3.1首先使用Mobilenet网络对COCO数据集进行预训练，这个数据集包括80类目标，包括人体，动物，交通工具，家用电器等12个大类，得到预训练的参数。Step 3.1 First, use the Mobilenet network to pre-train the COCO data set. This data set includes 80 types of targets, including 12 categories such as human body, animals, vehicles, and household appliances, to obtain pre-trained parameters.

步骤3.2使用步骤2.3中的自建人体数据集，使用LabelImg工具标注人体样本，样本被标注好后，生成一一对应的xml文件，然后调用xml_to_csv.py生成csv表格，接下来使用generate_tfrecord.py文件把csv文件转化为tensorflow框架能识别的tfrecord格式。Step 3.2 Use the self-built human body dataset in step 2.3, and use the LabelImg tool to label human samples. After the samples are labeled, generate a one-to-one corresponding xml file, and then call xml_to_csv.py to generate a csv table, and then use the generate_tfrecord.py file to convert the csv file into a tfrecord format that the tensorflow framework can recognize.

步骤3.3把步骤3.2生成的数据和步骤3.1生成的预训练模型融合特征传入SSD网络，设置迭代次10000次，当损失达到1.8时停止训练，总损失为1.8，平均精度为78.4％。此步骤实施后可以去除冗余的关键点，只保留正确的人体骨骼关键点。Step 3.3 Introduce the fusion features of the data generated in step 3.2 and the pre-trained model generated in step 3.1 to the SSD network, set the number of iterations to 10,000, stop training when the loss reaches 1.8, the total loss is 1.8, and the average accuracy is 78.4%. After this step is implemented, the redundant key points can be removed, and only the correct human skeleton key points can be kept.

虽然获得了关节点的轨迹，但仍存在数据不平衡的问题。在日常生活中，由于跌落类型的多样性，获得足够的实际样本下降不仅需要大量的人力和资源，而且难以获得，这导致样本不平衡，因此如何使用不平衡数据来检测跌倒已成为一个基本问题。本方法采用SVDD异常检测方法，通过网格搜索等方法对参数进行优化。形成正样本训练的正常域，并通过正常域检测跌倒行为。SVDD最初主要用于异常检测和图像分类。与分类器支持向量机(SVM)不同，SVDD属于单类SVM算法。主要思想是通过训练阳性样本来构建正常的域超球面，而超球面外部的样本是异常样本。对于给定的阳性样本集X，其包含N个阳性样本(Xi＝0,1,2，...，N)，以便在构建超球面时减少异常数据点被包括在正常域中的影响，惩罚因子C和松弛变量ξ_i，当半球R的超球面中心时，让正样本完全被球体包围，相应的优化方程是：Although the trajectories of joint points are obtained, there is still the problem of data imbalance. In daily life, due to the diversity of fall types, obtaining enough actual sample falls not only requires a lot of manpower and resources, but also is difficult to obtain, which leads to sample imbalance, so how to use imbalanced data to detect falls has become a fundamental problem. This method adopts the SVDD anomaly detection method, and optimizes the parameters by grid search and other methods. Form the normal domain of positive sample training, and detect the fall behavior through the normal domain. SVDD was originally mainly used for anomaly detection and image classification. Unlike the classifier Support Vector Machine (SVM), SVDD is a one-class SVM algorithm. The main idea is to construct a normal domain hypersphere by training positive samples, while samples outside the hypersphere are abnormal samples. For a given positive sample set X, which contains N positive samples (Xi=0, 1, 2, ..., N), in order to reduce the impact of abnormal data points being included in the normal domain when constructing a hypersphere, the penalty factor C and the relaxation variable ξ _i , when the hypersphere center of the hemisphere R, let the positive samples be completely surrounded by a sphere, the corresponding optimization equation is:

对于典型的二次规划问题，通过引入拉格朗日乘数并求解相应的函数。通常，在删除异常数据点之后，数据将不会以球形方式分布。因此，引入核函数K将低维空间中的非线性问题转化为高维线性问题。图4显示了二维SVDD异常检测图，红点是正常数据，蓝色交叉点是异常数据，决策边界被红点包围。决策中的边界点是支持向量。支持向量确定决策边界的形状和大小，这是模型的准确性。For typical quadratic programming problems, by introducing Lagrangian multipliers and solving the corresponding functions. Typically, after removing outlier data points, the data will not be distributed in a spherical fashion. Therefore, the kernel function K is introduced to transform the nonlinear problem in the low-dimensional space into a high-dimensional linear problem. Figure 4 shows the two-dimensional SVDD anomaly detection map, the red dots are normal data, the blue intersections are anomalous data, and the decision boundary is surrounded by red dots. The boundary points in the decision are support vectors. Support vectors determine the shape and size of the decision boundary, which is the accuracy of the model.

此步骤实施后效果如图5所示。经过实验证明可见本方法的准确性和可靠性。The effect after this step is implemented is shown in Figure 5. The experiment proves the accuracy and reliability of this method.

Claims

1. A real-time fall detection method based on deep learning, characterized in that, the concrete steps are as follows:

Step 1, collecting human body images;

Step 2, carry out key point recognition of human skeleton;

Step 2.1, use neural network convolution to extract features, predict the heat map of each key point, and through multi-stage learning, each stage optimizes the results of the previous stage;

Use the VGG-19 network for feature extraction to obtain the feature map feature-map, which is initialized and fine-tuned by the first 10 layers of VGG-19 to generate a set of feature maps F that are input to the first stage of each branch. In the first stage, the network generates a set of confidence map predictions S ¹ = ρ ¹ (F) and a set of partial affinity fields where ρ ¹ and /> is the CNN used for stage 1 inference; in subsequent stages, the predictions from the two branches of the previous stage along with the original image F are used for the precise prediction; the features F are concatenated and used to produce the precise prediction

In the formula, ρ ^t and is the forecast at stage t;

The two branches of the network have two loss functions, one for each branch, and weighted calculations for the loss function;

and /> They are the L2 loss function of different stages t, and f is the overall loss function; a personal confidence map is generated for each person, and the network predicts the set of each confidence degree;

In the process of network training, for a certain type of given key point j, for a certain person k, his confidence mark map has only one peak, and the confidence label is the Gaussian distance from each point to the real point of the data; for multiple people, use non-maximum value suppression to eliminate low-scoring data in the network;

Step 2.2, add vector encoding to the predicted key points, and connect different parts of the same person; given a set of detected body parts, perform a confidence measure for each pair of body part detection associations, using the PAFs feature representation.

Each type of limb has a corresponding region of PAFs used to connect two related body parts; non-maximum suppression is performed on the detection confidence map to obtain a discrete set of candidate positions of the part; for each part, there will be multiple candidate positions, and the candidate positions define a large set of candidate limbs; each candidate limb is scored using line integral calculations on PAFs, and the problem of finding the best resolution corresponds to the K-dimensional matching problem known as NP-Hard, using the Hungarian algorithm to obtain the best match;

In step 2.3, the RGB pictures of various human actions are collected, including the daily behavior of the human body and images of falls, and the tensorflow framework is used to train the collected pictures. As the number of iterations increases, the value of the loss function does not decrease. Set the number of iterations to more than 5000 times, and use the model to test the test set;

Step 3, use the SSD-Mobilenet target detection model to perform target detection on the detected human body key point area, and remove the non-human body part;

Step 3.1 First, use the Mobilenet network to pre-train the COCO dataset, which includes 80 types of targets, and obtain the pre-trained parameters;

Step 3.2 Use the human body data set in step 2.3, and use the LabelImg tool to label human samples. After the samples are marked, generate a one-to-one corresponding xml file, then call xml_to_csv.py to generate a csv table, and then use the generate_tfrecord.py file to convert the csv file into a tfrecord format that can be recognized by the tensorflow framework; step 3.3 transfer the data generated in step 3.2 and the fusion features of the pre-trained model generated in step 3.1 to the SSD network, and set iteration 1 More than 0000 times, stop training when the loss is less than 1.8;

Step 4, using the SVDD classification algorithm to classify the collected key points of the human skeleton;

Using the SVDD anomaly detection method, the parameters are optimized by grid search; the normal domain of positive sample training is formed, and the fall behavior is detected through the normal domain;

For a given sample set X, which contains N samples, using the penalty factor C and the slack variable ξ _i , when the hypersphere center of the hemisphere R is used, the positive samples are completely surrounded by the sphere, and the corresponding optimization equation is: