research-article

Open access

Deep Learning-based Smart Predictive Evaluation for Interactive Multimedia-enabled Smart Healthcare

Authors:

Zhihan Lv,

Zengchen Yu,

Shuxuan Xie,

Atif AlamriAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 18, Issue 1s

Article No.: 43, Pages 1 - 20

https://doi.org/10.1145/3468506

Published: 25 January 2022 Publication History

All formats PDF

Abstract

Two-dimensional¹ arrays of bi-component structures made of cobalt and permalloy elliptical dots with thickness of 25 nm, length 1 mm and width of 225 nm, have been prepared by a self-aligned shadow deposition technique. Brillouin light scattering has been exploited to study the frequency dependence of thermally excited magnetic eigenmodes on the intensity of the external magnetic field, applied along the easy axis of the elements.

This study aims to enhance the security for people's health, improve the medical level further, and increase the confidentiality of people's privacy information. Under the trend of wide application of deep learning algorithms, the convolutional neural network (CNN) is modified to build an interactive smart healthcare prediction and evaluation model (SHPE model) based on the deep learning model. The model is optimized and standardized for data processing. Then, the constructed model is simulated to analyze its performance. The results show that accuracy of the constructed system reaches 82.4%, which is at least 2.4% higher than other advanced CNN algorithms and 3.3% higher than other classical machine algorithms. It is proved based on comparison that the accuracy, precision, recall, and F1 of the constructed model are the highest. Further analysis on error shows that the constructed model shows the smallest error of 23.34 pixels. Therefore, it is proved that the built SHPE model shows higher prediction accuracy and smaller error while ensuring the safety performance, which provides an experimental reference for the prediction and evaluation of smart healthcare treatment in the later stage.

1 Introduction

With the rapid development of science and technology today, artificial intelligence (AI), big data, and other technologies are more widely used in all walks of life, bringing them to developing towards smart technologies, so that people's lives become more and more convenient. The emerging information technologies such as AI, cloud computing, and Internet bring people the new ideas for medical difficulties such as “medical resources imbalance”, “high healthcare costs”, and “high medical misdiagnosis rate”. Thus, the healthcare industry is developing towards digital [1, 2]. The traditional artificial diagnosis can realize a reliable diagnostic accuracy, but the accuracy and efficiency of medical diagnosis is closely related to people's lives and health, which is still a point that can't be ignored [3]. Therefore, the scientific technology in the medical field for smart evaluation of people's lives and health has become the focus of more and more scientific researchers.

As the aging process becomes more serious, there is a great deal of demand for sound medical policies in the whole society. Analysis of medical misdiagnosis in the medical field shows that errors caused by human negligence are more common. With the rapid development and widespread application of information technologies (such as the Internet of Things (IoT), mobile Internet, and AI) and breakthroughs and innovations in the theoretical basis of machine learning and deep learning, these emerging technologies have been also increasingly applied in the medical field [4, 5]. Among them, deep learning has been intensively applied in the medical field, such as for medical image segmentation, medical knowledge mapping, smart wearable medicine, disease risk prediction, and medical question answering system. Computers are adopted to record the related data and information in the electronic form, which is a new substitution of handwritten medical records. Standardized management of nurses’ clinical diagnosis and treatment behaviors can increase the retention time of patients’ medical records, improve the efficiency of doctors’ visits, and reduce doctors’ mistakes during diagnosis and treatment due to differences in personal knowledge, ability, and experience [6]. The intelligent tracking algorithm of computer-assisted minimally invasive surgical tool adopts surgical tool tracking technology to analyze the video images, determine the position and spatial posture of the surgical tool, and provide precise and real-time navigation for the surgeon or surgical robot, so as to enhance smoothness and security of the surgery [7]. Therefore, the application of AI technology in the medical field can fully broaden the knowledge of clinicians and reduce the perception of negligence, thereby helping doctors improve the efficiency of medical work and the quality of diagnosis, identify the abnormal conditions in the diagnosis and treatment process, and provide a basis for business process optimization [8].

In summary, this study is developed to enhance the security for people's health, improve the medical level further, and increase the confidentiality of people's privacy information. An interactive SHPE model is built using the deep learning algorithm, and its performance characteristics are analyzed by simulation. Therefore, it has very important significance for disease prediction and evaluation in the medicine field in the future.

2 Recent Related Work

2.1 Development Status of Deep Learning

As one of the most promising AI tools, deep learning has been applied more and more widely, and has been studied by many researchers. In order to evaluate the application value of deep CNN (DCNN) in the diagnosis of chest tuberculosis (TB), Lakhani et al. [2017] adopted four unidentified data sets that meet the Health Insurance Portability and Accountability Act (HIPAA) for relevant experiments and found that DCNN could accurately classify the TB on chest radiographs [9]. Huang et al. [2019] proposed a massive multiple-input multiple-output (MIMO) framework based on the millimeter wave (mmWave) under deep learning; taking each selected pre-coder as the mapping relationship in deep neural network (DNN), this framework adopted the hybrid pre-coder to optimize the pre-coding of mmWave-based massive MIMO through training based on the DNN; the simulation experiment proved that the DNN-based method could reduce bit error rate to the maximum extent and improve the spectral efficiency of mmWave-based massive MIMO, which could reduce the computational complexity obviously and realize better performance compared with traditional schemes [10]. Van et al. [2019] proposed a new deep learning-based detector DeepIM, which could recover data in the orthogonal frequency division multiplexing with index modulation (OFDM-IM) system using a DNN with a full connection layer, and then preprocess the received signals and channel vectors based on domain knowledge before entering the network; the results showed that DeepIM could obtain a near-optimal bit error rate at a lower running time [11]. Wang et al. [2020] proposed a new light automatic modulation classification (Light AMC) method based on deep learning, which introduced a scale factor into the CNN for each neuron, and then enhanced the sparsity of the scale factor through compressed perception; the simulation analysis revealed that the Light AMC method proposed could effectively reduce the model size, accelerate the calculation speed, and minimize the performance loss [12].

2.2 Development Status of Smart Healthcare

With the improvement of people's living standards, there are still some thorny problems in the medical and health service market in China, such as the high rate of misdiagnosis. With the popularization of AI in various fields, the medical field has gradually developed towards intelligence, which has attracted the attention and focus of many scientific research scholars. In view of this reason, Zhang et al. [2018] proposed a smart IoT architecture for the smart hospital based on the narrow band IoT (NB-IoT), introduced the edge computing to deal with the delay requirement in the medical process, and discussed the challenges and future development direction of smart hospital construction [13]. Verma et al. [2018] analyzed the big data generated by the IoT equipment in the field of medicine based on the cloud instead of relying on the limited storage and computing resources of a handheld device; a health monitoring diagnosis framework based on the cloud computing under IoT was proposed, which could predict the severity of the underlying disease; it was found by using the diagnosis scheme with various most advanced classification algorithms that the accuracy, sensitivity, specificity, and F1 of the method were superior to those of the baseline method in disease prediction [14]. Abdel-basset et al. [2019] proposed a new framework for the detection and observation of patients with type 2 diabetes based on computer-assisted diagnosis and the IoT; the constructed healthcare systems aimed to get better diagnostic accuracy from cryptic data; the experiment finally proved the overall effectiveness of the algorithm [15]. Akyildiz et al. [2020] proposed a submillimeter implantable bio-electronic device, a biological nanotechnology, and a system with tight couple among sensing, driving, and computing process, intending to provide a reliable and sensitive disease detection and infection recovery system; and its feasibility was finally verified through simulation analysis [16].

To sum up, the analysis of relevant research status in the field of deep learning and medical treatment discloses that the application of deep learning algorithm is becoming more and more extensive, and the medical level is also tending towards smart development, but there are few studies on the application of deep learning algorithm in the medical field and the assessment and analysis of people's physical health. Therefore, a variety of algorithm research schemes were designed based on neural network in this study, the model was trained and analyzed with real medical data to determine and save the optimal scheme model. In addition, the model was deployed in the system to promote and apply the algorithm.

3 Methods

3.1 Demand Analysis on Smart Healthcare System

The rapid development of social economy and cultural level has a direct impact on people's health consumption needs. The unbalanced development of economy and consumption power among regions and residents in China has led to the diversification of people's health needs. Taking the basic medical needs of the people as the starting point, this study provides multi-level and diversified medical services for different service targets, and actively creates related conditions to meet the medical service needs of different groups of people. This is not only a response for the call for health development in China, but also can bring reliable economic benefits [17, 18]. The feasibility analysis of a smart healthcare system is shown in Figure 1 below.

Fig. 1.

The feasibility analysis of the smart healthcare system mainly includes algorithm feasibility analysis, system feasibility analysis, and economic feasibility analysis. In terms of algorithm feasibility, the constructed system adopts the most popular and hottest deep learning, machine learning, and data mining algorithms, and is trained and analyzed based on the current cutting-edge data preprocessing technology. The model with the optimal solution is saved and arranged in the system for application and promotion of the algorithms. In the system feasibility, the WeChat applet platform is undertaken as the system development platform, which can be compatible with various versions of Android and iphone operation systems (IOSs), realizing the convenience of multi-platform use. In the economic feasibility, the development cost of this system mainly lies in the lease and maintenance of the server, and the developers are full-time engineering graduate students in colleges, so the cost is relatively small. With the rapid development of scientific information technology, many scholars in China and other countries have applied the deep learning, machine learning, and data mining algorithms to the field of smart healthcare. Thus, it can also be considered that applying the AI technology in the medical field is the current research trend.

There are many target groups in the smart healthcare system, including professional doctors, users with healthcare demands, busy users, users with minor illnesses, and users with demands of privacy protection. Doctors in medical institutions in remote areas and doctors over 55 years of age are mainly targeted among the medical staff, and the overall population of rural doctors is aging. Therefore, the emergence of the smart healthcare system can be used as a supplement to its daily diagnosis, providing some help or tips in the process, so that it can serve patients more accurately and conveniently. Unprofessional users who are extremely concerned about subtle physical discomforts or symptoms may suffer excessive costs in the hospital, or inaccurate results of online searches when the above discomforts or symptoms appear. In the smart healthcare system, users can describe the physical symptoms or select the symptoms keywords to obtain the reminding or help through an online diagnosis function. It is easy for busy users to ignore some minor physical symptoms, or they may give up going to the hospital due to financial or work reasons when experiencing subtle symptoms. The death rate of cardiovascular diseases has continued to rise in recent years, and its prevention is far more important than treatment. Therefore, the use of a smart healthcare system for pre-diagnosis is conducive to examining the tendency of a disease in a hospital. Under the imbalance in the distribution of medical resources, the bed utilization rate of many top three hospitals is as high as 99% [19], so that people with minor diseases in large cities can't solve their diseases timely, effectively, and economically. Therefore, the smart healthcare system can include the disease encyclopedia function to facilitate the retrieval of the profile, symptoms, etiology, treatment, and nursing information of the disease. In addition, the young boys or girls who have symptoms in their private parts can use the smart healthcare system for diagnosis and treatment without registering any personal information [20–22]. In this study, the CNN in the deep learning algorithm is improved according to the feasibility of the system to construct a smart healthcare system, and the system is divided into different functional modules according to the demand analysis of the target population.

3.2 Deep Learning

Deep learning is also called deep structure learning or cascade learning, and it is a part of machine learning algorithms. As an algorithm that can independently extract the data features, deep learning can be applied to the medical field to promote fast development of smart detection [23]. Deep learning algorithms can automatically learn the multi-level features in the system from the original data, requiring no participation of human experts in related fields. Thus, it greatly saves the manpower, material resources, and time costs, and can complete the classification task with the learnt key features [24]. Deep learning algorithms are closely related to the information processing and communication modes in the biological nervous system. It contains many architectures, such as multilayer artificial neural networks (ANN), CNN, and recurrent neural networks (RNN). The power of deep learning algorithms is that the output of a certain layer in the network can be undertaken as the expression of data, so that the features are extracted through the neural network [25]. The architecture and optimization of the deep learning algorithm are shown in Figure 2 below.

Fig. 2.

In the deep learning algorithm, the forward propagation of the neural network is transmitted from the input layer to the output layer by the linear combination of many neurons among the hidden layers, which is realized through nonlinear change of activation function. The equation is as follows:

\begin{equation} f(l) = m({V^T}l + s) \end{equation}

(1)

In the above equation, m refers to the activation function. The linear calculation of neurons among the hidden layer is contained in the activation function. After the nonlinear calculation of the activation function, the result is passed to the neurons in next hidden layer. The essence of training the neural network in the deep learning algorithm is to enable the neural network to better fit the data distribution and make a decision boundary [26]. The parameters have to be optimized to find the best decision boundary, during which, it is necessary to compare the difference between the real sample value k and the predicted value f(l) of the sample (namely, loss function). The loss function can be calculated with Equation (2) below:

\begin{equation} c = k - f(l) \end{equation}

(2)

In the above equation, c refers to loss. The goal of neural network operation is to make the loss sum of all training data as small as possible. The CNN composing of neurons with weights can be obtained by further deriving the fully connected neural network (FCNN), and its main characteristics are sparse connections and weight sharing. Sparse connection means that the CNN network is connected using the spatial local correlation among image pixels through the local connection of neurons; and weight sharing means that the feature maps generated by convolution operations on the input image by each convolution kernel in the CNN network share the same parameters [27]. From a structural point of view, the CNN network includes the convolutional layer, pooling layer, and fully connected layer. The parameters of the convolution layer are composed of some convolution kernels or convolution filters. The feature maps can be generated in this layer. The p^th feature map is often defined as m^p, and the convolution kernel is composed of parameter \({V^p}\) and \({s_p}\), then below the equation can be obtained:

\begin{equation} m_{ij}^p = f({({V^p} * l)_{ij}} + {s_p}) \end{equation}

(3)

In the above Equation (3), f(.) refers to an activation function, l represents an inputted feature map, and the parameter V of the hidden layer can be represented as a four-dimensional vector. The main function of the convolutional layer is to realize convolution calculations, and features of abstract data can be extracted finally through continuous adjustment of parameters. During the convolution calculation, all areas of the layer will be scanned in turn, and finally the features will be input to become the matrix element multiplication summation and superimposing the deviation, as given below:

\begin{equation} \begin{array}{@{}*{1}{c}@{}} {{Z_{h + 1}}(i,j) = \left[ {{Z_h} \otimes {v_h}} \right](i,j) + s = \sum\limits_{a = 1}^{{K_h}} {\sum\limits_{l = 1}^f {\sum\limits_{k = 1}^f {\left[ {Z_p^h\left( {{b_0}i + l,{b_0}j + k} \right)w_k^{h + 1}(l,k)} \right] + s} } } }\\ {(i,j) \in \left( {0,1,\ldots ,{H_{h + 1}}} \right)\begin{array}{*{20}{c}} {}&{} \end{array}} \end{array} \end{equation}

(4)

\begin{equation} {H_{h + 1}} = \frac{{{H_h} + 2k - f}}{{{b_0}}} + 1 \end{equation}

(5)

In the above equations, s refers to the amount of deviation, \({Z_h}\) and \({Z_{h + 1}}\) represent the convolutional input and output of the h+1^th layer, respectively, which is also called a feature map. \({H_{h + 1}}\) refers to the size of \({Z_{h + 1}}\), and the length and width of the feature map are assumed to be the same in the above equations. Z(i, j) refers to the pixels of the corresponding feature map, P is the number of channels for the feature map. f, \({b_0}\), and k are the convolutional layer parameters, representing the size of the convolution kernel, convolution stride, and padding number. Due to the convolutional layer, the CNN is more suitable for abstract data such as images and audio. The pooling layer exists in the continuous convolutional layers, and its main function is to gradually reduce the size of the feature map, so that the parameters and calculations in the network can be gradually reduced, preventing overfitting. It is assumed that the size of the input feature map is W, the filter size is C, the step size is B, and the number of zero padding added to the boundary is K, then the size of the output feature map can be calculated with Equation (6) below:

\begin{equation} (W - C + 2K)/B + 1 \end{equation}

(6)

The fully connected layer is connected in the same way as the ANN. At present, with the rapid development of the CNN network, more variants have been proposed, such as visual geometry group network (VGGNet) [28], GoogleNet [29], AlexNet [30], Institute for Global Communications Network (IGCNet) [31], and gated CNN (GCNN) [32].

3.3 Design on Interactive Smart Healthcare Prediction and Evaluation Model

When the interactive SHPE model is designed, the medical data should be collected and standardized with the smart health system firstly. There is a large amount of redundant information in medical data, so it has to be filtered and removed by data preprocessing technology, so that the useful data can be filtered out for use and analysis. Next, the medical data is stored in the database in Chinese encoding format. How to effectively segment and convert the complex Chinese medical data into computer-processable quantitative data and ultimately retain its original semantic information will be the core of the data preprocessing technology in the SHPE system. From the perspective of the demand of the smart healthcare system, there are many target groups, so the system can be designed with different functional modules. Finally, when the pre-processed data is applied to train the model, the main task is to explore the convergence and select the optimal parameter of model training.

3.4 Construction of Interactive Smart Healthcare Prediction and Evaluation Model based on the Deep Learning

People's physical health is affected by many diseases, and the ideal goal of physical health is to predict the disease risk before it is diagnosed, so as to discover the potential risks and trends of the disease and take effective preventive and intervention measures. In terms of the design of a variety of disease prediction and evaluation models, the disease risk prediction models based on AI technology studied by many scientific research scholars can generally only be applied to specific medical data due to the particularity of medical data itself. However, the model constructed in this study can predict multiple diseases based on deep learning. CNN shows strong generalization ability and versatility, so it is selected for analysis and finally used for analysis. The process for interactive SHPE model based on deep learning is shown in Figure 3 below.

Fig. 3.

The medical data used in this study mainly comes from the Medicare data set and Healthdata set. Then, the medical data is processed to sort out the chaotic medical data and convert it into a format that can be recognized and processed by a computer or machine learning model, including Chinese word segmentation, loading a dictionary, and dictionary supplementation. The data conversion mainly refers to the symptom data conversion and diagnosis data conversion. The data is summarized and collected finally. During the data preprocessing, the Min-Max method is adopted to normalize the characteristics data, as below:

\begin{equation} l = \frac{{l - \min (l)}}{{\max (l) - \min (l)}} \end{equation}

(7)

The specific process for data preprocessing is given below in Figure 4.

Fig. 4.

CNN in the deep learning algorithm is mainly used in the algorithm design of the SHPE model. This is because the use of FCNN can't quickly capture the key features and filter features, resulting in excessive model training and slower speed. As the most popular algorithm model currently used, CNN can extract the key features based on the calculation of the convolution kernel. It is more used in computer vision image processing. Through the combination of convolution kernel and maximum pooling, the key image contour features are extracted based on the back propagation algorithm. Therefore, CNN is adopted in this study to extract the key features in a variety of diseases.

Firstly, the processed data is inputted into the CNN through the convolution kernel for convolution operations, as below:

\begin{equation} W = conv2(V,L,''valid'') + s \end{equation}

(8)

In Equation (8), V is the weight of the convolution kernel, L is the input value, s refers to the bias value, and valid refers to the padding method. After the successful dot multiplication operation, the output dimension is different, generating a new feature matrix. The matrix size can be calculated with the below equation:

\begin{equation} v' = \frac{{v + 2k - p}}{b} - 1 \end{equation}

(9)

In the Equation (9), v refers to the size of the input matrix, k refers to the size of the pooling layer, p refers to the size of the convolution kernel, and b refers to the step size of the convolution kernel. The features are extracted automatically. To ensure the convolutional layers have better feature selection capabilities and dimensionality reduction, a pooling layer is added after convolution of the convolution kernel, so that the features from the maximum pooling can be utilized for further feature screening.

\begin{equation} W = \arg \max \left( {\begin{array}{c@{\quad}c} {{l_1}}&{{l_2}}\\ {{l_3}}&{{l_4}} \end{array}} \right) \end{equation}

(10)

In this study, a pooling layer with a step size of 2 and a size of 2*2 is adopted to extract the key features of medical data. Besides, the parameter update of the convolution kernel in the experimental CNN optimizes and updates the parameters of the convolution kernel pooling layer through the back propagation algorithm. The recursive equation is written as below:

\begin{equation} \frac{{\partial E}}{{\partial {v_{ij}}}} = \frac{{\partial E}}{{\partial {w_{ij}}}}\frac{{\partial {w_{ij}}}}{{\partial {v_{ij}}}} = {\delta _{ij}}\frac{{\partial {w_{ij}}}}{{\partial {v_{ij}}}} \end{equation}

(11)

\begin{equation} {\delta ^{h - 1}} = conv2(rot180({V^h}),{\delta ^h},'full')\gamma ({w^{h - 1}}) \end{equation}

(12)

In Equation (11) and (12) above, h refers to the number of convolution layers, and \({V^h}\) refers to the parameter weight, which is a recursive equation. \(\delta\), v, and w are matrix forms, rotl80() refers to the operation of rotating the input matrix 180° counterclockwise, and ‘full’ refers to the full convolution calculation.

When the extracted features do not conform to the actual situation, the algorithm of the SHPE system will be further improved to perform repeated operations. The improvement methods are mainly to reduce the convolution parameters or save memory and increase sparsity or alleviate convolution redundancy. After reducing the convolution parameter or saving memory is realized, both the convolution parameter and memory requirement becomes 1/2 of the original compared with the ordinary convolution method. Finally, the Softmax function is usually undertaken as the classifier in the deep learning, so does this study. The sum of the probabilities of all markers in the Softmax classifier is 1, and the index value with the largest probability value is selected as the predicted marker. The Softmax function can be expressed by the following calculation equations:

\begin{equation} {Z_i} = \sum\limits_p {{q_p}{v_{pi}}} \end{equation}

(13)

\begin{equation} {k_i} = \frac{{\exp ({Z_i})}}{{\sum\nolimits_j^8 {\exp ({Z_i})} }} \end{equation}

(14)

\begin{equation} \hat i = \arg \max {k_i} = \arg \max {Z_i} \end{equation}

(15)

In the CNN, \({q_p}\) refers to the neuron node after activation in the penultimate layer, \({v_{pi}}\) refers to the weight matrix connecting the penultimate layer and the Softmax layer, Z_i is the input of the Softmax layer, and \({k_i}\) refers to the probability of each category. The final predicted disease category i can be determined by selecting the largest \({k_i}\).

3.5 Disease Prediction and Tracking by Interactive Smart Healthcare Model

In the process of disease prediction, the system can speed up the feature search by narrowing the range, so as to quickly find the target within a limited range. In this study, the inter-frame tracking algorithm using spatio-temporal context learning is adopted to model the spatiotemporal relationship between the target to be tracked and its local context area through the Bayesian framework, so as to obtain the statistical correlation between the low-level features of the target and its surrounding area. During a surgery, the tool tip is connected with the surrounding local background space, so as to learn a model of the spatial context by solving the deconvolution. Then, the spatial context model is adopted to update the spatiotemporal context model of the next frame. The third step is to calculate the convolution of a confidence map seat integration under the spatiotemporal context with related equations, so that the tip position can be found out quickly through calculating the maximum value of the confidence map during the tracking of the next frame. Of which, the confidence map can be utilized to estimate the likelihood of the target location, and the equation was given as follows.

\begin{equation} c(l) = K(l|o) \end{equation}

(16)

In the above Equation (16), l refers to the location of the target, and o refers to the target object in the disease detection scene.

For the disease prediction and tracking, it has to know the tip position in the current frame, denoted as l*, and the context feature set can be defined as the following equation:

\begin{equation} {L^c} = \left\{ {c(z) = \left( {I(z),z} \right)|z \in {\Omega _c}({l^*})} \right\} \end{equation}

(17)

In the above equation, I(z) refers to the pixel value at the position z in the image, and \({\Omega _c}({l^*})\) refers to the neighborhood of the position l*. After the joint probability P(l,c(z)|o) is marginalized, the likelihood equation of the target position can be written as follows based on the above Equation (16):

\begin{equation} c(l) = \sum\limits_{c(z) \in {L^c}} {K\left( {l,c\left( z \right)|o} \right)} = \sum\limits_{c(z) \in {L^c}} {K\left( {l|c\left( z \right),o} \right)K\left( {c\left( z \right)|o} \right)} \end{equation}

(18)

In the Equation (18), \(K( {l,c( z )|o} )\) refers to the spatial relationship to model the target location and its context information to avoid ambiguity when the measurements of the image were different. \(K( {c( z )|o} )\) can be adopted to model the prior probability of the local context content. Then, \(K( {l|c( z ),o} )\) should be obtained firstly, it indicated that the relationship between the target location and its spatial context can be learned. The conditional probability can be modeled as the following equation:

\begin{equation} K\left( {l|c\left( z \right),o} \right) = {s^{bc}}(l - z) \end{equation}

(19)

In the above equation, \({s^{bc}}(l - z)\) refers to the equation for the relative distance and direction of the target position x and its local context position z, so as to encode the target and its spatial context. It is known that there is a direction for \({s^{bc}}(l - z)\), so there will be no ambiguity due to various symmetry problems when the spatial context information of the tip position and the surrounding background is considered. After the \(K( {l|c( z ),o} )\) is obtained, the context prior probability \(K( {c( z )|o} )\) has to be simply modeled as follows:

\begin{equation} P\left( {c\left( z \right)|o} \right) = I(z){\omega _\sigma }(z - {l^*}) \end{equation}

(20)

In the above Equation (20), I(z) refers to the gray value of z, which is the description of context appearance of z; and \({\omega _\sigma }(.)\) represent the weighting function, which could be expressed as in the below equation:

\begin{equation} {\omega _\sigma }(z) = n{e^{ - \frac{{|z|}}{{{\sigma ^2}}}}} \end{equation}

(21)

In Equation (21) above, n represent a normalization constant with a value of 0∼1, which is to constrain the \(K( {c( z )|o} )\) in Equation (18) to satisfy the definition of probability, and \(\sigma\) refers to the scale parameter. This weighting function is inspired by the attention points of the biological vision system, which means that a certain image area will be focused on. Simply speaking, the shorter the distance between the point and the target, the more attention the point received. The specific distance is determined by \(\sigma\). Next, the confidence map of the target position can be calculated as follows:

\begin{equation} c(l) = K(l|o) = s{e^{ - {{\left| {\frac{{l - {l^*}}}{\alpha }} \right|}^\beta }}} \end{equation}

(22)

\(\beta\) refers to the normalized shape parameter. If it is too large, there will be an excessive smoothing effect, causing the information near the center of the target to be lost, so that the positioning suffers from ambiguity. If it is too small, it will sharpen the information near the center of the target. Thus, insufficient information for modelling the context of the target space may cause over-fitting of the paradigm and recognition errors. After experimentation, strong robustness can be found when \(\beta\) = 1. Moreover, the confidence map is obtained by calculating the likelihood of any point x in the context area based on the given target position l*. According to the equation of the confidence map and the context prior probability, the below equation can be obtained based on the convolution and fast Fourier transform (FFT) operations:

\begin{equation} c(l) = s{e^{ - {{\left| {\frac{{l - {l^*}}}{\alpha }} \right|}^\beta }}} = \sum\limits_{z \in {\Omega _c}({l^*})} {{q^{bc}}(l - z)I(z){\omega _\sigma }(z - {l^*})} = {q^{bc}}(l) \otimes \left( {I(l){\omega _\sigma }(l - {l^*})} \right) \end{equation}

(23)

\(\otimes\) refers to the convolution operation. With further FFT, Equation (23) can be transformed into the below equation in the frequency domain:

\begin{equation} F\left(s{e^{ - {{\left| {\frac{{l - {l^*}}}{\alpha }} \right|}^\beta }}}\right) = F\left( {{q^{bc}}(l)} \right) \circ F\left( {I(l){\omega _\sigma }(l - {l^*})} \right) \end{equation}

(24)

In the above equation, F refers to FFT, and \(\circ\) refers to multiplying by element. Then, \({q^{bc}}(l)\) can be obtained by using inverse FFT F¹, as follows:

\begin{equation} {q^{bc}}(l) = {F^{ - 1}}\left( {\frac{{F\left(s{e^{ - {{\left| {\frac{{l - {l^*}}}{\alpha }} \right|}^\beta }}}\right)}}{{F\left( {I(l){\omega _\sigma }(l - {l^*})} \right)}}} \right) \end{equation}

(25)

Finally, the spatial context model \(q_t^{bc}(l)\) of the current frame can be learned, which can update the spatio-temporal context model \(Q_{t + 1}^{btc}\) of the next frame to find the position of the target in the next frame. The equation is as follows:

\begin{equation} Q_{t + 1}^{btc} = (1 - \rho )Q_t^{btc} + \rho Q_t^{bc}(l) \end{equation}

(26)

In the above equation, \(\rho\) refers to the learning rate. After the spatio-temporal context model of the next frame is updated, the confidence map of the next frame can be obtained based on the Equation (23) and (24), as follows:

\begin{equation} {c_{t + 1}}(l) = {F^{ - 1}}\left( {F\left( {Q_{t + 1}^{btc}(l)} \right) \circ F\left( {{I_{t + 1}}(l){\omega _{{\sigma _t}}}(l - {l^*})} \right)} \right) \end{equation}

(27)

Then, the target position \(l_{t + 1}^*\) of the next frame can be obtained by calculating the maximum value of confidence map \({c_{t + 1}}(l)\) for the next frame, as below:

\begin{equation} l_{t + 1}^* = \mathop {\arg \max }\limits_{l \in {\Omega _c}(l_t^*)} {c_{t + 1}}(l) \end{equation}

(28)

With all above steps, the positioning and tracking among frames for diseases in the medical system can be completed, so that the disease type can also be predicted well.

3.6 Simulation

The performance of the proposed interactive SHPE model based on deep learning algorithms is simulated with the MATLAB platform to ensure better realization of the proposed model and promote and serve the public better, it uses the MATLAB platform to simulate the system performance. 120,000 medical examination data are selected from the physical examination center of a tertiary hospital to form a physical examination dataset. Firstly, the allergy-reduction technology is used to remove personal data. Secondly, the physical examination samples with more missing feature items in the physical examination data are deleted. After sorting, more than 90,000 pieces of physical examination data are selected as experimental data sets from 120,000 pieces of physical examination data. 80% of the physical examination data set are included in the training set, and 20% is included in the test set. All the following results are based on the test set. The below parameters have to be set to ensure the CNN framework can achieve the desired prediction results: the number of training epochs, the learning rate, the batch size, and the convolution kernel size are set to 20, 0.002, 128, and 1 × 3, respectively; the activation function and optimizer are Tanh and Adam, respectively; and the dropout rate in the CNN framework is set to 0.5. To verify the performance of the proposed system, it is compared with the representative advanced CNNs (AlexNet, GoogleNet, VGGNet, IGCNet, and GCNN) and some other classic machine learning algorithms, including decision tree (DT) [33], Naive Bayes (NB) [34], MLP [35], DNN [36], LSTM, and RNN [37]. The specific configurations of the experimental environment in the simulation are shown in Table 1.

Table 1.

	Operating system	Linux 64bit
	Python version	Python 3.6.1
Software	Simulation platform	MATLAB
	Development platform	PyCharm
	Open source package	TensorFlow
	CPU	Intel core i7-7700@4. 2GHz 8
Hardware	RAM	Kingston ddr4 2400MHz 16G
	GPU	Nvidia GeForce 1060 8G

Table 1. The Specific Configurations of the Experimental Environment in the Simulation

Note: CPU refers to central processing unit; RAM refers to random access memory; and GPU represents graphic processing unit.

4 Results and Discussion

4.1 Performance Comparative Analysis of the System Model and Advanced Convolutional Neural Networks

The constructed system model is compared with the advanced CNNs to analyze its performance from the perspective of accuracy, precision, recall, F1 value, and error, as shown in Figures 5–9.

Fig. 5.

In Figures 5–8, the system model is compared with the advanced CNNs from the perspectives of accuracy, precision, recall, and F1, respectively. Figure 5 illustrates that the accuracy of the proposed algorithm reaches 82.4%, which is better than other advanced CNNs (IGCNet, VGGNet, GoogleNet, and GNN), so the accuracy is improved by at least 2.4%. Figures 6–8 reveal that the precision, recall, and F1 of the proposed algorithm are the highest, and the F1 value is not between precision and recall, it may be smaller than both. Further analysis from the perspective of error discloses that the error of the proposed algorithm is the smallest (23.34), while the errors of other advanced CNNs are all above 30 pixels (as shown in Figure 9). Therefore, the constructed interactive SHPE model based on deep learning has better precision and smaller error compared with other advanced CNN algorithms, so its performance is significantly better.

Fig. 6.

Fig. 7.

Fig. 8.

Fig. 9.

4.2 Performance Comparative Analysis of the System Model and Classic Machine Learning Algorithms

The constructed system model is compared with the classic machine learning algorithms to analyze its performance from the perspective of accuracy, precision, recall, F1 value, and error, as shown in Figures 10–14.

Fig. 10.

In Figures 10–13, the system model is compared with the classic machine learning algorithms from the perspectives of accuracy, precision, recall, and F1, respectively. Figure 10 illustrates that the accuracy of the proposed algorithm reaches 82.4%, which is better than other classic machine learning algorithms by at least 3.3%. Figures 11–13 reveal that the precision, recall, and F1 of the proposed algorithm are the highest, which is higher by 5% at least in contrast to other algorithms. Further analysis from the perspective of error discloses that the error of the proposed algorithm is the smallest (23.34), while the errors of other classic machine learning algorithms are all above 45 pixels, and even 150 pixels than the NB (as shown in Figure 14). Therefore, the constructed interactive SHPE model based on deep learning shows better precision and smaller error.

Fig. 11.

Fig. 12.

Fig. 13.

Fig. 14.

To analyze a variety of diseases, the CNN algorithms under the deep learning is adopted to build an interactive SHPE system model, and it is respectively compared with the improved CNN algorithm and the classic machine learning algorithms to analyze its performance [38–40]. The results disclose that the accuracy, precision, recall, and F1 of the constructed model are significantly better than those of other classic machine learning algorithms and improved CNN algorithms as well as the improved CNNs. In addition, the error is significantly lower. Therefore, the method proposed in this study is still able to achieve better multi-label classification prediction results.

5 Conclusions

In conclusion, it is proved in this study that the built interactive SHPE model based on deep learning can ensure the safety performance while significantly improving the prediction accuracy with low error, providing experimental references for the later prediction and evaluation of smart healthcare. However, there are some limitations for this study. The current smart healthcare diagnostic system is only a preliminary attempt and has to be modified in many aspects, such as the improvement of prediction accuracy, symptom correlation, and exploration of data preprocessing methods. It is hoped that this platform will alleviate the pressure of the current healthcare environment and improve various current healthcare difficulties in China. Of course, further attention will be paid to more details in subsequent studies to further improve the prediction accuracy of the system model [41, 42].

Footnote

It is a datatype.

References

[1]

M. Jamshidi, A. Lalbakhsh, J. Talla, Z. Peroutka, F. Hadjilooei, P. Lalbakhsh, and A. Sabet. 2020. Artificial intelligence and COVID-19: Deep learning approaches for diagnosis and treatment. IEEE Access 8 (2020), 109581–109595.

Abstract

1 Introduction

2 Recent Related Work

2.1 Development Status of Deep Learning

2.2 Development Status of Smart Healthcare

3 Methods

3.1 Demand Analysis on Smart Healthcare System

3.2 Deep Learning

3.3 Design on Interactive Smart Healthcare Prediction and Evaluation Model

3.4 Construction of Interactive Smart Healthcare Prediction and Evaluation Model based on the Deep Learning

3.5 Disease Prediction and Tracking by Interactive Smart Healthcare Model

3.6 Simulation

4 Results and Discussion

4.1 Performance Comparative Analysis of the System Model and Advanced Convolutional Neural Networks

4.2 Performance Comparative Analysis of the System Model and Classic Machine Learning Algorithms

5 Conclusions

Footnote

References

Cited By

Index Terms

Recommendations

Deep Learning in Smart Healthcare: A GAN-based Approach for Imbalanced Alzheimer's Disease Classification

Deep malware detection framework for IoT-based smart agriculture

Deep ResNet Based Remote Sensing Image Super-Resolution Reconstruction in Discrete Wavelet Domain

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

HTML Format

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations