Open AccessArticle

BagStacking: An Integrated Ensemble Learning Approach for Freezing of Gait Detection in Parkinson’s Disease

Seffi Cohen

Nurit Cohen-Inger

and

Lior Rokach

Software and Information Systems Engineering, Ben-Gurion University, David Ben-Gurion Blvd. 1, Beer Sheva 84105, Israel

Author to whom correspondence should be addressed.

Information 2024, 15(12), 822; https://doi.org/10.3390/info15120822

Submission received: 21 November 2024 / Revised: 16 December 2024 / Accepted: 17 December 2024 / Published: 23 December 2024

(This article belongs to the Special Issue Application of Machine Learning in Human Activity Recognition)

Download

Browse Figures

Versions Notes

Abstract

This study introduces BagStacking, an innovative ensemble learning framework designed to enhance the detection of freezing of gait (FOG) in Parkinson’s disease (PD) using accelerometer data. By synergistically combining bagging’s variance reduction with stacking’s sophisticated blending mechanisms, BagStacking achieves superior predictive performance. Evaluated on a comprehensive PD dataset provided by the Michael J. Fox Foundation, BagStacking attained a mean average precision (MAP) of 0.306, surpassing standalone LightGBM and traditional stacking methods. Furthermore, BagStacking demonstrated superior area under the curve (AUC) metrics across key FOG event classes. Specifically, it achieved AUCs of 0.88 for start hesitation, 0.90 for turning, and 0.84 for walking events, outperforming multistrategy ensemble, regular stacking, and LightGBM baselines. Additionally, BagStacking exhibited reduced runtime compared to other ensemble approaches, making it suitable for real-time clinical monitoring. These results underscore BagStacking’s effectiveness in addressing the variability inherent in FOG detection, thereby contributing to improved patient care in PD.

Keywords:

stacking; ensemble; FOG; PD; bagging; freezing of gait; Parkinson; IoT; sensors

1. Introduction

Parkinson’s disease (PD) is a neurodegenerative disorder that affects millions of people worldwide. One of the most debilitating symptoms of PD is freezing of gait (FOG), a phenomenon in which the feet of a patient feel as if they are “glued” to the ground, preventing forward movement despite their intention to walk [1]. FOG episodes can significantly impair a patient’s quality of life, increasing the risk of falls and restricting independence. Despite the prevalence and impact of FOG, its causes are still poorly understood, and its detection and prediction are challenging tasks [2].

Machine learning has emerged as a promising tool for FOG detection, with several studies demonstrating the potential of various algorithms to identify FOG episodes from wearable sensor data [1,2]. However, the performance of these models can be influenced by the inherent variability in the data, arising from differences in disease progression, manifestation of symptoms, and individual gait patterns. To address this, ensemble learning methods, which combine multiple models to improve prediction performance, have been proposed [3].

Ensemble methods such as bagging and stacking have been successfully applied in various domains, including medical applications [3,4,5,6]. Bagging improves the stability of the model and reduces variance by training the models on bootstrap samples and averaging the predictions. On the other hand, stacking trains a meta-learner to find optimal combinations of diverse base models, leveraging model diversity for more accurate predictions [7,8].

In this paper, we propose a novel ensemble method, BagStacking, that integrates the principles of bagging and stacking. The motivation behind this approach is to gain the variance reduction benefit from bagging’s bootstrap sampling while also learning sophisticated blending via stacking. The BagStacking method trains a set of base models on bootstrap samples from the training data, providing a diverse set of predictors. A meta-learner is then trained on the base model outputs and true labels to find an optimal aggregation scheme.

Our main contributions are as follows:

Introduction of BagStacking Method: This study presents BagStacking, a novel ensemble learning approach specifically tailored to detect FOG in Parkinson’s disease patients using accelerometer data. The method combines bagging’s variance reduction and stacking’s blending capability, creating a unique and innovative solution for handling the high variability in FOG data.
Theoretical Performance Analysis: We establish a theoretical foundation for BagStacking, showing that the method can outperform traditional ensemble techniques such as standalone bagging and stacking.
Empirical Validation on Real-World Data: Using a robust dataset on FOG episodes, we validate BagStacking’s performance where it achieves higher accuracy (MAP score) and efficiency than established machine learning methods like LightGBM and standard stacking. BagStacking not only increases FOG detection accuracy but also improves runtime, making it suitable for real-time applications in clinical settings.
Open Source Contribution to the Community: We provide an open-source implementation of BagStacking within the Scikit-learn API framework at https://github.com/SeffiCohen/BagStacking (accessed on 20 November 2024).

2. Related Work

The automatic detection of FOG events from wearable sensor data has been an active research area over the past decade. A variety of machine learning techniques have been applied to this problem, seeking to accurately identify FOG episodes in continuous sensor measurements. Early work by Moore et al. [9] and Bachlin et al. [10] used threshold-based methods on features derived from accelerometer and gyroscope data to detect FOG. While simple and fast, these methods were prone to misclassification errors. To address this, more complex classifiers were explored, including support vector machines [11], random forests [12], and neural networks [13]. Feature engineering and selection were found to be key factors affecting performance. Time and frequency domain features capturing gait rhythmicity and variance were often extracted. More recent work has increasingly utilized deep learning for FOG detection [14,15,16]. Convolutional and recurrent neural networks can learn predictive features directly from raw sensor data. However, large labeled datasets are needed for training such models. A persistent challenge in FOG detection is variability—differences in symptoms and individual gait can affect model robustness. To tackle this, ensemble methods have been proposed [17,18]. Fahira et al. [19] used a random forest ensemble, showing improved recognition over single decision trees.

Stacking has been proven to be an effective technique where a meta-learner combines predictions from multiple and diverse base models. Chaurasia et al. [17] employed stacking of SVM and KNN classifiers. They found that model diversity improved overall accuracy.

Our proposed BagStacking approach aims to enhance diversity by using bagging to train base models. Our experiments show that integrating bagging and stacking results in more robust FOG detection, which is a well-motivated approach based on the literature.

3. Method

Ensemble learning methods leverage multiple models to improve prediction performance over single models. Techniques like bagging, boosting, and stacking have shown success by combining simple base learners. Our proposed BagStacking ensemble integrates both bagging and stacking approaches. The intuition is to gain variance reduction from bagging’s bootstrap sampling while also learning sophisticated blending via stacking.

Given a labeled training set

S = {(x_{1}, y_{1}), (x_{2}, y_{2}), \dots, (x_{N}, y_{N})}

with N examples, our goal is to learn an ensemble model

H (x)

that can accurately predict the label y for a new input x. The method trains a set of base models on bootstrap samples from the training data. These provide a diverse set of predictors. A meta-learner is then trained on the base model outputs and true labels to find an optimal aggregation scheme. The entire procedure is described in full detail below and illustrated in Figure 1.

Bootstrap Sampling: Randomly sample with replacement $D = {d_{1}, d_{2}, \dots, d_{k}}$ as different subsets of the training data, where $d_{i} = {(x_{i 1}, y_{i 1}), (x_{i 2}, y_{i 2}), \dots, (x_{i n}, y_{i n})}$ .
Base Models Training: Train the $M = {m_{1}, m_{2}, \dots, m_{k}}$ base models on each bootstrap set using cross-validation. Any supervised model such as linear models, SVM, decision trees, GBM, and neural networks can be utilized.
Base Models Predictions: Apply the base models on the original cross-validated training set to obtain the predicted label vectors as $P = {p_{1}, p_{2}, \dots, p_{k}}$ , where $p_{i} = m_{i} (d_{i})$ .
Meta Learner Training: Train $M^{'}$ as a meta-learner; it could be any supervised model trained on the base model outputs optimally. It uses the predicted label vectors P as input and the labels as targets.
Ensemble Prediction: Apply base models to a new instance. Feed outputs to meta-learner for final prediction.

3.1. BagStacking Theoretical Foundation

Theorem 1.

Let

H

be a hypothesis space of the base models, and let

M^{'}

be a hypothesis space for meta-learners. Given a loss function

L

and a labeled training set

S = {(x_{1}, y_{1}), \dots, (x_{N}, y_{N})}

, the BagStacking method is expected to produce equal or better generalization performance, as measured by the expected loss

E [L]

, than any individual method based solely on bagging or stacking, under the condition that the base models and the meta-model are appropriately regularized.

3.1.1. Assumptions

Base models are independently trained on different bootstrapped samples from the dataset S.
The meta-learner is trained on the outputs of these base models.
All models are trained to minimize the empirical loss on their respective training sets, subject to regularization constraints.

3.1.2. Step 1: Bagging’s Variance Reduction

Bagging operates by averaging the predictions of k base models, each trained on a different subset of the data. The ensemble prediction for an input x can be formulated as

{\hat{y}}_{bagging} = \frac{1}{k} \sum_{i = 1}^{k} p_{i}

The expected prediction variance of this ensemble is

Var ({\hat{y}}_{bagging}) = \frac{1}{k^{2}} \sum_{i = 1}^{k} Var (p_{i}) < Var (p_{i}), \forall i

Thus, bagging reduces the prediction variance compared to any individual base model.

3.1.3. Step 2: Stacking’s Model Diversity

Stacking aims to find a meta-model

M^{'}

that best combines the base models. The ensemble prediction is

{\hat{y}}_{stacking} = M^{'} (P)

Given that the meta-model is trained to minimize the loss, it can capture complex relationships and correlations among base model outputs, thereby enhancing predictive performance.

3.1.4. Step 3: BagStacking’s Combined Strengths

BagStacking leverages both the variance reduction of bagging and the model diversity of stacking. The ensemble prediction for BagStacking is

{\hat{y}}_{bagstacking} = M^{'} (P)

Here,

P = {p_{1}, p_{2}, \dots, p_{k}}

is the set of predictions from the base models, each trained on a different bootstrapped subset of the training data.

3.1.5. Step 4: Expected Generalization Performance

Given that BagStacking incorporates the strengths of both bagging (variance reduction) and stacking (model diversity), we can assert

E [L ({\hat{y}}_{bagstacking}, y)] \leq min (E [L ({\hat{y}}_{bagging}, y)], E [L ({\hat{y}}_{stacking}, y)])

The theorem establishes that the BagStacking method should offer equal or better generalization performance than either bagging or stacking alone, contingent on appropriate regularization. This theoretical insight aligns well with the empirical results, thus providing a rigorous foundation for the effectiveness of the BagStacking approach.

The overall training complexity is

O (N M)

for M base learners on a dataset of N examples. For prediction, the cost is

O (M)

to generate the M base outputs. Compared to simple model averaging, BagStacking can learn complex non-linear combinations via the meta-regressor. Unlike stacked generalization, which uses a single hold-out set, BagStacking leverages bagging for variance reduction.

4. Experiments

In this section, we discuss the experimental setup, results, and detailed analysis of our BagStacking method, which uses LightGBM as the base estimator. We compare our approach with a standalone LightGBM model, a multistrategy ensemble learning method [20] that uses diverse ensemble settings including bagging, GBM, and random forest as a strong ensemble method, and a classical stacking method with analogous settings, which serve as the baseline for our study.

4.1. Dataset

The dataset employed in this study is an essential contribution, generously supported by the Michael J. Fox Foundation for Parkinson’s Research [21]. These data series include subjects who completed a FOG-provoking protocol, recorded at 128 Hz or 100 Hz on three axes: vertical (V), mediolateral (ML), and anteroposterior (AP). The total dataset comprises over 20 million samples, covering a diverse cohort of PD patients at different stages of disease progression. Specifically, the training set incorporates 920 subjects with overall 20,588,374 samples, and the test set includes 250 subjects. The dataset’s richness lies in its comprehensive coverage of different FOG-provoking scenarios, including start hesitation, turning, and walking. This diversity allows for the development of models capable of generalizing across various manifestations of FOG. The distribution of FOG events across the dataset is imbalanced, with FOG episodes constituting approximately 5% of the total samples, reflecting the real-world prevalence of FOG episodes in PD patients. Table 1 provides a summary of the dataset statistics.

These datasets offer a robust and nuanced platform to understand, model, and detect the complex phenomenon of FOG, fostering advancements in the mitigation of this crippling symptom of Parkinson’s disease.

4.2. Preprocessing and Feature Engineering

The preprocessing pipeline transforms raw accelerometer signals into meaningful features suitable for downstream machine learning tasks. We employ a dual-window segmentation strategy, utilize both time-domain and frequency-domain feature extraction techniques, and incorporate nonlinear and domain-specific measures known to be effective in freezing of gait (FOG) detection.

Raw Signal Example: As a starting point, Figure 2 shows a short segment of raw accelerometer data from the vertical (AccV), mediolateral (AccML), and anteroposterior (AccAP) axes, captured over a 5-s window. These signals depict the variability and dynamics inherent in the sensor readings and serve as the input to our preprocessing pipeline.

Segmentation and Windowing: We employed the Seglearn package to segment the accelerometer time series data into overlapping windows. Two window sizes were chosen: a 5-s window to capture short-term patterns and a 50-s window to capture longer-term gait dynamics, each with a step size of 1 s. This dual-window approach ensures that both rapid changes and broader context in the gait pattern are captured, ultimately providing a richer feature representation.

Feature Transformations: Within each window, we extracted a comprehensive set of features that span multiple domains. Time-domain features include statistical metrics such as mean, median, standard deviation, variance, skewness, and kurtosis for each acceleration axis. In the frequency domain, we applied the fast Fourier transform (FFT) to obtain spectral energy, dominant frequencies, and power spectral density (PSD) measures, including PSD mean and median. Figure 3 illustrates examples of these transformations, highlighting how the raw signals evolve into more abstract, yet informative, descriptors.

In addition to standard time-frequency features, we integrated nonlinear measures such as sample entropy to capture the complexity and irregularity of the accelerometer signals. Sample entropy is a nonlinear metric that assesses the likelihood that similar patterns of observations remain similar over time, providing insight into the unpredictability characteristic of FOG episodes. Domain-specific features, such as the freeze index (FI) and PSD ratios, previously shown to be effective in FOG detection [9], were also included. Finally, correlation coefficients between acceleration axes were computed to capture inter-axis coordination.

Feature Utility and Future Directions: The combination of time-domain, frequency-domain, wavelet-based, nonlinear, and domain-specific features enhances the discriminative power of our dataset, thereby improving the accuracy of detecting gait events and identifying FOG episodes. In particular, the inclusion of sample entropy provides a nuanced characterization of gait dynamics. While entropy is already an effective nonlinear feature, future work may explore additional nonlinear measures to further refine model performance in FOG detection tasks.

4.3. Experimental Setup

The experimental framework was designed using a dataset obtained from a Kaggle competition [22]. The dataset consists of 3D accelerometer data from the lower back, tracking 169 subjects who have experienced FOG episodes. The dataset was strategically divided into training and validation sets to facilitate model development and meticulous performance evaluation. The training–validation split comprised around 20 million samples for each set, with no overlapping subject between the two sets.

Within the BagStacking methodology, an ensemble of LightGBM models, each trained on distinctive bootstrap samples from the training data, was employed. The base model predictions were subsequently fed into a meta-model that was optimized to render the final predictions.

Selection of Comparison Methods

To evaluate the effectiveness of the proposed BagStacking method, we selected a diverse set of machine learning approaches that represent both single-model and ensemble techniques commonly used in FOG detection and related domains. The selected comparison methods are as follows:

Standalone LightGBM Model: LightGBM is a gradient boosting framework that has demonstrated high performance and efficiency in various classification tasks, including those involving large-scale datasets. By comparing BagStacking against a standalone LightGBM model, we aim to assess the added value of our ensemble approach over a powerful single-model baseline.
Multistrategy Ensemble Learning Method [20]: This method employs diverse ensemble strategies, including bagging, gradient boosting machines (GBM), and random forests, to enhance predictive performance. By incorporating multiple ensemble techniques, the multistrategy ensemble serves as a robust benchmark to evaluate how BagStacking compares to other ensemble methodologies that leverage different mechanisms for improving model accuracy and robustness.
Classical Stacking Method: Stacking is a widely recognized ensemble technique that combines multiple base models through a meta-learner to achieve superior performance. By including a classical stacking approach with analogous settings, we aim to demonstrate the efficacy of BagStacking’s integrated bagging and stacking strategy against traditional stacking methods that do not incorporate bagging.

These methods were chosen to provide a comprehensive comparison across different modeling paradigms—single-model, ensemble-based, and meta-ensemble approaches. This selection allows us to thoroughly evaluate the advantages of BagStacking in terms of accuracy and computational efficiency within the context of FOG detection in Parkinson’s disease.

The default parameters were utilized for the comparison techniques, followed by rigorous hyperparameter tuning to enhance their performance. The use of identical training and validation sets ensured an unbiased and equitable comparison.

4.4. Evaluation Metric

To evaluate the performance, we use the mean average precision (MAP) as the evaluation metric and the area under the curve (AUC) metric to offer a more comprehensive evaluation of model performance. We compute class-specific AUCs for start hesitation, turning, and walking. We calculate the MAP specifically for each of the three event classes. For each event class, we compute the average precision (AP) based on the predicted confidence scores and ground truth labels. We take into account only the portions of the data series that are annotated with “Valid” labels set to true. It is important to note that there were cases during the video annotation process where it was difficult for the annotator to decide if there was an akinetic FOG or if the subject stopped voluntarily. Therefore, only event annotations where the series is marked as “True” should be considered as unambiguous. Finally, we obtain the MAP by averaging the APs for all three event classes as follows:

MAP = \frac{1}{3} (A P_{class 1} + A P_{class 2} + A P_{class 3})

where

A P_{class i}

denotes the average precision for the i-th event class. While MAP assesses the ranking quality of predictions, the AUC metric provides across all three FOG event classes.

5. Results

Table 2 illustrates the experimental outcomes, presenting the MAP scores and runtime for the BagStacking method alongside the comparison techniques on the validation set. Additionally, Table 3 and Figure 4 present the AUC metrics for each method across the three FOG event classes.

The findings manifest that BagStacking’s MAP score of 0.306 significantly surpassed the comparison methods, thereby attesting to its effectiveness in discerning FOG events from accelerometer data. Notably, BagStacking achieved a 7% relative improvement over regular stacking and a 30% improvement over the standalone LightGBM model. These relative improvements were calculated using the formula

Relative Improvement (%) = (\frac{{MAP}_{BagStacking} - {MAP}_{Comparison}}{{MAP}_{Comparison}}) \times 100

For instance, the improvement over regular stacking is calculated as

(\frac{0.306 - 0.286}{0.286}) \times 100 \approx 7 %

Similarly, the improvement over LightGBM is

(\frac{0.306 - 0.234}{0.234}) \times 100 \approx 30 %

5.1. AUC Performance

In addition to the MAP scores, we evaluated the models using the area under the receiver operating characteristic curve (AUC) for each FOG event class. Table 3 summarizes the AUC results, and Figure 4 provides a visual comparison. BagStacking outperformed all other methods across all three event classes, achieving AUCs of 0.88 for start hesitation, 0.90 for turning, and 0.84 for walking events. These results indicate that BagStacking not only improves precision but also enhances the overall discriminative ability of the model.

5.2. Runtime Analysis

While the BagStacking method required more computational time than the single LightGBM model due to the training of multiple base models and the meta-learner, it was more efficient than regular stacking and the multistrategy ensemble. This efficiency can be attributed to the parallelizable nature of training base models on bootstrap samples and the relatively low complexity of the meta-learner.

5.2.1. Identified Runtime Bottlenecks

A detailed examination of the BagStacking workflow reveals that the primary runtime bottleneck lies in the training of the base models. This is primarily due to the large volume of repetitive data processed during training. Each base model in the ensemble is trained on a distinct bootstrap sample, which, despite being a subset of the original dataset, still involves significant computational resources given the dataset’s size.

5.2.2. Impact of Bagging on Computational Efficiency

Bagging plays a crucial role in mitigating the computational load by reducing the amount of data each base model needs to process. By sampling with replacement, bagging not only increases the diversity among the base models but also ensures that each model trains on a smaller, more manageable subset of the data. This reduction in data volume per base model leads to a dramatic decrease in training time compared to training a single model on the entire dataset.

5.2.3. Meta-Learner Optimization

The optimization of the meta-learner introduces additional computational overhead; however, this is relatively minor compared to the base model training. The meta-learner operates on the predictions generated by the base models, which are significantly smaller in size, thereby keeping the optimization process efficient. Furthermore, the simplicity of the meta-learner architecture contributes to keeping its runtime impact minimal.

5.2.4. Scalability and Parallelization

One of the strengths of the BagStacking method is its scalability through parallelization. The training of base models can be distributed across multiple processors or machines, effectively reducing the wall-clock time required for model training. This parallelizable nature allows BagStacking to handle large datasets efficiently, making it suitable for real-time applications where computational resources can be scaled as needed.

5.2.5. Comparison with Other Methods

Compared to Regular Stacking, which often involves training diverse models on the same dataset without data reduction, BagStacking benefits from both data subset training and model diversity, leading to improved computational efficiency. The multistrategy ensemble, encompassing a broader range of ensemble techniques, inherently requires more computational resources, which BagStacking optimizes through its structured approach to bagging and stacking.

5.2.6. Potential Optimization Strategies

To further alleviate runtime bottlenecks, future work could explore advanced optimization techniques, such as

Dynamic Data Sampling: Adjusting the size of bootstrap samples based on model performance and computational constraints.
Efficient Meta-Learner Architectures: Employing lightweight meta-learners that require fewer computational resources without compromising performance.
Incremental Training: Updating base models incrementally as new data arrives, reducing the need for retraining from scratch.

In conclusion, while BagStacking introduces additional computational steps compared to single-model approaches, its design effectively manages runtime through data reduction in base model training and the efficient optimization of the meta-learner. These considerations make BagStacking a viable and efficient method for real-time FOG detection applications in Parkinson’s disease patients.

6. Discussion

The BagStacking approach presented in this study improves FOG detection performance by effectively combining variance reduction through bagging with the blending strengths of stacking. Similar to the previous work, showing that ensemble strategies can outperform single classifiers in complex clinical domains [7,8], our framework addresses the inherent variability in FOG detection data more efficiently than the traditional methods. By training base models on bootstrap samples, BagStacking not only enhances model diversity but also leverages the repeated time-series nature of FOG-related accelerometer data to improve both diversity and runtime efficiency. The meta-learner then integrates the outputs of these diverse base models to achieve superior predictive accuracy.

When compared to standard ensemble methods such as multistrategy [20] ensembles or simple stacking, the AUC improvements across all three event classes: start hesitation, turning, and walking—underscore BagStacking’s robustness. Specifically, achieving AUCs of 0.88 for start hesitation, 0.90 for turning, and 0.84 for walking represents a substantial enhancement in distinguishing subtle and transient FOG events from non-FOG segments. Earlier studies have highlighted the challenges of reliably detecting these brief, heterogeneous episodes [1,17]. The current results indicate that BagStacking’s integrated, variance-focused approach can address these difficulties more effectively than the previously applied machine learning strategies.

6.1. Runtime Efficiency and Practical Implications

In addition to improved accuracy, BagStacking demonstrates practical advantages in runtime efficiency. Although advanced classifiers often demand high computational costs, BagStacking mitigates this issue by dramatically decreasing the amount of training data of the base models training on bootstrap samples and the relatively simple structure of the meta-learner. This approach enables faster processing times compared to more complex ensemble strategies.

6.2. Limitations and Future Work

Despite these promising outcomes, certain limitations provide avenues for further investigation. The imbalanced nature of the dataset, with relatively rare FOG events, remains a challenge that can potentially be alleviated by data augmentation or synthetic sample generation [2]. Additionally, while BagStacking currently employs traditional machine learning models, integrating deep learning architectures such as convolutional neural networks or Transformer models could improve the performance. While this study focuses on FOG detection in Parkinson’s disease, the underlying principles of BagStacking may be readily extended to other time-series sensor data and high-frequency sampling applications. Ensemble methods have proven adaptable across various domains and tasks, suggesting that BagStacking could also benefit related healthcare analytics challenges or other fields characterized by complex, streaming sensor data. This generalizability, combined with the demonstrated effectiveness of BagStacking, encourages broader adoption and ongoing refinement.

7. Conclusions

This research introduced and evaluated BagStacking, an innovative ensemble learning technique engineered to augment the detection of freezing of gait (FOG) in Parkinson’s disease (PD). By leveraging both bagging and stacking, this novel approach demonstrated superiority in both MAP scores and AUC metrics compared to conventional models. Specifically, BagStacking achieved a MAP score of 0.306, representing a 30% improvement over standalone LightGBM and surpassing traditional stacking by 7%. Furthermore, BagStacking excelled in discriminating between FOG and non-FOG events across start hesitation, turning, and walking classes, attaining AUCs of 0.88, 0.90, and 0.84, respectively.

The strategic use of a lower back sensor to track acceleration contributed to the precision and reliability of the method in real-world applications. BagStacking’s success, embodied in both its conceptual innovation and empirical validation through robust AUC metrics, marks a promising pathway in PD patient care. The method not only enhances detection accuracy but also maintains computational efficiency, making it suitable for real-time clinical monitoring.

The insights gleaned from this study invigorate further exploration and refinement, steering toward a new horizon in machine learning and medical diagnosis. Future work will focus on integrating deep learning architectures into the BagStacking framework and addressing data imbalance through advanced techniques, thereby further enhancing the robustness and applicability of FOG detection systems. Overall, BagStacking stands as a significant advancement, enabling more accurate and efficient FOG detection and offering potential pathways for future enhancements in the intersection of machine learning and healthcare.

Author Contributions

Conceptualization, S.C.; methodology, S.C.; validation, S.C.; formal analysis, S.C.; investigation, S.C.; data curation, S.C.; writing—original draft preparation, S.C.; writing—review and editing, S.C. and N.C.-I.; visualization, S.C.; supervision, L.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The dataset employed in this study is an essential contribution, generously supported by the Michael J. Fox Foundation for Parkinson’s Research [21] and available online at https://www.kaggle.com/competitions/tlvmc-parkinsons-freezing-gait-prediction/data (accessed on 20 November 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

FOG	Freezing of gait
PD	Parkinson’s disease
MAP	Mean average precision
PSD	Power spectral density
FFT	Fast Fourier transform
FI	Freeze index
V	Vertical
ML	Mediolateral
AP	Anteroposterior
AUC	Area under the curve
KNN	K nearest neighbors
GBM	Gradient boosting machines
IoT	Internet of Things

References

Naghavi, N.; Miller, A.; Wade, E. Towards Real-Time Prediction of Freezing of Gait in Patients with Parkinson’s Disease: Addressing the Class Imbalance Problem. Sensors 2019, 19, 3898. [Google Scholar] [CrossRef] [PubMed]
Shah, S.Y.; Iqbal, Z.; Rahim, A. Constrained Optimization-Based Extreme Learning Machines with Bagging for Freezing of Gait Detection. Big Data Cogn. Comput. 2018, 2, 31. [Google Scholar] [CrossRef]
Pintelas, P.; Livieris, I.E. Special Issue on Ensemble Learning and Applications. Algorithms 2020, 13, 140. [Google Scholar] [CrossRef]
Bose, R.; Dutta, A.; Bhattacharyya, M.; Ghosh, S.; Chakraborty, C.; Chakraborty, C.; Bhattacharyya, P. An ensemble machine learning model based on multiple filtering and supervised attribute clustering algorithm for classifying cancer samples. PeerJ Comput. Sci. 2021, 7, e671. [Google Scholar] [CrossRef] [PubMed]
Cohen, S.; Katz, O.; Presil, D.; Arbili, O.; Rokach, L. Ensemble Learning For Alcoholism Classification Using EEG Signals. IEEE Sens. J. 2023, 23, 17714–17724. [Google Scholar] [CrossRef]
Cohen, S.; Lior, E.; Bocher, M.; Rokach, L. Improving severity classification of Hebrew PET-CT pathology reports using test-time augmentation. J. Biomed. Inform. 2024, 149, 104577. [Google Scholar] [CrossRef] [PubMed]
Hosni, A.; Chawki, Y.; Idri, A.; Bakkoury, Z.; Al Achhab, M.; González, J.R. A systematic mapping study for ensemble classification methods in cardiovascular disease. Artif. Intell. Rev. 2021, 54, 2827–2861. [Google Scholar] [CrossRef]
Cohen, S.; Dagan, N.; Cohen-Inger, N.; Ofer, D.; Rokach, L. ICU survival prediction incorporating test-time augmentation to improve the accuracy of ensemble-based models. IEEE Access 2021, 9, 91584–91592. [Google Scholar] [CrossRef]
Moore, S.T.; MacDougall, H.G.; Ondo, W.G. Ambulatory monitoring of freezing of gait in Parkinson’s disease. J. Neurosci. Methods 2008, 167, 340–348. [Google Scholar] [CrossRef]
Bachlin, M.; Plotnik, M.; Roggen, D.; Inbar, N.; Sagiv, N.; Giladi, N.; Hausdorff, J.M.; Troster, G. Wearable assistant for Parkinson’s disease patients with the freezing of gait syndrome. IEEE Trans. Inf. Technol. Biomed. 2010, 14, 436–446. [Google Scholar] [CrossRef]
Tahafchi, P.; Molina, R.; Roper, J.A.; Sowalsky, K.; Hass, C.J.; Gunduz, A.; Okun, M.S.; Judy, J.W. Freezing-of-Gait detection using temporal, spatial, and physiological features with a support-vector-machine classifier. In Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Jeju Island, Republic of Korea, 11–15 July 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 2867–2870. [Google Scholar]
Abujrida, H.; Agu, E.O.; Pahlavan, K. Machine learning-based motor assessment of Parkinson’s disease using postural sway, gait and lifestyle features on crowdsourced smartphone data. Biomed. Phys. Eng. Express 2020, 6, 035005. [Google Scholar] [CrossRef] [PubMed]
Shi, B.; Tay, A.; Au, W.L.; Tan, D.M.L.; Chia, N.S.Y.; Yen, S.-C. Detection of freezing of gait using convolutional neural networks and data from lower limb motion sensors. IEEE Trans. Biomed. Eng. 2022, 69, 2256–2267. [Google Scholar] [CrossRef] [PubMed]
Habib, Z.; Mughal, M.A.; Khan, M.A.; Shabaz, M. WiFOG: Integrating deep learning and hybrid feature selection for accurate freezing of gait detection. Alex. Eng. J. 2024, 86, 481–493. [Google Scholar] [CrossRef]
Hou, Y.; Ji, J.; Zhu, Y.; Dell, T.; Liu, X. Flexible gel-free multi-modal wireless sensors with edge deep learning for detecting and alerting freezing of gait symptom. IEEE Trans. Biomed. Circuits Syst. 2023, 17, 1010–1021. [Google Scholar] [CrossRef] [PubMed]
Mo, W.T.; Chan, J.H. Freezing of Gait Prediction Using Deep Learning. In Proceedings of the 13th International Conference on Advances in Information Technology, Bangkok, Thailand, 5–7 July 2023; pp. 1–6. [Google Scholar]
Chaurasia, V.; Chaurasia, A. Detection of Parkinson’s Disease by Using Machine Learning Stacking and Ensemble Method. Biomed. Mater. Devices 2023, 1, 966–978. [Google Scholar] [CrossRef]
Mazilu, S.; Calatroni, A.; Gazit, E.; Roggen, D.; Hausdorff, J.M.; Tröster, G. Feature learning for detection and prediction of freezing of gait in Parkinson’s disease. In Proceedings of the 9th International Conference on Machine Learning and Data Mining in Pattern Recognition (MLDM 2013), New York, NY, USA, 19–25 July 2013; Springer: Berlin/Heidelberg, Germany, 2013; pp. 144–158. [Google Scholar]
Fahira, N.R.; Lawi, A.; Aqsha, M. Early detection model of Parkinson’s Disease using Random Forest Method on voice frequency data. J. Nat. Sci. Math. Res. 2023, 9, 29–37. [Google Scholar] [CrossRef]
Webb, G.I.; Zheng, Z. Multistrategy ensemble learning: Reducing error by combining ensemble learning techniques. IEEE Trans. Knowl. Data Eng. 2004, 16, 980–991. [Google Scholar] [CrossRef]
Howard, A.; Salomon, A.; Gazit, E.; Hausdorff, J.; Kirsch, L.; Maggie; Ginis, P.; Holbrook, R.; Karim, Y.F. Parkinson’s Freezing of Gait Prediction. Kaggle. 2023. Available online: https://kaggle.com/competitions/tlvmc-parkinsons-freezing-gait-prediction (accessed on 20 November 2024).
Salomon, A.; Gazit, E.; Ginis, P.; Urazalinov, B.; Takoi, H.; Yamaguchi, T.; Goda, S.; Lander, D.; Lacombe, J.; Sinha, A.K.; et al. A machine learning contest enhances automated freezing of gait detection and reveals time-of-day effects. Nat. Commun. 2024, 15, 4853. [Google Scholar] [CrossRef] [PubMed]

Figure 1. BagStacking method overview: D—Bootstrap sampling the training set S, M—Training the base models, P—Apply the base models on the original training set, M’—Train the meta learner on the base models predictions, ${\hat{y}}_{b a g s t a c k i n g}$ —Apply base models to new instance; feed outputs to meta-learner for final prediction.

Figure 2. Raw accelerometer data for the vertical (AccV), mediolateral (AccML), and anteroposterior (AccAP) axes over a 5-s window.

Figure 3. Examples of feature transformations: time-domain features (mean, standard deviation), frequency-domain features (PSD mean, PSD median), and wavelet-domain features (wavelet coefficient means at levels 0 and 1) for the first five windows.

Figure 4. AUC comparison of different methods across FOG event classes. BagStacking consistently outperforms other methods in start hesitation, turning, and walking event classes.

Table 1. Dataset Statistics.

	Training Set	Test Set
Number of Subjects	920	250
Total Samples	20,588,374	5,000,000
Sampling Rate	100 Hz/128 Hz	100 Hz/128 Hz
Number of FOG Events	1029	270
FOG Event Duration (avg)	3.5 s	3.6 s
Non-FOG Samples	19,558,374	4,850,000

Table 2. MAP scores and runtime of the BagStacking method and the comparison methods on the validation set.

Method	MAP Score	Runtime (s)
BagStacking	0.306	3828
Multistrategy Ensemble	0.249	7716
LightGBM	0.234	1518
Regular Stacking	0.286	8350

Table 3. AUC results of the BagStacking method and the comparison methods across different FOG event classes on the validation set.

Method	Start Hesitation AUC	Turning AUC	Walking AUC
BagStacking	0.88	0.90	0.84
Multistrategy Ensemble	0.77	0.88	0.73
Regular Stacking	0.83	0.87	0.72
LightGBM	0.80	0.87	0.70

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cohen, S.; Cohen-Inger, N.; Rokach, L. BagStacking: An Integrated Ensemble Learning Approach for Freezing of Gait Detection in Parkinson’s Disease. Information 2024, 15, 822. https://doi.org/10.3390/info15120822

AMA Style

Cohen S, Cohen-Inger N, Rokach L. BagStacking: An Integrated Ensemble Learning Approach for Freezing of Gait Detection in Parkinson’s Disease. Information. 2024; 15(12):822. https://doi.org/10.3390/info15120822

Chicago/Turabian Style

Cohen, Seffi, Nurit Cohen-Inger, and Lior Rokach. 2024. "BagStacking: An Integrated Ensemble Learning Approach for Freezing of Gait Detection in Parkinson’s Disease" Information 15, no. 12: 822. https://doi.org/10.3390/info15120822

APA Style

Cohen, S., Cohen-Inger, N., & Rokach, L. (2024). BagStacking: An Integrated Ensemble Learning Approach for Freezing of Gait Detection in Parkinson’s Disease. Information, 15(12), 822. https://doi.org/10.3390/info15120822

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu