Discriminative Bayesian Filtering for the Semi-supervised Augmentation of Sequential Observation Data

Michael C. Burkhart ORCID: orcid.org/0000-0002-2772-5840¹³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12743))

Included in the following conference series:

International Conference on Computational Science

1893 Accesses

Abstract

We aim to construct a probabilistic classifier to predict a latent, time-dependent boolean label given an observed vector of measurements. Our training data consists of sequences of observations paired with a label for precisely one of the observations in each sequence. As an initial approach, we learn a baseline supervised classifier by training on the labeled observations alone, ignoring the unlabeled observations in each sequence. We then leverage this first classifier and the sequential structure of our data to build a second training set as follows: (1) we apply the first classifier to each unlabeled observation and then (2) we filter the resulting estimates to incorporate information from the labeled observations and create a much larger training set. We describe a Bayesian filtering framework that can be used to perform step 2 and show how a second classifier built using the latter, filtered training set can outperform the initial classifier.

At Adobe, our motivating application entails predicting customer segment membership from readily available proprietary features. We administer surveys to collect label data for our subscribers and then generate feature data for these customers at regular intervals around the survey time. While we can train a supervised classifier using paired feature and label data from the survey time alone, the availability of nearby feature data and the relative expensive of polling drive this semi-supervised approach. We perform an ablation study comparing both a baseline classifier and a likelihood-based augmentation approach to our proposed method and show how our method best improves predictive performance for an in-house classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 55.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 69.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

On classifier behavior in the presence of mislabeling noise

Article 05 December 2016

Coupling the PAELLA Algorithm to Predictive Models

Probabilistic Hoeffding Trees

Notes

1.
Given a set \(\{\alpha _0^k\}_{k\in K}\) of candidate values for \(\alpha _0\) and a set \(\{\alpha _1^\ell \}_{\ell \in L}\) for \(\alpha _1\), we select parameters via an exhaustive grid search as follows. For each \((k,\ell )\in K\times L\), we apply Algorithm 1 with \(\alpha _0^k\) and \(\alpha _1^\ell \) to the training set, train a classifier on the resulting filtered dataset, and then evaluate this classifier’s predictive performance on the validation set (using AUC). Upon completion, we select the parameter values \(\alpha _0^k\) and \(\alpha _1^\ell \) that yield the most performant classifier.

References

Arazo, E., Ortego, D., Albert, P., O’Connor, N.E., McGuinness, K.: Pseudo-labeling and confirmation bias in deep semi-supervised learning. In: International Joint Conference on Neural Networks, vol. 3, pp. 189–194 (2020)
Google Scholar
Batty, E., et al.: Behavenet: nonlinear embedding and Bayesian neural decoding of behavioral videos. In: Advances in Neural Information Processing Systems, pp. 15706–15717 (2019)
Google Scholar
Brandman, D.M., et al.: Rapid calibration of an intracortical brain-computer interface for people with tetraplegia. J. Neural Eng. 15(2), 026007 (2018)
Article Google Scholar
Brandman, D.M., Burkhart, M.C., Kelemen, J., Franco, B., Harrison, M.T., Hochberg, L.R.: Robust closed-loop control of a cursor in a person with tetraplegia using Gaussian process regression. Neural Comput. 30(11), 2986–3008 (2018)
Article MathSciNet Google Scholar
Burkhart, M.C.: A discriminative approach to bayesian filtering with applications to human neural decoding. Ph.D. thesis, Brown University, Division of Applied Mathematics, Providence, U.S.A. (2019)
Google Scholar
Burkhart, M.C., Brandman, D.M., Franco, B., Hochberg, L.R., Harrison, M.T.: The discriminative Kalman filter for Bayesian filtering with nonlinear and nongaussian observation models. Neural Comput. 32(5), 969–1017 (2020)
Article MathSciNet Google Scholar
Burkhart, M.C., Shan, K.: Deep low-density separation for semi-supervised classification. In: Krzhizhanovskaya, V.V., et al. (eds.) ICCS 2020. LNCS, vol. 12139, pp. 297–311. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50420-5_22
Chapter Google Scholar
Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge (2006)
Google Scholar
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)
Google Scholar
Chen, Z.: Bayesian filtering: from Kalman filters to particle filters, and beyond. Technical report, McMaster U (2003)
Google Scholar
Durrett, R.: Probability: Theory and Examples. Cambridge University Press, Cambridge (2010)
Book Google Scholar
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Statist. 29(5), 1189–1232 (2001)
Article MathSciNet Google Scholar
Grandvalet, Y., Bengio, Y.: Semi-supervised learning by entropy minimization. In: Advances in Neural Information Processing Systems, pp. 529–536 (2004)
Google Scholar
Iscen, A., Tolias, G., Avrithis, Y., Chum, O.: Label propagation for deep semi-supervised learning. In: Conference on Computer Vision and Pattern Recognition (2019)
Google Scholar
Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. In: Advances in Neural Information Processing Systems, pp. 3146–3154 (2017)
Google Scholar
Kim, M., Pavlovic, V.: Discriminative learning for dynamic state prediction. IEEE Trans. Pattern Anal. Mach. Intell. 31(10), 1847–1861 (2009)
Article Google Scholar
Krogh, A., Hertz, J.A.: A simple weight decay can improve generalization. In: Advances in Neural Information Processing Systems, pp. 950–957 (1991)
Google Scholar
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: International Conference on Machine Learning (2001)
Google Scholar
Laine, S., Aila, T.: Temporal ensembling for semi-supervised learning. In: International Conference on Learning Representations (2017)
Google Scholar
Lee, D.H.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: ICML Workshop on Challenges in Representation Learning (2013)
Google Scholar
Liu, D.C., Nocedal, J.: On the limited memory method for large scale optimization. Math. Program. 45(3), 503–528 (1989)
Article MathSciNet Google Scholar
McCallum, A., Freitag, D., Pereira, F.: Maximum entropy Markov models for information extraction and segmentation. In: International Conference on Machine Learning, pp. 591–598 (2000)
Google Scholar
Minka, T.P.: Expectation propagation for approximate Bayesian inference. In: Uncertainty in Artificial Intelligence (2001)
Google Scholar
Nair, V., Hinton, G.: Rectified linear units improve restricted Boltzmann machines (2010)
Google Scholar
Ng, A., Jordan, M.: On discriminative vs. generative classifiers: a comparison of logistic regression and Naive Bayes. In: Advances in Neural Information Processing Systems, vol. 14, pp. 841–848 (2002)
Google Scholar
Oliver, A., Odena, A., Raffel, C.A., Cubuk, E.D., Goodfellow, I.: Realistic evaluation of deep semi-supervised learning algorithms. In: Advances in Neural Information Processing Systems, pp. 3235–3246 (2018)
Google Scholar
Pearl, J.: Reverend Bayes on inference engines: a distributed hierarchical approach. In: Proceedings of Association for the Advancement of Artificial Intelligence, pp. 133–136 (1982)
Google Scholar
Rasmus, A., Berglund, M., Honkala, M., Valpola, H., Raiko, T.: Semi-supervised learning with ladder networks. In: Advances in Neural Information Processing Systems, pp. 3546–3554 (2015)
Google Scholar
Sajjadi, M., Javanmardi, M., Tasdizen, T.: Regularization with stochastic transformations and perturbations for deep semi-supervised learning. In: Advances in Neural Information Processing Systems, pp. 1163–1171 (2016)
Google Scholar
Särkkä, S.: Bayesian Filtering and Smoothing. Cambridge University Press, Cambridge (2013)
Book Google Scholar
Scudder III, H.J.: Probability of error for some adaptive pattern-recognition machines. IEEE Trans. Inf. Theory 11(3), 363–371 (1965)
Article MathSciNet Google Scholar
Tarvainen, A., Valpola, H.: Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In: Advances in Neural Information Processing Systems, pp. 1195–1204 (2017)
Google Scholar
Taycher, L., Shakhnarovich, G., Demirdjian, D., Darrell, T.: Conditional random people: tracking humans with CRFs and grid filters. In: Computer Vision and Pattern Recognition (2006)
Google Scholar
Whitney, M., Sarkar, A.: Bootstrapping via graph propagation. In: Proceedings of Association for Computational Linguistics, vol. 1, pp. 620–628 (2012)
Google Scholar
Yarkowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of Association for Computational Linguistics, pp. 189–196 (1995)
Google Scholar
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: beyond empirical risk minimization. In: International Conference on Learning Representations (2018)
Google Scholar
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: Advances in Neural Information Processing Systems, pp. 321–328 (2004)
Google Scholar
Zhu, X.: Semi-supervised learning literature survey. Technical report, TR 1530, U. Wisconsin-Madison (2005)
Google Scholar
Zhu, X., Ghahramani, Z.: Learning from labeled and unlabeled data with label propagation. Technical report, CMU-CALD-02-107, Carnegie Mellon University (2002)
Google Scholar
Zhuang, C., Ding, X., Murli, D., Yamins, D.: Local label propagation for large-scale semi-supervised learning (2019). arXiv:1905.11581

Download references

Acknowledgements

The author would like to thank his manager Binjie Lai, her manager Xiang Wu, and his coworkers at Adobe, especially Eunyee Koh for performing an internal review. The author is also grateful to the anonymous reviewers for their thoughtful feedback and to his former advisor Matthew T. Harrison for inspiring this discriminative filtering approach.

Author information

Authors and Affiliations

Adobe Inc., San José, USA
Michael C. Burkhart

Authors

Michael C. Burkhart
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael C. Burkhart .

Editor information

Editors and Affiliations

AGH University of Science and Technology, Krakow, Poland
Maciej Paszynski
Ludwig-Maximilians-Universität München, Munich, Germany
Dieter Kranzlmüller
University of Amsterdam, Amsterdam, The Netherlands
Valeria V. Krzhizhanovskaya
University of Tennessee at Knoxville, Knoxville, TN, USA
Jack J. Dongarra
University of Amsterdam, Amsterdam, The Netherlands
Peter M. A. Sloot

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Burkhart, M.C. (2021). Discriminative Bayesian Filtering for the Semi-supervised Augmentation of Sequential Observation Data. In: Paszynski, M., Kranzlmüller, D., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2021. ICCS 2021. Lecture Notes in Computer Science(), vol 12743. Springer, Cham. https://doi.org/10.1007/978-3-030-77964-1_22

Download citation

DOI: https://doi.org/10.1007/978-3-030-77964-1_22
Published: 09 June 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-77963-4
Online ISBN: 978-3-030-77964-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Discriminative Bayesian Filtering for the Semi-supervised Augmentation of Sequential Observation Data

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

On classifier behavior in the presence of mislabeling noise

Coupling the PAELLA Algorithm to Predictive Models

Probabilistic Hoeffding Trees

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Discriminative Bayesian Filtering for the Semi-supervised Augmentation of Sequential Observation Data

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

On classifier behavior in the presence of mislabeling noise

Coupling the PAELLA Algorithm to Predictive Models

Probabilistic Hoeffding Trees

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation