research-article

Online chinese restaurant process

Authors:

Chien-Liang Liu,

Tsung-Hsun Tsai,

Chia-Hoang LeeAuthors Info & Claims

KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 591 - 600

https://doi.org/10.1145/2623330.2623636

Published: 24 August 2014 Publication History

Get Access

Abstract

Processing large volumes of streaming data in near-real-time is becoming increasingly important as the Internet, sensor networks and network traffic grow. Online machine learning is a typical means of dealing with streaming data, since it allows the classification model to learn one instance of data at a time. Although many online learning methods have been developed since the development of the Perceptron algorithm, existing online methods assume that the number of classes is available in advance of classification process. However, this assumption is unrealistic for large scale or streaming data sets. This work proposes an online Chinese restaurant process (CRP) algorithm, which is an online and nonparametric algorithm, to tackle this problem. This work proposes a relaxing function as part of the prior and updates the parameters with the likelihood function in terms of the consistency between the true label information and predicted result. This work presents two Gibbs sampling algorithms to perform posterior inference. In the experiments, the online CRP is applied to three massive data sets, and compared with several online learning and batch learning algorithms. One of the data sets is obtained from Wikipedia, which comprises approximately two million documents. The experimental results reveal that the proposed online CRP performs well and efficiently on massive data sets. Finally, this work proposes two methods to update the hyperparameter $\alpha$ of the online CRP. The first method is based on the posterior distribution of $\alpha$, and the second exploits the property of online learning, namely adapting to change, to adjust $\alpha$ dynamically.

Supplementary Material

MP4 File (p591-sidebyside.mp4)

Download
306.32 MB

References

[1]

D. Aldous. Exchangeability and related topics. In École d'Été St Flour 1983, pages 1--198. Springer-Verlag, 1985. Lecture Notes in Math. 1117.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Online but accurate inference for latent variable models with local Gibbs sampling

Do prior online course outcomes provide more information than G.P.A. alone in predicting subsequent online course grades and retention? An observational study at an urban community college

Distance Dependent Maximum Margin Dirichlet Process Mixture

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations