More Web Proxy on the site http://driver.im/

article

Corrective feedback and persistent learning for information extraction

Authors:

Trausti Kristjansson,

Andrew McCallum,

Paul ViolaAuthors Info & Claims

Artificial Intelligence, Volume 170, Issue 14-15

Pages 1101 - 1122

Published: 01 October 2006 Publication History

Abstract

To successfully embed statistical machine learning models in real world applications, two post-deployment capabilities must be provided: (1) the ability to solicit user corrections and (2) the ability to update the model from these corrections. We refer to the former capability as corrective feedback and the latter as persistent learning. While these capabilities have a natural implementation for simple classification tasks such as spam filtering, we argue that a more careful design is required for structured classification tasks. One example of a structured classification task is information extraction, in which raw text is analyzed to automatically populate a database. In this work, we augment a probabilistic information extraction system with corrective feedback and persistent learning components to assist the user in building, correcting, and updating the extraction model. We describe methods of guiding the user to incorrect predictions, suggesting the most informative fields to correct, and incorporating corrections into the inference algorithm. We also present an active learning framework that minimizes not only how many examples a user must label, but also how difficult each example is to label. We empirically validate each of the technical components in simulation and quantify the user effort saved. We conclude that more efficient corrective feedback mechanisms lead to more effective persistent learning.

References

[1]

B. Anderson, A. Moore, Active learning for hidden Markov models: Objective functions and algorithms, in: ICML, 2005

Digital Library

[2]

Argamon-Engelson, S. and Dagan, I., Committee-based sample selection for probabilistic classifiers. Journal of Artificial Intelligence. 335-360.

[3]

J. Baldridge, M. Osborne, Active learning and the total cost of annotation, in: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain, 2004

[4]

P.N. Bennett, Assessing the calibration of naive Bayes' posterior estimates, Tech. Rep. CMU-CS-00-155, Computer Science Department, School of Computer Science, Carnegie Mellon University, September 2000

[5]

C. Cardie, D. Pierce, Proposal for an interactive environment for information extraction, Tech. Rep. TR98-1702, Cornell University, 1998

Digital Library

[6]

R. Caruana, P. Hodor, J. Rosenberg, High precision information extraction, in: KDD-2000 Workshop on Text Mining, August 2000

[7]

D. Cohn, Z. Ghahramani, M. Jordan, Active learning with statistical models, in: Advances in Neural Information Processing Systems, vol. 9, 1996, pp. 705--712

[8]

Cohn, D.A., Ghahramani, Z. and Jordan, M.I., Active learning with statistical models. In: Tesauro, G., Touretzky, D., Leen, T. (Eds.), Advances in Neural Information Processing Systems, vol. 7. MIT Press, Cambridge, MA. pp. 705-712.

[9]

A. Culotta, A. McCallum, Confidence estimation for information extraction, in: Human Language Technology Conference (HLT 2004), Boston, MA, 2004

[10]

A. Culotta, A. McCallum, Reducing labeling effort for structured prediction tasks, in: Twentieth National Conference on Artificial Intelligence, Pittsburgh, PA, 1990, pp. 746--751

[11]

M. Franzini, K.F. Lee, A. Waibel, Connectionist Viterbi training: a new hybrid method for continuous speech recognition, in: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 1, 1990

[12]

Freund, Y., Seung, S., Shamir, E. and Tishby, N., Selective sampling using the query by committee algorithm. Machine Learning. v28. 133-168.

Digital Library

[13]

S. Gandrabur, G. Foster, Confidence estimation for text prediction, in: Proceedings of the Conference on Natural Language Learning, Edmonton, Canada, 2003

[14]

R. Ghani, R. Jones, T. Mitchell, E. Riloff, Active learning for information extraction with multiple view feature sets, in: ICML, 2003

[15]

A. Gunawardana, H. Hon, L. Jiang, Word-based acoustic confidence measures for large-vocabulary speech recognition, in: Proc. ICSLP-98, Sydney, Australia, 1998, pp. 791--794

[16]

Jaynes, E.T., Where do we stand on maximum entropy?. In: Levine, R.D., Tribus, M. (Eds.), The Maximum Entropy Formalism, MIT Press, Cambridge, MA. pp. 15-118.

[17]

T. Kristjannson, A. Culotta, P. Viola, A. McCallum, Interactive information extraction with conditional random fields, in: Nineteenth National Conference on Artificial Intelligence, 2004

[18]

N. Kushmerick, D.S. Weld, R.B. Doorenbos, Wrapper induction for information extraction, in: IJCAI, 1997, p. 729

[19]

Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th International Conference on Machine Learning, Morgan Kaufmann, San Francisco, CA. pp. 282-289.

[20]

Lewis, D.D. and Catlett, J., Heterogeneous uncertainty sampling for supervised learning. In: Cohen, W.W., Hirsh, H. (Eds.), Proceedings of ICML-94, 11th International Conference on Machine Learning, Morgan Kaufmann Publishers, San Francisco, US, New Brunswick, US. pp. 148-156.

[21]

Liu, D.C. and Nocedal, J., On the limited memory BFGS method for large scale optimization. Math. Programming Ser. B. v45 i3. 503-528.

[22]

J. Mankoff, G.D. Abowd, Error correction techniques for handwriting, speech, and other ambiguous or error prone systems, Tech. Rep., GVU Center and College of Computing Georgia Institute of Technology, 1999

[23]

A. McCallum, Efficiently inducing features of conditional random fields, in: Nineteenth Conference on Uncertainty in Artificial Intelligence, 2003

[24]

McCallum, A. and Li, W., Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Hearst, M., Ostendorf, M. (Eds.), HLT-NAACL, Association for Computational Linguistics, Edmonton, Alberta, Canada.

[25]

A. McCallum, K. Nigam, Employing em in pool-based active learning for text classification, in: Proceedings of the 15th International Conference on Machine Learning, 1998, pp. 359--367

[26]

I. Muslea, S. Minton, C. Knoblock, Active learning with strong and weak views: a case study on wrapper induction, in: Proceedings of International Joint Conference on Artificial Intelligence, 2003, pp. 415--420

[27]

D. Pinto, A. McCallum, X. Wei, W.B. Croft, Table extraction using conditional random fields, in: SIGIR '03: Proceedings of the Twenty-Sixth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2003

[28]

L. Rabiner, A tutorial on hidden Markov models, in: IEEE, vol. 77, 1989, pp. 257--286

[29]

N. Roy, A. McCallum, Toward optimal active learning through sampling estimation of error reduction, in: Proceedings of the 18th International Conference on Machine Learning, 2001, pp. 441--448

Digital Library

[30]

T. Scheffer, C. Decomain, S. Wrobel, Active hidden Markov models for information extraction, in: Advances in Intelligent Data Analysis, 4th International Conference, IDA 2001, 2001

[31]

A.I. Schein, Active learning for logistic regression, Ph.D. thesis, University of Pennsylvania, 2005

[32]

R. Schwartz, Y.-L. Chow, The n-best algorithms: an efficient and exact procedure for finding the n most likely sentence hypotheses, in: International Conference on Acoustics, Speech, and Signal Processing (ICASSP-90), 1990

[33]

Sha, F. and Pereira, F., Shallow parsing with conditional random fields. In: Hearst, M., Ostendorf, M. (Eds.), HLT-NAACL: Main Proceedings, Association for Computational Linguistics, Edmonton, Alberta, Canada. pp. 213-220.

[34]

B. Suhm, B.A. Myers, A. Waibel, Model-based and empirical evaluation of multimodal interactive error correction, in: CHI, 1999

[35]

Sutton, C. and McCallum, A., An introduction to conditional random fields for relational learning. In: Getoor, L., Taskar, B. (Eds.), Introduction to Statistical Relational Learning, MIT Press, Cambridge, MA.

[36]

Thompson, C.A., Califf, M.E. and Mooney, R.J., Active learning for natural language parsing and information extraction. In: Proc. 16th International Conf. on Machine Learning, Morgan Kaufmann, San Francisco, CA. pp. 406-414.

[37]

A. Vlachos, Active annotation, in: Proceedings of the EACL 2006 Workshop on Adaptive Text Extraction, 2006

[38]

Zuker, M., Suboptimal sequence alignment in molecular biology: Alignment with error analysis. Journal of Molecular Biology. v221. 403-420.

Cited By

Aydogdu MSaraoglu HLouton D(2020)Using long short‐term memory neural networks to analyze SEC 13D filingsInternational Journal of Intelligent Systems in Accounting and Finance Management10.1002/isaf.146426:4(153-163)Online publication date: 14-Feb-2020
https://dl.acm.org/doi/10.1002/isaf.1464
Suh JGhorashi SRamos GChen NDrucker SVerwey JSimard P(2019)AnchorVizACM Transactions on Interactive Intelligent Systems10.1145/324137910:1(1-38)Online publication date: 9-Aug-2019
https://dl.acm.org/doi/10.1145/3241379
Goldberg SWang DGrant C(2017)A Probabilistically Integrated System for Crowd-Assisted Text Labeling and ExtractionJournal of Data and Information Quality10.1145/30120038:2(1-23)Online publication date: 9-Feb-2017
https://dl.acm.org/doi/10.1145/3012003
Show More Cited By

Index Terms

Corrective feedback and persistent learning for information extraction
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Classification and regression trees
2. Theory of computation
  1. Models of computation
    1. Probabilistic computation

Recommendations

Efficiently incorporating user feedback into information extraction and integration programs
SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of data

Many applications increasingly employ information extraction and integration (IE/II) programs to infer structures from unstructured data. Automatic IE/II are inherently imprecise. Hence such programs often make many IE/II mistakes, and thus can ...
Information extraction with automatic knowledge expansion

POSIE (POSTECH Information Extraction System) is an information extraction system which uses multiple learning strategies, i.e., SmL, user-oriented learning, and separate-context learning, in a question answering framework. POSIE replaces laborious ...
A fast hybrid reinforcement learning framework with human corrective feedback

Reinforcement Learning agents can be supported by feedback from human teachers in the learning loop that guides the learning process. In this work we propose two hybrid strategies of Policy Search Reinforcement Learning and Interactive Machine Learning ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Artificial Intelligence

Artificial Intelligence Volume 170, Issue 14-15

October, 2006

43 pages

ISSN:0004-3702

Issue’s Table of Contents

Copyright © Elsevier B.V. © 2006.

Publisher

Elsevier Science Publishers Ltd.

United Kingdom

Publication History

Published: 01 October 2006

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Aydogdu MSaraoglu HLouton D(2020)Using long short‐term memory neural networks to analyze SEC 13D filingsInternational Journal of Intelligent Systems in Accounting and Finance Management10.1002/isaf.146426:4(153-163)Online publication date: 14-Feb-2020
https://dl.acm.org/doi/10.1002/isaf.1464
Suh JGhorashi SRamos GChen NDrucker SVerwey JSimard P(2019)AnchorVizACM Transactions on Interactive Intelligent Systems10.1145/324137910:1(1-38)Online publication date: 9-Aug-2019
https://dl.acm.org/doi/10.1145/3241379
Goldberg SWang DGrant C(2017)A Probabilistically Integrated System for Crowd-Assisted Text Labeling and ExtractionJournal of Data and Information Quality10.1145/30120038:2(1-23)Online publication date: 9-Feb-2017
https://dl.acm.org/doi/10.1145/3012003
Serrano NCivera JSanchis AJuan A(2014)Effective balancing error and user effort in interactive handwriting recognitionPattern Recognition Letters10.1016/j.patrec.2013.03.01037(135-142)Online publication date: 1-Feb-2014
https://dl.acm.org/doi/10.1016/j.patrec.2013.03.010
Vidulin VBohanec MGams M(2014)Combining human analysis and machine data mining to obtain credible data relationsInformation Sciences: an International Journal10.1016/j.ins.2014.08.014288:C(254-278)Online publication date: 20-Dec-2014
https://dl.acm.org/doi/10.1016/j.ins.2014.08.014
Cakmak MThomaz A(2014)Eliciting good teaching from humans for machine learnersArtificial Intelligence10.1016/j.artint.2014.08.005217:C(198-215)Online publication date: 1-Dec-2014
https://dl.acm.org/doi/10.1016/j.artint.2014.08.005
Stumpf SBurnett MPipek VWong WKonstan JChi EHöök K(2012)End-user interactions with intelligent and autonomous systemsCHI '12 Extended Abstracts on Human Factors in Computing Systems10.1145/2212776.2212713(2755-2758)Online publication date: 5-May-2012
https://dl.acm.org/doi/10.1145/2212776.2212713
Hanke MMuthmann KSchuster DSchill AAliyev KBerger M(2012)Continuous user feedback learning for data capture from business documentsProceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part II10.1007/978-3-642-28931-6_51(538-549)Online publication date: 28-Mar-2012
https://dl.acm.org/doi/10.1007/978-3-642-28931-6_51
Vondrick CRamanan D(2011)Video annotation and tracking with active learningProceedings of the 25th International Conference on Neural Information Processing Systems10.5555/2986459.2986463(28-36)Online publication date: 12-Dec-2011
https://dl.acm.org/doi/10.5555/2986459.2986463
Malik HMacGillivray IOlof-Ors MSun SSaroha S(2011)Exploring the corporate ecosystem with a semi-supervised entity graphProceedings of the 20th ACM international conference on Information and knowledge management10.1145/2063576.2063844(1857-1866)Online publication date: 24-Oct-2011
https://dl.acm.org/doi/10.1145/2063576.2063844
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents