[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Stable classification of text genres

Published: 01 June 2011 Publication History

Abstract

Every text has at least one topic and at least one genre. Evidence for a text's topic and genre comes, in part, from its lexical and syntactic features-features used in both Automatic Topic Classification and Automatic Genre Classification (AGC). Because an ideal AGC system should be stable in the face of changes in topic distribution, we assess five previously published AGC methods with respect to both performance on the same topic-genre distribution on which they were trained and stability of that performance across changes in topic-genre distribution. Our experiments lead us to conclude that (1) stability in the face of changing topical distributions should be added to the evaluation critera for new approaches to AGC, and (2) Part-of-Speech features should be considered individually when developing a high-performing, stable AGC system for a particular, possibly changing corpus.

Cited By

View all
  • (2022)Register identification from the unrestricted open Web using the Corpus of Online Registers of EnglishLanguage Resources and Evaluation10.1007/s10579-022-09624-157:3(1045-1079)Online publication date: 26-Oct-2022
  • (2022)Cover-based multiple book genre recognition using an improved multimodal networkInternational Journal on Document Analysis and Recognition10.1007/s10032-022-00413-826:1(65-88)Online publication date: 20-Sep-2022
  • (2021)Text categorization based on a new classification by thresholdsProgress in Artificial Intelligence10.1007/s13748-021-00247-110:4(433-447)Online publication date: 1-Dec-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Computational Linguistics
Computational Linguistics  Volume 37, Issue 2
June 2011
154 pages
ISSN:0891-2017
EISSN:1530-9312
Issue’s Table of Contents

Publisher

MIT Press

Cambridge, MA, United States

Publication History

Published: 01 June 2011
Published in COLI Volume 37, Issue 2

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Register identification from the unrestricted open Web using the Corpus of Online Registers of EnglishLanguage Resources and Evaluation10.1007/s10579-022-09624-157:3(1045-1079)Online publication date: 26-Oct-2022
  • (2022)Cover-based multiple book genre recognition using an improved multimodal networkInternational Journal on Document Analysis and Recognition10.1007/s10032-022-00413-826:1(65-88)Online publication date: 20-Sep-2022
  • (2021)Text categorization based on a new classification by thresholdsProgress in Artificial Intelligence10.1007/s13748-021-00247-110:4(433-447)Online publication date: 1-Dec-2021
  • (2021)Exploring the role of lexis and grammar for the stable identification of register in an unrestricted corpus of web documentsLanguage Resources and Evaluation10.1007/s10579-020-09519-z55:3(757-788)Online publication date: 1-Sep-2021
  • (2019)Citation Needed: A Taxonomy and Algorithmic Assessment of Wikipedia's VerifiabilityThe World Wide Web Conference10.1145/3308558.3313618(1567-1578)Online publication date: 13-May-2019
  • (2019)Open-Set Web Genre Identification Using Distributional Features and Nearest Neighbors Distance RatioAdvances in Information Retrieval10.1007/978-3-030-15719-7_1(3-11)Online publication date: 14-Apr-2019
  • (2016)Finding News Citations for WikipediaProceedings of the 25th ACM International on Conference on Information and Knowledge Management10.1145/2983323.2983808(337-346)Online publication date: 24-Oct-2016
  • (2016)Crowdsourcing for web genre annotationLanguage Resources and Evaluation10.1007/s10579-015-9331-650:3(603-641)Online publication date: 1-Sep-2016
  • (2015)Topic identification techniques applied to dynamic language model adaptation for automatic speech recognitionExpert Systems with Applications: An International Journal10.1016/j.eswa.2014.07.03542:1(101-112)Online publication date: 1-Jan-2015
  • (2013)Closing the loopProceedings of the 76th ASIS&T Annual Meeting: Beyond the Cloud: Rethinking Information Boundaries10.5555/2655780.2655796(1-10)Online publication date: 1-Nov-2013
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media