[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Discovering Compatible Top-K Theme Patterns from Text Based on Users’ Preferences

  • Conference paper
Intelligence and Security Informatics (PAISI 2009)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 5477))

Included in the following conference series:

Abstract

Discovering a representative set of theme patterns from a large amount of text for interpreting their meaning has always been concerned by researches of both data mining and information retrieval. Recent studies of theme pattern mining have paid close attention to the problem of discovering a set of compatible top-k theme patterns with both high-interestingness and low-redundancy. Since different users have different preferences on interestingness and redundancy, how to measure the attributes of the users’ preferences, and thereby to discover “preferred compatible top-k theme patterns” (PCTTP) is urgent in the field of text mining. In this paper, a novel strategy of discovering PCTTP based on users’ preferences in text mining is proposed. Firstly, an evaluation function of the preferred compatibility between every two theme patterns is presented. Then the preferred compatibilities are archived into a data structure called theme compatibility graph, and a problem called MWSP based on the compatibility graph is proposed to formulate the problem how to discover the PCTTP. Secondly, since MWSP is proved to be a NP-Hard problem, a greedy algorithm, DPCTG, is designed to approximate the optimal solution of MWSP. Thirdly, a quality evaluation model is introduced to measure the compatibility of discovering theme patterns. Empirical studies indicate that a high quality set of PCTTP on four different sub text sets can be obtained from DBLP.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Afrati, F.N., Gionis, A., Mannila, H.: Approximating a collection of frequent sets. In: KDD 2004, pp. 8–19 (2004)

    Google Scholar 

  2. Krovetz, R.: Viewing morphology as an inference process. In: Proceedings of SIGIR 1993, pp. 191–202 (1993)

    Google Scholar 

  3. Mei, Q., Xin, D., Cheng, H., Han, J., Zhai, C.: Generating semantic annotations for frequent patterns with context analysis. In: KDD 2006, pp. 337–346 (2006)

    Google Scholar 

  4. Mei, Q., Xin, D., Cheng, H., Han, J., Zhai, C.: Discovering Evolutionary Theme semantic annotations for frequent patterns with context analysis. In: KDD 2005 (2005)

    Google Scholar 

  5. Tzvetkov, P., Yan, X., Han, J.: TSP: Mining top-k closed sequential patterns. Knowledge and Information Systems 7, 438–457 (2005)

    Article  Google Scholar 

  6. Varian, H.: Intermediate Microeconomics: A Modern Approach, 6th edn. W.W. Norton & Company Inc. (2003)

    Google Scholar 

  7. Xin, D., Han, J., Yan, X., Cheng, H.: On compressing frequent patterns. In: KIS 2007 (2007)

    Google Scholar 

  8. Xin, D., Han, J., Yan, X., Cheng, H.: Discovering Redundancy-Aware Top-K Patterns. In: KDD 2006, pp. 314–323 (2006)

    Google Scholar 

  9. Yan, X., Cheng, H., Han, J., Xin, D.: Summarizing Itemset Patterns: A Profile Based Approach. In: KDD 2005, pp. 314–323 (2005)

    Google Scholar 

  10. Yan, X., Han, J., Afshar, R.: Clospan: Mining closed sequential patterns in large datasets. In: SDM 2003, pp. 166–177 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tong, Y., Ma, S., Yu, D., Zhang, Y., Zhao, L., Xu, K. (2009). Discovering Compatible Top-K Theme Patterns from Text Based on Users’ Preferences. In: Chen, H., Yang, C.C., Chau, M., Li, SH. (eds) Intelligence and Security Informatics. PAISI 2009. Lecture Notes in Computer Science, vol 5477. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01393-5_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01393-5_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01392-8

  • Online ISBN: 978-3-642-01393-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics