[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3575882.3575940acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesic3inaConference Proceedingsconference-collections
research-article

A scientific expertise classification model based on experts’ self-claims using the semantic and the TF-IDF approach

Published: 27 February 2023 Publication History

Abstract

It is difficult to understand a scientific domain’s structure and extract specific information from it. A lot of human work is needed to achieve this goal. Based on previous studies, most of the data sets used in identifying the scientific expertise of academia are obtained through the information in the metadata and the contents of the papers written by academia. Therefore, machine learning tools should be utilized to accurately represent how knowledge has been arranged and presented up to this point. In this research, we compare semantic analysis approaches (Latent Dirichlet Allocation/ LDA and knowledge graph / KG) and non-explainable variables (TF-IDF) in identifying categories of scientific expertise. Dataset used based on scientific expertise self-claims written organically by academia which has not been widely studied in previous studies. The TF-IDF approach can provide better classification model accuracy results because its character only looks at the level of word importance (word relevance). However, this approach does not give meaning to the independent variable. It is also supported by the dataset with single part of speech condition. Meanwhile, the semantic analysis approach can provide meaning and relation to form the topic or cluster graph, even with a lower accuracy value.

References

[1]
Scopus: Access and use Support Center. 2020. What are Scopus subject area categories and ASJC codes?https://service.elsevier.com/app/answers/detail/a_id/12007/supporthub/scopus/
[2]
Jonardo R. Asor and Marco Antonio T. Subion. 2018. RESEARCH++: An Academic Social Networking Research Community Portal for Profiling and Expertise Classification. In 2018 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI). 470–475. https://doi.org/10.1109/ISRITI.2018.8864483
[3]
Krisztian Balog and Maarten Rijke. 2007. Determining Expert Profiles (With an Application to Expert Finding).Proceedings IJCAI-2007, 2657–2662.
[4]
David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent Dirichlet Allocation. The Art and Science of Analyzing Software Data 3 (2003), 139–159. https://doi.org/10.1016/B978-0-12-411519-4.00006-9
[5]
Veselka Boeva, Liliana Boneva, and Elena Tsiporkova. 2014. Semantic-Aware Expert Partitioning. In Artificial Intelligence: Methodology, Systems, and Applications, Gennady Agre, Pascal Hitzler, Adila A. Krisnadhi, and Sergei O. Kuznetsov (Eds.). Springer International Publishing, Cham, 13–24.
[6]
Veselka Boeva, Maria Krusheva, and Elena Tsiporkova. 2012. Measuring expertise similarity in expert networks. In 2012 6th IEEE International Conference Intelligent Systems. 053–057. https://doi.org/10.1109/IS.2012.6335190
[7]
Joshua Charles Campbell, Abram Hindle, and Eleni Stroulia. 2015. Latent Dirichlet Allocation: Extracting topics from software engineering data. The Art and Science of Analyzing Software Data (2015), 139–159. https://doi.org/10.1016/B978-0-12-411519-4.00006-9
[8]
Rodrigo Gonçalves and Carina Dorneles. 2019. Automated Expertise Retrieval: A Taxonomy-Based Survey and Open Issues. Comput. Surveys 52 (09 2019), 1–30. https://doi.org/10.1145/3331000
[9]
Margherita Grandini, Enrico Bagli, and Giorgio Visani. 2020. Metrics for Multi-Class Classification: an Overview. https://doi.org/10.48550/ARXIV.2008.05756
[10]
Lukman Lukman, Yan Rianto, Shidiq Al Hakim, Irene M Nadhiroh, and Deden Sumirat Hidayat. 2018. Citation performance of Indonesian scholarly journals indexed in Scopus from Scopus and Google Scholar. Sci Ed 5, 1 (2018), 53–58. https://doi.org/10.6087/kcse.119
[11]
Lindung Parningotan Manik, Zaenal Akbar, Aris Yaman, and Ariani Indrawati. 2022. Indonesian Scientists’ Behavior Relative to Research Data Governance in Preventing WMD-Applicable Technology Transfer. Publications 10, 4 (2022). https://doi.org/10.3390/publications10040050
[12]
C. Murray, Weimao Ke, and K. Borner. 2006. Mapping Scientific Disciplines and Author Expertise Based on Personal Bibliography Files. In Tenth International Conference on Information Visualisation (IV’06). 258–263. https://doi.org/10.1109/IV.2006.73
[13]
Michael Röder, Andreas Both, and Alexander Hinneburg. 2015. Exploring the space of topic coherence measures. WSDM 2015 - Proceedings of the 8th ACM International Conference on Web Search and Data Mining(2015), 399–408. https://doi.org/10.1145/2684822.2685324
[14]
G Salton. 1988. Term-weighting approaches in automatic text retrieval.
[15]
SINTA. 2022. Subjects. https://sinta.kemdikbud.go.id/subjects
[16]
Mauro Dalle Lucca Tosi and Julio Cesar Dos Reis. 2021. SciKGraph: A knowledge graph approach to structure a scientific field. Journal of Informetrics 15, 1 (2021), 101109. https://doi.org/10.1016/j.joi.2020.101109
[17]
Ike Vayansky and Sathish A.P. Kumar. 2020. A review of topic modeling methods. Information Systems 94(2020). https://doi.org/10.1016/j.is.2020.101582

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
IC3INA '22: Proceedings of the 2022 International Conference on Computer, Control, Informatics and Its Applications
November 2022
415 pages
ISBN:9781450397902
DOI:10.1145/3575882
© 2022 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 February 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Knowledge Graphs
  2. Latent Dirichlet Allocation
  3. Scientific Expertise Classification
  4. Semantic Analysis
  5. TF-IDF

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

IC3INA 2022

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 28
    Total Downloads
  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media