[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Skill requirements in job advertisements: : A comparison of skill-categorization methods based on wage regressions

Published: 01 March 2023 Publication History

Highlights

We study methods to extract skill requirements from online job advertisements.
We propose the explained wage variation as performance metric for this task.
We compare dictionary-based, word-counting methods to unsupervised topic modeling.
We run wage regressions using the identified skills by the different methods.
LDA outperforms the word counting and the other topic modeling methods.

Abstract

In this paper, we compare different methods to extract skill demand from the text of job descriptions. We propose the fraction of wage variation explained by the extracted skills as a novel performance metric for the comparison of methods. Using this, we compare the performance of the word-counting method with three different dictionaries and that of three unsupervised topic-modeling techniques, the LDA, the PLSA and the BERTopic. We apply these methods to a U.K. job board dataset of 1,158,926 job advertisements from 35 industries collected in 2018. We find that each of the dictionary-based methods explain about 20% of the wage variation across jobs. The topic modeling techniques perform better as the PLSA is able to explain 36.5% of the wage variation, while BERTopic 32.6%. The best performing method is the LDA with 48.3% of the wage variation explained. Its disadvantage, however, is in the difficulty of interpretation of the skills extracted.

References

[1]
E.O. Arceo-Gomez, R.M. Campos-Vazquez, R.Y. Badillo, S. Lopez-Araiza, Gender stereotypes in job advertisements: What do they imply for the gender salary gap?, Journal of Labor Research 43 (1) (2022) 65–102.
[2]
E. Atalay, P. Phongthiengtham, S. Sotelo, D. Tannenbaum, The evolution of work in the United States, American Economic Journal: Applied Economics 12 (2) (2020) 1–34.
[3]
J. Azar, I. Marinescu, M. Steinbaum, B. Taska, Concentration in US labor markets: Evidence from online vacancy data, Labour Economics 66 (2020).
[4]
B.V. Barde, A.M. Bainwad, An overview of topic modeling methods and tools, in: Proceedings of the international conference on intelligent computing and control systems (ICICCS), IEEE, 2017, pp. 745–750.
[5]
K. Bastani, H. Namavari, J. Shaffer, Latent Dirichlet allocation (LDA) for topic modeling of the CFPB consumer complaints, Expert Systems with Applications 127 (2019) 256–271.
[6]
D.M. Blei, A.Y. Ng, M.I. Jordan, Latent dirichlet allocation, Journal of Machine Learning Research 3 (2003) 993–1022.
[7]
K. Bothmer, T. Schlippe, Skill scanner: Connecting and supporting employers, job seekers and educational institutions with an AI-based recommendation system, in: Proceedings of the learning ideas conference, 2022.
[8]
D. Botov, J. Klenin, A. Melnikov, Y. Dmitrin, I. Nikolaev, M. Vinel, Mining labor market requirements using distributional semantic models and deep learning, in: Proceedings of the international conference on business information systems, Springer, Cham, 2019, pp. 177–190.
[9]
L. Cao, J. Zhang, Skill requirements analysis for data analysts based on named entities recognition, in: Proceedings of the 2nd international conference on big data and informatization education (ICBDIE), IEEE, 2021, pp. 64–68.
[10]
M. Cerioli, M. Leotta, F. Ricca, What 5 million job advertisements tell us about testing: A preliminary empirical investigation, in: Proceedings of the 35th annual ACM symposium on applied computing, 2020, pp. 1586–1594.
[11]
S. Chaturvedi, K. Mahajan, Z. Siddique, Words matter: Gender, jobs and applicant behavior, IZA Discussion Paper (14497) (2021).
[12]
F. Colace, M. De Santo, M. Lombardi, F. Mercorio, M. Mezzanzanica, F. Pascale, Towards labour market intelligence through topic modelling, in: Proceedings of the 52nd Hawaii international conference on system sciences, 2019.
[13]
S. Debortoli, O. Müller, J. vom Brocke, Comparing business intelligence and big data skills, Business & Information Systems Engineering 6 (5) (2014) 289–300.
[14]
Decorte, J.J., .Van Hautte, J., Demeester, T., & Develder, C. (2021). JobBERT: Understanding job titles through skills. arXiv preprint arXiv:2109.09605.
[15]
D.J. Deming, K. Noray, Earnings dynamics, changing job skills, and STEM careers, The Quarterly Journal of Economics 135 (4) (2020) 1965–2005.
[16]
D. Deming, L.B. Kahn, Skill requirements across firms and labor markets: Evidence from job postings for professionals, Journal of Labor Economics 36 (SI) (2018) 337–369.
[17]
Devlin, J., Chang, M.W., Lee, K., & Toutanova, K. (2018, October 11). BERT: Pre-training of deep bidirectional transformers for language understanding. ArXiv.org. https://arxiv.org/abs/1810.04805.
[18]
J. Djumalieva, A. Lima, C. Sleeman, Classifying occupations according to their skill requirements in job advertisements, Economic Statistics Centre of Excellence Discussion Paper 4 (2018) 2018.
[19]
J. Gao, K.J. Merkley, J. Pacelli, J.H. Schroeder, Internal control weaknesses and the demand for financial skills: Evidence from US job postings, Kelley School of Business Research Paper (2020) 2020-54SSRN 3542331.
[20]
Grootendorst, M. (2022). BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv preprint arXiv:2203.05794.
[21]
A. Gugnani, H. Misra, Implicit skills extraction using document embedding and its use in job recommendation, in: Proceedings of the AAAI conference on artificial intelligence, 34, 2020, pp. 13286–13293.
[22]
F. Gurcan, N.E. Cagiltay, Big data software engineering: Analysis of knowledge domains and skill sets using LDA-based topic modeling, IEEE Access Practical Innovations Open Solutions 7 (2019) 82541–82552.
[23]
F. Gurcan, S. Sevik, Expertise roles and skills required by the software development industry, in: Proceedings of the 1st international informatics and software engineering conference (UBMYK), IEEE, 2019, pp. 1–4.
[24]
Ham, C.C., .Hann, R.N., .Rabier, M., & Wang, W. (2022). Auditor skill demands and audit quality: Evidence from job postings. Available at SSRN 3727495.
[25]
B. Hershbein, L.B. Kahn, Do recessions accelerate routine-biased technological change? Evidence from vacancy postings, American Economic Review 108 (7) (2018) 1737–1772.
[26]
T. Hofmann, Probabilistic latent semantic indexing, in: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, 1999, pp. 50–57.
[27]
F. Javed, P. Hoang, T. Mahoney, M. McNair, Large-scale occupational skills normalization for online recruitment, in: Proceedings of the 29th IAAI conference, 2017.
[28]
H. Jelodar, Y. Wang, C. Yuan, X. Feng, X. Jiang, Y. Li, et al., Latent Dirichlet allocation (LDA) and topic modeling: Models, applications, a survey, Multimedia Tools and Applications 78 (11) (2019) 15169–15211.
[29]
S. Jia, X. Liu, P. Zhao, C. Liu, L. Sun, T. Peng, Representation of job-skill in artificial intelligence with knowledge graph analysis, in: Proceedings of the IEEE symposium on product compliance engineering-Asia (ISPCE-CN), IEEE, 2018, pp. 1–6.
[30]
H. Jiang, C. Chen, Data science skills and graduate certificates: A quantitative text analysis, Journal of Computer Information Systems (2021) 1–17.
[31]
I. Khaouja, I. Kassou, M. Ghogho, A survey on skill identification from online job ads, IEEE Access Practical Innovations Open Solutions 9 (2021) 118134–118153.
[32]
I. Khaouja, G. Mezzour, I. Kassou, Unsupervised Skill Identification from Job Ads, in: Proceedings of the IEEE 22nd international conference on information reuse and integration for data science (IRI), IEEE, 2021, pp. 147–151.
[33]
J. Kim, P. Angnakoon, Research using job advertisements: A methodological assessment, Library & Information Science Research 38 (4) (2016) 327–335.
[34]
J. Koch, R. Plattfaut, I. Kregel, Looking for talent in times of crisis–the impact of the Covid-19 pandemic on public sector job openings, International Journal of Information Management Data Insights 1 (2) (2021).
[35]
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., & Dyer, C. (2016). Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360.
[36]
H.F. Mahdi, R. Dagli, A. Mustufa, S. Nanivadekar, Job descriptions keyword extraction using attention based deep learning models with BERT, in: Proceedings of the 3rd international congress on human-computer interaction, optimization and robotic applications (HORA), IEEE, 2021, pp. 1–6.
[37]
E. Malherbe, M.A. Aufaure, Bridge the terminology gap between recruiters and candidates: A multilingual skills base built from social media and linked data, in: Proceedings of the IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), IEEE, 2016, pp. 583–590.
[38]
I. Marinescu, R. Wolthoff, Opening the black box of the matching function: The power of words, Journal of Labor Economics 38 (2) (2020) 535–568.
[39]
M. Omar, B.W. On, I. Lee, G.S. Choi, LDA topics: Representation and evaluation, Journal of Information Science 41 (5) (2015) 662–675,.
[40]
K. Stevens, P. Kegelmeyer, D. Andrzejewski, D. Buttler, Exploring topic coherence over many models and many topics, Association for Computational Linguistics, 2012, pp. 12–14. https://aclanthology.org/D12-1087.pdf.
[41]
M. Papoutsoglou, N. Mittas, L. Angelis, Mining people analytics from stackoverflow job advertisements, in: Proceedings of the 43rd Euromicro conference on software engineering and advanced applications (SEAA), IEEE, 2017, pp. 108–115.
[42]
M. Pejic-Bach, T. Bertoncel, M. Meško, Ž. Krstić, Text mining of industry 4.0 job advertisements, International Journal of Information Management 50 (2020) 416–431.
[43]
Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. ArXiv.org. https://arxiv.org/abs/1908.10084.
[44]
J. Séguéla, G. Saporta, Automatic categorization of job postings, in: Proceedings of COMPSTAT'2010, 19th international conference on computational statistics, 2010.
[45]
Sia, S., Dalmia, A., & Mielke, S.J. (.2020). Tired of topic models? Clusters of pretrained word embeddings make for fast and good topics too. ArXiv:2004.14914 [Cs]. https://arxiv.org/abs/2004.14914.
[46]
E.M. Sibarani, S. Scerri, C. Morales, S. Auer, D. Collarana, Ontology-guided job market demand analysis: A cross-sectional study for the data science field, in: Proceedings of the 13th international conference on semantic systems, 2017, pp. 25–32.
[47]
A. Spitz-Oener, Technical change, job tasks, and rising educational demands: Looking outside the wage structure, Journal of Labor Economics 24 (2) (2006) 235–270.
[48]
Y. Sun, F. Zhuang, H. Zhu, Q. Zhang, Q. He, H. Xiong, Market-oriented job skill valuation with cooperative composition neural network, Nature Communications 12 (1) (2021) 1–12.
[49]
D.A. Tamburri, W.J. Van Den Heuvel, M. Garriga, DataOps for societal intelligence: A data pipeline for labor market skills extraction and matching, in: Proceedings of the IEEE 21st international conference on information reuse and integration for data science (IRI), IEEE, 2020, pp. 391–394.
[50]
R. Xie, S.K.W. Chu, D.K.W. Chiu, Y. Wang, Exploring public response to COVID-19 on Weibo with LDA topic modeling and sentiment analysis, Data and Information Management 5 (1) (2021) 86–99.
[51]
M. Yang, S. Lee, K. Park, K. Choi, T. Kim, A study on analysis of national R&D research trends for artificial intelligence using LDA topic modeling, Journal of Internet Computing and Services 22 (5) (2021) 47–55.
[52]
M. Zhao, F. Javed, F. Jacob, M. McNair, SKILL: A system for skill identification and normalization, in: Proceedings of the 27th IAAI conference, 2015.
[53]
Ziegler, L. (2021). Skill demand and wages. evidence from linked vacancy data. IZA Discussion Paper, No 14511.

Cited By

View all
  • (2024)Data science for job market analysisExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.124101251:COnline publication date: 24-Jul-2024
  • (2024)Predicting determinants influencing user satisfaction with mental health appExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.123647249:PBOnline publication date: 1-Sep-2024

Index Terms

  1. Skill requirements in job advertisements: A comparison of skill-categorization methods based on wage regressions
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Information Processing and Management: an International Journal
        Information Processing and Management: an International Journal  Volume 60, Issue 2
        Mar 2023
        1443 pages

        Publisher

        Pergamon Press, Inc.

        United States

        Publication History

        Published: 01 March 2023

        Author Tags

        1. Text analytics
        2. Topic modeling
        3. Skill extraction
        4. Job advertisements
        5. Wage regressions

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 13 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Data science for job market analysisExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.124101251:COnline publication date: 24-Jul-2024
        • (2024)Predicting determinants influencing user satisfaction with mental health appExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.123647249:PBOnline publication date: 1-Sep-2024

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media