Computer Science > Computation and Language

arXiv:2410.01356 (cs)

[Submitted on 2 Oct 2024]

Title:Assisted Data Annotation for Business Process Information Extraction from Textual Documents

Authors:Julian Neuberger, Han van der Aa, Lars Ackermann, Daniel Buschek, Jannic Herrmann, Stefan Jablonski

Abstract:Machine-learning based generation of process models from natural language text process descriptions provides a solution for the time-intensive and expensive process discovery phase. Many organizations have to carry out this phase, before they can utilize business process management and its benefits. Yet, research towards this is severely restrained by an apparent lack of large and high-quality datasets. This lack of data can be attributed to, among other things, an absence of proper tool assistance for dataset creation, resulting in high workloads and inferior data quality. We explore two assistance features to support dataset creation, a recommendation system for identifying process information in the text and visualization of the current state of already identified process information as a graphical business process model. A controlled user study with 31 participants shows that assisting dataset creators with recommendations lowers all aspects of workload, up to $-51.0\%$, and significantly improves annotation quality, up to $+38.9\%$. We make all data and code available to encourage further research on additional novel assistance strategies.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2410.01356 [cs.CL]
	(or arXiv:2410.01356v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2410.01356

Submission history

From: Julian Neuberger [view email]
[v1] Wed, 2 Oct 2024 09:14:39 UTC (650 KB)

Computer Science > Computation and Language

Title:Assisted Data Annotation for Business Process Information Extraction from Textual Documents

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Assisted Data Annotation for Business Process Information Extraction from Textual Documents

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators