Abstract
Workflow formalisations are often focused on the representation of a process with the primary objective to support execution. However, there are scenarios where what needs to be represented is the effect of the process on the data artefacts involved, for example when reasoning over the corresponding data policies. This can be achieved by annotating the workflow with the semantic relations that occur between these data artefacts. However, manually producing such annotations is difficult and time consuming. In this paper we introduce a method based on recommendations to support users in this task. Our approach is centred on an incremental rule association mining technique that allows to compensate the cold start problem due to the lack of a training set of annotated workflows. We discuss the implementation of a tool relying on this approach and how its application on an existing repository of workflows effectively enable the generation of such annotations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
My experiment: http://www.myexperiment.org/.
- 2.
W3C PROV: https://www.w3.org/TR/prov-overview/.
- 3.
OPMW: http://www.opmw.org/.
- 4.
PWO: http://purl.org/spar/pwo.
- 5.
Wings: http://www.wings-workflows.org/.
- 6.
My experiments: http://www.myexperiment.org/.
- 7.
- 8.
- 9.
Datanode: http://purl.org/datanode/ns/.
- 10.
In this paper we use the terminology of the SCUFL2 specification. However, the basic structure is a common one. In the W3C PROV-O model this concept maps to the class Activity, in PWO with Step, and in OPMW to WorkflowExecutionProcess, just to mention few examples.
- 11.
“LipidMaps Query” workflow from My experiment: http://www.myexperiment.org/workflows/1052.html.
- 12.
Dinowolf: http://github.com/enridaga/dinowolf.
- 13.
SCUFL2 Specification: https://taverna.incubator.apache.org/documentation/scufl2/.
- 14.
Apache Taverna: https://taverna.incubator.apache.org/.
- 15.
Apache Lucene: https://lucene.apache.org/core/.
- 16.
DBPedia Spotlight: http://spotlight.dbpedia.org/.
- 17.
DBPedia: http://dbpedia.org/.
- 18.
My Experiments: http://www.myexperiments.org.
References
Alper, P., Belhajjame, K., Goble, C.A., Karagoz, P.: LabelFlow: exploiting workflow provenance to surface scientific data provenance. In: Ludäscher, B., Plale, B. (eds.) IPAW 2014. LNCS, vol. 8628, pp. 84–96. Springer, Heidelberg (2015). doi:10.1007/978-3-319-16462-5_7
Belhajjame, K., Corcho, O., Garijo, D., Zhao, J., Missier, P., Newman, D., Bechhofer, S., Garc a Cuesta, E., Soiland-Reyes, S., Verdes-Montenegro, L., et al.: Workflow-centric research objects: first class citizens in scholarly discourse. In: Proceedings of Workshop on the Semantic Publishing (SePublica 2012) 9th Extended Semantic Web Conference Hersonissos, Crete, Greece, 28 May 2012 (2012)
Belhajjame, K., Zhao, J., Garijo, D., Garrido, A., Soiland-Reyes, S., Alper, P., Corcho, O.: A workflow prov-corpus based on taverna and wings. In: Proceedings of the Joint EDBT/ICDT 2013 Workshops, pp. 331–332. ACM (2013)
Daga, E., d’Aquin, M., Adamou, A., Motta, E.: Addressing exploitability of smart city data. In: 2016 IEEE Second International Smart Cities Conference (ISC2). IEEE (2016)
Daga, E., d’Aquin, M., Gangemi, A., Motta, E.: Describing semantic web applications through relations between data nodes. Technical report kmi-14-05, Knowledge Media Institute, The Open University, Walton Hall, Milton Keynes (2014). http://kmi.open.ac.uk/publications/techreport/kmi-14-05
Daga, E., d’Aquin, M., Gangemi, A., Motta, E.: Propagation of policies in rich data flows. In: Proceedings of the 8th International Conference on Knowledge Capture, K-CAP 2015, New York, NY, USA, pp. 5:1–5:8 (2015). http://doi.acm.org/10.1145/2815833.2815839
Di Francescomarino, C., Ghidini, C., Rospocher, M., Serafini, L., Tonella, P.: Semantically-aided business process modeling. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 114–129. Springer, Heidelberg (2009)
Ferreira, D.R., Alves, S., Thom, L.H.: Ontology-based discovery of workflow activity patterns. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM 2011. LNBIP, vol. 100, pp. 314–325. Springer, Heidelberg (2012). doi:10.1007/978-3-642-28115-0_30
Gangemi, A., Peroni, S., Shotton, D., Vitali, F.: A pattern-based ontology for describing publishing workflows. In: Proceedings of the 5th International Conference on Ontology and Semantic Web Patterns, WOP 2014, vol. 1302, Aachen, Germany, pp. 2–13. CEUR-WS.org (2014). http://dl.acm.org/citation.cfm?id=2878937.2878939
Garijo, D., Alper, P., Belhajjame, K., Corcho, O., Gil, Y., Goble, C.: Common motifs in scientific workflows: an empirical analysis. Future Gener. Comput. Syst. 36, 338–351 (2014)
Garijo, D., Gil, Y.: A new approach for publishing workflows: abstractions, standards, and linked data. In: Proceedings of the 6th Workshop on Workflows in Support of Large-scale Science, WORKS 2011, NY, USA, pp. 47–56 (2011). http://doi.acm.org/10.1145/2110497.2110504
Godin, R., Missaoui, R., Alaoui, H.: Incremental concept formation algorithms based on galois (concept) lattices. Comput. Intell. 11(2), 246–267 (1995)
Gómez-Pérez, J.M., Corcho, O.: Problem-solving methods for understanding process executions. Comput. Sci. Eng. 10(3), 47–52 (2008)
Hettne, K., Soiland-Reyes, S., Klyne, G., Belhajjame, K., Gamble, M., Bechhofer, S., Roos, M., Corcho, O.: Workflow forever: Semantic web semantic models and tools for preserving and digitally publishing computational experiments. In: Proceedings of the 4th International Workshop on Semantic Web Applications and Tools for the Life Sciences, SWAT4LS 2011, NY, USA, pp. 36–37 (2012). http://doi.acm.org/10.1145/2166896.2166909
Kuznetsov, S.O., Obiedkov, S.A.: Comparing performance of algorithms for generating concept lattices. J. Exp. Theor. Artif. Intell. 14(2–3), 189–216 (2002)
Liu, J., Pacitti, E., Valduriez, P., Mattoso, M.: A survey of data-intensive scientific workflow management. J. Grid Comput. 13(4), 457–493 (2015)
Palma, R., Corcho, O., Hotubowicz, P., Pérez, S., Page, K., Mazurek, C.: Digital libraries for the preservation of research methods and associated artifacts. In: Proceedings of the 1st International Workshop on Digital Preservation of Research Methods and Artefacts, DPRMA 2013, NY, USA, pp. 8–15 (2013). http://doi.acm.org/10.1145/2499583.2499589
Poelmans, J., Elzinga, P., Viaene, S., Dedene, G.: Formal concept analysis in knowledge discovery: a survey. In: Croitoru, M., Ferré, S., Lukose, D. (eds.) ICCS 2010. LNCS (LNAI), vol. 6208, pp. 139–153. Springer, Heidelberg (2010). doi:10.1007/978-3-642-14197-3_15
Poelmans, J., Kuznetsov, S.O., Ignatov, D.I., Dedene, G.: Formal concept analysis in knowledge processing: a survey on models and techniques. Expert Syst. Appl. 40(16), 6601–6623 (2013)
Weber, I., Hoffmann, J., Mendling, J.: Semantic business process validation. In: Proceedings of the 3rd International Workshop on Semantic Business Process Management (SBPM 2008). CEUR-WS Proceedings, vol. 472 (2008)
Wille, R.: Formal concept analysis as mathematical theory of concepts and concept hierarchies. In: Ganter, B., Stumme, G., Wille, R. (eds.) Formal Concept Analysis. LNCS (LNAI), vol. 3626, pp. 1–33. Springer, Heidelberg (2005)
Wolstencroft, K., Haines, R., Fellows, D., Williams, A., Withers, D., Owen, S., Soiland-Reyes, S., Dunlop, I., Nenadic, A., Fisher, P., et al.: The taverna workflow suite: designing and executing workflows of web services on the desktop, web or in the cloud. Nucleic Acids Res. 41, W557–W561 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Daga, E., d’Aquin, M., Gangemi, A., Motta, E. (2016). An Incremental Learning Method to Support the Annotation of Workflows with Data-to-Data Relations. In: Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (eds) Knowledge Engineering and Knowledge Management. EKAW 2016. Lecture Notes in Computer Science(), vol 10024. Springer, Cham. https://doi.org/10.1007/978-3-319-49004-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-49004-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49003-8
Online ISBN: 978-3-319-49004-5
eBook Packages: Computer ScienceComputer Science (R0)