Abstract
Data generated from various sources can be erroneous or incomplete which can have direct impact over business analysis. ETL (Extraction-Transformation-Loading) is a well-known process which extract data from different sources, transform those data into required format and finally load it into target data warehouse (DW). ETL performs an important role in data warehouse environment. Configuring an ETL process is one of the key factor having direct impact over cost, time and effort for establishment of a successful data warehouse. Conceptual modeling of ETL can give a high-level view of the system activities. It provides the advantage of pre-identification of system error, cost minimization, scope, risk assessment etc. Some research development has been done for modeling ETL process by applying UML, BPMN and Semantic Web at conceptual level. In this paper, we propose a new approach for conceptual modeling of ETL process by using a new standard Systems Modeling Language (SysML). SysML extends UML features with much more clear semantics from System Engineering point of view. We have shown the usefulness of our approach by exemplifying using a use case scenario.
Similar content being viewed by others
References
MBSE wiki. http://www.omgwiki.org/MBSE/doku.php
OMG systems modeling language. http://www.omgsysml.org/
Akkaoui, E.E., Zimányi, E.: Defining ETL worfklows using BPMN and BPEL. In: Proceedings of the ACM Twelfth International Workshop on Data Warehousing and OLAP, pp. 41–48. ACM (2009)
El Akkaoui, Z., Mazón, J.-N., Vaisman, A., Zimányi, E.: BPMN-based conceptual modeling of ETL processes. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2012. LNCS, vol. 7448, pp. 1–14. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32584-7_1
Akkaoui, Z.E., Zimányi, E., López, J.N.M., Mondéjar, J.C.T., et al.: A BPMN-based design and maintenance framework for ETL processes (2013)
Akkaoui, Z.E., Zimànyi, E., Mazón, J.N., Trujillo, J.: A model-driven framework for ETL process development. In: Proceedings of the 14th International Workshop on Data Warehousing and OLAP, pp. 45–52. ACM (2011)
Çağıltay, N.E., Topallı, D., Aykaç, Y.E., Tokdemir, G.: Abstract conceptual database model approach. In: Conference on Science and Information, pp. 275–281 (2013)
Ayhan, S., Pesce, J., Comitz, P., Sweet, D., Bliesner, S., Gerberick, G.: Predictive analytics with aviation big data. In: Conference on Integrated Communications, Navigation and Surveillance (ICNS 2013), pp. 1–13 (2013)
Barateiro, J., Galhardas, H.: A survey of data quality tools. Datenbank-Spektrum 14(15–21), 48 (2005)
Belo, O., Gomes, C., Oliveira, B., Marques, R., Santos, V.: Automatic generation of ETL physical systems from BPMN conceptual models. In: Bellatreche, L., Manolopoulos, Y. (eds.) MEDI 2015. LNCS, vol. 9344, pp. 239–247. Springer, Cham (2015). doi:10.1007/978-3-319-23781-7_19
Eckerson, W., White, C.: Evaluating ETL and data integration platforms. Report of The Data Warehousing Institute 184 (2003)
Estefan, J.A.: Survey of model-based systems engineering (MBSE) methodologies. Incose MBSE Focus Group 25(8) (2007)
Franconi, E., Kamblet, A.: A data warehouse conceptual data model. In: Proceedings of 16th International Conference on Scientific and Statistical Database Management, pp. 435–436 (2004)
Friedenthal, S., Moore, A., Steiner, R.: A Practical Guide to SysML: The Systems Modeling Language. Morgan Kaufmann, San Francisco (2014)
Hart, L.E.: Introduction to model-based system engineering (MBSE) and SysML, 30 July 2015. http://www.incose.org/docs/default-source/delaware-valley/mbse-overview-incose-30-july-2015.pdf
Hause, M.: The sysml modelling language. In: 15th European Systems Engineering Conference, vol. 9 (2006)
Hoang, A.D.T., Nguyen, B.T.: An integrated use of CWM and ontological modeling approaches towards ETL processes. In: IEEE International Conference on e-Business Engineering (ICEBE 2008), pp. 715–720, October 2008
Kherdekar, V.A., Metkewar, P.S.: A technical comprehensive survey of ETL tools. Int. J. Appl. Eng. Res. 11(4), 2557–2559 (2016)
Mrunalini, M., Kumar, T.S., Kanth, K.R.: Simulating secure data extraction in extraction transformation loading (ETL) processes. In: Third UKSim European Symposium on Computer Modeling and Simulation (EMS 2009), pp. 142–147. IEEE (2009)
Muñoz, L., Mazón, J.-N., Pardillo, J., Trujillo, J.: Modelling ETL processes of data warehouses with UML activity diagrams. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM 2008. LNCS, vol. 5333, pp. 44–53. Springer, Heidelberg (2008). doi:10.1007/978-3-540-88875-8_21
Muñoz, L., Mazón, J.N., Trujillo, J.: Automatic generation of ETL processes from conceptual models. In: Proceedings of the ACM Twelfth International Workshop on Data Warehousing and OLAP, pp. 33–40. ACM (2009)
Oliveira, B., Belo, O.: BPMN patterns for ETL conceptual modelling and validation. In: Chen, L., Felfernig, A., Liu, J., Raś, Z.W. (eds.) ISMIS 2012. LNCS, vol. 7661, pp. 445–454. Springer, Heidelberg (2012). doi:10.1007/978-3-642-34624-8_50
Oliveira, B., Belo, O.: ETL standard processes modelling - a novel BPMN approach. In: Proceedings of the 15th International Conference on Enterprise Information Systems, pp. 120–127 (2013)
Oliveira, B., Santos, V., Belo, O.: Pattern-based ETL conceptual modelling. In: Cuzzocrea, A., Maabout, S. (eds.) MEDI 2013. LNCS, vol. 8216, pp. 237–248. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41366-7_20
Simitsis, A., Skoutas, D., Castellanos, M.: Representation of conceptual etl designs in natural language using semantic web technology. Data Knowl. Eng. 69(1), 96–115 (2010)
Simitsis, A., Vassiliadis, P.: A methodology for the conceptual modelling of ETL processes. In: Proceedings of DSE (2003)
Skoutas, D., Simitsis, A.: Designing ETL processes using semantic web technologies. In: Proceedings ACM 9th International Workshop on Data Warehousing and OLAP (DOLAP 2006), Arlington, Virginia, USA, pp. 67–74 (2006)
Skoutas, D., Simitsis, A.: Ontology-based conceptual design of ETL processes for both structured and semi-structured data. Int. J. Semant. Web Inf. Syst. (IJSWIS) 3(4), 1–24 (2007)
Skoutas, D., Simitsis, A., Sellis, T.: Ontology-driven conceptual design of ETL processes using graph transformations. In: Spaccapietra, S., Zimányi, E., Song, I.-Y. (eds.) Journal on Data Semantics XIII. LNCS, vol. 5530, pp. 120–146. Springer, Heidelberg (2009). doi:10.1007/978-3-642-03098-7_5
Snezana, S., Violeta, M.: Business intelligence tools for statistical data analysis. In: Proceedings of the 32nd International Conference on Information Technology Interfaces (ITI 2010), pp. 199–204 (2010)
Thi, A.D.H., Nguyen, B.T.: A semantic approach towards CWM-based ETL processes. In: Proceedings of I-SEMANTICS 2008, pp. 58–66 (2008)
Trujillo, J., Luján-Mora, S.: A UML based approach for modeling ETL processes in data warehouses. In: Song, I.-Y., Liddle, S.W., Ling, T.-W., Scheuermann, P. (eds.) ER 2003. LNCS, vol. 2813, pp. 307–320. Springer, Heidelberg (2003). doi:10.1007/978-3-540-39648-2_25
Vassiliadis, P.: A survey of extract - transform - load technology. Int. J. Data Warehouse. Min. 5(3), 1–27 (2009)
Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Conceptual modeling for ETL processes. In: Proceedings DOLAP, pp. 14–21 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Biswas, N., Chattopadhyay, S., Mahapatra, G., Chatterjee, S., Mondal, K.C. (2017). SysML Based Conceptual ETL Process Modeling. In: Mandal, J., Dutta, P., Mukhopadhyay, S. (eds) Computational Intelligence, Communications, and Business Analytics. CICBA 2017. Communications in Computer and Information Science, vol 776. Springer, Singapore. https://doi.org/10.1007/978-981-10-6430-2_19
Download citation
DOI: https://doi.org/10.1007/978-981-10-6430-2_19
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6429-6
Online ISBN: 978-981-10-6430-2
eBook Packages: Computer ScienceComputer Science (R0)