[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2948992.2949007acmotherconferencesArticle/Chapter ViewAbstractPublication PagesuccsConference Proceedingsconference-collections
short-paper

Ontology Based Rewriting Data Cleaning Operations

Published: 20 July 2016 Publication History

Abstract

Dealing with increasing amounts of data creates the need to deal with redundant, inconsistent and/or complementary repositories which may be different in their data models and/or in their schema. Current data cleaning techniques developed to tackle data quality problems are just suitable for scenarios were all repositories share the same model and schema. Recently, an ontology-based methodology was proposed to overcome this limitation. In this paper, this methodology is briefly described and applied to a real scenario in the health domain with data quality problems.

References

[1]
T. C. Redman, Data Quality: The Field Guide. Digital Press, 2001.
[2]
L. Atzori, A. Iera, and G. Morabito, 'The internet of things: A survey', Comput. Netw., vol. 54, no. 15, pp. 2787--2805, 2010.
[3]
S. Singh and N. Singh, 'Big Data analytics', in 2012 International Conference on Communication, Information Computing Technology (ICCICT), 2012, pp. 1--4.
[4]
C. Snijders, U. Matzat, and U.-D. Reips, 'Big data: Big gaps of knowledge in the field of internet science', Int. J. Internet Sci., vol. 7, no. 1, pp. 1--5, 2012.
[5]
E. F. Codd, 'A relational model of data for large shared data banks', Commun. ACM, vol. 13, no. 6, pp. 377--387, 1970.
[6]
G. Booch, Object-Oriented Analysis and Design with Applications, 2 edition. Redwood City, Calif: Addison-Wesley Professional, 1993.
[7]
J. Han, E. Haihong, G. Le, and J. Du, 'Survey on NoSQL database', in Pervasive computing and applications (ICPCA), 2011 6th international conference on, 2011, pp. 363--366.
[8]
P. Gupta, A. Goel, J. Lin, A. Sharma, D. Wang, and R. Zadeh, 'WTF: The Who to Follow Service at Twitter', in Proceedings of the 22Nd International Conference on World Wide Web, Republic and Canton of Geneva, Switzerland, 2013, pp. 505--514.
[9]
P. Oliveira, F. Rodrigues, P. Henriques, and H. Galhardas, 'A taxonomy of data quality problems', in 2nd Int. Workshop on Data and Information Quality, 2005, pp. 219--233.
[10]
D. Milano, M. Scannapieco, and T. Catarci, 'Using ontologies for xml data cleaning', in On the Move to Meaningful Internet Systems 2005: OTM 2005 Workshops, 2005, pp. 562--571.
[11]
T. Dasu, G. T. Vesonder, and J. R. Wright, 'Data quality through knowledge engineering', in Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, 2003, pp. 705--710.
[12]
C. Fürber and M. Hepp, 'Towards a vocabulary for data quality management in semantic web architectures', in Proceedings of the 1st International Workshop on Linked Web Data Management, 2011, pp. 1--8.
[13]
M. Knuth and H. Sack, 'Data cleansing consolidation with PatchR', in The Semantic Web: ESWC 2014 Satellite Events, Springer, 2014, pp. 231--235.
[14]
W3C, 'SPARQL 1.1 Query Language', 2013. {Online}. Available: https://www.w3.org/TR/sparql11-query/. {Accessed: 11-Apr-2016}.
[15]
P. Oliveira, F. Rodrigues, and P. Henriques, 'SmartClean: An Incremental Data Cleaning Tool', in Quality Software, 2009. QSIC'09. 9th International Conference on, 2009, pp. 452--457.
[16]
R. Almeida, P. Maio, P. Oliveira, and J. Barroso, 'An Ontology-based Methodology for Reusing Data Cleaning Knowledge':, 2015, pp. 202--211.
[17]
W3C, 'OWL Web Ontology Language Reference', 2004. {Online}. Available: https://www.w3.org/TR/owl-ref/. {Accessed: 12-Apr-2016}.
[18]
R. Almeida, P. Maio, P. Oliveira, and J. Barroso, 'A Methodology For Rewritting Data Cleaning Operations':, 2016, ICIQ.

Cited By

View all
  • (2024)BIGOWL4DQ: Ontology-driven approach for Big Data quality meta-modelling, selection and reasoningInformation and Software Technology10.1016/j.infsof.2023.107378167(107378)Online publication date: Mar-2024
  • (2022)Linked Data Quality Assessment: A SurveyWeb Services – ICWS 202110.1007/978-3-030-96140-4_5(63-76)Online publication date: 18-Feb-2022
  1. Ontology Based Rewriting Data Cleaning Operations

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    C3S2E '16: Proceedings of the Ninth International C* Conference on Computer Science & Software Engineering
    July 2016
    152 pages
    ISBN:9781450340755
    DOI:10.1145/2948992
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • BytePress
    • ISEP

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 July 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Data Cleaning
    2. Ontology
    3. Rewriting Process
    4. Schema
    5. Vocabulary

    Qualifiers

    • Short-paper
    • Research
    • Refereed limited

    Conference

    C3S2E '16

    Acceptance Rates

    Overall Acceptance Rate 12 of 42 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 01 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)BIGOWL4DQ: Ontology-driven approach for Big Data quality meta-modelling, selection and reasoningInformation and Software Technology10.1016/j.infsof.2023.107378167(107378)Online publication date: Mar-2024
    • (2022)Linked Data Quality Assessment: A SurveyWeb Services – ICWS 202110.1007/978-3-030-96140-4_5(63-76)Online publication date: 18-Feb-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media