[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3274856.3274879acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicaitConference Proceedingsconference-collections
research-article

Design and Implementation of a CSV Validation System

Published: 01 November 2018 Publication History

Abstract

The need for error checking is essential in our daily activities, as the saying goes "to err is human". This issue becomes problematic when it involves hidden errors. Depending on the hidden errors, in critical systems this can lead to a tragedy or disaster. There are two possible solutions, either preventing the error or detecting and correcting the error. In the case of file system detection and correction is more effective. In this work we consider the later case of error detection and correction, here we develop a scalable tabular technique for detecting errors in a file system. We apply the technique on a CSV file system with developing an application. The application detects and corrects errors in a CSV file. Scalability is one of the strength that differentiate this work from other related work. This scalability of the application provides an easy way for expanding the scope of possible errors that can be detected and corrected. The work contributes in illustrating a technique that improves confidence in the well-known legacy file CSV. This is very important considering the role of CSV file in transmission and processing of data, especially in data migration. The result of this work indicates a promising scalable approach for error detection and correction on a file system like CSV file.

References

[1]
Nada Amin, Tiark Rompf, and Martin Odersky. 2014. Foundations of path-dependent types. In Acm Sigplan Notices, Vol. 49. ACM, 233--249.
[2]
Henri Binsztok and Adam Koprowski. 2012. System and method for creating a parser generator and associated computer program. US Patent App. 13/384,326.
[3]
Roger S Bivand. 2014. 14 GeoComputation and Open-Source Software. In GeoComputation. CRC Press, 329.
[4]
John Frederick Chionglo. {n. d.}. A Reply to âĂIJWould an IDEâĂę be practical?âĂİ. ({n. d.}).
[5]
Grune Dick and H Ceriel. 1990. Parsing techniques, a practical guide. Technical Report. Technical Report, Tech. Rep.
[6]
Denis Firsov and Tarmo Uustalu. 2014. Certified CYK parsing of context-free languages. Journal of Logical and Algebraic Methods in Programming 83, 5-6 (2014), 459--468.
[7]
Jeffrey EF Friedl. 2002. Mastering regular expressions. "O'Reilly Media, Inc.".
[8]
Jeffrey EF Friedl. 2006. Mastering Regular Expressions: Understand Your Data and Be More Productive. "O'Reilly Media, Inc.".
[9]
Samuel Goëta and Tim Davies. 2016. The daily shaping of state transparency: Standards, machine-readability and the configuration of open government data policies. Science & Technology Studies (2016).
[10]
Terence Parr. 2013. The definitive ANTLR 4 reference. Pragmatic Bookshelf.
[11]
J Repici. 2010. The comma separated value (CSV) file format. Creativyst Inc (2010).
[12]
Tom Ridge. 2011. Simple, functional, sound and complete parsing for all context-free grammars. In International Conference on Certified Programs and Proofs. Springer, 103--118.
[13]
Susie M Stephens, Jake Y Chen, Marcel G Davidson, Shiby Thomas, and Barry M Trute. 2005. Oracle database 10 g: a platform for BLAST search and regular expression pattern matching in life sciences. Nucleic Acids Research 33, suppl_1 (2005), D675--D679.
[14]
Menno Van Zaanen et al. 1997. Error correction using dop. Master's thesis, Vrije Universiteit, Amsterdam (1997).
[15]
Ruben Verborgh and Max De Wilde. 2013. Using OpenRefine. Packt Publishing Ltd.

Cited By

View all
  • (2023)RoughSet based Feature Selection for Prediction of Breast CancerWireless Personal Communications10.1007/s11277-023-10378-4130:3(2197-2214)Online publication date: 30-Mar-2023
  • (2022)An LDA–SVM Machine Learning Model for Breast Cancer ClassificationBioMedInformatics10.3390/biomedinformatics20300222:3(345-358)Online publication date: 26-Jun-2022
  • (2022)Online Examination System with Measures for Prevention of Cheating along with Rapid Assessment and Automatic Grading2022 5th International Conference on Advances in Science and Technology (ICAST)10.1109/ICAST55766.2022.10039552(28-34)Online publication date: 2-Dec-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICAIT'2018: Proceedings of the 3rd International Conference on Applications in Information Technology
November 2018
171 pages
ISBN:9781450365161
DOI:10.1145/3274856
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

In-Cooperation

  • University of Aizu: University of Aizu

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. CSV
  2. Correction
  3. Data
  4. Detection
  5. Error
  6. Information

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICAIT'2018

Acceptance Rates

ICAIT'2018 Paper Acceptance Rate 33 of 56 submissions, 59%;
Overall Acceptance Rate 122 of 207 submissions, 59%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)1
Reflects downloads up to 12 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)RoughSet based Feature Selection for Prediction of Breast CancerWireless Personal Communications10.1007/s11277-023-10378-4130:3(2197-2214)Online publication date: 30-Mar-2023
  • (2022)An LDA–SVM Machine Learning Model for Breast Cancer ClassificationBioMedInformatics10.3390/biomedinformatics20300222:3(345-358)Online publication date: 26-Jun-2022
  • (2022)Online Examination System with Measures for Prevention of Cheating along with Rapid Assessment and Automatic Grading2022 5th International Conference on Advances in Science and Technology (ICAST)10.1109/ICAST55766.2022.10039552(28-34)Online publication date: 2-Dec-2022
  • (2022)Formalization of Converting Processes and it Validation in Spatial Data InfrastructureSmart Technologies in Urban Engineering10.1007/978-3-031-20141-7_1(3-13)Online publication date: 29-Nov-2022
  • (2021)The Role of Linear Discriminant Analysis for Accurate Prediction of Breast Cancer2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)10.1109/MCSoC51149.2021.00057(340-344)Online publication date: Dec-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media