[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1414004.1414071acmconferencesArticle/Chapter ViewAbstractPublication PagesesemConference Proceedingsconference-collections
research-article

Issues and effort in integrating data from heterogeneous software repositories and corporate databases

Published: 09 October 2008 Publication History

Abstract

Software repositories and corporate databases capture different fragments of a project's history. Software cockpits integrate the data from these repositories and databases to provide a holistic view of the project and the capability to drill-down and analyze details. By incorporating existing data, the cockpit can be used effectively from the first day it is introduced. In this paper we describe our findings from integrating several repositories and databases for a large, distributed project. We highlight common issues in data integration, report on the resulting effort for the development of software cockpits, and share our lessons learned from this data integration project.

References

[1]
Ballou, D. P., Tayi, G. K. 1999. Enhancing Data Quality in Data Warehouse Environments. CACM, Jan. 1999, 42(1): 73--78.
[2]
Hernández, M. A., Stolfo, S. J. 1998. Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem. Data Mining and Knowledge Discovery, 2(1): 9--37.
[3]
Inmon, W. H. 2002. Building the Data Warehouse, 3rd. Ed. Wiley.
[4]
Kim, W., Choi, B.-J., Hong, E.-K., Kim, S.-K., Lee, D. 2003. A Taxonomy of Dirty Data. Data Mining and Knowledge Discovery, 7(1): 81--99.
[5]
Kimball, R., Caserta, J. 2004. The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data. Wiley.
[6]
Kitchenham B., Kutay, C., Jeffery, R., Connaughton, C. 2006. Lessons learnt from the analysis of large-scale corporate databases. In 28th International Conference on Software Engineering (ICSE 2006), 439--444. ACM Press.
[7]
Kopanas, I., Avouris, N., Daskalaki, S. 2002. The Role of Domain Knowledge in a Large Scale Data Mining Project. In Methods and Applications of Artificial Intelligence 2002. LNCS 2308, 746--757. Springer.
[8]
Münch J., Heidrich, J. 2004. Software Project Control Centers: Concepts and Approaches. Journal of Systems and Software, Feb. 2004, 70(1-2): 3--19.
[9]
Schein, A. I., Popescul, A., Ungar, L. H., Pennock, D. M. 2002. Methods and metrics for cold-start recommendations. In 25th International ACM SIGIR Conference on Research and Development in Information Retrieval, 253--260. ACM Press.
[10]
Ziegler, P., Dittrich, K. R. 2004. Three Decades of Data Integration - All Problems Solved? In Jacquart, R. (ed.) 18th IFIP World Computer Congress (WCC 2004), vol. 12, 3--12, Kluwer, Toulouse.

Cited By

View all
  • (2020)Applying AI in Practice: Key Challenges and Lessons LearnedMachine Learning and Knowledge Extraction10.1007/978-3-030-57321-8_25(451-471)Online publication date: 18-Aug-2020
  • (2018)Building Defect Prediction Models in PracticeComputer Systems and Software Engineering10.4018/978-1-5225-3923-0.ch014(324-350)Online publication date: 2018
  • (2014)Building Defect Prediction Models in PracticeHandbook of Research on Emerging Advancements and Technologies in Software Engineering10.4018/978-1-4666-6026-7.ch024(540-565)Online publication date: 2014
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEM '08: Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement
October 2008
374 pages
ISBN:9781595939715
DOI:10.1145/1414004
  • General Chair:
  • Dieter Rombach,
  • Program Chairs:
  • Sebastian Elbaum,
  • Jürgen Münch
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 October 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data integration
  2. software cockpit
  3. software repository

Qualifiers

  • Research-article

Conference

ESEM '08
Sponsor:

Acceptance Rates

Overall Acceptance Rate 130 of 594 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Applying AI in Practice: Key Challenges and Lessons LearnedMachine Learning and Knowledge Extraction10.1007/978-3-030-57321-8_25(451-471)Online publication date: 18-Aug-2020
  • (2018)Building Defect Prediction Models in PracticeComputer Systems and Software Engineering10.4018/978-1-5225-3923-0.ch014(324-350)Online publication date: 2018
  • (2014)Building Defect Prediction Models in PracticeHandbook of Research on Emerging Advancements and Technologies in Software Engineering10.4018/978-1-4666-6026-7.ch024(540-565)Online publication date: 2014
  • (2014)1.2.1 The Architecture and Design of a Corporate Engineering Data RepositoryINCOSE International Symposium10.1002/j.2334-5837.2012.tb01320.x22:1(31-48)Online publication date: 4-Nov-2014
  • (2013)Noise in Bug Report Data and the Impact on Defect Prediction ResultsProceedings of the 2013 Joint Conference of the 23nd International Workshop on Software Measurement (IWSM) and the 8th International Conference on Software Process and Product Measurement10.1109/IWSM-Mensura.2013.33(173-180)Online publication date: 23-Oct-2013
  • (2011)A Framework for Defect Prediction in Specific Software Project ContextsSoftware Engineering Techniques10.1007/978-3-642-22386-0_20(261-274)Online publication date: 2011
  • (2010)Concept, Implementation and Evaluation of a Web-Based Software CockpitProceedings of the 2010 36th EUROMICRO Conference on Software Engineering and Advanced Applications10.1109/SEAA.2010.15(385-392)Online publication date: 1-Sep-2010
  • (2010)Software Engineering – Processes and ToolsHagenberg Research10.1007/978-3-642-02127-5_5(157-235)Online publication date: 2010
  • (2009)Experiences and Results from Establishing a Software Cockpit at BMD SystemhausProceedings of the 2009 35th Euromicro Conference on Software Engineering and Advanced Applications10.1109/SEAA.2009.77(188-194)Online publication date: 27-Aug-2009
  • (2009)What Software Repositories Should Be Mined for Defect Predictors?Proceedings of the 2009 35th Euromicro Conference on Software Engineering and Advanced Applications10.1109/SEAA.2009.65(181-187)Online publication date: 27-Aug-2009
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media