[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2484838.2484865acmotherconferencesArticle/Chapter ViewAbstractPublication PagesssdbmConference Proceedingsconference-collections
research-article

Sharing confidential data for algorithm development by multiple imputation

Published: 29 July 2013 Publication History

Abstract

The availability of real-life data sets is of crucial importance for algorithm and application development, as these often require insight into the specific properties of the data. Often, however, such data are not released because of their proprietary and confidential nature. We propose to solve this problem using the statistical technique of multiple imputation, which is used as a powerful method for generating realistic synthetic data sets. Additionally, it is shown how the generated records can be combined into networked data using clustering techniques.

References

[1]
A. Alfons, S. Kraft, M. Templ, and P. Filzmoser. Simulation of synthetic population data for household surveys with application to EU-SILC. Technical report, Vienna University of Technology, 2010.
[2]
R. Choenni, J. van Dijk, and F. Leeuw. Preserving privacy whilst integrating data: applied to criminal justice. International Journal of Government and Democracy in the Information Age, 15(1-2):125--138, 2010.
[3]
H. Federrath. Privacy enhanced technologies: Methods--markets--misuse. Trust, Privacy, and Security in Digital Business, pages 1--9, 2005.
[4]
P. Graham and R. Penny. Multiply imputed synthetic datafiles. Technical report, Satistics New Zealand, 2007.
[5]
J. P. Reiter. Releasing multiply imputed, synthetic public use microdata: an illustration and empirical study. Journal of the Royal Statistical Society: Series A (Statistics in Society), 168(1):185--205, 2005.
[6]
D. B. Rubin. Multiple Imputation for Nonresponse in Surveys. John Wiley & Sons., New York, 1987.
[7]
D. B. Rubin. Statistical disclosure limitation. Journal of Official Statistics, 9(2):461--468, 1993.
[8]
D. R. T. E. Raghunathan, J. P. Reiter. Multiple imputation for disclosure limitation. Journal of official statistics, 19(1), 2003.
[9]
S. van Buuren and K. Groothuis-Oudshoorn. Mice: Multivariate imputation by chained equations in r. Journal of Statistical Software, 45(3):1--67, 2011.
[10]
S. van den Braak, R. Choenni, R. Meijer, and A. Zuiderwijk. Trusted third parties for secure and privacy-preserving data integration and sharing in the public sector. In Proceedings of the 13th Annual International Conference on Digital Government Research, pages 135--140, 2012.

Cited By

View all
  • (2014)Fraud Indicators Applied to Legal Entities: An Empirical Ranking ApproachDatabase and Expert Systems Applications10.1007/978-3-319-10085-2_9(106-115)Online publication date: 2014

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
SSDBM '13: Proceedings of the 25th International Conference on Scientific and Statistical Database Management
July 2013
401 pages
ISBN:9781450319218
DOI:10.1145/2484838
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 July 2013

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

SSDBM '13

Acceptance Rates

Overall Acceptance Rate 56 of 146 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2014)Fraud Indicators Applied to Legal Entities: An Empirical Ranking ApproachDatabase and Expert Systems Applications10.1007/978-3-319-10085-2_9(106-115)Online publication date: 2014

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media