More Web Proxy on the site http://driver.im/

research-article

Public Access

DataPrism: Exposing Disconnect between Data and Systems

Authors:

Sainyam Galhotra,

Raoni Lourenço,

Juliana Freire,

Alexandra Meliou,

Divesh SrivastavaAuthors Info & Claims

SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

Pages 217 - 231

https://doi.org/10.1145/3514221.3517864

Published: 11 June 2022 Publication History

Abstract

As data is a central component of many modern systems, the cause of a system malfunction may reside in the data, and, specifically, particular properties of data. E.g., a health-monitoring system that is designed under the assumption that weight is reported in lbs will malfunction when encountering weight reported in kilograms. Like software debugging, which aims to find bugs in the source code or runtime conditions, our goal is to debug data to identify potential sources of disconnect between the assumptions about some data and systems that operate on that data. We propose DataPrism, a framework to identify data properties (profiles) that are the root causes of performance degradation or failure of a data-driven system. Such identification is necessary to repair data and resolve the disconnect between data and systems. Our technique is based on causal reasoning through interventions: when a system malfunctions for a dataset, DataPrism alters the data profiles and observes changes in the system's behavior due to the alteration. Unlike statistical observational analysis that reports mere correlations, DataPrism reports causally verified root causes -- in terms of data profiles -- of the system malfunction. We empirically evaluate DataPrism on seven real-world and several synthetic data-driven systems that fail on certain datasets due to a diverse set of reasons. In all cases, DataPrism identifies the root causes precisely while requiring orders of magnitude fewer interventions than prior techniques.

Supplemental Material

PDF File

Read me

Download
69.35 KB

ZIP File

Source Code

Download
16.14 MB

References

[1]

Ziawasch Abedjan, Lukasz Golab, and Felix Naumann. 2015. Profiling relational data: a survey. The VLDB Journal 24, 4 (2015), 557--581.

Digital Library

[2]

Ziawasch Abedjan, Lukasz Golab, Felix Naumann, and Thorsten Papenbrock. 2018. Data profiling. Synthesis Lectures on Data Management 10, 4 (2018), 1--154.

[3]

AdaBoost Classifier. https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.AdaBoostClassifier.html

[4]

Alan Akbik, Duncan Blythe, and Roland Vollgraf. 2018. Contextual String Embeddings for Sequence Labeling. In COLING 2018, 27th International Conference on Computational Linguistics. 1638--1649.

[5]

Mona Attariyan, Michael Chow, and Jason Flinn. 2012. X-Ray: Automating Root-Cause Diagnosis of Performance Anomalies in Production Software. In Proceedings of USENIX OSDI (Hollywood, CA, USA) (OSDI'12). USENIX Association, USA, 307--320.

[6]

Mona Attariyan and Jason Flinn. 2011. Automating Configuration Troubleshooting with ConfAid. ;login: 36, 1 (2011), 1--14.

[7]

Peter Bailis, Edward Gan, Samuel Madden, Deepak Narayanan, Kexin Rong, and Sahaana Suri. 2017. MacroBase: Prioritizing Attention in Fast Data. In Proceedings of the 2017 ACM International Conference on Management of Data (Chicago, Illinois, USA) (SIGMOD '17). ACM, New York, NY, USA, 541--556.

Digital Library

[8]

Daniel W Barowy, Emery D Berger, and Benjamin Zorn. 2018. ExceLint: Automatically finding spreadsheet formula errors. Proceedings of the ACM on Programming Languages 2, OOPSLA (2018), 1--26.

Digital Library

[9]

Daniel W. Barowy, Dimitar Gochev, and Emery D. Berger. 2014. CheckCell: data debugging for spreadsheets. In OOPSLA. 507--523.

[10]

Rachel KE Bellamy, Kuntal Dey, Michael Hind, Samuel C Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovic, et al . 2018. AI Fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias. arXiv preprint arXiv:1810.01943 (2018).

[11]

Bias in Amazon Hiring. https://becominghuman.ai/amazons-sexist-ai-recruiting-tool-how-did-it-go-so-wrong-e3d14816d98e

[12]

Mike Brachmann, Carlos Bautista, Sonia Castelo, Su Feng, Juliana Freire, Boris Glavic, Oliver Kennedy, Heiko Müeller, Rémi Rampin, William Spoth, et al. 2019. Data debugging and exploration with vizier. In Proceedings of the 2019 International Conference on Management of Data. 1877--1880.

Digital Library

[13]

Eric Breck, Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang, and Martin Zinkevich. 2019. Data validation for machine learning. In Conference on Systems and Machine Learning (SysML). https://www. sysml. cc/doc/2019/167. pdf.

[14]

Gabriel Cadamuro, Ran Gilad-Bachrach, and Xiaojin Zhu. 2016. Debugging machine learning models. In ICML Workshop on Reliable Machine Learning in the Wild.

[15]

Cardiovascular Disease dataset. https://www.kaggle.com/sulianova/cardiovascular-disease-dataset

[16]

Loredana Caruccio, Vincenzo Deufemia, and Giuseppe Polese. 2016. On the discovery of relaxed functional dependencies. In Proceedings of the 20th International Database Engineering & Applications Symposium. 53--61.

Digital Library

[17]

Giuseppe Casalicchio, Christoph Molnar, and Bernd Bischl. 2018. Visualizing the feature importance for black box models. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 655--670.

[18]

Mike Y. Chen, Emre Kiciman, Eugene Fratkin, Armando Fox, and Eric Brewer. 2002. Pinpoint: Problem Determination in Large, Dynamic Internet Services. In Proceedings of IEEE DSN (DSN'02). IEEE, USA, 595--604.

[19]

Fernando Chirigati, Harish Doraiswamy, Theodoros Damoulas, and Juliana Freire. 2016. Data Polygamy: The Many-Many Relationships Among Urban Spatio- Temporal Data Sets. In Proceedings of ACM SIGMOD (SIGMOD '16). ACM, New York, NY, USA, 1011--1025.

Digital Library

[20]

Xu Chu, Ihab F. Ilyas, and Paolo Papotti. 2013. Discovering Denial Constraints. PVLDB 6, 13 (2013), 1498--1509.

Digital Library

[21]

Xin Luna Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Kevin Murphy, Shaohua Sun, and Wei Zhang. 2014. From Data Fusion to Knowledge Fusion. PVLDB 7, 10 (2014), 881--892.

Digital Library

[22]

Dingzhu Du, Frank K Hwang, and Frank Hwang. 2000. Combinatorial group testing and its applications. Vol. 12. World Scientific.

[23]

Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml

[24]

Kareem El Gebaly, Parag Agrawal, Lukasz Golab, Flip Korn, and Divesh Srivastava. 2014. Interpretable and Informative Explanations of Outcomes. PVLDB 8, 1 (Sept. 2014), 61--72.

[25]

Wenfei Fan, Floris Geerts, Laks V. S. Lakshmanan, and Ming Xiong. 2009. Discovering Conditional Functional Dependencies. In Proceedings of the 25th International Conference on Data Engineering, ICDE 2009, March 29 2009 - April 2 2009, Shanghai, China. 1231--1234.

Digital Library

[26]

Anna Fariha, Suman Nath, and Alexandra Meliou. 2020. Causality-Guided Adaptive Interventional Debugging. In SIGMOD. 431--446.

[27]

Anna Fariha, Ashish Tiwari, Arjun Radhakrishna, Sumit Gulwani, and Alexandra Meliou. 2021. Conformance Constraint Discovery: Measuring Trust in Data- Driven Systems. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021. ACM, 499--512.

[28]

Gordon Fraser and Andrea Arcuri. 2013. Whole test suite generation. IEEE Transactions on Software Engineering 39, 2 (2013), 276--291.

Digital Library

[29]

Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou. 2017. Fairness testing: testing software for discrimination. In Proceedings of the 2017 11th Joint meeting on foundations of software engineering. 498--510.

Digital Library

[30]

Sainyam Galhotra, Anna Fariha, Raoni Lourenço, Juliana Freire, Alexandra Meliou, and Divesh Srivastava. 2021. DataPrism: Exposing Disconnect between Data and Systems. Technical Report. https://arxiv.org/abs/2105.06058.

[31]

Sainyam Galhotra, Udayan Khurana, Oktie Hassanzadeh, Kavitha Srinivas, Horst Samulowitz, and Miao Qi. 2019. Automated Feature Enhancement for Predictive Modeling using External Knowledge. In 2019 International Conference on Data Mining Workshops (ICDMW). IEEE, 1094--1097.

[32]

Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna M. Wallach, Hal Daumé III, and Kate Crawford. 2018. Datasheets for Datasets. CoRR abs/1803.09010 (2018). arXiv:1803.09010

[33]

Patrice Godefroid, Michael Y. Levin, and David A. Molnar. 2008. Automated whitebox fuzz testing. In Proceedings of NDSS. 151--166.

[34]

Google Vision Racism. https://algorithmwatch.org/en/story/google-vision-racism/

[35]

Muhammad Ali Gulzar, Siman Wang, and Miryung Kim. 2018. BigSift: Automated Debugging of Big Data Analytics in Data-Intensive Scalable Computing. In Proceedings of ESEC/FSE (Lake Buena Vista, FL, USA) (ESEC/FSE 2018). ACM, New York, NY, USA, 863--866.

Digital Library

[36]

Brent Hailpern and Padmanabhan Santhanam. 2002. Software debugging, testing, and verification. IBM Systems Journal 41, 1 (2002), 4--12.

Digital Library

[37]

Joseph M Hellerstein. 2008. Quantitative Data Cleaning for Large Databases. (2008).

[38]

Thomas A Henzinger, Ranjit Jhala, Rupak Majumdar, and Grégoire Sutre. 2003. Software verification with BLAST. In International SPIN Workshop on Model Checking of Software. Springer, 235--239.

[39]

Christian Holler, Kim Herzig, and Andreas Zeller. 2012. Fuzzing with code fragments. In Proceedings of USENIX Security Symposium. 445--458.

[40]

IBM AIF 360. https://aif360.mybluemix.net/

[41]

Ihab F Ilyas, Volker Markl, Peter Haas, Paul Brown, and Ashraf Aboulnaga. 2004. CORDS: automatic discovery of correlations and soft functional dependencies. In Proceedings of the 2004 ACM SIGMOD international conference on Management of data. 647--658.

Digital Library

[42]

IMDb Dataset. https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews

[43]

Md Shahriar Iqbal, Rahul Krishna, Mohammad Ali Javidian, Baishakhi Ray, and Pooyan Jamshidi. [n.d.]. CADET: A Systematic Method For Debugging Miscon- figurations using Counterfactual Reasoning. ([n. d.]).

[44]

Is Amazon same-day delivery service racist? 2016. The Christian Science Monitor. https://www.csmonitor.com/Business/2016/0423/Is-Amazon-same-day-delivery-service-racist

[45]

Brittany Johnson, Yuriy Brun, and Alexandra Meliou. 2020. Causal Testing: Finding Defects' Root Causes. In ICSE.

Digital Library

[46]

Nick Koudas, Avishek Saha, Divesh Srivastava, and Suresh Venkatasubramanian. 2009. Metric functional dependencies. In 2009 IEEE 25th International Conference on Data Engineering. IEEE, 1275--1278.

Digital Library

[47]

Todd Kulesza, Margaret Burnett, Weng-Keen Wong, and Simone Stumpf. 2015. Principles of explanatory debugging to personalize interactive machine learning. In Proceedings of the 20th international conference on intelligent user interfaces. 126--137.

Digital Library

[48]

Amresh Kumar, M Kiran, and BR Prathap. 2013. Verification and validation of mapreduce program model for parallel k-means algorithm on hadoop cluster. In 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT). IEEE, 1--8.

[49]

Philipp Langer and Felix Naumann. 2016. Efficient order dependency detection. VLDB J. 25, 2 (2016), 223--241.

Digital Library

[50]

Ben Liblit, Mayur Naik, Alice X Zheng, Alex Aiken, and Michael I Jordan. 2005. Scalable statistical bug isolation. Acm Sigplan Notices 40, 6 (2005), 15--26.

Digital Library

[51]

Haopeng Liu, Shan Lu, Madan Musuvathi, and Suman Nath. 2019. What bugs cause production cloud incidents?. In Proceedings of the Workshop on Hot Topics in Operating Systems, HotOS 2019, Bertinoro, Italy, May 13--15, 2019. 155--162.

Digital Library

[52]

Raoni Lourenço, Juliana Freire, and Dennis E. Shasha. 2020. BugDoc: Algorithms to Debug Computational Processes. In SIGMOD. 463--478.

[53]

Ali Mesbah, Arie Van Deursen, and Danny Roest. 2011. Invariant-based automatic testing of modern web applications. IEEE Transactions on Software Engineering 38, 1 (2011), 35--53.

Digital Library

[54]

Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model Cards for Model Reporting. Proceedings of the Conference on Fairness, Accountability, and Transparency (Jan 2019).

Digital Library

[55]

K?vanç Mu?lu, Yuriy Brun, and Alexandra Meliou. 2013. Data debugging with continuous testing. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering. 631--634.

[56]

Thorsten Papenbrock, Jens Ehrlich, Jannik Marten, Tommy Neubert, Jan-Peer Rudolph, Martin Schönberg, Jakob Zwiener, and Felix Naumann. 2015. Functional dependency discovery: An experimental evaluation of seven algorithms. PLDB 8, 10 (2015), 1082--1093.

[57]

Thorsten Papenbrock, Sebastian Kruse, Jorge-Arnulfo Quiané-Ruiz, and Felix Naumann. 2015. Divide & conquer-based inclusion dependency discovery. PLDB 8, 7 (2015), 774--785.

[58]

Python Rexpy package. https://tdda.readthedocs.io/en/v1.0.30/rexpy.html

[59]

Random Forest Classifier. https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html

[60]

Kaivalya Rawal, Ece Kamar, and Himabindu Lakkaraju. 2020. Can I Still Trust You?: Understanding the Impact of Distribution Shifts on Algorithmic Recourses. arXiv preprint arXiv:2012.11788 (2020).

[61]

Theodoros Rekatsinas, Xu Chu, Ihab F Ilyas, and Christopher Ré. 2017. HoloClean: holistic data repairs with probabilistic inference. PVLDB 10, 11 (2017), 1190--1201.

Digital Library

[62]

El Kindi Rezig, Ashrita Brahmaroutu, Nesime Tatbul, Mourad Ouzzani, Nan Tang, Timothy G. Mattson, Samuel Madden, and Michael Stonebraker. 2020. Debugging Large-Scale Data Science Pipelines using Dagger. PVLDB 13, 12 (2020), 2993--2996.

Digital Library

[63]

El Kindi Rezig, Lei Cao, Giovanni Simonini, Maxime Schoemans, Samuel Madden, Nan Tang, Mourad Ouzzani, and Michael Stonebraker. 2020. Dagger: A Data (not code) Debugger. In CIDR 2020, 10th Conference on Innovative Data Systems Research, Amsterdam, The Netherlands, January 12--15, 2020, Online Proceedings.

[64]

El Kindi Rezig, Mourad Ouzzani, Walid G Aref, Ahmed K Elmagarmid, Ahmed R Mahmood, and Michael Stonebraker. 2021. Horizon: scalable dependency-driven data cleaning. PVLDB 14, 11 (2021), 2546--2554.

Digital Library

[65]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why should I trust you?" Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1135--1144.

Digital Library

[66]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Anchors: High- Precision Model-Agnostic Explanations. In AAAI, Vol. 18. 1527--1535.

[67]

Jeremias Rößler, Gordon Fraser, Andreas Zeller, and Alessandro Orso. 2012. Isolating failure causes through test case generation. In International Symposium on Software Testing and Analysis, ISSTA 2012, Minneapolis, MN, USA, July 15--20, 2012, Mats Per Erik Heimdahl and Zhendong Su (Eds.). ACM, 309--319.

[68]

Babak Salimi, Harsh Parikh, Moe Kayali, Lise Getoor, Sudeepa Roy, and Dan Suciu. 2020. Causal Relational Learning. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14--19, 2020. 241--256.

Digital Library

[69]

Babak Salimi, Luke Rodriguez, Bill Howe, and Dan Suciu. 2019. Interventional Fairness: Causal Database Repair for Algorithmic Fairness. In SIGMOD. 793--810.

Digital Library

[70]

Sebastian Schelter, Tammo Rukat, and Felix Bießmann. 2020. Learning to Validate the Predictions of Black Box Classifiers on Unseen Data. In SIGMOD. 1289--1299.

[71]

Sentiment 140 dataset. https://www.kaggle.com/kazanova/sentiment140

[72]

Shaoxu Song and Lei Chen. 2011. Differential dependencies: Reasoning and discovery. ACM Transactions on Database Systems (TODS) 36, 3 (2011), 1--41.

Digital Library

[73]

Julia Stoyanovich and Bill Howe. 2019. Nutritional Labels for Data and Models. IEEE Data Eng. Bull. 42, 3 (2019), 13--23.

[74]

Paroma Varma, Dan Iter, Christopher De Sa, and Christopher Ré. 2017. Flipper: A systematic approach to debugging training sets. In Proceedings of the 2nd Workshop on Human-in-the-Loop Data Analytics. 1--5.

Digital Library

[75]

Xiaolan Wang, Xin Luna Dong, and Alexandra Meliou. 2015. Data X-Ray: A Diagnostic Tool for Data Errors. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31 - June 4, 2015. 1231--1245.

Digital Library

[76]

Weiyuan Wu, Lampros Flokas, Eugene Wu, and Jiannan Wang. 2020. Complaint-driven Training Data Debugging for Query 2.0. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1317--1334.

Digital Library

[77]

Jing Nathan Yan, Oliver Schulte, Mohan Zhang, Jiannan Wang, and Reynold Cheng. 2020. SCODED: Statistical Constraint Oriented Data Error Detection. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14--19, 2020. 845--860.

Digital Library

[78]

Andreas Zeller. 1999. Yesterday, My Program Worked. Today, It Does Not. Why?. In Software Engineering - ESEC/FSE'99, 7th European Software Engineering Conference, Held Jointly with the 7th ACM SIGSOFT Symposium on the Foundations of Software Engineering, Toulouse, France, September 1999, Proceedings. 253--267.

Digital Library

[79]

Alice X. Zheng, Michael I. Jordan, Ben Liblit, Mayur Naik, and Alex Aiken. 2006. Statistical Debugging: Simultaneous Identification of Multiple Bugs. In Proceedings of ICML (Pittsburgh, Pennsylvania, USA) (ICML'06). ACM, New York, NY, USA, 1105--1112.

Digital Library

Cited By

Behme LGalhotra SBeedkar KMarkl V(2024)Fainder: A Fast and Accurate Index for Distribution-Aware Dataset SearchProceedings of the VLDB Endowment10.14778/3681954.368199917:11(3269-3282)Online publication date: 30-Aug-2024
https://dl.acm.org/doi/10.14778/3681954.3681999
An SCao Y(2024)Counterfactual Explanation at Will, with Zero Privacy LeakageProceedings of the ACM on Management of Data10.1145/36549332:3(1-29)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654933
Ahmed SGao HRajan HRoychoudhury APaiva AAbreu RStorey M(2024)Inferring Data Preconditions from Deep Learning Models for Trustworthy Prediction in DeploymentProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623333(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3623333
Show More Cited By

Index Terms

DataPrism: Exposing Disconnect between Data and Systems

Recommendations

Causal testing: understanding defects' root causes
ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering

Understanding the root cause of a defect is critical to isolating and repairing buggy behavior. We present Causal Testing, a new method of root-cause analysis that relies on the theory of counterfactual causality to identify a set of executions that ...
Debugging Big Data Analytics in Spark with BigDebug
SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data

To process massive quantities of data, developers leverage Data-Intensive Scalable Computing (DISC) systems such as Apache Spark. In terms of debugging, DISC systems support only post-mortem log analysis and do not provide any debugging functionality. ...
Triage: diagnosing production run failures at the user's site
SOSP '07: Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles

Diagnosing production run failures is a challenging yet importanttask. Most previous work focuses on offsite diagnosis, i.e.development site diagnosis with the programmers present. This is insufficient for production-run failures as: (1) it is difficult ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

June 2022

2597 pages

ISBN:9781450392495

DOI:10.1145/3514221

General Chair:
Zachary Ives
University of Pennsylvania (USA)
,
Program Chairs:
Angela Bonifati
Lyon 1 University (France)
,
Amr El Abbadi
University of California, Santa Barbara (USA)

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMOD: ACM Special Interest Group on Management of Data

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

SIGMOD/PODS '22

Sponsor:

SIGMOD

SIGMOD/PODS '22: International Conference on Management of Data

June 12 - 17, 2022

PA, Philadelphia, USA

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
835
Total Downloads

Downloads (Last 12 months)379
Downloads (Last 6 weeks)19

Reflects downloads up to 13 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Behme LGalhotra SBeedkar KMarkl V(2024)Fainder: A Fast and Accurate Index for Distribution-Aware Dataset SearchProceedings of the VLDB Endowment10.14778/3681954.368199917:11(3269-3282)Online publication date: 30-Aug-2024
https://dl.acm.org/doi/10.14778/3681954.3681999
An SCao Y(2024)Counterfactual Explanation at Will, with Zero Privacy LeakageProceedings of the ACM on Management of Data10.1145/36549332:3(1-29)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654933
Ahmed SGao HRajan HRoychoudhury APaiva AAbreu RStorey M(2024)Inferring Data Preconditions from Deep Learning Models for Trustworthy Prediction in DeploymentProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623333(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3623333
Vu ATsigkanos CQuiané-Ruiz JMarkl VKehrer T(2023)On Irregularity Localization for Scientific Data Analysis WorkflowsComputational Science – ICCS 202310.1007/978-3-031-35995-8_24(336-351)Online publication date: 3-Jul-2023
https://dl.acm.org/doi/10.1007/978-3-031-35995-8_24
Vu AKehrer TTsigkanos C(2022)Outcome-Preserving Input Reduction for Scientific Data Analysis WorkflowsProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3559558(1-5)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3551349.3559558

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents