[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Positioning Paradata: A Conceptual Frame for AI Processual Documentation in Archives and Recordkeeping Contexts

Published: 16 November 2023 Publication History

Abstract

The emergence of sophisticated Artificial Intelligence (AI) and machine learning tools poses a challenge to archives and records professionals, who are accustomed to understanding and documenting the activities of human agents rather than the often-opaque processes of sophisticated AI functioning. Preliminary work has proposed the term paradata to describe the unique documentation needs that emerge for archivists using AI tools to process records in their collections. For the purposes of archivists working with AI, paradata is conceptualized here as information recorded and preserved about records’ processing with AI tools; it is a category of data that is defined both by its relationship with other datasets and by the documentary purpose it serves. This article surveys relevant literature across three contexts to scope the relevant scholarship that archivists may draw upon to develop appropriate AI documentation practices. From the statistical social sciences and the visual heritage fields, the article discusses existing definitions of paradata and its ambiguous, often contextually dependent relationship with existing metadata categories. Approaching the problem from a sociotechnical perspective, literature on Explainable Artificial Intelligence (XAI) insists pointedly that explainability be attuned to specific users’ stated needs—needs that archivists may better articulate using the framework of paradata. Most importantly, the article situates AI as a challenge to accountability, transparency, and impartiality in archives by introducing an unfamiliar non-human agency, one that pushes the limits of existing archival practice and demands the development of new concepts and vocabularies to shape future technological and methodological developments in archives.

References

[1]
Philip Adler, Casey Falk, Sorelle A. Friedler, Gabriel Rybeck, Carlos Scheidegger, Brandon Smith, and Suresh Venkatasubramanian. 2018. Auditing black-box models for indirect influence. Knowledge and Information Systems 54, 1 (2018), 95–122. DOI:
[2]
Herbjørn Andresen. 2019. A discussion frame for explaining records that are based on algorithmic output. Records Management Journal 30, 2 (2019), 129–141. DOI:
[3]
W. Ross Ashby. 1956. An Introduction to Cybernetics. Chapman & Hall Ltd., London, England. http://pcp.vub.ac.be/books/IntroCyb.pdf.
[4]
Drew Baker. 2012. Defining paradata in heritage visualization. In Paradata and Transparency in Virtual Heritage, Anna Bentkowska-Kafel and Hugh Denard (Eds.). Routledge, London, England, 163–175.
[5]
Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador Garcia, et al. 2020. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion 58 (2020), 82–115. DOI:
[6]
David Bearman. 1993. The implications of “Armstrong v. Executive of the President” for the archival management of electronic records. American Archivist 56, 4 (1993), 674–689.
[7]
Jean-Christophe Bélisle-Pipon, Erica Monteferrante, Marie-Christine Roy, and Vincent Couture. 2022. Artificial Intelligence Ethics has a Black Box Problem. AI & Society. Early access, January 4, 2022. DOI:
[8]
Anna Bentkowska-Kafel. 2012. Processual scholia: The importance of paradata in heritage visualization. In Paradata and Transparency in Virtual Heritage, Anna Bentkowska-Kafel and Hugh Denard (Eds.). Routledge, London, England, 245–249.
[9]
Anna Bentkowska-Kafel and Hugh Denard (Eds.). 2012. Paradata and Transparency in Virtual Heritage. Routledge, London, England. DOI:
[10]
Umang Bhatt, Mckane Andrus, Adrian Weller, and Alice Xiang. 2020. Machine learning explainability for external stakeholders. arXiv:2007.05408 (2020).
[11]
Jenny Bunn. 2020. Working in contexts for which transparency is important: A recordkeeping view of Explainable Artificial Intelligence (XAI). Records Management Journal 30, 2 (2020), 143–153. DOI:
[12]
Mark Carnall. 2012. Walking with dragons: CGIs in wildlife ‘documentaries.’ In Paradata and Transparency in Virtual Heritage, Anna Bentkowska-Kafel and Hugh Denard (Eds.). Routledge, London, England, 81–94.
[13]
William M. Carter. 2017. Virtual Archaeology, Virtual Longhouses and “Envisioning the Unseen” Within the Archaeological Record. Ph.D. Dissertation. The University of Western Ontario (Canada), London, Ontario. https://www.proquest.com/docview/2714866407/abstract/7C5DB4B639D64D95PQ/1.
[14]
Bryan Casey, Ashkon Farhangi, and Roland Vogl. 2019. Rethinking explainable machines: The GDPR's “right to explanation” debate and the rise of algorithmic audits in enterprise. Berkeley Technology Law Journal 34 (2019), 145–189.
[15]
Giovanni Colavizza, Tobias Blanke, Charles Jeurgens, and Julia Noordegraaf. 2021. Archives and AI: An overview of current debates and future perspectives. Journal on Computing and Cultural Heritage 15, 1 (2021), 1–15. DOI:
[16]
Terry Cook. 1994. Electronic records, paper minds: The revolution in information management and archives in the post-custodial and post-modernist era. Archives & Manuscripts 22, 2 (1994), 300–328.
[17]
Terry Cook. 2013. Evidence, memory, identity, and community: Four shifting archival paradigms. Archival Science 13, 2 (2013), 95–120. DOI:
[18]
Mick P. Couper. 1998. Measuring survey quality in a CASIC environment. In Proceedings of the Section on Survey Research Methods of the American Statistical Association. American Statistical Association, Dallas, TX, 41–49. http://www.asasrms.org/Proceedings/papers/1998_006.pdf.
[19]
Jeremy Davet, Babak Hamidzadeh, Patricia Franks, and Jenny Bunn. 2022. Tracking the functions of AI as paradata & pursuing archival accountability. In Archiving 2022: Final Programs and Proceedings. Society for Imaging Science and Technology, Springfield, VA, 83–88. DOI:
[20]
Hugh Denard, Richard Beacham, Franco Niccolucci, Sorin Hermon, and Anna Bentkowska-Kafel. 2009. The London charter for the computer-based visualization of cultural heritage. London Charter. Retrieved November 8, 2021 from http://www.londoncharter.org/downloads.html.
[21]
John M. Dirks. 2004. Accountability, history, and archives: Conflicting priorities or synthesized strands? Archivaria 57 (2004), 29–49.
[22]
Juan Manuel Durán and Karin Rolanda Jongsma. 2021. Who is afraid of black box algorithms? On the epistemological and ethical basis of trust in medical AI. Journal of Medical Ethics 47, 5 (2021), 329–335. DOI:
[23]
Luciana Duranti, Adam Jansen, Giovanni Michetti, Mumma Courtney, Daryll Prescott, Corinne Rogers, and Thibodeau Kenneth. 2016. Preservation as a service for trust. In Security in the Private Cloud, John R. Vacca (Ed.). CRC Press, Boca Raton, FL, 47–72. DOI:
[24]
Rosalind Edwards, John Goodwin, Henrietta O'Connor, and Ann Phoenix. 2017. Introduction: Working with paradata, marginalia and fieldnotes. In Working with Paradata, Marginalia and Fieldnotes. Edward Elgar Publishing, 1–19. DOI:
[25]
Batya Friedman and Helen Nissenbaum. 1996. Bias in computer systems. ACM Transactions on Information Systems 14, 3 (1996), 330–347. DOI:
[26]
Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. 2021. Datasheets for datasets. arXiv:1803.09010 (2021).
[27]
Rachel Hann. 2021. Modelling Kiesler's endless theatre: Approaches to paradata for heritage visualization. Theatre and Performance Design 7, 1-2 (2021), 96–115. DOI:
[28]
High-Level Expert Group on Artificial Intelligence. 2018. Ethics Guidelines for Trustworthy AI. European Commission, Brussels, Belgium. https://ec.europa.eu/newsroom/dae/document.cfm?doc_id=60419.
[29]
Chris Hurley. 2005. Recordkeeping and accountability. In Archives: Recordkeeping in Society. Chandos Publishing, Wagga Wagga, New South Wales, 223–252. DOI:
[30]
Isto Huvila. 2012. The unbearable complexity of documenting intellectual processes: Paradata and virtual cultural heritage visualisation. HUMAN IT 12, 1 (2012), 97–110.
[31]
Isto Huvila. 2022. Improving the usefulness of research data with better paradata. Open Information Science 6, 1 (2022), 28–48. DOI:
[32]
Livia Iacovino. 2012. Archives as arsenals of accountability. In Currents of Archival Thinking, Terry Eastwood and Heather MacNeil (Eds.). Libraries Unlimited, Santa Barbara, CA, 181–212.
[33]
Lise Jaillant, Katherine Aske, and Annalina Caputo. 2021. The National Archives (UK): Case Study. AEOLIAN Network: Artificial Intelligence for Cultural Institutions. https://www.aeolian-network.net/case-study-1-the-national-archives-uk/.
[34]
Hilary Jenkinson. 1937. A Manual of Archive Administration. P. Lund, Humphries & Co., Ltd., London, England. http://archive.org/details/manualofarchivea00iljenk.
[35]
Eun Seo Jo and Timnit Gebru. 2020. Lessons from archives: Strategies for collecting sociocultural data in machine learning. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. ACM, New York, NY, 306–316. DOI:
[36]
Alan Karr. 2010. Metadata and Paradata: Information Collection and Potential Initiatives. National Institute of Statistical Sciences, Washington, DC. https://www.niss.org/sites/default/files/research_attachments/Metadata%20vs%20Paradata-FT.pdf.
[37]
Jacob Leon Kröger, Otto Hans-Martin Lutz, and Florian Müller. 2020. What does your gaze reveal about you? On the privacy implications of eye tracking. In Privacy and Identity Management: Data for Better Living: AI and Privacy, Michael Friedewald, Melek Önen, Eva Lievens, Stephan Krenn and Samuel Fricker (Eds.). Springer International Publishing, Cham, Switzerland, 226–241. DOI:
[38]
Luis A. Leiva, Ioannis Arapakis, and Costas Iordanou. 2021. My mouse, my rules: Privacy issues of behavioral user profiling via mouse tracking. In Proceedings of the 2021 Conference on Human Information Interaction and Retrieval. 51–61. DOI:
[39]
Lars Lyberg, Frauke Kreuter, and Mick Couper. 2010. The use of paradata to monitor and manage survey data collection. In Proceedings of Joint Statistical Meetings 2010. 282–296. http://sampieuchair.ec.unipi.it/wp-content/uploads/2018/10/Couper-et-al.pdf.
[40]
Clifford Lynch. 2017. Stewardship in the “Age of Algorithms.” First Monday 22, 12 (2017). DOI:
[41]
Fritz Marx. 1947. The role of records in administration. American Archivist 10, 3 (1947), 241–248. DOI:
[42]
Eileen McIlvain. 2013. Paradata. NSDL Documentation Wiki. Retrieved September 9, 2022 from https://wiki.ucar.edu/display/nsdldocs/Paradata.
[43]
Tim Miller. 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence 267 (2019), 1–38. DOI:
[44]
Norman Mooradian. 2019. AI, records, and accountability. ARMA Magazine 2019 (2019), 9–13.
[45]
Mark Mudge. 2012. Transparency for empirical data. In Paradata and Transparency in Virtual Heritage, Hugh Denard and Anna Bentkowska-Kafel (Eds.). Routledge, London, England, 177–188.
[46]
Henrietta O'Connor and John Goodwin. 2020. Paradata. SAGE Research Methods Foundations. SAGE Publications Ltd., London, England. DOI:
[47]
Jane Parkinson. 1993. Accountability in Archival Science. University of British Columbia. DOI:
[48]
P. Jonathon Phillips, Carina A. Hahn, Peter C. Fontana, Amy N. Yates, Kristen Greene, David A. Broniatowski, and Mark A. Przybocki. 2021. Four Principles of Explainable Artificial Intelligence. National Institute of Standards and Technology, Gaithersburg, MD. DOI:
[49]
Gregory Rolan. 2017. Towards interoperable recordkeeping systems: A meta-model for recordkeeping metadata. Records Management Journal 27, 2 (2017), 125–148. DOI:
[50]
Adam Safir, Tamara Black, and Rebecca Steinback. 2001. Using paradata to examine the effects of interviewer characteristics on survey response and data quality. In Proceedings of the Annual Meeting of the American Statistical Association. 1–6. http://www.asasrms.org/Proceedings/y2001/Proceed/00620.pdf.
[51]
Theodore R. Schellenberg. 1984. The appraisal of modern public records. In A Modern Archives Reader. National Archives Trust Board, Washington, DC, 57–70. http://www.archivists.org/prof-education/pre-readings/FAA/Schellenberg_Article.pdf.
[52]
Fritz Scheuren. 2000. Macro and micro paradata for survey assessment. In Proceedings of the Satellite Meeting to the UN/ECE Work Session on Statistical Metadata.
[53]
Andrew D. Selbst and Julia Powles. 2017. Meaningful information and the right to explanation. International Data Privacy Law 7, 4 (2017), 233–242. DOI:
[54]
Edward H. Shortliffe, Randall Davis, Stanton G. Axline, Bruce G. Buchanan, C. Cordell Green, and Stanley N. Cohen. 1975. Computer-based consultations in clinical therapeutics: Explanation and rule acquisition capabilities of the MYCIN system. Computers and Biomedical Research 8, 4 (1975), 303–320. DOI:
[55]
Azizeh K. Sowan and Louise S. Jenkins. 2010. Paradata: A new data source from web-administered measures. CIN: Computers, Informatics, Nursing 28, 6 (2010), 333–342. DOI:
[56]
Richard Stapleton. 1983. Jenkinson and Schellenberg: A comparison. Archivaria 1983 (1983), 75–85.
[57]
Beth L. Taylor. 2008. The 2006 National Health Interview Survey (NHIS) Paradata File: Overview and applications. In Proceedings of the Survey Research Methods Section. American Statistical Association. http://www.asasrms.org/Proceedings/y2008/Files/301266.pdf.
[58]
Martin J. Turner. 2012. Lies, damned lies, and visualizations: Will metadata and paradata be a solution or a curse? In Paradata and Transparency in Virtual Heritage, Anna Bentkowska-Kafel and Hugh Denard (Eds.). Routledge, London, England, 135–144.
[59]
United States Census Bureau. 2021. About paradata. Census.gov. Retrieved November 8, 2021 from https://www.census.gov/topics/research/paradata/about.html.
[60]
Frank Upward. 2005. The records continuum. In Archives: Recordkeeping in Society, Sue McKemmish, Michael Piggott, Barbara Reed, and Frank Upward (Eds.). Chandos Publishing, Wagga Wagga, New South Wales, 197–222. DOI:
[61]
Giulia Vilone and Luca Longo. 2020. Explainable Artificial Intelligence: A systematic review. arXiv:2006.00093 (2020). DOI:
[62]
Jordan Joseph Wadden. 2021. Defining the undefinable: The black box problem in healthcare artificial intelligence. Journal of Medical Ethics 48, 10 (2021), 764–768. DOI:
[63]
King's Visualization Lab. 2007. Making Space. Retrieved September 9, 2022 from https://www.kvl.cch.kcl.ac.uk/making_space.html.
[64]
International Organization for Standardization. 2017. Information and Documentation—Records Management Processes—Metadata for Records—Part 1: Principles. International Organization for Standardization, Geneva, Switzerland.
[65]
European Commission. 2017. Guidelines on Data Protection Impact Assessment (DPIA). European Commission, Brussels, Belgium. https://ec.europa.eu/newsroom/article29/items/611236.
[66]
ITRUSTAI. 2018. Terminology database: “Metadata.” InterPARES Trust AI. Retrieved November 12, 2022 from https://interparestrustai.org/terminology/term/metadata.
[67]
European Commission. 2021. Laying down harmonized rules on artificial intelligence (Artificial Intelligence Act) and amending certain Union legislative acts. European Commission, Brussels, Belgium. https://digital-strategy.ec.europa.eu/en/library/proposal-regulation-laying-down-harmonised-rules-artificial-intelligence.
[68]
Whitehouse.gov. 2022. Blueprint for an AI bill of rights. The White House. Retrieved October 10, 2022 from https://www.whitehouse.gov/ostp/ai-bill-of-rights/.
[69]
ARMA International. 2022. The Principles: Generally accepted recordkeeping principles. ARMA International. Retrieved October 11, 2022 from https://www.arma.org/page/principles.

Cited By

View all
  • (2024)Recordkeeping in Voice-based Remote Community EngagementProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642779(1-16)Online publication date: 11-May-2024
  • (2024)Preserving paradata for accountability of semi-autonomous AI agents in dynamic environments: An archival perspectiveTelematics and Informatics Reports10.1016/j.teler.2024.10013514(100135)Online publication date: Jun-2024
  • (2024)Concluding Discussion: Paradata for Information and Knowledge ManagementPerspectives on Paradata10.1007/978-3-031-53946-6_14(249-264)Online publication date: 18-Sep-2024
  • Show More Cited By

Index Terms

  1. Positioning Paradata: A Conceptual Frame for AI Processual Documentation in Archives and Recordkeeping Contexts

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Journal on Computing and Cultural Heritage
      Journal on Computing and Cultural Heritage   Volume 16, Issue 4
      December 2023
      473 pages
      ISSN:1556-4673
      EISSN:1556-4711
      DOI:10.1145/3615351
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 16 November 2023
      Online AM: 28 April 2023
      Accepted: 14 April 2023
      Revised: 14 March 2023
      Received: 21 December 2022
      Published in JOCCH Volume 16, Issue 4

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Paradata
      2. archives
      3. Explainable Artificial Intelligence
      4. XAI
      5. processual documentation
      6. metadata
      7. records management
      8. records
      9. accountability

      Qualifiers

      • Research-article

      Funding Sources

      • International Research on Permanent Authentic Records in Electronic Systems (InterPARES) Trust AI
      • Social Sciences and Humanities Research Council of Canada (SSHRC)

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)626
      • Downloads (Last 6 weeks)72
      Reflects downloads up to 16 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Recordkeeping in Voice-based Remote Community EngagementProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642779(1-16)Online publication date: 11-May-2024
      • (2024)Preserving paradata for accountability of semi-autonomous AI agents in dynamic environments: An archival perspectiveTelematics and Informatics Reports10.1016/j.teler.2024.10013514(100135)Online publication date: Jun-2024
      • (2024)Concluding Discussion: Paradata for Information and Knowledge ManagementPerspectives on Paradata10.1007/978-3-031-53946-6_14(249-264)Online publication date: 18-Sep-2024
      • (2024)AI for Library and Information Science (AI4LIS)Proceedings of the Association for Information Science and Technology10.1002/pra2.109761:1(767-769)Online publication date: 15-Oct-2024
      • (2023)AI-Generated Images as an Emergent Record Format2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386946(2020-2031)Online publication date: 15-Dec-2023

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media