In our globalized world, governments, societies, and economies rely heavily on data. Data drives decision-making, automation, and innovation across public and private sectors. This trend is intensifying with the ongoing advancements in artificial intelligence (AI), which frequently requires vast amounts of data for training purposes. However, the distributed nature of data—spread across various systems and governed by different actors with their own regulations—poses significant challenges. For instance, achieving a circular economy requires data to measure, optimize, and enhance material reuse. Yet, the relevant data for consumer product components is often fragmented across international supply chain partners. Similarly, in the public sector, data about a single social problem is often fragmented and dispersed across multiple agencies and multiple levels of government with no formal sharing mechanisms. Therefore, effective data sharing among different actors is critical to realize the potential of data in order to address complex problems.

Given this context, the goal of this Special Issue is to advance current theoretical and empirical knowledge on data sharing in a data economy. Thus, this preface explores the tensions between globalization on the one hand and the distributed nature of data on the other hand, inspired by the papers included in this special issue. We first examine the new challenges for data sharing at the global ecosystem level. Next, we examine value creation in globalized data ecosystems. Finally, we discuss novel artifacts needed to facilitate data sharing. We illustrate each section with an example of data sharing for circularity and sustainability in globalized supply chains. Finally, we summarize the contributions of the papers in this special issue, highlighting emerging patterns and insights.

Data governance at a global ecosystems level

Data sharing is increasingly required in today’s globalized world, where data are distributed among multiple systems and controlled by diverse actors within their own legal frameworks. Sharing can take various forms: businesses may share data openly with the public (Zuiderwijk & Janssen, 2014), collaborate with government agencies (Rukanova et al., 2023), or create joint data pools accessible to outsiders. Public organizations can also share data among themselves or exchange data with citizens. All of these data-related collaborations present important challenges but also the promise of benefits that go beyond a single organization or a single jurisdiction. However, more often, data ecosystems are needed where businesses can exchange and trade data, for instance, with supply chain partners (Oliveira et al., 2019). Data ecosystems are sociotechnical networks that enable collaborative data sharing between independent actors (Möller et al., 2024; Oliveira et al., 2019). Given the multi-actor nature of data ecosystems, research should go beyond the micro-level of individual data-sharing decisions to examine the entire ecosystem of data sharing parties. Focusing on the data ecosystem as a whole will provide additional insights and complement what we have learned from individual level studies.

Beyond human decisions, today’s data economy involves non-human agency too. For example, the extensive use of data in AI applications has shown that control over data is distributed across systems. In addition, the quality and availability of data will significantly influence the results of the emergent AI systems, which require data sharing and effective communication among multiple actors. Furthermore, tracing how data is shared and recombined is difficult. Data itself may even generate new data artifacts autonomously (Aaltonen et al., 2023). Such distribution of agency to non-human actors requires new conceptualizations of the agency that already exists and will be needed in data ecosystems.

The trends of distributed data, globalization, and non-human agency create tensions that call for governance. These structures and rules to make decisions about data are not only necessary in the new economy but also essential for the generation of benefits for the majority of individuals. However, data ecosystems span across various institutional settings with different practices, legal frameworks, and technical disparities (Aaen et al., 2021). This diversity creates tensions, such as balancing privacy with data sharing for services and the need to share data for the common good versus monetizing it for private gain. The multifaceted nature of data ecosystems leads to contradictions, making it essential for governance to be collectively negotiated by stakeholders with varying degrees of influence (Abraham et al., 2023; Paparova et al., 2023). Finding optimal ways to govern data sharing in these globalized ecosystems remains a significant challenge.

Illustrative example: Circularity and sustainability in a globalized supply chain

As a running illustrative example, we focus on data sharing for circularity and sustainability goals in supply chains. Typically, supply chains are globalized, comprising autonomous actors across the globe. Each actor holds essential data to monitor systems-level outcomes, such as progress on societal goals of circularity and sustainability. Hence, globalization and distributedness of data imply that data sharing is essential.

Since supply chains are global and data is widely distributed, new approaches are needed to meet the needs of various stakeholders, including businesses, societal actors, governments, auditing firms, and financial institutions. Only systems-level solutions can address these challenges comprehensively, as neglecting one part of the supply chain can negate the efforts in all other parts (e.g., cars being dumped or clothes being burnt).

However, realizing these systems-level solutions for data sharing is highly challenging. It requires interactions between local and global contexts, each with its own regulatory conditions. Research is needed to explore how to create collective governance structures that facilitate the development of cross-sectoral data ecosystems. This includes gaining insights into how social, privacy-related, technical, and legal tensions can be conceptualized and resolved through strategies and governance models. Moreover, collective governance structures may be needed for developing data analytic solutions for government purposes. These might require pooling financial and human resources to advance progress across organizational boundaries within a country and, in some cases, across multiple nations.

Value creation in globalized data ecosystems

Creating value with data is a challenging process that includes discovering and shaping data artifacts (Günther et al., 2022). Issues of value creation should be considered both on the ecosystem and individual levels of analysis. On the individual level, local businesses with limited resources or digital capabilities increasingly find themselves part of a globalized data ecosystem. This raises critical questions such as trust in larger ecosystem players, confidentiality concerns, and competitive dynamics. Studies are needed to help small businesses position themselves against large data-sharing platforms, promoting inclusive practices that do not exacerbate existing information asymmetries. Similarly, in the public sector, individual agencies are facing the challenge of how to generate public value through the use of their own data but also data collected by other government agencies and other levels of government.

On the ecosystem level, economic value interacts with values such as transparency and sustainability. Data sharing can drive innovation, optimize processes, and create new opportunities. Shared data can also facilitate decision making and coordination across the supply chain, fostering collaborations that lead to new products and services (Stahl et al., 2023). Besides complying with regulation, transparency can also build trust among ecosystem participants and with external stakeholders, including consumers and regulators. At the ecosystem level, governments play a dual role as users and sharers of data but also as entities which are creating the rules and regulating how private organizations are using data and which characteristics (e.g., quality and transparency) are important for which situations. In the private sector, risks are also present, for instance, as competitors may use shared data to reverse engineer critical manufacturing processes that create a competitive edge.

Illustrative example: Value creation with circularity and sustainability

For circularity and sustainability, data may be shared in the form of a digital product passport that maintains data on the material ingredients of a product. This shared data may add value for business strategy-making and operational planning, besides enabling accountability for their sustainability actions. Digital product passports and digital infrastructures that allow for transparency can also improve financial service delivery for green loans, helping to avoid greenwashing. Furthermore, governments may use data-sharing systems to promote goals such as circularity by steering and monitoring systemic transitions.

Sociotechnical artifacts for globalized data-sharing

Centralized digital solutions, such as classical data platforms or data warehouses, are unlikely to be effective in a globalized, cross-industry setting. Data sovereignty is increasingly emphasized as a prerequisite for data sharing (Abraham et al., 2023). Consequently, the paradigm is shifting towards solutions where data remains at the premises of data owners (Agahari et al., 2022). Data transformations and computations occur at those decentralized locations, for instance, by using so-called data stations. Even AI and machine learning (ML) are becoming decentralized through federated learning models, where only global model parameters are shared (Verbraeken et al., 2020). The enforcement of agreements is possible through algorithms controlled and executed in a distributed way, thanks to smart contracts that enforce computations. Thus, the once envisioned centralized data spaces for organizing data sharing have given way to decentralized models (Otto & Jarke, 2019).

Despite these decentralizing trends, platform-like systems likely remain relevant (Alt, 2021). For instance, in the data spaces vision, metadata is shared between partners via a shared platform (Beverungen et al., 2022). In the federated learning paradigm, global model parameters still require centralized processing and updating (Verbraeken et al., 2020). Platforms will also be necessary for (micro-)compensations of data processing, especially when one party conducts local data computations that yield valuable results for another party. Consequently, new platform models may emerge, either within or across globalized industrial sectors, warranting further study.

Research opportunities abound in understanding how new technologies can maintain the distributed organization of data while enabling collective and coordinated data sharing. This includes exploring how decentralized solutions can integrate with necessary centralized elements to optimize data sharing in globalized contexts, which could lead to new complex hybrid approaches to data sharing.

Illustrative example: Sociotechnical artifacts for circularity and sustainability monitoring

Achieving data sharing across industries requires data interoperability, including dataspace interoperability and global standards. Addressing these issues at the level of individual supply chains may not be sufficient, as multiple supply chains intersect within material loops. While some advocate for the potential of generative AI to resolve discrepancies between ontological domains, others call for increased standardization at a semantic level (Alt, 2022). Despite these complexities, regulatory requirements, such as the Corporate Sustainability Reporting Directive (CSRD) in the EU, demand fast action.

Design studies that develop prototypes and conduct pilots can be helpful in determining whether design principles can be applied at scale. These studies might also include failed projects that illustrate the paradoxes, tensions, and barriers to achieving systemic solutions. Additionally, research is needed on the efforts of global organizations to establish and align standards and architectures on B2B and G2G and B2G data sharing. It is crucial to understand how global standards and initiatives around Digital Product Passports (DPPs) align with regional and national developments.

Special issue articles

This special issue combines a set of papers that contribute to the literature on data sharing, providing a welcoming home to future work in this area. The contributions address three important facets of data sharing: data sovereignty, data privacy and traceability, and value extraction.

Data sovereignty

Abbas et al. (2024) contribute to the data economy discourse by developing a multifaceted framework on data sovereignty based on expert interviews. The framework illustrates how different facets of data sovereignty are impacted by contextual conditions such as data type, data sharing setting, and organizational size.

Data privacy and traceability

Heeß et al. (2024) examine the growing data privacy concerns in increasingly complex and interconnected global supply chains. They develop design principles for digital product passports to ensure traceability while safeguarding data privacy: holistic data capture, data privacy, decentralized data administration, forgery-proof data, automated passport processing, and interoperability.

Value extraction from data

Holstein et al. (2023) show that extracting value from datasets requires more than just processing vast amounts of data. Rather, value extraction involves an in-depth understanding of the dimensions of datasets, documentation, and associated metadata. The paper identifies data enrichment and benchmark dataset matching as key design principles. Here, data semanticization reflects the need for systems to enable understandability of datasets through capabilities to enrich available data resources with domain-specific metadata.

Overarching themes and directions

Within this collection of papers, several central themes emerge that reflect the current trends and challenges in data sharing. Unmistakably, distribution and decentralization are key trends, providing the context for each of the papers in this issue. Decentralization introduces new challenges, such as ensuring that data remains valuable across different use contexts (see Holstein et al., 2023). This is especially the case in a globalized context, where data is shared across organizational and legal boundaries.

The global data sharing context requires new conceptualizations of terms such as data governance, sovereignty, transparency, and contextualization. Among these, data sovereignty is a notably strong theme in our collection. Abbas et al. (2024) challenge the conventional concept of data sovereignty by exploring new facets that become increasingly relevant in the global data sharing landscape. Heeß et al. (2024) deepen the understanding of conflicts and tensions between competing values of sovereignty (implying control) and transparency (needed to make data valuable).

Another important concept in our collection is data governance. In a decentralized frame of data sharing, data governance goes beyond controlling data for sovereignty (see Abbas et al., 2024). Data governance also requires meta-data for enriching and contextualizing data artifacts (Holstein et al., 2023).

Finally, the global data sharing context creates a need for sociotechnical artifacts. These artifacts include infrastructures or platform-like systems that enable decentralized data sharing, such as data product passports discussed by Heeß et al. Additionally, standards or metadata formats could facilitate data sharing, to support the transfer of data between origins and use contexts, see Holstein et al. (2023).

Conclusion

Data sharing has the potential to drive global change and help address global issues that affect society. For governments and businesses to steer collaboratively and produce value for nations across the world, data should be shared for monitoring, optimizing, and driving the necessary societal transformation. Our illustrative example of data sharing for circularity and sustainability stands next to other important application areas, such as data sharing for a healthy society or data sharing to fight poverty and hunger. It is clear that to achieve data sharing in globalized data ecosystems, several advances are needed. Specifically, understanding the ecosystem-level, sociotechnical, and multi-value aspects of data sharing requires a transdisciplinary approach, in which governments, businesses, and academics collaborate and share knowledge and resources.

As next steps, we call for scholars to further explore the globalized data sharing context. Future research should uncover new ways to create value with data on a globalized scale while preventing new ways to create harm. To do so, a mix of empirical case work and conceptual studies are needed to study both the emerging and possible realities of globalized data sharing. Scholars should be prepared to redefine and reconceptualize notions that were long taken for granted, such as privacy, transparency, accountability, and traceability. Equally, they should carefully develop upcoming popular notions such as data sovereignty and empirically study requirements and challenges of implementing data sovereignty as well as data strategies to overcome them. It is also important to acknowledge that data sharing happens in different contexts and diverse relationships among specific actors, including networks of peers, hierarchies, and formal relationships with regulators in certain industries. To develop a holistic perspective of data sharing in a globalized world, a more profound and subtle understanding of each of these relationships is needed. In a globalized world, given the big societal challenges that our society faces today, collective efforts and collaboration among businesses, governments, and wider society will be key. This would require research on how to collaborate, both in terms of organizational and institutional level and related governance, as well as on the data sharing level. Such collaboration would affect the business-to-business relationships, the international government-to-government context, as well as how governments monitor, control, and facilitate global supply chains.

Finally, we call for work on designing new sociotechnical artifacts. While regulations and basic technologies such as data platforms that address the long-heard objections to data sharing are now emerging, scholars can move their gaze to other artifacts that facilitate data sharing and empirically study advantages and disadvantages of different artifacts as well as implementation challenges.