Sitemap

The Future of Health Is Preventive — If We Get Data Governance Right

Unlocking the Potential of the European Health Data Space

9 min readApr 3, 2025

By Stefaan Verhulst (Co-Founder of The GovLab and The Data Tank)

Introduction: The Shift from Treatment to Prevention

After a long gestation period of three years, the European Health Data Space (EHDS) is now coming into effect across the European Union, potentially ushering in a new era of health data access, interoperability, and innovation. As this ambitious initiative enters the implementation phase, it brings with it the opportunity to fundamentally reshape how health systems across Europe operate. More generally, the EHDS contains important lessons (and some cautions) for the rest of the world, suggesting how a fragmented, reactive model of healthcare may transition to one that is more integrated, proactive, and prevention-oriented.

For too long, health systems–in the EU and around the world–have been built around treating diseases rather than preventing them. Now, we have an opportunity to change that paradigm. Data, and especially the advent of AI, give us the tools to predict and intervene before illness takes hold. Data offers the potential for a system that prioritizes prevention–one where individuals receive personalized guidance to stay healthy, policymakers access real-time evidence to address risks before they escalate, and epidemics are predicted weeks in advance, enabling proactive, rapid, and highly effective responses.

But to make AI-powered preventive health care a reality, and to make the EHDS a success, we need a new data governance approach, one that would include two key components:

  • The ability to reuse data collected for other purposes (e.g., mobility, retail sales, workplace trends) to improve health outcomes.
  • The ability to integrate different data sources–clinical records and electronic health records (EHRS), but also environmental, social, and economic data — to build a complete picture of health risks.

In what follows, we outline some critical aspects of this new governance framework, including responsible data access and reuse (so-called secondary use), moving beyond traditional consent models to a social license for reuse, data stewardship, and the need to prioritize high-impact applications. We conclude with some specific recommendations for the EHDS, built from the preceding general discussion about the role of AI and data in preventive health.

Examples of AI-Powered Preventive Health

Table 1 and Appendix 1 includes several examples of how AI can be used to predict and preemptively respond to a range of diseases and health conditions. It highlights the breadth of data sources — from mobility patterns to consumer behavior and climate data — and the diverse health risks that can be addressed through AI-driven analysis. These use cases illustrate how non-traditional data, when responsibly accessed and combined with clinical or public health information, can generate predictive insights and lead to meaningful policy interventions. In addition, data and AI can be combined to target interventions more precisely, allocate resources more efficiently, and tailor support to individual and community needs. These benefits are evident, for instance, in the case of epidemic modeling or climate-related health risks, where the ability to act early and at scale can significantly improve public health outcomes.

The European Health Data Space (EHDS) is well-positioned to enable many of these innovations by creating a shared legal and technical infrastructure for data sharing across the EU. Notably, while its initial focus is on primary use of EHRs, the EHDS creates a framework that could eventually incorporate other forms of data essential for prevention — including environmental, occupational, and behavioral data. We take up the vital importance of this cross-sectoral sharing of data in the next section.

Table 1. AI and Data for Preventive Health: Use Cases

Toward a Governance Framework for Responsible Reuse

Until recently, the inhibiting factors preventing the types of benefits evident in Table 1 were primarily technical. This is less the case. Today, the limits are related increasingly to policy, governance, and more broadly the social context within which data is collected, stored, used, and reused. That’s why we need a new enabling governance framework.

Based on the above discussion, two broad observations stand out related to governance in this space. First, the ability to reuse non-traditional data — such as consumer behavior, environmental conditions, workplace dynamics, and mobility patterns — is critical for health prediction and prevention. Second, preventive health applications often depend on combining multiple sources of data, including information that is both health and non-health related. This integration and responsible reuse allow AI systems to detect patterns, anticipate risks, and inform timely interventions.

In order for EHDS to unlock the full potential of data-powered AI for health, it needs a governance framework that encompasses these two observations, and more generally that supports responsible and scalable data reuse while safeguarding public trust in similar ways across all European member states. Such a framework must balance innovation with strong commitments to privacy, transparency, and accountability. The absence of such a common data access for re-use framework has been recognized by the European Partnership for Personalized Medicine as hampering precision health in the European Union.

Three components of this framework are particularly important:

1. Moving Beyond Traditional Consent

Today’s health data governance models purely rely heavily on long-standing practices of individual consent (or opt out, as specified by EHDS Regulations). Such practices emphasize individual autonomy and risk minimization. These principles are important to ensure responsible data handling, but they are not always conducive to responsible data reuse. As I have elsewhere argued, to facilitate data reuse–for preventive health and other applications–we need to update and complement our traditional models of consent. Three practices are particularly important:

  • Establish a social license: A social license refers to the broader (formal and informal) public acceptance of data reuse based on trust, transparency, and shared communal values. It requires ongoing public engagement to ensure that data practices align with societal expectations and deliver clear public benefit.
  • Implement privacy-enhancing technologies (PETs): PETs such as federated learning, synthetic data, and continuous differential privacy allow data to be analyzed without exposing individual-level information. These tools go beyond traditional anonymization techniques, and help unlock insights while minimizing privacy risks. Adapting them to evolving needs — despite their resource demands — may require ongoing capacity building across stakeholders (including nurturing data stewards, see below).
  • Create independent oversight bodies: Independent oversight bodies — including ethical review boards or data stewardship councils — can evaluate requests for data reuse and ensure they serve a legitimate public interest. They offer a layer of accountability that builds trust and prevents misuse.

2. Establishing Data Stewards

Data stewards play an important role in ensuring that data access is systematic, sustainable, and responsible. They provide a layered accountability framework and manage data quality, define and help enforce access protocols, and provide a bridge between data providers and users. The role of data stewards has been growing and becoming increasingly institutionalized in recent years. But in order to unlock the full benefits of this role for preventive health, we need more definition and structure.

In the context of EHDS, we need to formalize the role of data stewards in two types of organizations:

  • National health data hubs: Data stewards can help national platforms (e.g., France’s Health Data Hub) facilitate secure and ethical access to sensitive health data. Their oversight ensures that data is used and reused in the public interest while respecting privacy and legal safeguards.
  • Public-private partnerships: Data stewards also enable responsible collaboration between governments, researchers, and companies by establishing clear rules for data sharing and reuse. They help ensure that innovation aligns with public accountability.

3. Prioritizing High-Impact Use Cases

Finally, in order to enable the reuse of data for preventive health, we need to prioritize. Not all data-sharing initiatives should be pursued equally; just because we can reuse data does not mean we should. Instead, we must identify and prioritize use cases that have the highest societal impact, in the process helping to maximize the efficiency of scarce public resources.

We have pioneered an approach called the 100 Questions methodology to help prioritize high-impact data use and reuse. That approach, which is applicable across domains, is described more fully here. In the specific case of preventive health, it would entail asking the following critical questions in order to identify a specific reuse opportunity as worthy of pursuit:

  • Preventive health benefit: Does the proposed reuse have the potential to prevent diseases or public health crises?
  • Data feasibility and representativeness: Are the necessary data sources representative, available, accessible, and capable of being meaningfully integrated?
  • Responsible reuse: Can the reuse be carried out in a way that safeguards individual rights, prevents bias, and avoids unintended harms?

Only if these three questions can be answered in the affirmative should a data reuse initiative move forward. Without such a filter, we risk pursuing projects that may sound promising but ultimately represent a misdirection or misuse of limited public resources. Prioritization is essential — not all use cases are equally valuable, and focusing on those with the greatest impact ensures both effectiveness and legitimacy.

Yet even the most promising efforts can be derailed by practical hurdles such as sluggish data pipelines, fragmented regulations, and inconsistent metadata. Introducing workflow “buffers” can help navigate these issues, ensuring EHDS-driven initiatives remain both effective and ethically sound.

Conclusion: Making the EHDS Work for Preventive Health

The EHDS is a crucial initiative that aims to standardize secure health data sharing across the EU. It lays the foundation for better access to EHRs, AI-driven health innovation, and cross-border research. As the initiative becomes operational, Europe has a singular opportunity to reimagine the future of public health.

However, in order for the EHDS to fully support preventive health efforts, additional steps must be taken. We conclude with four recommendations, built from the preceding discussion:

  • First, the EHDS should expand beyond its current focus on EHRs. To be truly transformative, it must find ways — through federation, interoperability, or legal pathways — to incorporate non-traditional data sources, such as environmental exposure data, consumer behavior, and urban mobility trends.
  • Second, the EHDS should strengthen governance, and in particular institutionalize a system of professionalized data stewards, both at the national level and in municipalities. Data stewards should be supported through clear mandates, capacity-building programs, and sustainable funding mechanisms.
  • Third, the EHDS must develop tools and methods within its governance architecture to identify high priority use cases. Whether through formal advisory panels, public consultations, or approaches like the 100 Questions methodology, data reuse efforts should be aligned with the highest public value and best possible use of public resources.
  • Fourth, a culture of trust and transparency must be at the heart of the EHDS, so as to build a social license for data reuse. This will require constant public engagement, community outreach, and attention to norms and values as well as formal laws and policies. Europe must lead not only in building infrastructure, but in earning public confidence through ethical use.

If these steps are taken, the EHDS will not only support data-driven research and clinical innovation across the EU. It will also enable a long-overdue shift toward preventive health, offering a model to the EU–-and to the wider world.

Thanks to Hannah Chafetz, Adam Zable, and Roy Saurabh for input to earlier versions.

Appendix: Few examples of the use of non-traditional data for preventive health

Respiratory Disease Prevention:

Chronic Disease Prevention via Consumer Data

Mental Health and Burnout Detection

Climate-Driven Health Risk Prediction

Pandemic Prediction with Telecom Data

About the author

Stefaan G. Verhulst is Co-Founder and Chief Research and Development Officer as well as Director of GovLab’s Data Program. He is also an Editor-in-Chief of Data & Policy.

***

This is the blog for Data & Policy (cambridge.org/dap), a peer-reviewed open access journal published by Cambridge University Press in association with the Data for Policy Conference and Community Interest Company.

--

--

Data & Policy Blog
Data & Policy Blog

Written by Data & Policy Blog

Blog for Data & Policy, an open access journal at CUP (cambridge.org/dap). Eds: Zeynep Engin (Turing), Jon Crowcroft (Cambridge) and Stefaan Verhulst (GovLab)

Responses (1)