At a time where the NHS faces unprecedented operational pressure, the accuracy and currency of clinical codesets rarely attract attention. Yet they quietly shape everything from population health analytics to medicines optimisation, risk stratification, research and national reporting.
Clinical practice does not stand still. Medicines are withdrawn and introduced, terminologies are updated, new therapies emerge, and guidelines change. Without a disciplined approach to codeset maintenance, organisations can quickly end up working with definitions and rules that no longer reflect clinical reality.
This creates four systemic risks:
- patient safety risk - outdated codesets may fail to identify patients who need monitoring, review, or recall. Equally, they can flag patients unnecessarily, creating noise and additional workload
- operational risk - commissioners and providers may make decisions based on incomplete or inaccurate data, weakening planning, resource allocation, and performance management
- reputational risk - national reporting, audit outputs, and benchmarking can become unreliable, eroding confidence in data quality
- research integrity risk - research outputs become unreliable, and the insights derived from them risk misleading rather than informing
A codeset may be findable, accessible, interoperable and reusable, yet still be clinically unsound.
All too often, codesets are reused without recognising the need for ongoing review - assuming that openness implies accuracy, validation, and alignment with the latest clinical standards. They are also frequently applied beyond their original purpose, with code groups designed for one clinical or analytical context repurposed for another without sufficient scrutiny. The result is not only outdated definitions, but misaligned ones: codes that are technically valid yet conceptually inappropriate for the question being asked.
This creates a subtle but significant vulnerability. Outdated or misapplied definitions can be propagated across multiple studies, analytics platforms and supplier products; embedding inconsistencies at scale. Without clear governance for maintaining and versioning shared codesets, the benefits of FAIR risk being undermined by the unintended spread of definitions that no longer reflect current clinical practice.
Codesets are the invisible infrastructure that underpin safe care, effective commissioning, trustworthy data, and credible research. When they fall out of date, the consequences ripple across the system. When they are maintained well, they become a strategic asset.
Recognising the importance of this infrastructure, PRIMIS has developed a set of guiding principles for high-quality clinical codesets. These principles set out clear expectations for designing, governing, maintaining and reusing codesets, emphasising that they are living assets rather than static lists. By promoting consistent stewardship, they aim to ensure that codesets continue to support safe care, reliable data and credible research - helping to mitigate the risks associated with outdated or misapplied definitions.
Guiding principles for high-quality clinical codesets
To uphold the highest standards of clinical integrity and digital reliability, every codeset should embody the following principles:
1. Aligned with current clinical practice
A codeset should reflect the latest recommended clinical guidance, safety alerts, and evidence. It should evolve in step with modern practice, ensuring that the logic used to identify patients remains clinically relevant and safe.
2. Built with a clear and explicit purpose
Every codeset should have a clearly defined intent. Its scope, clinical question, and intended use should be unambiguous, enabling users to understand exactly what it identifies and, equally important, what it does not identify or should not be used for.
3. Supported by transparent metadata
A high‑quality codeset includes detailed metadata documenting its inclusions, exclusions, assumptions, and rationale. This transparency enables reproducibility, auditability, and informed reuse.
4. Grounded in an understanding of source data
Those developing the codeset must understand how data items are recorded at source - including coding behaviour, system constraints, and known data quality issues - to ensure the logic is both realistic and robust.
5. Clinically curated and expertly overseen
A good codeset is shaped and validated by a clinician with deep knowledge of the relevant terminology and supported by health informatics expertise. This partnership ensures both clinical accuracy and technical feasibility.
6. Maintained through a defined review plan
A codeset is not static. It should be governed by a clear, scheduled review process that tracks changes in clinical terminologies, clinical guidelines and emerging evidence. Regular updates help protect against drift and maintain trust.
Kerry Oliver, PRIMIS