2017 SemEHRSurfacingSemanticDatafrom

From GM-RKB
Jump to navigation Jump to search

Subject Headings: SemEHR System.

Notes

Cited By

Quotes

Abstract

Background

Deriving structured data from unstructured clinical notes in electronic health records (EHRs) requires natural language processing and clinical expertise, which is often costly, and frequently a one-off investment. We implemented SemEHR, a semantic search system that reduces the expertise and effort required in this context. We aimed to use it to characterise and select patients for projects such as the UK Department of Health 100,000 Genome Project.

Methods

Built upon the off-the-shelf toolkits, Bio-YODIE and CogStack, SemEHR integrates heterogeneous EHR documents and identifies contextualised (negation, temporality, and experiencer) mentions of a wide range of biomedical concepts including SNOMED CT, ICD-10, LOINC, and Drug Ontology. Text mining and semantics techniques are incorporated to derive a longitudinal patient panorama, combining structured profiles and unstructured records, available through semantic search interfaces.

Findings

We deployed SemEHR in various UK hospital EHRs, including the South London and Maudsley NHS Foundation Trust, where 46 million concept mentions were identified from 18 million documents. In a liver disease study, SemEHR identified 94 of 100 hepatitis C positive manually annotated patients. In a HIV study, SemEHR identified 21 of 23 true positives in a 1000-patient cohort. At King's College Hospital, SemEHR is being used to recruit patients into the 100,000 Genomes Project, where ontological associations are integrated to match recruitment criteria and populate complex phenotype models. A preliminary evaluation suggests that the tool is able to validate previously submitted cases and is very fast in searching phenotypes.

Interpretation

Using SemEHR, a query such as "€œfind patients with a family history of hepatitis C", which previously might have required the user to have natural language processing expertise, becomes a simple search, for which SemEHR retrieves a relevant patient cohort, populates patient-level summaries, and provides a link to each mention in the original source. Results and feedback from the multiple studies have proven its efficiency: previously weeks or months of work can be done within minutes in some cases.

Funding

Medical Research Council (MC_PC_14089); NIHR Biomedical Research Centre for Mental Health, Biomedical Research Unit for Dementia; the European Union's Horizon 2020 (No 644753 KConnect); Wellcome Trust Seed Award in Science (109823/Z/15/Z); National Institute for Health Research University College London Hospital's Biomedical Research Centre; Arthritis Research UK; British Heart Foundation; Cancer Research UK; Chief Scientist Office; Economic and Social Research Council; Engineering and Physical Sciences Research Council; National Institute for Social Care and Health Research; Wellcome Trust (grant MR/K006584/1).

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2017 SemEHRSurfacingSemanticDatafromAngus Roberts
Honghan Wu
Giulia Toti
Katherine I Morley
Amos Folarin
Richard Jackson
Ismail Kartoglu
Asha Agrawal
Clive Stringer
Darren Gale
Matthew Broadbent
Robert Stewart
Zina Ibrahim
Genevieve M Gorrell
Richard J B Dobson
SemEHR: Surfacing Semantic Data from Clinical Notes in Electronic Health Records for Tailored Care, Trial Recruitment, and Clinical Research10.1016/S0140-6736(17)33032-52017