Matching Patient Records to Clinical Trials Using Ontologies

This paper describes a large case study that explores the applicability of ontologies and semantic technology to problems in the medical domain. We investigate whether it is possible to use ontologies to automate common clinical tasks that are currently labor intensive and error prone, and focus our case study on improving cohort selection for clinical trials. An obstacle to automating such clinical tasks is the need to bridge the semantic gulf between raw patient data, such as laboratory tests or specific medications, and the way a clinician interprets this data. Our key insight is that matching patients to clinical trials can be formulated as a problem of semantic retrieval. We describe the technical challenges to building a realistic case study, which include problems related to scalability, the integration of large terminologies, and dealing with noisy, inconsistent data. Our solution is based on the SNOMED CT ontology, and scales to one year of patient records (approx. 240,000 patients).

By: Chintan Patel; James Cimino; Julian Dolby; Achille Fokoue; Aditya Kalyanpur; Aaron Kershenbaum; Li Ma; Edith Schonberg; Kavitha Srinivas

Published in: RC24265 in 2007


This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.


Questions about this service can be mailed to .