Toggle Main Menu Toggle Search

Open Access padlockePrints

An interoperable similarity-based cohort identification method using the OMOP common data model version 5.0

Lookup NU author(s): Dr Anando SenORCiD


Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


Cohort identification for clinical studies tends to be laborious, time-consuming, and expensive. Developing automated or semi-automated methods for cohort identification is one of the “holy grails” in the field of biomedical informatics. We propose a high-throughput similarity-based cohort identification algorithm by applying numerical abstractions on electronic health records (EHR) data. We implement this algorithm using the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), which enables sites using this standardized EHR data representation to avail this algorithm with minimum effort for local implementation. We validate its performance for a retrospective cohort identification task on six clinical trials conducted at the Columbia University Medical Center. Our algorithm achieves an average area under the curve (AUC) of 0.966 and an average Precision at 5 of 0.983. This interoperable method promises to achieve efficient cohort identification in EHR databases. We discuss suitable applications of our method and its limitations and propose warranted future work.

Publication metadata

Author(s): Chakrabarti S, Sen A, Huser V, Hruby GW, Rusanov A, Albers DJ, Weng C

Publication type: Article

Publication status: Published

Journal: Journal of Healthcare Informatics Research

Year: 2017

Volume: 1

Issue: 1

Pages: 1-18

Online publication date: 08/06/2017

Acceptance date: 19/05/2017

ISSN (print): 2509-4971

ISSN (electronic): 2509-498X

Publisher: Springer Nature


DOI: 10.1007/s41666-017-0005-6


Altmetrics provided by Altmetric