Toggle Main Menu Toggle Search

Open Access padlockePrints

Enriching representation learning using 53 million patient notes through human phenotype ontology embedding

Lookup NU author(s): Dr David Lewis-Smith


Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


© 2023The Human Phenotype Ontology (HPO) is a dictionary of >15,000 clinical phenotypic terms with defined semantic relationships, developed to standardize phenotypic analysis. Over the last decade, the HPO has been used to accelerate the implementation of precision medicine into clinical practice. In addition, recent research in representation learning, specifically in graph embedding, has led to notable progress in automated prediction via learned features. Here, we present a novel approach to phenotype representation by incorporating phenotypic frequencies based on 53 million full-text health care notes from >1.5 million individuals. We demonstrate the efficacy of our proposed phenotype embedding technique by comparing our work to existing phenotypic similarity-measuring methods. Using phenotype frequencies in our embedding technique, we are able to identify phenotypic similarities that surpass current computational models. Furthermore, our embedding technique exhibits a high degree of agreement with domain experts' judgment. By transforming complex and multidimensional phenotypes from the HPO format into vectors, our proposed method enables efficient representation of these phenotypes for downstream tasks that require deep phenotyping. This is demonstrated in a patient similarity analysis and can further be applied to disease trajectory and risk prediction.

Publication metadata

Author(s): Daniali M, Galer PD, Lewis-Smith D, Parthasarathy S, Kim E, Salvucci DD, Miller JM, Haag S, Helbig I

Publication type: Article

Publication status: Published

Journal: Artificial Intelligence in Medicine

Year: 2023

Volume: 139

Print publication date: 01/05/2023

Online publication date: 28/02/2023

Acceptance date: 23/02/2023

ISSN (print): 0933-3657

ISSN (electronic): 1873-2860

Publisher: Elsevier B.V.


DOI: 10.1016/j.artmed.2023.102523


Altmetrics provided by Altmetric