Toggle Main Menu Toggle Search

Open Access padlockePrints

Concerns of Using Large Language Models in Health Care Research and Practice: Umbrella Review

Lookup NU author(s): Dr Pauline AddisORCiD, Megan FairweatherORCiD, Professor Dawn CraigORCiD, Hannah O'KeefeORCiD

Downloads


Licence

This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).


Abstract

© 2026 JMIR Publications Inc.. All rights reserved.Background: Large language models (LLMs), such as ChatGPT (OpenAI), are rapidly evolving, and their applications in health care are increasing. There is a growing demand for automation of routine tasks and a drive to use LLMs or similar to support research. Objective: This umbrella review examines concerns of health care professionals and researchers related to the use of LLMs in health care research and practice. We aimed to identify common issues raised and the implications for patient care, policy, and practice. Methods: A protocol was registered on PROSPERO (CRD420250640997). Searches were conducted in 7 databases (Ovid MEDLINE, Ovid Embase, Scopus, Web of Science, JBI Database of Systematic Reviews and Implementation Reports, Cochrane Database of Systematic Reviews, and Epistemonikos) in February 2025 and updated in February 2026. Screening was conducted in 2 stages, with independent screening by 2 reviewers. Studies published in the English language after January 2017 with at least one outcome expressing concerns of LLM or generative artificial intelligence use in health care research were included. The included studies were quality appraised for risk of bias and certainty of the evidence using AMSTAR-2 (A Measurement Tool to Assess Systematic Reviews) and GRADE (Grading of Recommendations Assessment, Development, and Evaluation), respectively. Data was extracted using a piloted form and narratively synthesized following SWiM guidelines and the PRIOR (Preferred Reporting Items for Overviews of Reviews) checklist. Results: The search retrieved 448 systematic reviews, of which 42 met the inclusion criteria. Further, 12 distinct populations were identified, including researchers and clinicians in various medical specialties. The included reviews were assessed to be of very poor quality, and the level of overlap between primary studies could not be determined. Additionally, 15 reviews focused on ChatGPT, a further 15 on two or more LLMs, and 12 on generic artificial intelligence. Thus, 3 main themes emerged from the narrative synthesis. In order of most to least frequently discussed: (1) technical capability; (2) ethical, legal, and societal; and (3) costs. Conclusions: To our knowledge, this is the first umbrella review to address the concerns of LLMs in health care research and practice. Thematic analyses provided insight into the complexity of different perspectives, and by using a whole population approach, it demonstrates common narratives. However, the poor quality of the included studies and potential overlap of results are substantial limitations. Data quality is at the heart of these concerns, and combative action must ensure health care professionals and researchers have the resources required to overcome these apprehensions. Ethical, legal, and societal implications of artificial intelligence use were also commonly raised. As technology accelerates and demands on health care increase, we must adapt and embrace change with equity, diversity, inclusion, and safety at the core.


Publication metadata

Author(s): Yarar F, Addis P, Fairweather M, Craig D, O'Keefe H

Publication type: Review

Publication status: Published

Journal: Journal of Medical Internet Research

Year: 2026

Volume: 28

Online publication date: 15/05/2026

Acceptance date: 09/04/2026

ISSN (print): 1439-4456

ISSN (electronic): 1438-8871

Publisher: JMIR Publications Inc.

URL: https://doi.org/10.2196/87804

DOI: 10.2196/87804

PubMed id: 42144967

Data Access Statement: All data analyzed during this study are available as supplemental materials.


Share