Ta strona używa pliki cookies, w celu polepszenia użyteczności i funkcjonalności oraz w celach statystycznych. Dowiedz się więcej w Polityce prywatności.
Korzystając ze strony wyrażasz zgodę na używanie plików cookies, zgodnie z aktualnymi ustawieniami przeglądarki.
Akceptuję wykorzystanie plików cookies
Family Medicine & Primary Care Review
eISSN: 2449-8580
ISSN: 1734-3402
Family Medicine & Primary Care Review
Current issue Archive Manuscripts accepted About the journal Editorial board Reviewers Abstracting and indexing Subscription Contact Instructions for authors Publication charge Ethical standards and procedures
Editorial System
Submit your Manuscript
SCImago Journal & Country Rank
1/2025
vol. 27
 
Share:
Share:
abstract:
Original paper

Investigating the use of a large language model in general practice

Wesley Chorney
1
,
Sing Hui Ling
2

  1. School of Medicine, University College Cork, Cork, Ireland
  2. School of Medicine, University of Limerick, Limerick, Ireland
Family Medicine & Primary Care Review 2025; 27(1): 19–24
Online publish date: 2025/03/26
View full text Get citation
 
PlumX metrics:
Background
Large language models have demonstrated strong performance on many tasks. In particular, they have been shown to pass many medical knowledge tests. However, the majority of these studies do not make use of fine tuning.

Objectives
Evaluate the suitability of fine-tuned, large language models in the context of general practice by evaluating performance on the Applied Knowledge Test (AKT) of the Royal College of General Practitioners.

Material and methods
We evaluate the performance of ChatGPT 3.5 in three distinct cases using publicly available practice questions from the Royal College of General Practitioners. In the baseline case, questions are simply input and the answer recorded. In the second case, prompt engineering is used before the questions. Finally, the model is fine-tuned using a subset of the questions and evaluated on the remaining ones.

Results
The fine-tuned model outperforms both the baseline (p = 0.005) and prompt engineering cases (p = 0.010). Furthermore, the model achieves a passing mark on the AKT, with a mean score of 72.03%.

Conclusions
With further development, fine-tuned, large language models could potentially be used by general practitioners to facilitate areas of their practice. Care must be used to ensure that the models conform to stringent standards to avoid misinforming patients or misguiding care.

keywords:

artificial intelligence, general practice, medicine

 
Quick links
© 2025 Termedia Sp. z o.o.
Developed by Bentus.