How generative artificial intelligence can facilitate the teaching of clinical reasoning

a scoping review

Authors

DOI:

https://doi.org/10.35699/2237-5864.2025.58339

Keywords:

artificial intelligence, clinical reasoning, generative artificial intelligence, health education, scoping review

Abstract

This review aims to map and summarize the current state of research to identify the applicability of chatbots in teaching clinical reasoning during medical training, considering the best available evidence. A systematic and comprehensive search was conducted in PubMed/MEDLINE, Web of Science, and Google Scholar databases between August 2023 and August 2024. Original studies describing educational applications aligned with evidence-based strategies for teaching clinical reasoning (self-explanation, structured reflection, case practice, and feedback) were included. The selection was complemented by snowballing and expert consultation. Twenty-one publications were included. All studies explored the use of ChatGPT (OpenAI); three (14%) also analyzed Bard (Google), two (9.5%) investigated Bing (Microsoft), and one (5%) explored other artificial intelligence tools. Our findings suggest that chatbots can support the development of clinical reasoning skills through effective educational strategies. Chatbot responses can help students build understanding, promote deliberate reflection, encourage feedback when practicing with written cases, and adapt content to the learner’s stage. Few studies raised concerns about risks and ethical issues. This review demonstrated that chatbots hold great potential to enhance the development of clinical reasoning during medical education. However, it is essential to address inherent limitations, such as the risks of hallucinations and inaccurate explanations, to maximize the technology’s educational potential.

Downloads

Download data is not yet available.

Author Biographies

  • Guilherme Freitas Bernardo Ferreira, Universidade Professor Edson Antônio Velano

    Guilherme Freitas Bernardo Ferreira is a neurologist with a master’s degree in Health Education/Clinical Reasoning Education. He served as a professor at Unifenas-BH from 2021 to 2025.

  • Alexandre Sampaio Moura, Faculdade Santa Casa

    Alexandre Sampaio Moura is an infectious diseases physician working as a full professor at the Graduate Program in Medicine and Biomedicine at Faculdade Santa Casa, Belo Horizonte, Brazil. Prof. Moura conducts research in medical education, with a particular interest in clinical reasoning and competence-based assessment.

  • Lígia Maria Cayres Ribeiro, University Medical Center Groningen

    Ligia Cayres Ribeiro is an internal medicine physician with a PhD in Clinical Reasoning. She is a researcher at the University Medical Center Groningen, where she investigates how technology can enhance evidence-informed educational practices.

  • Maria Aparecida Turci, Universidade Professor Edson Antonio Velano

    Maria Aparecida Turci is a public health professional working as a full professor at the Graduate Program in Medicine at Professor Edson Antônio Velano University, Belo Horizonte, Brazil. Prof. Turci conducts research in public health and health professions education.

  • Sílvia Mamede, University Medical Center Groningen

    Sílvia Mamede is a guest professor at the Wenckebach Institute (WIOO), Lifelong Learning, Education and Assessment Research Network (LEARN), University Medical Center Groningen, Netherlands. She conducts research on clinical reasoning and diagnostic error in medicine; educational strategies for the teaching of clinical reasoning; reflection and experiential learning in medical education and clinical practice.

References

ARKSEY, Hilary; O’MALLEY, Lisa. Scoping studies: towards a methodological framework. International Journal of Social Research Methodology, Milton Park, v. 8, n. 1, p. 19-32, 23 Feb. 2007. DOI: https://doi.org/10.1080/1364557032000119616. Available at: https://www.tandfonline.com/doi/abs/10.1080/1364557032000119616. Accessed on: 10 Dec. 2025.

BALAS, Michael; ING, Edsel B. Conversational AI models for ophthalmic diagnosis: comparison of chatGPT and the isabel pro differential diagnosis generator. JFO Open Ophthalmology, Issy-les-Moulineaux, v. 1, p. 100005, 4 Mar. 2023. DOI: https://doi.org/10.1016/j.jfop.2023.100005. Available at: https://www.sciencedirect.com/science/article/pii/S2949889923000053?via%3Dihub. Accessed on: 10 Dec. 2025.

BALASANJEEVI, Gayathri; SURAPANENI, Krishna Mohan. Comparison of chatGPT version 3.5 & 4 for utility in respiratory medicine education using clinical case scenarios. Respiratory Medicine and Research, Issy-les-Moulineaux, v. 85, p. 101091, Jun. 2024. DOI: https://doi.org/10.1016/j.resmer.2024.101091. Available at: https://www.sciencedirect.com/science/article/pii/S2590041224000084?via%3Dihub. Accessed on: 10 Dec. 2025.

BONETTI, Mario Alessandri; GIORGINO, Riccardo; AFFLITTO, Gabriele Gallo; LORENZI, Francesca De; EGRO, Francesco M. How does chatGPT perform on the Italian residency admission national exam compared to 15,869 medical graduates? Annals of Biomedical Engineering, Berlin, v. 52, n. 4, p. 745-749, Apr. 2024. Available at: https://pubmed.ncbi.nlm.nih.gov/37490183/. Accessed on: 10 Dec. 2025.

BOWEN, Judith L. Educational strategies to promote clinical diagnostic reasoning. New England Journal of Medicine, Walthan, v. 355, n. 21, p. 2217-2225, 23 Nov. 2006. DOI: https://doi.org/10.1056/NEJMra054782. Available at: https://www.nejm.org/doi/10.1056/NEJMra054782. Accessed on: 10 Dec. 2025.

CAI, Louis Z.; SHAHEEN, Abdulla; JIN, Andrew; FUKUI, Riya; YI, Jonathan S.; YANNUZZI, Nicolas; ALABIAD, Chrisfouad. Performance of generative large language models on ophthalmology board-style questions. American Journal of Ophthalmology, [S.l.], v. 254, p. 141-149, Oct. 2023. Doi: https://doi.org/10.1016/j.ajo.2023.05.024. Available at: https://www.sciencedirect.com/science/article/pii/S0002939423002301?via%3Dihub. Accessed on: 10 Dec. 2025.

CIVANER, M. Murat; UNCU, Yeşim; BULUT, Filiz; CHALIL, Esra Giounous; TATLI, Abdülhamit. Artificial intelligence in medical education: a cross-sectional needs assessment. BMC medical education, [S.l.], v. 22, n. 1, p. 1-9, 9 Nov. 2022. DOI: https://doi.org/10.1186/s12909-022-03852-3. Available at: https://link.springer.com/article/10.1186/s12909-022-03852-3. Accessed on: 10 Dec. 2025.

COOPER, Nicola; BARTLETT, Maggie; GAY, Simon; HAMMOND, Anna; LILLICRAP, Mark; MATTHAN, Joanna; SINGH, Mini. Consensus statement on the content of clinical reasoning curricula in undergraduate medical education. Medical Teacher, [S.l.], v. 43, n. 2, p. 152-159, 1 Feb. 2021. DOI: https://doi.org/10.1080/0142159X.2020.1842343. Available at: https://www.tandfonline.com/doi/full/10.1080/0142159X.2020.1842343. Accessed on: 10 Dec. 2025.

CUTRER, William B.; SULLIVAN, William M.; FLEMING, Amy E. Educational strategies for improving clinical reasoning. Current Problems in Pediatric and Adolescent Health Care, [S.l.], v. 43, n. 9, p. 248-257, Oct. 2013. DOI: https://doi.org/10.1016/j.cppeds.2013.07.005. Available at: https://www.sciencedirect.com/science/article/pii/S1538544213000941?via%3Dihub. Accessed on: 10 Dec. 2025.

D’SOUZA, Russel Franco; AMANULLAH, Shabbir; MATHEW, Mary; SURAPANENI, Krishna Mohan. Appraising the performance of chatGPT in psychiatry using 100 clinical case vignettes. Asian Journal of Psychiatry, [S.l.], v. 89, p. 103770, Nov. 2023. DOI: https://doi.org/10.1016/j.ajp.2023.103770. Available at: https://www.sciencedirect.com/science/article/pii/S187620182300326X?via%3Dihub. Accessed on: 10 Dec. 2025.

ERICSSON, Karl Anders. Deliberate practice and the acquisition and maintenance of expert performance in medicine and related domains. Academic Medicine, [S.l.], v. 79, n. 10, p. S70-S81, Oct. 2004. Available at: https://pubmed.ncbi.nlm.nih.gov/15383395/#full-view-affiliation-1. Accessed on: 10 Dec. 2025.

EVA, Kevin W. What every teacher needs to know about clinical reasoning. Medical Education, [S.l.], v. 39, n. 1, p. 98-106, Jan. 2005. DOI: https://doi.org/10.1111/j.1365-2929.2004.01972.x. Available at: https://asmepublications.onlinelibrary.wiley.com/doi/10.1111/j.1365-2929.2004.01972.x. Accessed on: 10 Dec. 2025.

FONSECA, Ângelo; FERREIRA, Axel; RIBEIRO, Luís; MOREIRA, Sandra; DUQUE, Cristina. Embracing the future — is artificial intelligence already better? A comparative study of artificial intelligence performance in diagnostic accuracy and decision‐making. European Journal of Neurology, [S.l.], v. 31, n. 4, p. e16195, Apr. 2024. DOI: https://doi.org/10.1111/ene.16195. Available at: https://onlinelibrary.wiley.com/doi/10.1111/ene.16195. Accessed on: 10 Dec. 2025.

GORDON, Morris; DANIEL, Michelle; AJIBOYE, Aderonke; URAIBY, Hussein; XU, Nicole Y.; BARTLETT, Rangana; HANSON, Janice; HAAS, Mary; SPADAFORE, Maxwell; GRAFTON-CLARKE, Ciaran; GASIEA, Rayhan Yousef; MICHIE, Colin; CORRAL, Janet; KWAN, Brian; DOLMANS, Diana; THAMMASITBOON, Satid. A scoping review of artificial intelligence in medical education. Medical Teacher, [S.l.], n. 84, p. 1-25, 29 Feb. 2024. DOI: https://doi.org/10.1080/0142159X.2024.2314198. Available at: https://www.tandfonline.com/doi/full/10.1080/0142159X.2024.2314198. Accessed on: 10 Dec. 2025.

GRUNHUT, Joel; WYATT, Adam T. M.; MARQUES, Oge. Educating future physicians in artificial intelligence (AI): an integrative review and proposed changes. Journal of Medical Education and Curricular Development, [S.l.], v. 8, 6 Sep. 2021. DOI: https://doi.org/10.1177/23821205211036836. Available at: https://journals.sagepub.com/doi/10.1177/23821205211036836. Accessed on: 10 Dec. 2025.

HIROSAWA, Takanobu; HARADA, Yukinori; YOKOSE, Masashi; SAKAMOTO, Tetsu; KAWAMURA, Ren; SHIMIZU, Taro. Diagnostic accuracy of differential-diagnosis lists generated by generative pretrained transformer 3 chatbot for clinical vignettes with common chief complaints: a pilot study. International Journal of Environmental Research and Public Health, [S.l.], v. 20, n. 4, p. 3378, 15 Feb. 2023. DOI: https://doi.org/10.3390/ijerph20043378. Available at: https://www.mdpi.com/1660-4601/20/4/3378. Accessed on: 10 Dec. 2025.

HUDON, Alexandre; KIEPURA, Barnabé; PELLETIER, Myriam; PHAN, Véronique. Using chatGPT in psychiatry to design script concordance tests in undergraduate medical education: mixed methods study. JMIR Medical Education, [S.l.], v. 10, p. e54067-e54067, 4 Apr. 2024. Available at: https://www.mdpi.com/1660-4601/20/4/3378. Accessed on: 10 Dec. 2025.

IBIAPINA, Cassio; MAMEDE, Sílvia; MOURA, Alexandre; ELÓI-SANTOS, Silvana; GOG, Tamara van. Effects of free, cued and modelled reflection on medical students’ diagnostic competence. Medical Education, [S.l.], v. 48, n. 8, p. 796-805, Aug. 2014. DOI: https://doi.org/10.1111/medu.12435. Available at: https://asmepublications.onlinelibrary.wiley.com/doi/10.1111/medu.12435. Accessed on: 10 Dec. 2025.

KANJEE, Zahir; CROWE, Byron; RODMAN, Adam. Accuracy of a generative artificial intelligence model in a complex diagnostic challenge. JAMA, [S.l.], v. 330, n. 1, p. 78, 3 Jul. 2023. Available at: https://jamanetwork.com/journals/jama/fullarticle/2806457. Accessed on: 10 Dec. 2025.

KIYAK, Yavuz Selim; EMEKLI, Emre. A prompt for generating script concordance test using chatGPT, claude, and llama large language model chatbots. Revista Española de Educación Médica, Murcia, v. 5, n. 3, 15 May 2024. DOI: https://doi.org/10.6018/edumed.612381. Available at: https://revistas.um.es/edumed/article/view/612381. Accessed on: 10 Dec. 2025.

KLANG, Eyal; PORTUGEZ, Shir; GROSS, Raz; KASSIF LERNER, Reut; BRENNER, Alina; GILBOA, Mayan; ORTAL, Tal; RON, Sophi; ROBINZON, Vered; MEIRI, Hila; SEGAL, Gad. Advantages and pitfalls in utilizing artificial intelligence for crafting medical examinations: a medical education pilot study with GPT-4. BMC Medical Education, [S.l.], v. 23, n. 1, p. 772, 17 Oct. 2023. Available at: https://link.springer.com/article/10.1186/s12909-023-04752-w. Accessed on: 10 Dec. 2025.

KOGA, Shunsuke; MARTIN, Nicholas B.; DICKSON, Dennis W. Evaluating the performance of large language models: chatGPT and google bard in generating differential diagnoses in clinicopathological conferences of neurodegenerative disorders. Brain Pathology, [S.l.], p. e13207, 8 Aug. 2023. DOI: https://doi.org/10.1111/bpa.13207. Available at: https://onlinelibrary.wiley.com/doi/10.1111/bpa.13207. Accessed on: 10 Dec. 2025.

KUNG, Tiffany H.; CHEATHAM, Morgan; MEDENILLA, Arielle; SILLOS, Czarina; LEON, Lorie De; ELEPAÑO, Camille; MADRIAGA, Maria; AGGABAO, Rimel; DIAZ-CANDIDO, Giezel; MANINGO, James; TSENG, Victor. Performance of chatGPT on USMLE: potential for AI-assisted medical education using large language models. PLOS Digital Health, [S.l.], v. 2, n. 2, p. e0000198, 9 Feb. 2023. DOI: https://doi.org/10.1371/journal.pdig.0000198. Available at: https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000198. Accessed on: 10 Dec. 2025.

LEE, Hyunsu. The rise of chatGPT: exploring its potential in medical education. Anatomical Sciences Education, [S.l.], v. 17, n. 5, 14 Mar. 2023. DOI: https://doi.org/10.1002/ase.2270. Available at: https://anatomypubs.onlinelibrary.wiley.com/doi/10.1002/ase.2270. Accessed on: 10 Dec. 2025.

MADRID-GARCÍA, Alfredo; ROSALES-ROSADO, Zulema; FREITES-NUÑEZ, Dalifer; PÉREZ-SANCRISTÓBAL, Inés; PATO-COUR, Esperanza; PLASENCIA-RODRÍGUEZ, Chamaida; CABEZA-OSORIO, Luis; ABASOLO-ALCÁZAR, Lydia; LÉON-MATEOS, Leticia; FERNÁNDEZ-GUTIÉRREZ, Benjamín; RODRÍGUEZ-RODRÍGUEZ, Luis. Harnessing chatGPT and GPT-4 for evaluating the rheumatology questions of the spanish access exam to specialized medical training. Scientific Reports, [S.l.], v. 13, n. 1, p. 22129, 13 Dec. 2023. DOI: https://doi.org/10.1038/s41598-023-49483-6. Available at: https://www.nature.com/articles/s41598-023-49483-6. Accessed on: 10 Dec. 2025.

MAMEDE, Sílvia. What does research on clinical reasoning have to say to clinical teachers? Scientia Medica, Porto Alegre, v. 30, p. 1-8, 15 Jul. 2020. DOI: https://doi.org/10.15448/1980-6108.2020.1.37350. Available at: https://revistaseletronicas.pucrs.br/scientiamedica/article/view/37350. Accessed on: 10 Dec. 2025.

MAMEDE, Sílvia; SCHMIDT, Henk G. Deliberate reflection and clinical reasoning: founding ideas and empirical findings. Medical Education, [S.l.], v. 57, n. 1, p. 76-85, Jan. 2023. DOI: https://doi.org/10.1111/medu.14863. Available at: https://asmepublications.onlinelibrary.wiley.com/doi/10.1111/medu.14863. Accessed on: 10 Dec. 2025.

PRAKASH, Shivesh; SLADEK, Ruth M.; SCHUWIRTH, Lambert. Interventions to improve diagnostic decision making: a systematic review and meta-analysis on reflective strategies. Medical Teacher, [S.l.], v. 41, n. 5, p. 517-524, 4 May 2019. DOI: https://doi.org/10.1080/0142159X.2018.1497786. Available at: https://www.tandfonline.com/doi/full/10.1080/0142159X.2018.1497786. Accessed on: 10 Dec. 2025.

PREIKSAITIS, Carl; ROSE, Christian. Opportunities, challenges, and future directions of generative artificial intelligence in medical education: scoping review. JMIR Medical Education, [S.l.], v. 9, n. 1, p. e48785, 2023. DOI: https://doi.org/10.2196/48785. Available at: https://mededu.jmir.org/2023/1/e48785. Accessed on: 10 Dec. 2025.

SCHERR, Riley; HALASEH, Faris F.; SPINA, Aidin; ANDALIB, Saman; RIVERA, Ronald. ChatGPT interactive medical simulations for early clinical education: case study. JMIR Medical Education, [S.l.], v. 9, p. e49877, 10 Nov. 2023. DOI: https://doi.org/10.2196/49877. Available at: https://mededu.jmir.org/2023/1/e49877. Accessed on: 10 Dec. 2025.

SCHMIDT, Hank G.; RIKERS, Remy. M. J. P. How expertise develops in medicine: knowledge encapsulation and illness script formation. Medical Education, [S.l.], v. 41, n. 12, p. 1133-1139, 14 Nov. 2007. DOI: https://doi.org/10.1111/j.1365-2923.2007.02915.x. Available at: https://asmepublications.onlinelibrary.wiley.com/doi/10.1111/j.1365-2923.2007.02915.x. Accessed on: 10 Dec. 2025.

SHEA, Yat-Fung; LEE, Cynthia Min Yao; IP, Whitney Chin Tung; LUK, Dik Wai Anderson; WONG, Stephanie Sze Wing. Use of GPT-4 to analyze medical records of patients with extensive investigations and delayed diagnosis. JAMA Network Open, [S.l.], v. 6, n. 8, p. e2325000, 14 Aug. 2023. DOI: https://doi.org/10.1001/jamanetworkopen.2023.25000. Available at: https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2808251. Accessed on: 10 Dec. 2025.

SHIEH, Allen; TRAN, Brandon; HE, Gene; KUMAR, Mudit; FREED, Jason A.; MAJETY, Priyanka. Assessing chatGPT 4.0’s test performance and clinical diagnostic accuracy on USMLE STEP 2 CK and clinical case reports. Scientific Reports, [S.l.], v. 14, n. 1, p. 9330, 23 Apr. 2024. DOI: https://doi.org/10.1038/s41598-024-58760-x. Available at: https://pubmed.ncbi.nlm.nih.gov/38654011/. Accessed on: 10 Dec. 2025.

STRONG, Eric; DIGIAMMARINO, Alicia; WENG, Yingjie; KUMAR, Andre; HOSOMANI, Poonam; HOM, Jason; CHEN, Jonathan H. Chatbot vs medical student performance on free-response clinical reasoning examinations. JAMA Internal Medicine, [S.l.], v. 183, n. 9, p. 1028-1030, 1 Sep. 2023. DOI: https://doi.org/10.1001/jamainternmed.2023.2909. Available at: https://pubmed.ncbi.nlm.nih.gov/37459090/. Accessed on: 10 Dec. 2025.

WONG, Kristin; FAYNGERSH, Alla; TRABA, Christin; CENNIMO, David; KOTHARI, Neil; CHEN, Sophia. Using chatGPT in the development of clinical reasoning cases: a qualitative study. Cureus, [S.l.], v. 16, n. 5, 31 May 2024. DOI: https://doi.org/10.7759/cureus.61438. Available at: https://www.cureus.com/articles/253053-using-chatgpt-in-the-development-of-clinical-reasoning-cases-a-qualitative-study#!/. Accessed on: 10 Dec. 2025.

WOODS, Nicole N.; NEVILLE, Alan J.; LEVINSON, Anthony J.; HOWEY, Elizabeth H. A.; OCZKOWSKI, Wieslaw J.; NORMAN, Geoffrey R. The value of basic science in clinical diagnosis. Academic Medicine, [S.l.], v. 81, n. 10, p. S124-S127, Oct. 2006. DOI: https://doi.org/10.1097/00001888-200610001-00031. Available at: https://pubmed.ncbi.nlm.nih.gov/17001122/. Accessed on: 10 Dec. 2025.

XIE, Yi; SETH, Ishith; HUNTER-SMITH, David J.; ROZEN, Warren M.; SEIFMAN, Marc A. Investigating the impact of innovative AI chatbot on post‐pandemic medical education and clinical assistance: a comprehensive analysis. ANZ Journal of Surgery, [S.l.], v. 94, n. 1-2, p. 68-77, Aug. 2023. DOI: https://doi.org/10.1111/ans.18666. Available at: https://onlinelibrary.wiley.com/doi/10.1111/ans.18666. Accessed on: 10 Dec. 2025.

YIU, Allen; LAM, Kyle. Performance of large language models at the MRCS part A: a tool for medical education? The Annals of the Royal College of Surgeons of England, [S.l.], v. 107, n. 6, p. 434-440, Dec. 2023. DOI: https://doi.org/10.1308/rcsann.2023.0085. Available at: https://publishing.rcseng.ac.uk/doi/10.1308/rcsann.2023.0085. Accessed on: 10 Dec. 2025.

Seção especial: IA nos processos de ensino-aprendizagem

Downloads

Published

2026-02-06

Issue

Section

Seção especial: IA nos processos de ensino-aprendizagem

How to Cite

FERREIRA, Guilherme Freitas Bernardo; MOURA, Alexandre Sampaio; RIBEIRO, Lígia Maria Cayres; TURCI, Maria Aparecida; MAMEDE, Sílvia. How generative artificial intelligence can facilitate the teaching of clinical reasoning: a scoping review. Revista Docência do Ensino Superior, Belo Horizonte, v. 15, p. 1–27, 2026. DOI: 10.35699/2237-5864.2025.58339. Disponível em: https://periodicos.ufmg.br/index.php/rdes/article/view/58339. Acesso em: 17 apr. 2026.