Learn
With a growing number of research, publications, and applications related to Large Language Models in healthcare, there is an ever-growing need for evaluation frameworks and tools. The publications featured below provide key recommendations, frameworks, and tools for evaluation of LLMs in healthcare. These key publications were discussed at the BrainX Community, January 2025 Live event.
Bedi S, Liu Y, Orr-Ewing L, et al. Testing and Evaluation of Health Care Applications of Large Language Models: A Systematic Review. JAMA. Published online October 15, 2024. doi:10.1001/jama.2024.21700
Tam, T.Y.C., Sivarajkumar, S., Kapoor, S. et al. A framework for human evaluation of large language models in healthcare derived from literature review. npj Digit. Med. 7, 258 (2024). https://doi.org/10.1038/s41746-024-01258-7
Connect
The BrainX Community Live, January 2025 event, featured Suhana Bedi, Stanford University; Dr. Ashish Atreja, UC Davis; VALIDAI & Dr. Yanshan Wang, University of Pittsburgh. The session was moderated by Dr. Piyush Mathur, Cleveland Clinic;BrainX. At the panel discussion, key aspects of human evaluation of LLMs in healthcare, such as metrics for evaluation, evaluation methods, including who and how these need to be performed, and frameworks, were reviewed. Future directions, such as new frameworks, tools, and the possibility of LLMs-as-judge, were also explored.
Datasets
MEDEC Dataset (MEDICAL ERROR DETECTION AND CORRECTION IN CLINICAL NOTES)
It includes 3,848 clinical texts from the MS and University of Washington hospital collections covering five types of errors (Diagnosis, Management, Treatment, Pharmacotherapy, and Causal Organism).
BiomedParseData by combining 45 biomedical image segmentation datasets and using GPT-4 to generate the canonical semantic label for each segmented object. GPT-4 was used to create a unifying biomedical object ontology for image analysis and harmonise natural language descriptions with this ontology. This ontology encompasses three main categories (histology, organ, and abnormality), 15 meta-object types, and 82 specific object types.
The resulting BiomedParseData contains 3.4 million distinct image-mask-label triples, spanning nine imaging modalities and 25 anatomic sites, representing a large-scale and diverse dataset for semantic-based biomedical image analysis. The images include CT scan, MRI, chest X-ray, ultrasound, skin lesion photos, endoscopy images, pathology whole slide images, and eye OCTs.
Podcast
In this episode, we feature Dr. Amol M. Joshi. Dr. Joshi is the Thomas H. Davis Professor in Business at Wake Forest University (WFU). He holds a joint faculty appointment as an Associate Professor of Strategic Management in the WFU School of Business and as an Associate Professor of Innovation & Commercialisation in the WFU School of Medicine.
Dr. Joshi is an inventor who helps others reinvent themselves. His passion is guiding aspiring entrepreneurs, executives, and students of all levels to pursue their business dreams. He has trained startup founders, corporate managers, and industry leaders in the US, Austria, Denmark, and Vietnam.
A recognised expert on innovation, his interdisciplinary research examines how inventors create and commercialise new technologies and launch and fund new ventures. With a 13-year prior career as an engineer and entrepreneur in venture capital-funded startups in Silicon Valley, Dr. Joshi co-invented two patents for AI-based voice assistant products. He co-founded BeVocal, a speech recognition software startup acquired by Nuance Communications, and served as VP of Sales & Marketing from 1999 to 2003.
Dr. Joshi has advised Federal agencies, including NASA, NSF, NIH, and the US Department of Energy, on national innovation policies and R&D grant programs for small businesses. Dr. Joshi earned a PhD in Business Administration from UNC Chapel Hill, an MBA and MS in Engineering Sciences from Dartmouth College, and a BS in Electrical Engineering from Georgia Tech with highest honours.
Conferences
Additional BXC-featured publication
General
Join and follow the BrainX community!
Webpage: https://brainxai.org/
Newsletter: https://brainxai.substack.com/subscribe
LinkedIn: https://www.linkedin.com/groups/13599549/
Youtube: https://www.youtube.com/channel/UCua5EiLL6I29hpNrJsdv1rg