[Paving the Way for Global Health Equity with Generative AI and Learning Health Systems (LHS)]
(https://elhsi.com/myllm/finetuned)
The NAM leaders have reviewed the progress of Learning Health Systems (LHS) and provided a trust framework for their future development. More importantly, the perspectives recognize that generative AI will transform healthcare in unprecedented ways. I believe that ML-enabled LHS will fulfill the promises envisioned by LHS. In 2022, we published the first ML-enabled LHS unit using synthetic patient data, initiating a subfield of simulated ML-LHS. Today, generative AI (GenAI) has significantly simplified the creation of ML-LHS units.
As large language models (LLMs) consume all publicly available data, they may soon exhaust the existing supply. According to a recent Nature news article, estimates suggest that by 2028, developers will be working with datasets equivalent to the total text available on the Internet. When that point is reached, synthetic data will likely fill the gap. In fact, synthetic data is already being used to train LLMs.
After successfully developing ML models from real EHR data using the same pipeline we created for synthetic patient data, I hypothesized that synthetic patient data would accelerate ML/AI development in the future. This hypothesis has now been validated: synthetic patient data is proving to be a key enabler for fine-tuning open-source LLMs like Llama for healthcare applications.
For instance, clinical teams are currently facing a significant bottleneck: the lack of preclinically validated LLMs for use in clinical studies. This challenge prevents doctors from initiating LLM clinical research. To address this, we recently demonstrated that synthetic patient data can be used to fine-tune open-source Llama3.1-8B models, improving diagnostic prediction and providing preclinical validation data for disease prediction. With these pre-validated, fine-tuned LLMs, any doctor can now explore the benefits of GenAI in clinical care.
Our breakthrough in creating high-accuracy, fine-tuned LLMs using synthetic data will undoubtedly accelerate the democratization of GenAI in healthcare—a core mission of the ELHS Institute.
Keep reading to explore the latest papers below, and enjoy my conversations with ChatGPT about using synthetic data to accelerate GenAI development.
Warm regards,
AJ
AJ Chen, PhD | ELHS Institute | https://elhsi.org
~
(Recent papers, news, and events showcasing the progress of GenAI and LHS)
McGinnis, J. M., H. V. Fineberg, and V. J. Dzau. Shared commitments for health and health care: A trust framework from the learning health system. NAM Perspectives. Commentary, National Academy of Medicine, Washington, DC. https://doi.org/10.31478/202412c.
[2024] The potential for practical application of generative AI in health and health care sets the stage for accelerated real-time learning, evidence generation, and expedited application for continuous improvement in outcomes. This perspectives presents the shared commitments as a trust framework for health and health care services, clearly expressing the primacy of priorities for all health and health care stakeholders, hence the service as drivers of organizational culture and synergy.
-
Hartman V, Zhang X, Poddar R, et al. Developing and Evaluating Large Language Model–Generated Emergency Medicine Handoff Notes. JAMA Netw Open. 2024;7(12):e2448723. doi:10.1001/jamanetworkopen.2024.48723
[2024/12] Can a large language model (LLM) generate emergency medicine (EM)-to-inpatient (IP) handoff notes that are useful and safe for EM care? In this cohort study of 1600 EM patient medical records using a novel evaluation framework, the LLM-generated EM-to-IP handoff notes had a mean usefulness of 4.04 out of 5 (compared with 4.36 for physician-written) and a mean patient safety of 4.06 out of 5 (compared with 4.50 for physician-written) with no critical patient safety risks. These findings suggest the value of a manual, patient safety–focused clinical evaluation of LLM models and the potential of LLM-generated handoff notes to create a new standard of care in EM.
-
Klang E, Apakama D, Abbott EE, et al. A strategy for cost-effective large language model use at health system-scale. NPJ Digit Med. 2024 Nov 18;7(1):320. doi: 10.1038/s41746-024-01315-1.
[2024/11] We assessed ten LLMs of different capacities and sizes utilizing real-world patient data. We conducted >300,000 experiments of various task sizes and configurations, measuring accuracy in question-answering and the ability to properly format outputs. Performance deteriorated as the number of questions and notes increased. High-capacity models, like Llama-3-70b, had low failure rates and high accuracies. GPT-4-turbo-128k was similarly resilient across task burdens, but performance deteriorated after 50 tasks at large prompt sizes. After addressing mitigable failures, these two models can concatenate up to 50 simultaneous tasks effectively, with validation on a public medical question-answering dataset. An economic analysis demonstrated up to a 17-fold cost reduction at 50 tasks using concatenation. These results identify the limits of LLMs for effective utilization and highlight avenues for cost-efficiency at the enterprise scale.
-
Perlis R, Schweitzer K. The Low-Cost, Battery-Powered AI-Enabled Ultrasound Device That Could Improve Global Obstetric Care. JAMA. Published online December 13, 2024. doi:10.1001/jama.2024.23859
[2024/12] After a decade spent studying prenatal care in Zambia, Stringer and his fellow researchers have worked to develop an AI-enabled alternative in the form of a low-cost, battery-powered ultrasonography probe that doesn’t require trained sonographers. In a clinical trial, the tool performed similarly to standard ultrasound for measuring gestational age, offering “immediate implications for obstetrical care in low-resource settings”.
-
Naved BA, Luo Y. Contrasting rule and machine learning based digital self triage systems in the USA. npj Digit. Med. 7, 381 (2024).
[2024/12] Patient smart access and self-triage systems have been in development for decades. As of now, no LLM for processing self-reported patient data has been published by health systems. Many expert systems and computational models have been released to millions. This review is the first to summarize progress in the field including an analysis of the exact self-triage solutions available on the websites of 647 health systems in the USA.
-
Williams, M., Karim, W., Gelman, J. et al. Ethical data acquisition for LLMs and AI algorithms in healthcare. npj Digit. Med. 7, 377 (2024).
[2024/12] Artificial intelligence (AI) algorithms will become increasingly integrated into our healthcare systems in the coming decades. These algorithms require large volumes of data for development and fine-tuning. Patient data is typically acquired for AI algorithms through an opt-out system in the United States, while others support an opt-in model. We argue that ethical principles around autonomy, patient ownership of data, and privacy should be prioritized in the data acquisition paradigm.
~
(AJ in conversation with ChatGPT to understand how machines view human healthcare)
I see a bottleneck that most clinical teams face because they cannot use affordable, smaller open-source LLMs, such as Llama 3.1-8B, in clinical applications. For many diseases, these LLMs do not have the high accuracy required for clinical predictions, such as diagnostic predictions. Do you believe this is a real bottleneck for doctors? How can this bottleneck be removed?
ChatGPT
Yes, this is a real bottleneck for clinical teams, and it has significant implications for the democratization of generative AI (GenAI) in healthcare. Open-source LLMs like Llama3.1-8B offer affordability and flexibility, but their limited out-of-the-box accuracy for clinical tasks, such as diagnostic prediction, makes them unsuitable for many use cases where high accuracy is critical. Here’s an analysis of the bottleneck and potential solutions:
Why This is a Real Bottleneck
Strategies to Remove This Bottleneck
1. Fine-Tuning Open-Source LLMs for Clinical Use
2. Developing a User-Friendly GenAI Platform
3. Democratizing Access Through Pre-Validated Models
4. Collaboration and Training for Clinical Teams
5. Integrating Feedback Loops
Conclusion
The bottleneck arises not from the lack of technology but from the gap in enabling clinical teams to adapt affordable open-source LLMs for high-stakes applications. By providing fine-tuned, pre-validated models and empowering teams with training and accessible tools, this bottleneck can be removed. This would not only enhance diagnostic prediction accuracy but also accelerate the democratization of GenAI in healthcare, ensuring equitable access to advanced tools across all settings.
I believe synthetic data is critical because it can be used to fine-tune LLMs. This step can produce fine-tuned models that are acceptable for clinical validation, thus enabling clinical teams to start using these fine-tuned models in clinical research. This will accelerate the development of fine-tuned LLMs using real patient data. What are your thoughts on this?
ChatGPT
You are absolutely correct—synthetic data plays a pivotal role in fine-tuning LLMs for clinical applications, acting as a bridge to overcome current barriers in clinical research and model validation. This approach not only accelerates the deployment of fine-tuned LLMs but also ensures compliance with data privacy standards while enabling clinical teams to gain confidence in these models before transitioning to real patient data.
Why Synthetic Data is Critical
How Synthetic Data Can Accelerate Development
1. Fine-Tuning LLMs with Synthetic Data: Use synthetic datasets for initial fine-tuning, tailoring models to specific tasks (e.g., diagnostic prediction, risk stratification). Employ domain-specific synthetic data generators (e.g., generative adversarial networks or simulation-based approaches) to produce high-fidelity clinical data.
2. Preparing Models for Clinical Validation: Fine-tuned LLMs on synthetic data can achieve a performance threshold suitable for validation with real-world patient data in clinical settings. This ensures that clinical teams start with a model that already demonstrates robust performance, reducing the burden of initial testing.
3. Enabling Clinical Research: Once fine-tuned models achieve acceptable accuracy levels with synthetic data, clinical teams can use these models in real-world research under strict governance. This ensures that real patient data is only introduced when models are mature and trustworthy.
4. Enhancing Feedback Loops: Synthetic data allows models to evolve through continuous feedback from both clinical research and synthetic test cases. This iterative process strengthens models before full-scale deployment.
Steps to Implement This Strategy
Benefits of This Approach
This approach aligns perfectly with your vision of democratizing GenAI in healthcare. Would you like to discuss specific tools for generating synthetic data or strategies for clinical collaboration?
Yes, please list common tools and strategies for generating synthetic patient data to fine-tune disease prediction LLMs.
ChatGPT
Here’s a comprehensive list of tools and strategies commonly used for generating synthetic patient data, particularly for fine-tuning LLMs in disease prediction:
Common Tools for Generating Synthetic Patient Data
1. Open-Source Synthetic Data Generators
2. AI-Driven Generative Models
3. Federated Data Synthesis
4. Commercial Tools
Strategies for Generating Synthetic Data for Disease Prediction
1. Tailored Dataset Generation
2. Augmentation of Limited Real Datasets
3. Longitudinal Data Simulation
4. Multi-Modal Data Synthesis
5. Real-World Validation
Integrating Synthetic Data in Fine-Tuning Pipelines
~
|
The new copilot platform makes it easy for doctors and patients to use GenAI:
|
Meta AI helps you learn, create and connect in new ways. With our latest features you can now talk with recognizable voices and share images with Meta AI to learn more about your surroundings. Ask Meta AI anything. |
~
ELHS Institute Relevant Resources:
Stanford University RAISE Health Initiative
~ the end ~
Democratizing Generative AI in Healthcare to Help Achieve Global Health Equity
info@elhsi.org
Palo Alto, California, USA