ELHS Institute

Newsletters

Your E-mail *

The Rise of Clinical GenAI: From Algorithmic Consultants to AI Specialists

ELHS Newsletter 2025-09-12

Dear Friends,

Generative AI is making a profound—and sometimes unsettling—impact on our society. It may replace or reduce many jobs across different industries. At the same time, it will also create new roles. In healthcare, GenAI may require the creation of new positions in care delivery in order to function optimally and improve care quality and patient outcomes.

Harvard professors have proposed one such role: the Algorithmic Consultant (see Nature paper below). They envision this role as similar to that of clinical pharmacists in hospitals today. An algorithmic consultant would have two primary responsibilities: providing point-of-care consultations with physicians who seek input from AI systems on specific clinical scenarios, and overseeing and managing the hospital’s broader ecosystem of algorithms.

Since I have been studying the convergence of GenAI and the Learning Health System (LHS) vision proposed by the U.S. National Academy of Medicine (NAM), I believe a much larger and deeper structural change is needed in healthcare before GenAI’s full potential can be realized. One such change would be the creation of a new AI Specialist role, analogous to pathologists or radiologists. This specialist would generate AI reports that physicians can use as additional information to support differential diagnoses and personalized treatment planning. The AI Specialist role could be added to existing Medical Informatics departments or placed in a new department dedicated to predictive care.

These AI reports could be applied in multiple clinical settings. Beyond the obvious outpatient and inpatient points of care, they could also be used for automated, daily background monitoring of clinical decisions in diagnosis and treatment. Every specialty department could implement daily monitoring across all conditions within its scope, reminding physicians to consider overlooked disease causes or confirm necessary diagnostic tests. This, in turn, would improve clinical training for young doctors and raise the overall quality of care. Such broad monitoring was recommended in the 2015 NAM report Improving Diagnosis in Health Care—yet few, if any, hospitals have been able to implement it, largely because the resource demands were prohibitive.

In my 2024 review of GenAI democratization in healthcare, I highlight these hospital-wide monitoring tasks as an example of GenAI’s unique potential roles in care delivery. The review also proposes LHS units as a structural guarantee—a concept I first published in Nature Sci. Rep. in 2022.

I’ll be discussing with my copilot, ChatGPT, the emerging roles of GenAI in learning health systems below. I hope you enjoy these machine-human conversations.

Warm regards,
AJ

AJ Chen, PhD
Founder and PI, ELHS Institute
Silicon Valley, USA
🔗 ELHS Newsletters: https://elhsi.org/Newsletters
🔗 ELHS GenAI Platform: https://elhsi.com

From Page Mill

(Recent papers, news, and events showcasing the progress of GenAI and LHS)

Marwaha, J.S., Yuan, W., Poddar, M. et al. The algorithmic consultant: a new era of clinical AI calls for a new workforce of physician-algorithm specialists. npj Digit. Med. 8, 552 (2025).

[2025/8] As complex AI systems become more common in clinical decision-making, a new type of physician-data science specialist is needed to bridge the gap between these AI tools and practicing clinicians. Analogous to how clinical pharmacists currently guide appropriate medication use and govern a hospital’s formulary, this specialist would offer point-of-care guidance on AI tool selection and interpretation, and manage a hospital’s AI systems. This role aims to enable safe and effective clinical AI by collaborating with patient-facing providers to ensure appropriate model application in the real world.

Li, R., Wang, X., Berlowitz, D. et al. CARE-AD: a multi-agent large language model framework for Alzheimer’s disease prediction using longitudinal clinical notes. npj Digit. Med. 8, 541 (2025).

[2025/8] In this study, we present CARE-AD (Collaborative Analysis and Risk Evaluation for Alzheimer’s Disease), a multi-agent LLM-based framework for forecasting Alzheimer’s disease (AD) onset by analyzing longitudinal electronic health record (EHR) notes. CARE-AD assigns specialized LLM agents to extract signs and symptoms relevant to AD and conduct domain-specific evaluations—emulating a collaborative diagnostic process. In a retrospective evaluation, CARE-AD achieved higher accuracy (0.53 vs. 0.26–0.45) than baseline single-model approaches in predicting AD risk 10 years prior to the first recorded diagnosis code. These findings highlight the feasibility of using multi-agent LLM systems to support early risk assessment for AD and motivate further research on their integration into clinical decision support workflows.

Khasentino, J., Belyaeva, A., Liu, X. et al. A personal health large language model for sleep and fitness coaching. Nat Med (2025).

[2025/8] Here we introduce the Personal Health Large Language Model (PH-LLM), designed for applications in sleep and fitness. PH-LLM is a version of the Gemini LLM that was finetuned for text understanding and reasoning when applied to aggregated daily-resolution numerical sensor data. In a comprehensive evaluation involving 857 real-world case studies, PH-LLM performed similarly to human experts for fitness-related tasks and improved over the base Gemini model in providing personalized sleep insights. Finally, PH-LLM effectively predicted self-reported sleep quality using a multimodal encoding of wearable sensor data, further demonstrating its ability to effectively contextualize wearable modalities.

Wang, Z., Cao, L., Danek, B. et al. Accelerating clinical evidence synthesis with large language models. npj Digit. Med. 8, 509 (2025).

[2025/8] Here, we propose a generative artificial intelligence (AI) pipeline named TrialMind to streamline study search, study screening, and data extraction tasks in SR. We chose published SRs to build TrialReviewBench, which contains 100 SRs and 2,220 clinical studies. For study search, it achieves high recall rates (Ours 0.711–0.834 v.s. Human baseline 0.138–0.232). For study screening, TrialMind beats previous document ranking methods in a 1.5–2.6 fold change. For data extraction, it outperforms a GPT-4’s accuracy by 16–32%. In a pilot study, human-AI collaboration with TrialMind improved recall by 71.4% and reduced screening time by 44.2%, while in data extraction, accuracy increased by 23.5% with a 63.4% time reduction. Medical experts preferred TrialMind’s synthesized evidence over GPT-4’s in 62.5%-100% of cases. These findings show the promise of accelerating clinical evidence synthesis driven by human-AI collaboration.

Luo, MJ., Bi, S., Pang, J. et al. A large language model digital patient system enhances ophthalmology history taking skills. npj Digit. Med. 8, 502 (2025).

[2025/8] To address this, we developed a large language model-based digital patient (LLMDP) system that transforms de‑identified electronic health records into voice‑enabled virtual patients capable of free‑text dialog and adaptive feedback, based on our previously established open-source retrieval-augmented framework. In a single‑center randomized controlled trial (ClinicalTrials.gov: NCT06229379; N = 84), students trained with LLMDP achieved a 10.50-point increase in medical history-taking assessment scores (95% CI: 4.66–16.33, p < 0.001) compared to those using traditional methods.

Wu, J., Wang, Y., Zhong, Z. et al. Vision-language foundation model for 3D medical imaging. npj Artif. Intell. 1, 17 (2025).

[2025/8] Recent advances in AI, especially vision-language foundation models (VLFMs), show promise in automating radiology report generation from complex 3D medical imaging data. Our review analyzes 23 studies on VLFMs, focusing on model architectures, capabilities, training datasets, and evaluation metrics. We discuss AI’s evolution in radiology, emphasizing the need for diverse datasets and standardized metrics, as challenges remain in producing consistent, high-quality reports.

You JG, Dbouk RH, Landman A, et al. Ambient Documentation Technology in Clinician Experience of Documentation Burden and Burnout. JAMA Netw Open. 2025;8(8):e2528056. doi:10.1001/jamanetworkopen.2025.28056

[2025/8] To examine the prevalence of ADT use and its association with clinicians’ experience of documentation burden and burnout before and after use. A total of 128 of 264 MGB survey respondents (48.5%) had self-reported using ADT for at least 50% of their patient visits, while 27 of 62 Emory respondents (43.5%) self-reported usage for most or all of their visits. The proportion of MGB clinicians reporting burnout (Professional Fulfillment Index overall burnout score >1.33) decreased from 50.6% to 29.4% (χ² = 42.4; P < .001) at 42 days and decreased from 52.6% to 30.7% (χ² = 32.7; P < .001) at 84 days. The proportion of Emory clinicians reporting a positive impact of their documentation practice on well-being (defined as a score of 3-4 for positive or very positive) improved from 1.6% to 32.3% (χ² = 19.0; P < .001). These findings suggest that ADT may enhance clinicians’ documentation-related experience and reduce burnout.

Bedi S, Jiang Y, Chung P, et al. Fidelity of Medical Reasoning in Large Language Models. JAMA Netw Open. 2025;8(8):e2526021. doi:10.1001/jamanetworkopen.2025.26021

[2025/8] We evaluated 6 models spanning different architectures and capabilities: DeepSeek-R1 (model 1), o3-mini (reasoning models) (model 2), Claude-3.5 Sonnet (model 3), Gemini-2.0-Flash (model 4), GPT-4o (model 5), and Llama-3.3-70B (model 6). When forced to reason beyond familiar answer patterns, all models demonstrate declines in accuracy, challenging claims of artificial intelligence’s readiness for autonomous clinical deployment.

Giglio B, Albeloushi A, Alhaj AK, et al. Artificial Intelligence–Augmented Human Instruction and Surgical Simulation Performance: A Randomized Clinical Trial. JAMA Surg. August 06, 2025. doi:10.1001/jamasurg.2025.2564

[2025/8] In this randomized clinical trial of 88 medical students, trainees achieved significantly higher performance scores when tutored by a human educator providing personalized feedback based on artificial intelligence error data than by an intelligent tutor alone. Providing human educators with artificial intelligence performance data to tailor feedback improves learning outcomes in surgical simulation training.

Sahhar M, Nassar JE, Rouffiac A, et al. Enhanced Surgical Instructions Using Large Language Models. JAMA Surg. August 13, 2025. doi:10.1001/jamasurg.2025.2825

[2025/8] Clear preoperative and postoperative instructions are essential, yet some patients struggle to comprehend them, increasing their risk for complications. These risks can be mitigated through simplified materials, such as the Easy-to-Read library (Elsevier), the only known large-scale effort to offer professionally simplified surgical instructions through electronic medical records (EMRs). This study evaluates the hypothesis that artificial intelligence can simplify surgical instructions as effectively as Easy-to-Read versions, with respect to readability, fidelity, and safety metrics.

Jiang Y. et al. MedAgentBench: A Virtual EHR Environment to Benchmark Medical LLM Agents. NEJM AI 2025;2(9). DOI: 10.1056/AIdbp2500144

[2025/8] MedAgentBench encompasses 300 patient-specific clinically derived tasks from 10 categories written by human physicians, realistic profiles of 100 patients with over 700,000 data elements, a Fast Healthcare Interoperability Resources–compliant interactive environment, and an accompanying codebase. The environment uses standard application programming interfaces and communication infrastructure used in modern electronic health record (EHR) systems so that it can be easily migrated into live EHR systems.

In the Eyes of GPTs

(AJ in conversation with ChatGPT to understand how machines view human healthcare)

As LLMs like ChatGPT become more accurate in predicting diseases, direct interaction by physicians alone may not yield optimal results. It is clear that clinical AI should be treated as a new specialty—similar to pathology or radiology—where a complex technology provides critical insights for diagnosing and treating diseases. Do you consider Clinical GenAI a new specialty in medicine?

ChatGPT

At this point, most medical systems and professional societies do not officially recognize “clinical GenAI” as a specialty, but the analogy you raise is important.

Comparison to existing specialties:
Pathology and radiology became specialties because they represented entirely new modalities of information generation and interpretation (microscopy, imaging). They required specialized training, unique workflows, and a body of knowledge distinct from other fields.
Generative AI in medicine similarly represents a new modality of clinical information synthesis: converting unstructured EHR data, literature, and patient narratives into structured predictions, summaries, or diagnostic support. Like pathology or radiology, it can reveal patterns inaccessible to unaided human observation.
Current positioning:
At present, clinical GenAI is best described as a cross-disciplinary domain at the intersection of:
- Clinical informatics (recognized subspecialty)
- Decision science
- Medical AI/ML research
  But it lacks the formal accreditation pathway (board exams, fellowships) that define specialties. Instead, AI responsibilities are distributed among clinical informaticians, data scientists, and practicing physicians.
Arguments for future specialty status:
1. Distinct expertise: Safe use of GenAI requires deep knowledge of AI model behavior, biases, interpretability, and clinical validation—skills not taught in current medical curricula.
2. Workflow integration: Just as radiologists issue reports that guide treatment, GenAI specialists could issue validated AI reports (diagnostic predictions, risk profiles, decision support) for clinicians to incorporate.
3. Regulatory and ethical oversight: GenAI in clinical care brings unique governance, liability, and equity issues that may require a dedicated professional body.
4. Impact on patient care: When GenAI becomes indispensable in complex diagnosis across many diseases, formalizing a specialty may be necessary to ensure standardization, training, and quality.
Likely trajectory:
In the near term, clinical GenAI will likely be recognized as a subspecialty of Clinical Informatics or as a certification area (similar to Addiction Medicine before it became a specialty). If adoption continues and its role in diagnosis/treatment decision-making becomes indispensable, it could evolve into a full specialty—just as radiology did in the early 20th century.

👉 So: Not yet a specialty, but conceptually on a path that mirrors the emergence of radiology and pathology.

Good analysis. If Clinical GenAI evolves into a medical specialty, what new professional roles should be created in health care delivery?

ChatGPT

If Clinical GenAI evolves into a medical specialty, health systems would need to create new professional roles that parallel radiology/pathology (where specialized physicians and allied staff form an ecosystem). Likely roles include:

1. Clinical GenAI Specialist (Physician)

A fully trained physician (e.g., rheumatologist, internist) who undertakes subspecialty training in GenAI.
Responsibilities:
- Interpret AI-generated diagnostic reports, probability distributions, and risk predictions.
- Validate outputs against clinical context.
- Communicate findings to treating physicians, similar to radiology/pathology consults.
Function: Provide AI consult notes as part of the medical record.

2. AI Diagnostic Interpreter / Consultant

Mid-level professionals (e.g., nurse practitioners, physician assistants with AI specialization).
Focus: Daily triage of AI outputs, flagging uncertain or high-risk cases to GenAI specialists.
Role in workflow optimization: ensure AI signals reach the right clinician at the right time.

3. Clinical GenAI Data Scientist

Works at the boundary of healthcare and data science.
Responsibilities:
- Continuous validation of GenAI models against local patient populations.
- Monitor model drift, bias, and safety signals.
- Translate clinician feedback into data-driven model refinements.

4. AI Quality & Safety Officer

A governance role similar to hospital infection-control officers.
Responsibilities:
- Audit AI predictions for accuracy, equity, and compliance.
- Investigate “AI-related adverse events” (missed or harmful suggestions).
- Report metrics to regulatory bodies.

5. GenAI Clinical Educator

Physician or allied professional trained to teach safe AI use to clinical staff.
Provides CME modules, case conferences, and onboarding for new physicians.

6. AI Workflow Engineer

Specializes in embedding AI into EHR systems and clinical pathways.
Works with IT and clinicians to design interfaces so that AI insights are usable, traceable, and don’t cause alert fatigue.

7. Patient-facing AI Navigator

A new allied role to help patients understand when AI is involved in their diagnosis/treatment.
Responsibilities:
- Educate patients about AI contributions.
- Support informed consent for AI-assisted diagnosis.
- Address equity concerns (e.g., explaining model fairness).

8. GenAI Regulatory & Ethics Specialist

Oversees compliance with FDA/EMA/WHO guidelines.
Advises on ethical dilemmas (e.g., secondary findings, rare disease predictions, explainability).

Analogy to Radiology/Pathology

Just as radiology has radiologists, radiology assistants, physicists, and technicians, a Clinical GenAI specialty would require:
- Physician experts (interpretation & consultation).
- Technical experts (data science, workflow engineering).
- Governance roles (safety, ethics).
- Allied health roles (patient communication, education).

Thank you for the comprehensive list of roles. Before Clinical GenAI is officially accepted as a specialty, it will likely be tested through different roles. A recent Nature perspective proposes an “algorithmic consultant” role for AI. I have designed a GenAI-enabled navigator role for monitoring and improving care quality, which won a prize in a global challenge competition. Currently, our collaborative clinical study is evaluating GenAI in the role of providing AI diagnostic reports to support physicians in making more complete diagnoses and treatment plans. What do you think of these roles at the current stage?

ChatGPT

I think your framing is exactly right: before “Clinical GenAI” can mature into a recognized specialty, the field will advance through role experimentation. The three roles you mention—algorithmic consultant, GenAI-enabled navigator, and AI diagnostic report provider—each test different aspects of feasibility, safety, and value.

1. Algorithmic Consultant (as proposed in Nature)

Core idea: Treat AI as a consulting service, analogous to pathology/radiology, where trained clinicians or hybrid professionals provide “algorithmic reads.”
Strength at current stage:
- Emphasizes accountability: a named professional signs off on the AI interpretation.
- Helps build trust, since physicians can query “why” and get structured explanations.
Challenge: Requires creating new training pathways—today, very few clinicians are equipped to interpret algorithmic limitations, calibration, and fairness issues.

2. GenAI-enabled Navigator (your design)

Core idea: Use AI to continuously monitor care quality and provide feedback loops to clinicians and systems.
Strength at current stage:
- Works well in real-world evidence (RWE) studies, where AI can track missed diagnoses, gaps in testing, or inequities.
- Embeds AI into quality improvement rather than direct patient-facing care, reducing regulatory barriers.
Challenge: Must define boundaries so that “navigator” functions do not overlap/conflict with existing roles (nurse navigators, care coordinators).

3. AI Diagnostic Report Provider (your clinical study)

Core idea: Deliver AI-generated diagnostic predictions and complication lists to treating physicians, who decide what to adopt.
Strength at current stage:
- Directly addresses the diagnostic completeness gap (rare complications, atypical comorbidities).
- Generates measurable outcomes (percentage of adoptable predictions, change in diagnostic accuracy, time-to-diagnosis).
- Mimics radiology/pathology reporting—easy for physicians to grasp.
Challenge: Physicians may underuse or over-rely on AI unless outputs are standardized, calibrated, and contextualized. Requires strong governance (audit trails, adjudication).

Synthesis: Why these roles matter now

They each test a different value proposition:
- Consultant → accountability & interpretability
- Navigator → continuous monitoring & quality improvement
- Report provider → direct augmentation of diagnostic decision-making
Together, they form a prototype ecosystem for Clinical GenAI: one role ensures responsible interpretation, one embeds AI in quality cycles, and one integrates AI into frontline diagnosis.
At this stage, it’s less about official specialty recognition and more about proving that each role adds value without harm. These pilots will provide the empirical evidence (e.g., adoption rates, diagnostic yield, time savings, error reduction) that professional societies will need before considering GenAI a formal specialty.

🚀 Healthcare GenAI Toolbox

🤖 GenAI-ELHS Solution

Empowering physicians to accelerate the responsible use of GenAI in healthcare through three growth stages:

🚀 1. Quick Start – Access GenAI agents preclinically validated with synthetic data to enhance routine clinical decision-making.

🛠️ 2. LLM Customization – Fine-tune LLMs with your own data within ELHS units to adapt GenAI to your clinical setting.

🌍 3. Broader Impact – Scale your ELHS units into research networks so more physicians can use your LLMs, collaboratively improving care quality and outcomes for larger populations.

🧠 Meta Open-Source AI

Creating a positive future by making innovation available to all and enabling advancements that can enhance our lives and spur economic growth. Open-source AI provides tools for more people to build awesome things and expand our horizons.

🔹 ELHS Institute Relevant Resources 🔹

🤖 ELHS GenAI Copilot Platform

🌐 ELHS Institute Website

📰 ELHS Newsletters

▶️ ELHS Videos

👉 For Clinical AI technology support, contact us at support@elhsi.org 📩

~ the end ~