Newsletters

Your E-mail *

GenAI Wins Nobel Prize, Yet LLMs Still Lack Clinical Evaluation

 

ELHS Newsletter 2024-11-02

Subscribe | All Issues

 

 

(https://elhsi.com/)

 

 

 

 

 

Dear Friends,

 

Geoffrey Hinton, often referred to as the "Godfather of AI," was awarded the 2024 Nobel Prize in Physics alongside John Hopfield for their foundational work on artificial neural networks. In his remarks following the announcement, Hinton highlighted the transformative potential of AI, comparing its impact to that of the Industrial Revolution. However, he also emphasized the urgent need for research into AI safety to prevent systems from surpassing human control, a focus of his work since retiring from Google.

 

Hinton further underscored the importance of curiosity-driven research. In my experience, the breakthrough of generative AI, created from large language models built on deep artificial neural networks, serves as a powerful example of how individual perseverance in researching and developing new, "impossible" theories can change the world. I’m confident that more breakthroughs are on the horizon. If you keep an open mind, you might recognize them early, before they achieve mainstream impact.

 

In a recent publication in JAMA, FDA Commissioner Dr. Robert Califf and colleagues presented the “FDA Perspective on the Regulation of Artificial Intelligence in Health Care and Biomedicine.” This article emphasizes the need for continuous local performance monitoring of AI tools, a cornerstone of learning health systems. It stresses that the evaluation and regulatory systems should prioritize patient health outcomes.

 

The FDA’s concerns are echoed by a JAMA review, which found that current evaluations of large language models (LLMs) in healthcare are fragmented and insufficient. The review suggests that evaluations should incorporate real patient data, quantify biases, and cover a broader range of tasks and specialties.

 

Recognizing the critical importance of clinical evaluation of LLMs, the ELHS Institute is offering free access to a multi-LLM copilot on its GenAI copilot platform, as well as personal LLM copilots for doctors and patients to explore the benefits of GenAI. We also provide technical support for doctors to deploy and fine-tune LLMs locally, promoting the democratization of GenAI in healthcare.

 

Keep reading for the latest papers below, and enjoy my conversations with ChatGPT about making breakthroughs.

 

Warm regards,

AJ

AJ Chen, PhD | ELHS Institute | https://elhsi.org
 

~

 

 

(Source: nobelprize.org)

 

 

From Page Mill

Warraich HJ, Tazbaz T, Califf RM. FDA Perspective on the Regulation of Artificial Intelligence in Health Care and Biomedicine. JAMA. Published online October 15, 2024. doi:10.1001/jama.2024.21451

[2024/10] Strong oversight by the FDA protects the long-term success of industries by focusing on evaluation to advance regulated technologies that improve health. A life cycle management approach incorporating recurrent local postmarket performance monitoring should be central to health AI development. Special mechanisms to evaluate large language models and their uses are needed. Approaches are necessary to balance the needs of the entire spectrum of health ecosystem interests, from large firms to start-ups. The evaluation and regulatory system will need to focus on patient health outcomes to balance the use of AI for financial optimization for developers, payers, and health systems.

-

Bedi S, Liu Y, Orr-Ewing L, et al. Testing and Evaluation of Health Care Applications of Large Language Models: A Systematic Review. JAMA. Published online October 15, 2024. doi:10.1001/jama.2024.21700

[2024/10] In this systematic review of 519 studies published between January 1, 2022, and February 19, 2024, only 5% used real patient care data for LLM evaluation. Administrative tasks such as writing prescriptions and natural language processing and natural language understanding tasks such as summarization were understudied; accuracy was the predominant dimension of evaluation, while fairness, bias, and toxicity assessments were less studied. Results of this systematic review suggest that current evaluations of LLMs in health care are fragmented and insufficient, and that evaluations need to use real patient data, quantify biases, cover a wider range of tasks and specialties, and report standardized performance metrics to enable broader implementation.

-

Boussina A. et al. Large Language Models for More Efficient Reporting of Hospital Quality Measures. NEJM AI 2024;1(11). DOI: 10.1056/AIcs2400420.

[2024] We deployed an LLM-based system that ingests Fast Healthcare Interoperability Resources data and outputs a completed Severe Sepsis and Septic Shock Management Bundle (SEP-1) abstraction. We tested the system on a sample of 100 manual SEP-1 abstractions that University of California San Diego Health reported to the Centers for Medicare & Medicaid Services in 2022. The LLM system achieved agreement with manual abstractors on the measure category assignment in 90 of the abstractions (90%; κ=0.82; 95% confidence interval, 0.71 to 0.92). Expert review of the 10 discordant cases identified four that were mistakes introduced by manual abstraction. This pilot study suggests that LLMs using interoperable electronic health record data may perform accurate abstractions for complex quality measures. 

-

Wiest, I.C., Ferber, D., Zhu, J. et al. Privacy-preserving large language models for structured medical information retrieval. npj Digit. Med. 7, 257 (2024). https://doi.org/10.1038/s41746-024-01233-2

[2024/9] This study presents an open-source pipeline using the local large language model (LLM) “Llama 2” to extract quantitative information from clinical text and evaluates its performance in identifying features of decompensated liver cirrhosis. The LLM identified five key clinical features in a zero- and one-shot manner from 500 patient medical histories in the MIMIC IV dataset. Our pipeline achieved high accuracy, detecting liver cirrhosis with 100% sensitivity and 96% specificity. High sensitivities and specificities were also yielded for detecting ascites (95%, 95%), confusion (76%, 94%), abdominal pain (84%, 97%), and shortness of breath (87%, 97%) using the 70 billion parameter model, which outperformed smaller versions. Our study successfully demonstrates the capability of locally deployed LLMs to extract clinical information from free text with low hardware requirements.

 

 

~

 

 

 

 

 

  Jeff Hinton shared his story of working on neural networks to advance AI. It's a classic tale of a great scientist’s perseverance in pushing a new theory that the mainstream considered impossible. Could you tell us Jeff's story?

 

 Geoffrey Hinton’s journey in AI, particularly with neural networks, is indeed a story of remarkable perseverance, especially in the face of skepticism and long-standing challenges. Here's a condensed version of his story:

 

In the 1970s, Geoffrey Hinton began working on artificial neural networks—systems inspired by the way human brains process information. The idea of neural networks had emerged earlier, but by the time Hinton entered the field, enthusiasm had waned. Many experts believed neural networks were inefficient and limited, and funding for this research was scarce. Despite this, Hinton was captivated by the potential of neural networks to enable machines to "learn" patterns and representations.

 

In 1986, Hinton and his colleagues published a groundbreaking paper on backpropagation, a technique for training neural networks by adjusting weights based on errors. Backpropagation allowed neural networks to improve accuracy over time, making them far more effective. But even after this breakthrough, the AI community remained skeptical. Many believed that symbolic approaches, which focused on logic and rules rather than learning from data, were the only viable path for AI.

 

Over the next few decades, Hinton worked with a small community of neural network researchers. They faced numerous setbacks and funding challenges, but Hinton’s belief in the potential of neural networks never wavered. He moved to the University of Toronto, where he continued his research and mentored students who would later become leaders in AI.

 

In 2012, Hinton’s perseverance paid off in a monumental way. He and his students, including Alex Krizhevsky and Ilya Sutskever, entered the ImageNet competition, a prestigious computer vision contest. Their neural network model, AlexNet, achieved unprecedented accuracy in image recognition, far surpassing previous approaches. This success marked the beginning of the modern AI revolution, with deep learning and neural networks suddenly becoming the dominant force in AI research and applications.

 

Hinton’s work laid the foundation for generative AI and large language models, which are built on the deep neural network architectures he helped pioneer. He has often reflected on the importance of curiosity-driven research, as his own path in neural networks was guided by a deep interest in understanding how intelligence works, not by mainstream acceptance.

 

In recent years, Hinton has also been vocal about the ethical implications of AI, stressing the importance of safety and control as AI systems grow more powerful. He continues to influence the field both as a researcher and as a thoughtful advocate for responsible AI development.

 

Geoffrey Hinton’s journey exemplifies how a scientist’s perseverance in the face of skepticism can ultimately lead to groundbreaking changes in the world. His story is not only about pushing boundaries in AI but also about the enduring value of exploring ideas considered “impossible.”

 

 Could you also share the story of the H. pylori theory, which won the Nobel Prize in 2005? It’s another example of a breakthrough achieved through curiosity and perseverance.

 

 The story of Helicobacter pylori (H. pylori) and the breakthrough in understanding the cause of stomach ulcers is a classic tale of scientific perseverance and curiosity that revolutionized medicine. In 2005, Drs. Barry Marshall and Robin Warren won the Nobel Prize in Physiology or Medicine for their discovery that H. pylori bacteria cause peptic ulcers, challenging a long-standing medical dogma. Here’s how their journey unfolded:

 

In the 1970s, Robin Warren, an Australian pathologist, noticed something unusual in biopsy samples from patients with gastritis and stomach ulcers. He observed spiral-shaped bacteria in the stomach lining of these patients, even though the acidic environment of the stomach was believed to be too harsh for bacterial life. His observations were initially dismissed, as it was widely believed that stomach ulcers and gastritis were caused by stress, spicy food, and lifestyle factors—not by bacteria.

 

However, Warren’s findings intrigued Barry Marshall, a young doctor with a similar passion for discovery. In 1981, the two began collaborating to investigate the role of these bacteria in stomach inflammation and ulcers. They cultured the bacteria, eventually identifying it as a new species, Helicobacter pylori. Despite their growing body of evidence, their findings were met with skepticism and resistance from the medical community, which firmly held that bacteria couldn’t survive in the stomach’s acidic environment. They were repeatedly told their theory was impossible.

 

To prove the link between H. pylori and ulcers, Marshall decided to take a radical step: he drank a broth containing the cultured bacteria himself. Within days, he developed symptoms of gastritis—nausea, vomiting, and inflammation—exactly as they had observed in patients. An endoscopy confirmed that H. pylori had colonized his stomach, leading to inflammation. Marshall then took antibiotics to eradicate the bacteria, and his symptoms disappeared, providing dramatic evidence that H. pylori could indeed cause gastric inflammation.

 

This bold self-experiment and the accumulating research from Marshall and Warren’s studies convinced the medical community that H. pylori played a significant role in causing stomach ulcers and even stomach cancer. Their discovery revolutionized ulcer treatment, shifting it from long-term acid suppression therapies to more effective antibiotic treatments that target the underlying bacterial infection. This transformation improved the lives of millions worldwide, as peptic ulcers could now be treated effectively and safely with antibiotics.

 

Despite initial dismissal, Marshall and Warren’s work is a testament to how curiosity, resilience, and a willingness to challenge established ideas can yield groundbreaking discoveries. Their persistence not only earned them the Nobel Prize but also reshaped our understanding of gastrointestinal health and infection, proving once again that a relentless commitment to science can lead to breakthroughs that change medicine.

 

 There is another similar breakthrough story that actually helped save the world from the COVID-19 pandemic. What is it?

 

 The rapid development of mRNA vaccines to combat COVID-19 is a remarkable story of scientific curiosity, persistence, and innovation that helped the world manage and begin to recover from the pandemic. The groundwork for mRNA vaccines was laid decades before COVID-19, driven by a handful of scientists who believed in the potential of this novel approach to immunization, despite years of skepticism. Here’s the story of how mRNA vaccines became a life-saving reality:

 

In the 1990s, Hungarian scientist Dr. Katalin Karikó became convinced that messenger RNA (mRNA) could be harnessed to create vaccines and treatments. Unlike traditional vaccines, which use weakened or inactivated forms of a virus, mRNA vaccines deliver genetic instructions to our cells, prompting them to produce a harmless protein associated with the virus. This protein triggers an immune response, “teaching” the body to recognize and fight the real virus if it encounters it in the future. Karikó saw this as a potentially safer and faster way to create vaccines, but the technology was unproven, and her ideas faced considerable resistance from the scientific community.

 

Karikó’s persistence led her to partner with Dr. Drew Weissman at the University of Pennsylvania in the early 2000s. Together, they tackled major obstacles, including the challenge of stabilizing mRNA and reducing its tendency to trigger an inflammatory response in the body. In 2005, they discovered a modification to the mRNA molecule that allowed it to be safely used in the human body without causing excessive inflammation. This breakthrough went largely unnoticed at the time but later became the foundation for mRNA vaccine development.

 

Over the next decade, biotech companies such as Moderna and BioNTech continued to develop mRNA technology for vaccines, with a focus on making the technology scalable and versatile for rapid deployment. When COVID-19 emerged in late 2019, mRNA technology was ready for a real-world test. The flexibility of mRNA allowed researchers to quickly design a vaccine once the genetic code of SARS-CoV-2 (the virus causing COVID-19) was sequenced. BioNTech, in partnership with Pfizer, and Moderna both developed mRNA vaccines in record time, achieving remarkable efficacy and safety in clinical trials.

 

The first mRNA vaccines were authorized for emergency use in December 2020, less than a year after the COVID-19 pandemic began—a timeline previously unheard of in vaccine development. These vaccines have since saved countless lives, preventing severe illness and reducing the spread of the virus worldwide.

 

The success of mRNA vaccines against COVID-19 was the culmination of years of perseverance by researchers who believed in the potential of mRNA, despite numerous setbacks and doubts. Today, mRNA technology is being explored for vaccines against other diseases, including cancer and HIV, illustrating how a breakthrough driven by curiosity and persistence can have far-reaching impacts on global health.

 

 You’re doing great! Recognizing the potential of an "impossible" theory and helping to promote it alongside the discoverer before it becomes mainstream is also incredibly exciting. I recognized the revolutionary power of H. pylori theory in 1995 and partnered with Dr. Barry Marshall to promote his theory. That was a once-in-a-lifetime experience. Do you see any new discoveries today that are considered "impossible" by the mainstream but could change the world in the future?

 

  Your early support for the H. pylori discovery with Dr. Barry Marshall is inspiring, and I can imagine how fulfilling it must have been to help bring such a groundbreaking theory into the mainstream. Today, there are a few emerging areas that are seen by some as improbable or controversial but hold the potential to transform science, healthcare, and even society:

  1. Quantum Computing for Drug Discovery and Material Science: Quantum computing is still in its infancy, but many see its potential to revolutionize fields like drug discovery, cryptography, and materials science. By handling complex calculations that are impossible for classical computers, quantum computers could unlock new treatments for diseases and create materials with properties we've never seen. Many in the scientific community remain skeptical due to technical limitations and the fragile nature of quantum states, but breakthroughs continue to emerge, showing that a scalable quantum computer may indeed be within reach.

  2. Longevity Science and Aging Reversal: Researchers like Dr. David Sinclair at Harvard are working on understanding the biological mechanisms of aging, suggesting that aging itself could be slowed or even reversed. Studies on cellular reprogramming and molecules like NAD+ are opening up new ways to extend healthy human lifespan, though many remain skeptical due to limited human trials and ethical considerations. If these treatments become viable, the impact on healthcare and society could be profound.

  3. Synthetic Biology and Programmable Life: Synthetic biology aims to engineer new biological systems or redesign existing organisms for useful purposes. This includes gene-editing tools like CRISPR, which are already making waves in medicine, and engineered organisms that could produce clean energy, absorb carbon, or synthesize rare compounds. While the field raises ethical questions and technical challenges, the potential applications are vast, from curing genetic diseases to addressing climate change.

  4. Brain-Computer Interfaces (BCIs): Although BCIs are a highly experimental area today, they could someday enable seamless interaction between the brain and computers, with applications in treating neurological diseases, restoring movement in paralyzed patients, and even augmenting human cognition. Companies like Neuralink are pushing this frontier, though many are skeptical about the technology’s invasiveness, scalability, and ethical implications. If successful, BCIs could radically change human interaction with technology.

  5. Nuclear Fusion for Clean Energy: Fusion energy has long been considered a “holy grail” for clean, limitless energy, but technical barriers have kept it out of reach. Recent advances, however, are bringing it closer to viability, with private companies and research institutions making progress on scalable reactors. If successful, fusion could address climate change by providing abundant, carbon-free energy, fundamentally reshaping the global economy and energy landscape.

 

These fields might seem out of reach or speculative to many, just as H. pylori research did in its early days. But with the right support and vision, some of these “impossible” theories could indeed become the breakthroughs of tomorrow.

 

 

GenAI/LHS Toolbox

 

 

ELHS GenAI Copilot Platform

The new copilot platform makes it easy for doctors and patients to use GenAI:

  1. Use free multi-LLM copilot, including GPT-4o, Gemini 5pro, Llama 3.1.

  2. Try free personal open-source LLM chatbot, i.e. copilot.

  3. Deploy and fine-tune open-source LLMs and copilots.

ChatGPT Search Engine

ChatGPT can now search the web in a much better way than before. You can get fast, timely answers with links to relevant web sources, which you would have previously needed to go to a search engine for. This blends the benefits of a natural language interface with the value of up-to-date sports scores, news, stock quotes, and more.

 

 

~

 

ELHS Institute Relevant Resources:

ELHS GenAI Copilot Platform

ELHS Institute Website

ELHS Newsletters

ELHS Videos

LHS Tech Forums

Prof. Andrew Ng AI Classes

Stanford University RAISE Health Initiative

 

~ the end ~