FOD#38: AI May Be Ushering Us Into a New Era of Health Monitoring

Last Monday, a nurse suggested we try a wireless monitor to track my vitals and those of my unborn baby.

“We call this device “Monica, the monitor!” It’s either a dream to work with or a total nightmare,” the nurse told me.

On that day, “Monica” (actually the Novii Wireless Patch System) performed exceptionally well. I was able to move freely, without the encumbrance of wires, while giving birth to my daughter. This technology harnesses passive signal acquisition to differentiate between fetal and maternal heart signals and to detect uterine contractions. Data is wirelessly transmitted to a monitoring unit for real-time observation. This system enhances accuracy and reduces false alarms, offering so much-needed mobility during labor.

I thought: that writing and theorizing about technologies is one thing, but experiencing their remarkable capabilities firsthand is quite another, especially when a device functions flawlessly. A question arose: What can foundation models add to wearables? Right after my experience with “Monica”, a recent paper from Google Research and MIT researchers caught my attention. Titled ‘Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data,’ and authored by Kim et al., this paper delves into the application of LLMs in the health sector, focusing on interpreting data from wearable sensors for health prediction. Intriguingly, these models are fed data not from medical records or doctor’s notes, but from wearable devices like Fitbits, which track daily steps, heart rate, sleep patterns, and more — akin to ‘Monica.’

The research evaluated eight cutting-edge LLMs: Med-Alpaca, PMC-Llama, Asclepius, ClinicalCamel, Flan-T5, Palmyra-Med, GPT-3.5, and GPT-4, across six public health datasets. They conducted experiments on thirteen health prediction tasks related to mental health, activity, metabolism, sleep, and cardiac assessments.

The team experimented with various methods, including zero-shot and few-shot prompting (teaching the model with minimal or no examples), instructional fine-tuning (tailoring the model to specific tasks), and even some parameter-efficient fine-tuning for computational efficiency.

Particularly fascinating is the effectiveness of context enhancement in prompts, which involves adding user context, health knowledge, and temporal information. This approach yielded up to a 23.8% improvement in performance.

Healthcare is an exceedingly sensitive field, but the potential benefits of generative AI for humans are immense, especially with the power of foundation models. Health-LLM explores the future where wearables are not just passive trackers but proactive health guardians.

Another recent groundbreaking paper in healthcare comes from Stanford and Stability AI researchers, titled CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation. The most fascinating aspect of this paper is the development of CheXagent, an advanced foundation model specifically designed for interpreting chest X-rays. This model uniquely combines a clinical LLM, a specialized vision encoder, and a vision-language bridging network, demonstrating exceptional performance in interpreting complex medical images. Its ability to outperform existing models in accuracy and fairness evaluations marks a significant advancement in medical imaging AI technology. It can save so much time! And possibly lives.

(The newborn girl — Reason Leeloo Joy — sends her regards. We took a week off last week but are now back on track, exploring the AI world to understand how she and her four brothers will live in it and navigate it.)

News from The Usual Suspects ©

Sam Altman and OpenAI

OpenAI released two new embedding models (text-embedding-3-small and text-embedding-3-large) and updated versions of GPT-4 Turbo, GPT-3.5 Turbo, and a text moderation model. The new embedding models represent content as numerical sequences, enhancing machine learning tasks like clustering or retrieval. They are also more efficient and cost-effective.
Meanwhile, Sam Altman is in discussions with Middle Eastern backers, including wealthy investors and chip fabricators like TSMC, to launch a new chip venture. This move aims to meet OpenAI’s growing semiconductor needs and reduce dependence on Nvidia. The venture’s structure is unclear, and it might be a separate entity or a subsidiary of OpenAI.

Blackstone steps in

Another big player is investing heavily in the AI revolution. Blackstone is building a $25 billion network of power-intensive data centers across America. Following its $10 billion acquisition of QTS, a major data center operator, Blackstone is developing massive facilities to meet tech giants' growing digital and AI demands. These projects, consuming electricity equivalent to millions of homes, are reshaping communities and sparking debates over resource use and local benefits. Despite challenges, including strained power supplies and public backlash, Blackstone views this venture as one of its potentially best investments, illustrating the increasing significance and complexity of data infrastructure in the AI era.

Elon Musk, xAI and Tesla

Elon Musk has been making headlines recently, seeking a $6 billion investment for xAI from global investors in the Middle East, Hong Kong, Japan, and Korea. If successful, xAI’s valuation could reach $20 billion, surpassing Anthropic’s $18.4 billion but falling behind OpenAI’s $100 billion. However, Musk’s recent threat to remove Tesla AI projects unless he secures 25% control has stirred dissatisfaction among current investors and might affect talks with potential new backers. Meanwhile, Tesla is planning a $500 million investment in a “Dojo” supercomputer at its Buffalo, New York facility, underscoring the company’s commitment to advancing AI technology.

Google and Hugging Face

The recently announced partnership between Hugging Face and Google Cloud aims to make AI more accessible. It focuses on shared initiatives in open science and source, leveraging both Hugging Face’s open models and Google Cloud’s technology. The goal is to facilitate the development of AI technologies for a wider range of users and applications.
Meanwhile, Google Bard has ascended to the second position on HuggingFace’s Chatbot Arena Leaderboard, overtaking GPT-4 and now only behind GPT-4 Turbo in the community-driven LLM rankings.

The freshest research papers, categorized for your convenience

Model Compression and Efficiency

SLICEGPT: A technique for efficiently compressing large language models by removing parameters while retaining performance →read the paper
DeepSeek-Coder: Focuses on developing high-performing, multi-language code generation models with an extensive parameter range →read the paper
SPACTOR-T5: Introduces an efficient pre-training method for T5 models, reducing computational requirements →read the paper
MEDUSA: A framework for accelerating large language model inference using multiple decoding heads →read the paper

LLM Capabilities and Evaluation

From GPT-4 to Gemini and Beyond: Evaluates MLLMs for generalizability, trustworthiness, and causality across multiple modalities →read the paper
MaLA-500: Develops a multilingual LLM supporting over 500 languages, enhancing language model accessibility →read the paper
Spotting LLMs with Binoculars: Introduces a method for zero-shot detection of text generated by large language models →read the paper

Multimodal and Specialized Models

Rethinking Patch Dependence for Masked Autoencoders: Examines the decoding mechanism in masked autoencoders for improved image processing →read the paper
MM-LLMs: A comprehensive survey on the advancements and capabilities of multimodal large language models →read the paper
CMMMU: Establishes a benchmark for evaluating large multimodal models in the Chinese context →read the paper
SpatialVLM: Enhances vision-language models with advanced spatial reasoning capabilities →read the paper

AI Training and Data Generation Techniques

Learning Universal Predictors: Explores training neural networks for universal prediction strategies, approaching Solomonoff Induction →read the paper
Unitxt: A Python library for flexible and reproducible data preparation in generative NLP →read the paper
GENIE: A method for generating high-quality, content-grounded synthetic data using large language models →read the paper
MambaByte: Investigates a token-free language model that learns directly from raw bytes →read the paper
Meta-Prompting: Enhances language models with a task-agnostic scaffolding technique for better performance →read the paper
WARM: An approach for aligning large language models with human preferences in reinforcement learning →read the paper

Language Models and Role-Playing

Small Language Model Meets with Reinforced Vision Vocabulary: Presents a compact model integrating enhanced vision vocabulary for efficient visual information encoding →read the paper
Large Language Models are Superpositions of All Characters: Develops a method for role-playing dialogues using large language models →read the paper
Orion-14B: Introduces a collection of multilingual large language models for conversational applications →read the paper

In other newsletters

Great dive into Apple’s “Update on apps distributed in the European Union” from Hardcore Software
Fun read from Interconnects about Model Merging “When what seems like pure LLM black magic is supported by the literature”
Is This The Year Apple Awakens in AI? Madrona investors’ opinion.
Andrew Ng describes his experience in Davos and World Economic Forum. It’s about AI but in Ng’s signature humanistic style.

Discussion (20)

Not yet any reply