New Study Reveals AI Model’s Ability to Utilize Apple Watch Behavior Data for Enhanced Health Predictions
Jul 11, 2025

New Study Reveals AI Model’s Ability to Utilize Apple Watch Behavior Data for Enhanced Health Predictions

AI-summarised brief · reviewed before publication

A recent scientific study has shed light on the potential of using behavioral information derived from Apple Watch sensor data for making health predictions. The research, which leveraged data from the Apple-led Heart and Movement Study, suggests that incorporating behavioral metrics such as physical activity, cardiovascular fitness, and mobility metrics can lead to more accurate health predictions compared to relying solely on raw sensor data. For years, Apple has collaborated with medical researchers on various projects, including studies on menstrual cycles, pickleball, hearing loss, and sleep tracking. The iPhone maker has also analyzed the training and cardio exercises of marathon runners as part of a multi-year Heart and Movement Study that utilized the Apple Watch. The Heart and Movement Study is part of a broader initiative aimed at promoting healthy movement and enhancing cardiovascular health. The new study, titled "Beyond Sensor Data: Foundation Models of Behavioral Data from Wearables Improve Health Predictions," highlights the effectiveness of behavior data in detecting transient and static health states. A static health state includes information such as whether someone is a smoker, has hypertension, or is on beta blockers. Pregnancy, on the other hand, would constitute a transient health state. Sensor data is typically collected at lower-level time scales, such as seconds, whereas transient health states can last for months. The researchers created a wearable health behavior foundation model (WBM) trained on behavioral data from wearables, using 162,000 participants with over 15 billion hourly measurements from the Apple Heart and Movement Study. The WBM uses patterns derived from raw sensor data but focuses on 27 interpretable HealthKit quantities calculated from lower-level sensors using validated methods. These metrics include exercise time, standing time, blood oxygen, heart rate measurements, and more. The study suggests that the WBM outperforms traditional detection methods that rely on data streams from sensors. The model excels in behavior-driven tasks like sleep prediction and improves further when combined with representations of raw sensor data. The research paper notes that the WBM was tested on 57 health-related tasks and outperformed a traditional PPG (photoplethysmograph) model in most situations. Specifically, the WBM outperforms PPG in predicting static health states such as beta blocker use, as it more reliably detects heart rate reductions during the day. It also outperformed PPG in predicting transient health states such as pregnancy, although it was unable to predict diabetes better than PPG. The study notes that low-level sensor data outperforms behavioral data in tasks where physiological information is sufficient. The researchers also explored a hybrid PPG+WBM model, which significantly improved predictive performance. WBM detects behavior patterns derived from raw sensor data, which can include significant information about an individual's health. PPG, on the other hand, can recognize immediate physiological changes. The two complement each other, but only when physiological information alone isn't enough, and where behavior is a meaningful predictor. Ultimately, the study concludes that across most tasks, the combination of embeddings of WBM and the PPG model results in the most accurate models. The combination achieves the best age prediction performance across all models considered, clearly outperforming even the best individual models.