top of page

Real-Time Patient Vital Monitoring and Anomaly Detection Using Streaming Analytics

  • Writer: Arturo Arriaga
    Arturo Arriaga
  • Jan 26, 2025
  • 3 min read


Programming In Scala for Big Data Systems


The goal of this project is to design and implement a streaming analytics pipeline to monitor real-time vital signs of patients. The pipeline will detect abnormal vital signs in heart rate, blood pressure, and oxygen saturation levels, and trigger alerts for potential medical intervention. By using Apache Beam, I aim to show how big data technology can improve patient monitoring and improve healthcare outcomes.


In healthcare, the continuous monitoring of a patient's vital signs (for example, heart rate, blood pressure, oxygen saturation) is critical for early detection of problems that could point to health deterioration. Analyzing streaming data from multiple patients in real-time is challenging and requires efficient systems that are also scalable. The purpose of this project is to develop a streaming application to analyze real-time patient vital sign data and detect anomalies using sliding windows. Alerts will be generated when a patient’s vital signs deviate from our predefined thresholds or exhibit abnormal trends.



Data Visualization using Matplotlib
Data Visualization using Matplotlib


Data Set

The data for this project is simulated patient vital sign measurements generated over time to mimic real-life scenarios for both healthy and unhealthy patients, with patterns designed to reflect anomalies such as elevated or decreasing vitals and gradual recovery trends.


We track the following data points:

  • patientId

  • timestamp

  • heartRate

  • bloodPressureSystolic

  • bloodPressureDiastolic

  • oxygenSaturation



Samples from our dataset
Samples from our dataset

Processing Pipeline


System Design
System Design

Collection: We will gather data about patient vital signs from connected health monitoring devices. Our data source in this project is mock data simulating health sensors transmitting patient information (i.e., heart rate, blood pressure, and oxygen saturation). The idea is that these devices will send patient data to a Kafka topic. Each record represents a patient’s current vital statistics with timestamped to simulate real-time measurement.

 

Preparation: We then need to prepare our data. This involves cleaning, validating, and transforming the raw data into a format suitable for analytics. We also ensure data integrity (e.g., checking for null values or invalid ranges like heart rate > 250 bpm).

 

Computation: Apply analytics and detect critical health patterns in real-time. For this we will us Scala processing with Spark or direct computation logic. We’ll compute aggregate statistics like average heart rate or oxygen saturation by patient over a time window and detect anomalies (e.g., oxygen saturation < 90%, extreme fluctuations in blood pressure). We can then aggregate results (e.g., maximum heart rate by patient) for further insights.



 

Presentation: Visualize the computed results and notify healthcare providers of critical health events. For this we’ll use Matplotlib for graphing data trends, such as heart rate or blood pressure over time. We’ll store aggregated data summaries into files or databases for long-term analysis.


Pipeline Overview



Conclusions

The objective of this project was to design and implement a scalable streaming analytics pipeline for monitoring real-time vital signs (heart rate, blood pressure, and oxygen saturation) of patients. The pipeline finds anomalies in these vital signs, triggers alerts for potential medical intervention, and computes aggregated metrics such as average heart rate. This project shows how big data technologies can enhance patient monitoring and healthcare outcomes.


Potential Options for Scaling to a Distributed System


Advanced Anomaly Detection: Implement machine learning models for detecting anomalies based on historical trends and multivariate analysis rather than static thresholds.

 

Interactive Dashboards:

Build real-time dashboards with tools like Apache Superset or Grafana for live monitoring of vital signs and anomalies.

 

Integration with Real Data

Extend the pipeline to process real-world patient data from IoT devices, but make sure to maintain compliance with data privacy regulations like HIPAA.


 
 
 

Comments


Let's connect on LinkedIn

bottom of page