Skip to content

Photoplethysmography

Photoplethysmography

Publish Date Title Authors PDF Code Abstract
2023-07-24 Remote Bio-Sensing: Open Source Benchmark Framework for Fair Evaluation of rPPG Dae Yeol Kim et.al. 2307.12644v1 link Remote Photoplethysmography (rPPG) is a technology that utilizes the light absorption properties of hemoglobin, captured via camera, to analyze and measure blood volume pulse (BVP). By analyzing the measured BVP, various physiological signals such as heart rate, stress levels, and blood pressure can be derived, enabling applications such as the early prediction of cardiovascular diseases. rPPG is a rapidly evolving field as it allows the measurement of vital signals using camera-equipped devices without the need for additional devices such as blood pressure monitors or pulse oximeters, and without the assistance of medical experts. Despite extensive efforts and advances in this field, serious challenges remain, including issues related to skin color, camera characteristics, ambient lighting, and other sources of noise, which degrade performance accuracy. We argue that fair and evaluable benchmarking is urgently required to overcome these challenges and make any meaningful progress from both academic and commercial perspectives. In most existing work, models are trained, tested, and validated only on limited datasets. Worse still, some studies lack available code or reproducibility, making it difficult to fairly evaluate and compare performance. Therefore, the purpose of this study is to provide a benchmarking framework to evaluate various rPPG techniques across a wide range of datasets for fair evaluation and comparison, including both conventional non-deep neural network (non-DNN) and deep neural network (DNN) methods. GitHub URL: https://github.com/remotebiosensing/rppg.
2023-07-18 Robust peak detection for photoplethysmography signal analysis Márton Á. Goda et.al. 2307.10398v1 null Efficient and accurate evaluation of long-term photoplethysmography (PPG) recordings is essential for both clinical assessments and consumer products. In 2021, the top opensource peak detectors were benchmarked on the Multi-Ethnic Study of Atherosclerosis (MESA) database consisting of polysomnography (PSG) recordings and continuous sleep PPG data, where the Automatic Beat Detector (Aboy) had the best accuracy. This work presents Aboy++, an improved version of the original Aboy beat detector. The algorithm was evaluated on 100 adult PPG recordings from the MESA database, which contains more than 4.25 million reference beats. Aboy++ achieved an F1-score of 85.5%, compared to 80.99% for the original Aboy peak detector. On average, Aboy++ processed a 1 hour-long recording in less than 2 seconds. This is compared to 115 seconds (i.e., over 57-times longer) for the open-source implementation of the original Aboy peak detector. This study demonstrated the importance of developing robust algorithms like Aboy++ to improve PPG data analysis and clinical outcomes. Overall, Aboy++ is a reliable tool for evaluating long-term wearable PPG measurements in clinical and consumer contexts.
2023-07-17 Quality Assessment of Photoplethysmography Signals For Cardiovascular Biomarkers Monitoring Using Wearable Devices Felipe M. Dias et.al. 2307.08766v1 null Photoplethysmography (PPG) is a non-invasive technology that measures changes in blood volume in the microvascular bed of tissue. It is commonly used in medical devices such as pulse oximeters and wrist worn heart rate monitors to monitor cardiovascular hemodynamics. PPG allows for the assessment of parameters (e.g., heart rate, pulse waveform, and peripheral perfusion) that can indicate conditions such as vasoconstriction or vasodilation, and provides information about microvascular blood flow, making it a valuable tool for monitoring cardiovascular health. However, PPG is subject to a number of sources of variations that can impact its accuracy and reliability, especially when using a wearable device for continuous monitoring, such as motion artifacts, skin pigmentation, and vasomotion. In this study, we extracted 27 statistical features from the PPG signal for training machine-learning models based on gradient boosting (XGBoost and CatBoost) and Random Forest (RF) algorithms to assess quality of PPG signals that were labeled as good or poor quality. We used the PPG time series from a publicly available dataset and evaluated the algorithm s performance using Sensitivity (Se), Positive Predicted Value (PPV), and F1-score (F1) metrics. Our model achieved Se, PPV, and F1-score of 94.4, 95.6, and 95.0 for XGBoost, 94.7, 95.9, and 95.3 for CatBoost, and 93.7, 91.3 and 92.5 for RF, respectively. Our findings are comparable to state-of-the-art reported in the literature but using a much simpler model, indicating that ML models are promising for developing remote, non-invasive, and continuous measurement devices.
2023-07-14 An Embedded Auto-Calibrated Offset Current Compensation Technique for PPG/fNIRS System Sadan Saquib Khan et.al. 2307.07414v1 null Usually, the current generated by the photodiode proportional to the oxygenated blood in the photoplethysmography (PPG) and functional infrared spectroscopy (fNIRS) based recording systems is small as compared to the offset-current. The offset current is the combination of the dark current of the photodiode, the current due to ambient light, and the current due to the reflected light from fat and skull . The relatively large value of the offset current limits the amplification of the signal current and affects the overall performance of the PPG/fNIRS recording systems. In this paper, we present a mixed-signal auto-calibrated offset current compensation technique for PPG and fNIRS recording systems. The system auto-calibrates the offset current, compensates using a dual discrete loop technique, and amplifies the signal current. Thanks to the amplification, the system provides better sensitivity. A prototype of the system is built and tested for PPG signal recording. The prototype is developed for a 3.3 V single supply. The results show that the proposed system is able to effectively compensate for the offset current.
2023-07-12 Personalized Anomaly Detection in PPG Data using Representation Learning and Biometric Identification Ramin Ghorbani et.al. 2307.06380v1 null Photoplethysmography (PPG) signals, typically acquired from wearable devices, hold significant potential for continuous fitness-health monitoring. In particular, heart conditions that manifest in rare and subtle deviating heart patterns may be interesting. However, robust and reliable anomaly detection within these data remains a challenge due to the scarcity of labeled data and high inter-subject variability. This paper introduces a two-stage framework leveraging representation learning and personalization to improve anomaly detection performance in PPG data. The proposed framework first employs representation learning to transform the original PPG signals into a more discriminative and compact representation. We then apply three different unsupervised anomaly detection methods for movement detection and biometric identification. We validate our approach using two different datasets in both generalized and personalized scenarios. The results show that representation learning significantly improves anomaly detection performance while reducing the high inter-subject variability. Personalized models further enhance anomaly detection performance, underscoring the role of personalization in PPG-based fitness-health monitoring systems. The results from biometric identification show that it's easier to distinguish a new user from one intended authorized user than from a group of users. Overall, this study provides evidence of the effectiveness of representation learning and personalization for anomaly detection in PPG data.
2023-07-07 A Self-Supervised Algorithm for Denoising Photoplethysmography Signals for Heart Rate Estimation from Wearables Pranay Jain et.al. 2307.05339v1 null Smart watches and other wearable devices are equipped with photoplethysmography (PPG) sensors for monitoring heart rate and other aspects of cardiovascular health. However, PPG signals collected from such devices are susceptible to corruption from noise and motion artifacts, which cause errors in heart rate estimation. Typical denoising approaches filter or reconstruct the signal in ways that eliminate much of the morphological information, even from the clean parts of the signal that would be useful to preserve. In this work, we develop an algorithm for denoising PPG signals that reconstructs the corrupted parts of the signal, while preserving the clean parts of the PPG signal. Our novel framework relies on self-supervised training, where we leverage a large database of clean PPG signals to train a denoising autoencoder. As we show, our reconstructed signals provide better estimates of heart rate from PPG signals than the leading heart rate estimation methods. Further experiments show significant improvement in Heart Rate Variability (HRV) estimation from PPG signals using our algorithm. We conclude that our algorithm denoises PPG signals in a way that can improve downstream analysis of many different health metrics from wearable devices.
2023-07-06 Learned Kernels for Interpretable and Efficient PPG Signal Quality Assessment and Artifact Segmentation Sully F. Chen et.al. 2307.05385v1 null Photoplethysmography (PPG) provides a low-cost, non-invasive method to continuously monitor various cardiovascular parameters. PPG signals are generated by wearable devices and frequently contain large artifacts caused by external factors, such as motion of the human subject. In order to ensure robust and accurate extraction of physiological parameters, corrupted areas of the signal need to be identified and handled appropriately. Previous methodology relied either on handcrafted feature detectors or signal metrics which yield sub-optimal performance, or relied on machine learning techniques such as deep neural networks (DNN) which lack interpretability and are computationally and memory intensive. In this work, we present a novel method to learn a small set of interpretable convolutional kernels that has performance similar to -- and often better than -- the state-of-the-art DNN approach with several orders of magnitude fewer parameters. This work allows for efficient, robust, and interpretable signal quality assessment and artifact segmentation on low-power devices.
2023-06-19 ApSense: Data-driven Algorithm in PPG-based Sleep Apnea Sensing Tanut Choksatchawathi et.al. 2306.10863v1 null In this paper, we utilized obstructive sleep apnea and cardiovascular disease-related photoplethysmography (PPG) features in constructing the input to deep learning (DL). The features are pulse wave amplitude (PWA), beat-to-beat or RR interval, a derivative of PWA, a derivative of RR interval, systolic phase duration, diastolic phase duration, and pulse area. Then, we develop DL architectures to evaluate the proposed features' usefulness. Eventually, we demonstrate that in human-machine settings where the medical staff only needs to label 20% of the PPG recording length, our proposed features with the developed DL architectures achieve 79.95% and 73.81% recognition accuracy in MESA and HeartBEAT datasets. This simplifies the labelling task of the medical staff during the sleep test yet provides accurate apnea event recognition.
2023-06-16 Camera PPG waveforms at the forehead A. C. den Brinker et.al. 2306.09879v1 null In order to obtain insights into the feasibility of replacing ECG-guided triggering in magnetic resonance imaging (MRI) by a system based on video photoplethysmography (PPG), PPG and ECG data were collected from volunteers in an MRI scanner. PPG waveforms obtained using remote camera PPG directed at the forehead are studied in qualitative and quantitative sense over a number of volunteers. The data analysis considers variations in PPG waveforms across volunteers, modelling of the waveforms in Fourier series, dependencies of waveforms and features on the interbeat interval (IBI) and breath-holding, and models for ECG-blind estimation of R-peak position. The main findings are that the PPG waveform depends on the volunteer and that its shape changes with IBI and does not depend on breath-holding in the given scenario. Low-order harmonic models provide accurate approximations to the PPG waveform, where for higher IBI the waveform shows more temporal details. Accurate predictions (20 ms std) of the delays between markers in ECG and PPG appear feasible from a single PPG feature.
2023-06-13 BeliefPPG: Uncertainty-aware Heart Rate Estimation from PPG signals via Belief Propagation Valentin Bieri et.al. 2306.07730v2 link We present a novel learning-based method that achieves state-of-the-art performance on several heart rate estimation benchmarks extracted from photoplethysmography signals (PPG). We consider the evolution of the heart rate in the context of a discrete-time stochastic process that we represent as a hidden Markov model. We derive a distribution over possible heart rate values for a given PPG signal window through a trained neural network. Using belief propagation, we incorporate the statistical distribution of heart rate changes to refine these estimates in a temporal context. From this, we obtain a quantized probability distribution over the range of possible heart rate values that captures a meaningful and well-calibrated estimate of the inherent predictive uncertainty. We show the robustness of our method on eight public datasets with three different cross-validation experiments.
2023-06-04 rPPG-MAE: Self-supervised Pre-training with Masked Autoencoders for Remote Physiological Measurement Xin Liu et.al. 2306.02301v1 null Remote photoplethysmography (rPPG) is an important technique for perceiving human vital signs, which has received extensive attention. For a long time, researchers have focused on supervised methods that rely on large amounts of labeled data. These methods are limited by the requirement for large amounts of data and the difficulty of acquiring ground truth physiological signals. To address these issues, several self-supervised methods based on contrastive learning have been proposed. However, they focus on the contrastive learning between samples, which neglect the inherent self-similar prior in physiological signals and seem to have a limited ability to cope with noisy. In this paper, a linear self-supervised reconstruction task was designed for extracting the inherent self-similar prior in physiological signals. Besides, a specific noise-insensitive strategy was explored for reducing the interference of motion and illumination. The proposed framework in this paper, namely rPPG-MAE, demonstrates excellent performance even on the challenging VIPL-HR dataset. We also evaluate the proposed method on two public datasets, namely PURE and UBFC-rPPG. The results show that our method not only outperforms existing self-supervised methods but also exceeds the state-of-the-art (SOTA) supervised methods. One important observation is that the quality of the dataset seems more important than the size in self-supervised pre-training of rPPG. The source code is released at https://github.com/linuxsino/rPPG-MAE.
2023-06-01 Privacy-Preserving Remote Heart Rate Estimation from Facial Videos Divij Gupta et.al. 2306.01141v1 null Remote Photoplethysmography (rPPG) is the process of estimating PPG from facial videos. While this approach benefits from contactless interaction, it is reliant on videos of faces, which often constitutes an important privacy concern. Recent research has revealed that deep learning techniques are vulnerable to attacks, which can result in significant data breaches making deep rPPG estimation even more sensitive. To address this issue, we propose a data perturbation method that involves extraction of certain areas of the face with less identity-related information, followed by pixel shuffling and blurring. Our experiments on two rPPG datasets (PURE and UBFC) show that our approach reduces the accuracy of facial recognition algorithms by over 60%, with minimal impact on rPPG extraction. We also test our method on three facial recognition datasets (LFW, CALFW, and AgeDB), where our approach reduced performance by nearly 50%. Our findings demonstrate the potential of our approach as an effective privacy-preserving solution for rPPG estimation.
2023-05-25 Mask Attack Detection Using Vascular-weighted Motion-robust rPPG Signals Chenglin Yao et.al. 2305.15940v1 null Detecting 3D mask attacks to a face recognition system is challenging. Although genuine faces and 3D face masks show significantly different remote photoplethysmography (rPPG) signals, rPPG-based face anti-spoofing methods often suffer from performance degradation due to unstable face alignment in the video sequence and weak rPPG signals. To enhance the rPPG signal in a motion-robust way, a landmark-anchored face stitching method is proposed to align the faces robustly and precisely at the pixel-wise level by using both SIFT keypoints and facial landmarks. To better encode the rPPG signal, a weighted spatial-temporal representation is proposed, which emphasizes the face regions with rich blood vessels. In addition, characteristics of rPPG signals in different color spaces are jointly utilized. To improve the generalization capability, a lightweight EfficientNet with a Gated Recurrent Unit (GRU) is designed to extract both spatial and temporal features from the rPPG spatial-temporal representation for classification. The proposed method is compared with the state-of-the-art methods on five benchmark datasets under both intra-dataset and cross-dataset evaluations. The proposed method shows a significant and consistent improvement in performance over other state-of-the-art rPPG-based methods for face spoofing detection.
2023-05-24 Promoting Generalization in Cross-Dataset Remote Photoplethysmography Nathan Vance et.al. 2305.15199v1 null Remote Photoplethysmography (rPPG), or the remote monitoring of a subject's heart rate using a camera, has seen a shift from handcrafted techniques to deep learning models. While current solutions offer substantial performance gains, we show that these models tend to learn a bias to pulse wave features inherent to the training dataset. We develop augmentations to mitigate this learned bias by expanding both the range and variability of heart rates that the model sees while training, resulting in improved model convergence when training and cross-dataset generalization at test time. Through a 3-way cross dataset analysis we demonstrate a reduction in mean absolute error from over 13 beats per minute to below 3 beats per minute. We compare our method with other recent rPPG systems, finding similar performance under a variety of evaluation parameters.
2023-05-23 Amplitude-Independent Machine Learning for PPG through Visibility Graphs and Transfer Learning Yuyang Miao et.al. 2305.14062v1 null Photoplethysmography (PPG) signals are omnipresent in wearable devices, as they measure blood volume variations using LED technology. These signals provide insight into the body's circulatory system and can be employed to extract various bio-features, such as heart rate and vascular ageing. Although several algorithms have been proposed for this purpose, many exhibit limitations, including heavy reliance on human calibration, high signal quality requirements, and a lack of generalization. In this paper, we introduce a PPG signal processing framework that integrates graph theory and computer vision algorithms, which is invariant to affine transformations, offers rapid computation speed, and exhibits robust generalization across tasks and datasets.
2023-05-21 Your smartphone could act as a pulse-oximeter and as a single-lead ECG Ahsan Mehmood et.al. 2305.12583v1 null In the post-covid19 era, every new wave of the pandemic causes an increased concern among the masses to learn more about their state of well-being. Therefore, it is the need of the hour to come up with ubiquitous, low-cost, non-invasive tools for rapid and continuous monitoring of body vitals that reflect the status of one's overall health. In this backdrop, this work proposes a deep learning approach to turn a smartphone-the popular hand-held personal gadget-into a diagnostic tool to measure/monitor the three most important body vitals, i.e., pulse rate (PR), blood oxygen saturation level (aka SpO2), and respiratory rate (RR). Furthermore, we propose another method that could extract a single-lead electrocardiograph (ECG) of the subject. The proposed methods include the following core steps: subject records a small video of his/her fingertip by placing his/her finger on the rear camera of the smartphone, and the recorded video is pre-processed to extract the filtered and/or detrended video-photoplethysmography (vPPG) signal, which is then fed to custom-built convolutional neural networks (CNN), which eventually spit-out the vitals (PR, SpO2, and RR) as well as a single-lead ECG of the subject. To be precise, the contribution of this paper is two-fold: 1) estimation of the three body vitals (PR, SpO2, RR) from the vPPG data using custom-built CNNs, vision transformer, and most importantly by CLIP model; 2) a novel discrete cosine transform+feedforward neural network-based method that translates the recorded video- PPG signal to a single-lead ECG signal. The proposed method is anticipated to find its application in several use-case scenarios, e.g., remote healthcare, mobile health, fitness, sports, etc.
2023-05-09 Predicting Cardiovascular Disease Risk using Photoplethysmography and Deep Learning Wei-Hung Weng et.al. 2305.05648v1 null Cardiovascular diseases (CVDs) are responsible for a large proportion of premature deaths in low- and middle-income countries. Early CVD detection and intervention is critical in these populations, yet many existing CVD risk scores require a physical examination or lab measurements, which can be challenging in such health systems due to limited accessibility. Here we investigated the potential to use photoplethysmography (PPG), a sensing technology available on most smartphones that can potentially enable large-scale screening at low cost, for CVD risk prediction. We developed a deep learning PPG-based CVD risk score (DLS) to predict the probability of having major adverse cardiovascular events (MACE: non-fatal myocardial infarction, stroke, and cardiovascular death) within ten years, given only age, sex, smoking status and PPG as predictors. We compared the DLS with the office-based refit-WHO score, which adopts the shared predictors from WHO and Globorisk scores (age, sex, smoking status, height, weight and systolic blood pressure) but refitted on the UK Biobank (UKB) cohort. In UKB cohort, DLS's C-statistic (71.1%, 95% CI 69.9-72.4) was non-inferior to office-based refit-WHO score (70.9%, 95% CI 69.7-72.2; non-inferiority margin of 2.5%, p<0.01). The calibration of the DLS was satisfactory, with a 1.8% mean absolute calibration error. Adding DLS features to the office-based score increased the C-statistic by 1.0% (95% CI 0.6-1.4). DLS predicts ten-year MACE risk comparable with the office-based refit-WHO score. It provides a proof-of-concept and suggests the potential of a PPG-based approach strategies for community-based primary prevention in resource-limited regions.
2023-04-28 Non-Contact Heart Rate Measurement from Deteriorated Videos Nhi Nguyen et.al. 2304.14789v1 null Remote photoplethysmography (rPPG) offers a state-of-the-art, non-contact methodology for estimating human pulse by analyzing facial videos. Despite its potential, rPPG methods can be susceptible to various artifacts, such as noise, occlusions, and other obstructions caused by sunglasses, masks, or even involuntary facial contact, such as individuals inadvertently touching their faces. In this study, we apply image processing transformations to intentionally degrade video quality, mimicking these challenging conditions, and subsequently evaluate the performance of both non-learning and learning-based rPPG methods on the deteriorated data. Our results reveal a significant decrease in accuracy in the presence of these artifacts, prompting us to propose the application of restoration techniques, such as denoising and inpainting, to improve heart-rate estimation outcomes. By addressing these challenging conditions and occlusion artifacts, our approach aims to make rPPG methods more robust and adaptable to real-world situations. To assess the effectiveness of our proposed methods, we undertake comprehensive experiments on three publicly available datasets, encompassing a wide range of scenarios and artifact types. Our findings underscore the potential to construct a robust rPPG system by employing an optimal combination of restoration algorithms and rPPG techniques. Moreover, our study contributes to the advancement of privacy-conscious rPPG methodologies, thereby bolstering the overall utility and impact of this innovative technology in the field of remote heart-rate estimation under realistic and diverse conditions.
2023-04-21 Heart Rate Extraction from Abdominal Audio Signals Jake Stuchbury-Wass et.al. 2304.11020v1 null Abdominal sounds (ABS) have been traditionally used for assessing gastrointestinal (GI) disorders. However, the assessment requires a trained medical professional to perform multiple abdominal auscultation sessions, which is resource-intense and may fail to provide an accurate picture of patients' continuous GI wellbeing. This has generated a technological interest in developing wearables for continuous capture of ABS, which enables a fuller picture of patient's GI status to be obtained at reduced cost. This paper seeks to evaluate the feasibility of extracting heart rate (HR) from such ABS monitoring devices. The collection of HR directly from these devices would enable gathering vital signs alongside GI data without the need for additional wearable devices, providing further cost benefits and improving general usability. We utilised a dataset containing 104 hours of ABS audio, collected from the abdomen using an e-stethoscope, and electrocardiogram as ground truth. Our evaluation shows for the first time that we can successfully extract HR from audio collected from a wearable on the abdomen. As heart sounds collected from the abdomen suffer from significant noise from GI and respiratory tracts, we leverage wavelet denoising for improved heart beat detection. The mean absolute error of the algorithm for average HR is 3.4 BPM with mean directional error of -1.2 BPM over the whole dataset. A comparison to photoplethysmography-based wearable HR sensors shows that our approach exhibits comparable accuracy to consumer wrist-worn wearables for average and instantaneous heart rate.
2023-04-21 IoT-Based Solution for Paraplegic Sufferer to Send Signals to Physician via Internet L. Srinivasan et.al. 2304.10840v1 null We come across hospitals and non-profit organizations that care for people with paralysis who have experienced all or portion of their physique being incapacitated by the paralyzing attack. Due to a lack of motor coordination by their mind, these persons are typically unable to communicate their requirements because they can speak clearly or use sign language. In such a case, we suggest a system that enables a disabled person to move any area of his body capable of moving to broadcast a text on the LCD. This method also addresses the circumstance in which the patient cannot be attended to in person and instead sends an SMS message using GSM. By detecting the user part's tilt direction, our suggested system operates. As a result, patients can communicate with physicians, therapists, or their loved ones at home or work over the web. Case-specific data, such as heart rate, must be continuously reported in health centers. The suggested method tracks the body of the case's pulse rate and other comparable data. For instance, photoplethysmography is used to assess heart rate. The decoded periodic data is transmitted continually via a Microcontroller coupled to a transmitting module. The croaker's cabin contains a receiver device that obtains and deciphers data as well as constantly exhibits it on Graphical interfaces viewable on the laptop. As a result, the croaker can monitor and handle multiple situations at once.
2023-04-14 PPG Signals for Hypertension Diagnosis: A Novel Method using Deep Learning Models Graham Frederick et.al. 2304.06952v1 null Hypertension is a medical condition characterized by high blood pressure, and classifying it into its various stages is crucial to managing the disease. In this project, a novel method is proposed for classifying stages of hypertension using Photoplethysmography (PPG) signals and deep learning models, namely AvgPool_VGG-16. The PPG signal is a non-invasive method of measuring blood pressure through the use of light sensors that measure the changes in blood volume in the microvasculature of tissues. PPG images from the publicly available blood pressure classification dataset were used to train the model. Multiclass classification for various PPG stages were done. The results show the proposed method achieves high accuracy in classifying hypertension stages, demonstrating the potential of PPG signals and deep learning models in hypertension diagnosis and management.
2023-04-05 Deep Learning Systems for Advanced Driving Assistance Francesco Rundo et.al. 2304.06041v1 null Next generation cars embed intelligent assessment of car driving safety through innovative solutions often based on usage of artificial intelligence. The safety driving monitoring can be carried out using several methodologies widely treated in scientific literature. In this context, the author proposes an innovative approach that uses ad-hoc bio-sensing system suitable to reconstruct the physio-based attentional status of the car driver. To reconstruct the car driver physiological status, the author proposed the use of a bio-sensing probe consisting of a coupled LEDs at Near infrared (NiR) spectrum with a photodetector. This probe placed over the monitored subject allows to detect a physiological signal called PhotoPlethysmoGraphy (PPG). The PPG signal formation is regulated by the change in oxygenated and non-oxygenated hemoglobin concentration in the monitored subject bloodstream which will be directly connected to cardiac activity in turn regulated by the Autonomic Nervous System (ANS) that characterizes the subject's attention level. This so designed car driver drowsiness monitoring will be combined with further driving safety assessment based on correlated intelligent driving scenario understanding.
2023-03-23 Efficient and Direct Inference of Heart Rate Variability using Both Signal Processing and Machine Learning Yuntong Zhang et.al. 2303.13637v1 null Heart Rate Variability (HRV) measures the variation of the time between consecutive heartbeats and is a major indicator of physical and mental health. Recent research has demonstrated that photoplethysmography (PPG) sensors can be used to infer HRV. However, many prior studies had high errors because they only employed signal processing or machine learning (ML), or because they indirectly inferred HRV, or because there lacks large training datasets. Many prior studies may also require large ML models. The low accuracy and large model sizes limit their applications to small embedded devices and potential future use in healthcare. To address the above issues, we first collected a large dataset of PPG signals and HRV ground truth. With this dataset, we developed HRV models that combine signal processing and ML to directly infer HRV. Evaluation results show that our method had errors between 3.5% to 25.7% and outperformed signal-processing-only and ML-only methods. We also explored different ML models, which showed that Decision Trees and Multi-level Perceptrons have 13.0% and 9.1% errors on average with models at most hundreds of KB and inference time less than 1ms. Hence, they are more suitable for small embedded devices and potentially enable the future use of PPG-based HRV monitoring in healthcare.
2023-03-23 PPG-based Heart Rate Estimation with Efficient Sensor Sampling and Learning Models Yuntong Zhang et.al. 2303.13636v1 null Recent studies showed that Photoplethysmography (PPG) sensors embedded in wearable devices can estimate heart rate (HR) with high accuracy. However, despite of prior research efforts, applying PPG sensor based HR estimation to embedded devices still faces challenges due to the energy-intensive high-frequency PPG sampling and the resource-intensive machine-learning models. In this work, we aim to explore HR estimation techniques that are more suitable for lower-power and resource-constrained embedded devices. More specifically, we seek to design techniques that could provide high-accuracy HR estimation with low-frequency PPG sampling, small model size, and fast inference time. First, we show that by combining signal processing and ML, it is possible to reduce the PPG sampling frequency from 125 Hz to only 25 Hz while providing higher HR estimation accuracy. This combination also helps to reduce the ML model feature size, leading to smaller models. Additionally, we present a comprehensive analysis on different ML models and feature sizes to compare their accuracy, model size, and inference time. The models explored include Decision Tree (DT), Random Forest (RF), K-nearest neighbor (KNN), Support vector machines (SVM), and Multi-layer perceptron (MLP). Experiments were conducted using both a widely-utilized dataset and our self-collected dataset. The experimental results show that our method by combining signal processing and ML had only 5% error for HR estimation using low-frequency PPG data. Moreover, our analysis showed that DT models with 10 to 20 input features usually have good accuracy, while are several magnitude smaller in model sizes and faster in inference time.
2023-03-21 Motion Matters: Neural Motion Transfer for Better Camera Physiological Sensing Akshay Paruchuri et.al. 2303.12059v2 link Machine learning models for camera-based physiological measurement can have weak generalization due to a lack of representative training data. Body motion is one of the most significant sources of noise when attempting to recover the subtle cardiac pulse from a video. We explore motion transfer as a form of data augmentation to introduce motion variation while preserving physiological changes. We adapt a neural video synthesis approach to augment videos for the task of remote photoplethysmography (PPG) and study the effects of motion augmentation with respect to 1) the magnitude and 2) the type of motion. After training on motion-augmented versions of publicly available datasets, the presented inter-dataset results on five benchmark datasets show improvements of up to 75% over existing state-of-the-art results. Our findings illustrate the utility of motion transfer as a data augmentation technique for improving the generalization of models for camera-based physiological sensing. We release our code and pre-trained models for using motion transfer as a data augmentation technique on our project page: https://motion-matters.github.io/
2023-03-17 HDformer: A Higher Dimensional Transformer for Diabetes Detection Utilizing Long Range Vascular Signals Ella Lan et.al. 2303.11340v1 null Diabetes mellitus is a worldwide concern, and early detection can help to prevent serious complications. Low-cost, non-invasive detection methods, which take cardiovascular signals into deep learning models, have emerged. However, limited accuracy constrains their clinical usage. In this paper, we present a new Transformer-based architecture, Higher Dimensional Transformer (HDformer), which takes long-range photoplethysmography (PPG) signals to detect diabetes. The long-range PPG contains broader and deeper signal contextual information compared to the less-than-one-minute PPG signals commonly utilized in existing research. To increase the capability and efficiency of processing the long range data, we propose a new attention module Time Square Attention (TSA), reducing the volume of the tokens by more than 10x, while retaining the local/global dependencies. It converts the 1-dimensional inputs into 2-dimensional representations and groups adjacent points into a single 2D token, using the 2D Transformer models as the backbone of the encoder. It generates the dynamic patch sizes into a gated mixture-of-experts (MoE) network as decoder, which optimizes the learning on different attention areas. Extensive experimentations show that HDformer results in the state-of-the-art performance (sensitivity 98.4, accuracy 97.3, specificity 92.8, and AUC 0.929) on the standard MIMIC-III dataset, surpassing existing studies. This work is the first time to take long-range, non-invasive PPG signals via Transformer for diabetes detection, achieving a more scalable and convenient solution compared to traditional invasive approaches. The proposed HDformer can also be scaled to analyze general long-range biomedical waveforms. A wearable prototype finger-ring is designed as a proof of concept.
2023-03-16 Full-Body Cardiovascular Sensing with Remote Photoplethysmography Lu Niu et.al. 2303.09638v1 null Remote photoplethysmography (rPPG) allows for noncontact monitoring of blood volume changes from a camera by detecting minor fluctuations in reflected light. Prior applications of rPPG focused on face videos. In this paper we explored the feasibility of rPPG from non-face body regions such as the arms, legs, and hands. We collected a new dataset titled Multi-Site Physiological Monitoring (MSPM), which will be released with this paper. The dataset consists of 90 frames per second video of exposed arms, legs, and face, along with 10 synchronized PPG recordings. We performed baseline heart rate estimation experiments from non-face regions with several state-of-the-art rPPG approaches, including chrominance-based (CHROM), plane-orthogonal-to-skin (POS) and RemotePulseNet (RPNet). To our knowledge, this is the first evaluation of the fidelity of rPPG signals simultaneously obtained from multiple regions of a human body. Our experiments showed that skin pixels from arms, legs, and hands are all potential sources of the blood volume pulse. The best-performing approach, POS, achieved a mean absolute error peaking at 7.11 beats per minute from non-facial body parts compared to 1.38 beats per minute from the face. Additionally, we performed experiments on pulse transit time (PTT) from both the contact PPG and rPPG signals. We found that remote PTT is possible with moderately high frame rate video when distal locations on the body are visible. These findings and the supporting dataset should facilitate new research on non-face rPPG and monitoring blood flow dynamics over the whole body with a camera.
2023-03-16 Image Enhancement for Remote Photoplethysmography in a Low-Light Environment Lin Xi et.al. 2303.09336v1 link With the improvement of sensor technology and significant algorithmic advances, the accuracy of remote heart rate monitoring technology has been significantly improved. Despite of the significant algorithmic advances, the performance of rPPG algorithm can degrade in the long-term, high-intensity continuous work occurred in evenings or insufficient light environments. One of the main challenges is that the lost facial details and low contrast cause the failure of detection and tracking. Also, insufficient lighting in video capturing hurts the quality of physiological signal. In this paper, we collect a large-scale dataset that was designed for remote heart rate estimation recorded with various illumination variations to evaluate the performance of the rPPG algorithm (Green, ICA, and POS). We also propose a low-light enhancement solution (technical solution) for remote heart rate estimation under the low-light condition. Using collected dataset, we found 1) face detection algorithm cannot detect faces in video captured in low light conditions; 2) A decrease in the amplitude of the pulsatile signal will lead to the noise signal to be in the dominant position; and 3) the chrominance-based method suffers from the limitation in the assumption about skin-tone will not hold, and Green and ICA method receive less influence than POS in dark illuminance environment. The proposed solution for rPPG process is effective to detect and improve the signal-to-noise ratio and precision of the pulsatile signal.
2023-03-14 Non-Contrastive Unsupervised Learning of Physiological Signals from Video Jeremy Speth et.al. 2303.07944v1 link Subtle periodic signals such as blood volume pulse and respiration can be extracted from RGB video, enabling remote health monitoring at low cost. Advancements in remote pulse estimation -- or remote photoplethysmography (rPPG) -- are currently driven by deep learning solutions. However, modern approaches are trained and evaluated on benchmark datasets with associated ground truth from contact-PPG sensors. We present the first non-contrastive unsupervised learning framework for signal regression to break free from the constraints of labelled video data. With minimal assumptions of periodicity and finite bandwidth, our approach is capable of discovering the blood volume pulse directly from unlabelled videos. We find that encouraging sparse power spectra within normal physiological bandlimits and variance over batches of power spectra is sufficient for learning visual features of periodic signals. We perform the first experiments utilizing unlabelled video data not specifically created for rPPG to train robust pulse rate estimators. Given the limited inductive biases and impressive empirical results, the approach is theoretically capable of discovering other periodic signals from video, enabling multiple physiological measurements without the need for ground truth signals. Codes to fully reproduce the experiments are made available along with the paper.
2023-03-14 ForDigitStress: A multi-modal stress dataset employing a digital job interview scenario Alexander Heimerl et.al. 2303.07742v1 null We present a multi-modal stress dataset that uses digital job interviews to induce stress. The dataset provides multi-modal data of 40 participants including audio, video (motion capturing, facial recognition, eye tracking) as well as physiological information (photoplethysmography, electrodermal activity). In addition to that, the dataset contains time-continuous annotations for stress and occurred emotions (e.g. shame, anger, anxiety, surprise). In order to establish a baseline, five different machine learning classifiers (Support Vector Machine, K-Nearest Neighbors, Random Forest, Long-Short-Term Memory Network) have been trained and evaluated on the proposed dataset for a binary stress classification task. The best-performing classifier achieved an accuracy of 88.3% and an F1-score of 87.5%.