arxiv-daily latest papers around wearable device arxiv paper daily

Automated deployment @ 2023-07-28 16:19:05 Asia/Shanghai

Welcome to contribute! Add your topics and keywords in topic.yml. You can also view historical data through the storage.

huawei

huawei watch

Publish Date	Title	Authors	PDF	Code	Abstract
2023-07-27	How to Train Your YouTube Recommender	Alexander Liu et.al.	2307.14551v1	null	YouTube provides features for users to indicate disinterest when presented with unwanted recommendations, such as the `Not interested'' and`Don\'t recommend channel'' buttons. These buttons are purported to allow the user to correct `mistakes'' made by the recommendation system. Yet, relatively little is known about the empirical efficacy of these buttons. Neither is much known about users' awareness of and confidence in them. To address these gaps, we simulated YouTube users with sock puppet agents. Each agent first executed a`stain phase'', where it watched many videos of one assigned topic; then it executed a `scrub phase'', where it tried to remove recommendations of the assigned topic. Each agent repeatedly applied a single scrubbing strategy, which included disliking previously-watched videos or deleting them from watch history, as well as clicking the`not interested'' or `don\'t recommend channel'' button on newly-recommended videos. Overall, we found that the stain phase significantly increased the fraction of the recommended videos on the user\'s homepage dedicated to the assigned topic. For the scrub phase, using the`Not interested'' button worked best, significantly reducing such recommendations in all topics tested, on average removing 88\% of them. Neither the stain phase nor the scrub phase, however, had much effect on videopage recommendations (those given to users while they watch a video). We also ran a survey ($N$ =300) asking adult YouTube users in the US whether they were aware of and used these buttons before, as well as how effective they found these buttons to be. We found that 44\% of participants were not aware that the ``Not interested'' button existed. However, those who were aware of this button often used it to remove unwanted recommendations (82.8\%) and found it to be modestly effective (3.42 out of 5).
2023-07-25	Insights into Cognitive Engagement: Comparing the Effectiveness of Game-Based and Video-Based Learning	Shayla Sharmin et.al.	2307.13637v1	null	The analysis of brain signals holds considerable importance in enhancing our comprehension of diverse learning techniques and cognitive mechanisms. Game-based learning is increasingly being recognized for its interactive and engaging educational approach. A pilot study of twelve participants divided into experimental and control groups was conducted to understand its effects on cognitive processes. Both groups were provided with the same contents regarding the basic structure of the graph. The participants in the experimental group engaged in a quiz-based game, while those in the control group watched a pre-recorded video. Functional Near-Infrared Spectroscopy (fNIRS) was employed to acquire cerebral signals, and a series of pre and post-tests were administered. The findings of our study indicate that the group engaged in the game activity displayed elevated levels of oxygenated hemoglobin compared to the group involved in watching videos. Conversely, the deoxygenated hemoglobin levels remained relatively consistent across both groups throughout the learning process. The aforementioned findings suggest that the use of game-based learning has a substantial influence on cognitive processes. Furthermore, it is evident that both the game and video groups exhibited higher neural activity in the Lateral Prefrontal cortex (PFC). The oxygenated hemoglobin ratio demonstrates that the game group had 2.33 times more neural processing in the Lateral PFC than the video group. This data is further supported by the knowledge gain analysis, which indicates that the game-based approach resulted in a 47.74% higher knowledge gain than the video group, as calculated from the difference in pre-and post-test scores.
2023-07-25	A Pairwise Dataset for GUI Conversion and Retrieval between Android Phones and Tablets	Han Hu et.al.	2307.13225v1	null	With the popularity of smartphones and tablets, users have become accustomed to using different devices for different tasks, such as using their phones to play games and tablets to watch movies. To conquer the market, one app is often available on both smartphones and tablets. However, although one app has similar graphic user interfaces (GUIs) and functionalities on phone and tablet, current app developers typically start from scratch when developing a tablet-compatible version of their app, which drives up development costs and wastes existing design resources. Researchers are attempting to employ deep learning in automated GUIs development to enhance developers' productivity. Deep learning models rely heavily on high-quality datasets. There are currently several publicly accessible GUI page datasets for phones, but none for pairwise GUIs between phones and tablets. This poses a significant barrier to the employment of deep learning in automated GUI development. In this paper, we collect and make public the Papt dataset, which is a pairwise dataset for GUI conversion and retrieval between Android phones and tablets. The dataset contains 10,035 phone-tablet GUI page pairs from 5,593 phone-tablet app pairs. We illustrate the approaches of collecting pairwise data and statistical analysis of this dataset. We also illustrate the advantages of our dataset compared to other current datasets. Through preliminary experiments on this dataset, we analyse the present challenges of utilising deep learning in automated GUI development and find that our dataset can assist the application of some deep learning models to tasks involving automatic GUI development.
2023-07-24	A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning	Benjamin Eysenbach et.al.	2307.12968v1	link	As with any machine learning problem with limited data, effective offline RL algorithms require careful regularization to avoid overfitting. One-step methods perform regularization by doing just a single step of policy improvement, while critic regularization methods do many steps of policy improvement with a regularized objective. These methods appear distinct. One-step methods, such as advantage-weighted regression and conditional behavioral cloning, truncate policy iteration after just one step. This ``early stopping'' makes one-step RL simple and stable, but can limit its asymptotic performance. Critic regularization typically requires more compute but has appealing lower-bound guarantees. In this paper, we draw a close connection between these methods: applying a multi-step critic regularization method with a regularization coefficient of 1 yields the same policy as one-step RL. While practical implementations violate our assumptions and critic regularization is typically applied with smaller regularization coefficients, our experiments nevertheless show that our analysis makes accurate, testable predictions about practical offline RL methods (CQL and one-step RL) with commonly-used hyperparameters. Our results that every problem can be solved with a single step of policy improvement, but rather that one-step RL might be competitive with critic regularization on RL problems that demand strong regularization.
2023-07-24	Rechargeable Li/Cl$_2$ battery down to -80 °C	Peng Liang et.al.	2307.12947v1	null	Low temperature rechargeable batteries are important to life in cold climates, polar/deep-sea expeditions and space explorations. Here, we report ~ 3.5 - 4 V rechargeable lithium/chlorine (Li/Cl2) batteries operating down to -80 {\deg}C, employing Li metal negative electrode, a novel CO2 activated porous carbon (KJCO2) as the positive electrode, and a high ionic conductivity (~ 5 to 20 mS cm-1 from -80 {\deg}C to 25 {\deg}C) electrolyte comprised of 1 M aluminum chloride (AlCl3), 0.95 M lithium chloride (LiCl), and 0.05 M lithium bis(fluorosulfonyl)imide (LiFSI) in low melting point (-104.5 {\deg}C) thionyl chloride (SOCl2). Between room-temperature and -80 {\deg}C, the Li/Cl2 battery delivered up to ~ 30,000 - 4,500 mAh g-1 first discharge capacity and a 1,200 - 5,000 mAh g-1 reversible capacity (discharge voltages in ~ 3.5 to 3.1 V) over up to 130 charge-discharge cycles. Mass spectrometry and X-ray photoelectron spectroscopy (XPS) probed Cl2 trapped in the porous carbon upon LiCl electro-oxidation during charging. At lower temperature down to -80 {\deg}C, SCl2/S2Cl2 and Cl2 generated by electro-oxidation in the charging step were trapped in porous KJCO2 carbon, allowing for reversible reduction to afford a high discharge voltage plateau near ~ 4 V with up to ~ 1000 mAh g-1 capacity for SCl2/S2Cl2 reduction and up to ~ 4000 mAh g-1 capacity at ~ 3.1 V plateau for Cl2 reduction. Towards practical use, we made CR2032 Li/Cl2 battery cells to drive digital watches at -40 {\deg}C and light emitting diode at -80 {\deg}C, opening Li/Cl2 secondary batteries for ultra-cold conditions.
2023-07-24	Less is More: Focus Attention for Efficient DETR	Dehua Zheng et.al.	2307.12612v1	link	DETR-like models have significantly boosted the performance of detectors and even outperformed classical convolutional models. However, all tokens are treated equally without discrimination brings a redundant computational burden in the traditional encoder structure. The recent sparsification strategies exploit a subset of informative tokens to reduce attention complexity maintaining performance through the sparse encoder. But these methods tend to rely on unreliable model statistics. Moreover, simply reducing the token population hinders the detection performance to a large extent, limiting the application of these sparse models. We propose Focus-DETR, which focuses attention on more informative tokens for a better trade-off between computation efficiency and model accuracy. Specifically, we reconstruct the encoder with dual attention, which includes a token scoring mechanism that considers both localization and category semantic information of the objects from multi-scale feature maps. We efficiently abandon the background queries and enhance the semantic interaction of the fine-grained object queries based on the scores. Compared with the state-of-the-art sparse DETR-like detectors under the same setting, our Focus-DETR gets comparable complexity while achieving 50.4AP (+2.2) on COCO. The code is available at https://github.com/huawei-noah/noah-research/tree/master/Focus-DETR and https://gitee.com/mindspore/models/tree/master/research/cv/Focus-DETR.
2023-07-24	Multi-Shooting Differential Dynamic Programming for Hybrid Systems using Analytical Derivatives	Shubham Singh et.al.	2307.12606v1	null	Differential Dynamic Programming (DDP) is a popular technique used to generate motion for dynamic-legged robots in the recent past. However, in most cases, only the first-order partial derivatives of the underlying dynamics are used, resulting in the iLQR approach. Neglecting the second-order terms often slows down the convergence rate compared to full DDP. Multi-Shooting is another popular technique to improve robustness, especially if the dynamics are highly non-linear. In this work, we consider Multi-Shooting DDP for trajectory optimization of a bounding gait for a simplified quadruped model. As the main contribution, we develop Second-Order analytical partial derivatives of the rigid-body contact dynamics, extending our previous results for fixed/floating base models with multi-DoF joints. Finally, we show the benefits of a novel Quasi-Newton method for approximating second-order derivatives of the dynamics, leading to order-of-magnitude speedups in the convergence compared to the full DDP method.
2023-07-24	Automated Mapping of Adaptive App GUIs from Phones to TVs	Han Hu et.al.	2307.12522v1	null	With the increasing interconnection of smart devices, users often desire to adopt the same app on quite different devices for identical tasks, such as watching the same movies on both their smartphones and TV. However, the significant differences in screen size, aspect ratio, and interaction styles make it challenging to adapt Graphical User Interfaces (GUIs) across these devices. Although there are millions of apps available on Google Play, only a few thousand are designed to support smart TV displays. Existing techniques to map a mobile app GUI to a TV either adopt a responsive design, which struggles to bridge the substantial gap between phone and TV or use mirror apps for improved video display, which requires hardware support and extra engineering efforts. Instead of developing another app for supporting TVs, we propose a semi-automated approach to generate corresponding adaptive TV GUIs, given the phone GUIs as the input. Based on our empirical study of GUI pairs for TV and phone in existing apps, we synthesize a list of rules for grouping and classifying phone GUIs, converting them to TV GUIs, and generating dynamic TV layouts and source code for the TV display. Our tool is not only beneficial to developers but also to GUI designers, who can further customize the generated GUIs for their TV app development. An evaluation and user study demonstrate the accuracy of our generated GUIs and the usefulness of our tool.
2023-07-23	LiveRetro: Visual Analytics for Strategic Retrospect in Livestream E-Commerce	Yuchen Wu et.al.	2307.12213v1	null	Livestream e-commerce integrates live streaming and online shopping, allowing viewers to make purchases while watching. However, effective marketing strategies remain a challenge due to limited empirical research and subjective biases from the absence of quantitative data. Current tools fail to capture the interdependence between live performances and feedback. This study identified computational features, formulated design requirements, and developed LiveRetro, an interactive visual analytics system. It enables comprehensive retrospective analysis of livestream e-commerce for streamers, viewers, and merchandise. LiveRetro employs enhanced visualization and time-series forecasting models to align performance features and feedback, identifying influences at channel, merchandise, feature, and segment levels. Through case studies and expert interviews, the system provides deep insights into the relationship between live performance and streaming statistics, enabling efficient strategic analysis from multiple perspectives.
2023-07-21	Large Language Model-based System to Provide Immediate Feedback to Students in Flipped Classroom Preparation Learning	Shintaro Uchiyama et.al.	2307.11388v1	null	This paper proposes a system that uses large language models to provide immediate feedback to students in flipped classroom preparation learning. This study aimed to solve challenges in the flipped classroom model, such as ensuring that students are emotionally engaged and motivated to learn. Students often have questions about the content of lecture videos in the preparation of flipped classrooms, but it is difficult for teachers to answer them immediately. The proposed system was developed using the ChatGPT API on a video-watching support system for preparation learning that is being used in real practice. Answers from ChatGPT often do not align with the context of the student's question. Therefore, this paper also proposes a method to align the answer with the context. This paper also proposes a method to collect the teacher's answers to the students' questions and use them as additional guides for the students. This paper discusses the design and implementation of the proposed system.
2023-07-21	OpenGDA: Graph Domain Adaptation Benchmark for Cross-network Learning	Boshen Shi et.al.	2307.11341v1	link	Graph domain adaptation models are widely adopted in cross-network learning tasks, with the aim of transferring labeling or structural knowledge. Currently, there mainly exist two limitations in evaluating graph domain adaptation models. On one side, they are primarily tested for the specific cross-network node classification task, leaving tasks at edge-level and graph-level largely under-explored. Moreover, they are primarily tested in limited scenarios, such as social networks or citation networks, lacking validation of model's capability in richer scenarios. As comprehensively assessing models could enhance model practicality in real-world applications, we propose a benchmark, known as OpenGDA. It provides abundant pre-processed and unified datasets for different types of tasks (node, edge, graph). They originate from diverse scenarios, covering web information systems, urban systems and natural systems. Furthermore, it integrates state-of-the-art models with standardized and end-to-end pipelines. Overall, OpenGDA provides a user-friendly, scalable and reproducible benchmark for evaluating graph domain adaptation models. The benchmark experiments highlight the challenges of applying GDA models to real-world applications with consistent good performance, and potentially provide insights to future research. As an emerging project, OpenGDA will be regularly updated with new datasets and models. It could be accessed from https://github.com/Skyorca/OpenGDA.
2023-07-21	Fused Spectatorship: Designing Bodily Experiences Where Spectators Become Players	Rakesh Patibanda et.al.	2307.11297v1	null	Spectating digital games can be exciting. However, due to its vicarious nature, spectators often wish to engage in the gameplay beyond just watching and cheering. To blur the boundaries between spectators and players, we propose a novel approach called ''Fused Spectatorship'', where spectators watch their hands play games by loaning bodily control to a computational Electrical Muscle Stimulation (EMS) system. To showcase this concept, we designed three games where spectators loan control over both their hands to the EMS system and watch them play these competitive and collaborative games. A study with 12 participants suggested that participants could not distinguish if they were watching their hands play, or if they were playing the games themselves. We used our results to articulate four spectator experience themes and four fused spectator types, the behaviours they elicited and offer one design consideration to support each of these behaviours. We also discuss the ethical design considerations of our approach to help game designers create future fused spectatorship experiences.
2023-07-20	Underwater 3D positioning on smart devices	Tuochao Chen et.al.	2307.11263v1	null	The emergence of water-proof mobile and wearable devices (e.g., Garmin Descent and Apple Watch Ultra) designed for underwater activities like professional scuba diving, opens up opportunities for underwater networking and localization capabilities on these devices. Here, we present the first underwater acoustic positioning system for smart devices. Unlike conventional systems that use floating buoys as anchors at known locations, we design a system where a dive leader can compute the relative positions of all other divers, without any external infrastructure. Our intuition is that in a well-connected network of devices, if we compute the pairwise distances, we can determine the shape of the network topology. By incorporating orientation information about a single diver who is in the visual range of the leader device, we can then estimate the positions of all the remaining divers, even if they are not within sight. We address various practical problems including detecting erroneous distance estimates, addressing rotational and flipping ambiguities as well as designing a distributed timestamp protocol that scales linearly with the number of devices. Our evaluations show that our distributed system running on underwater deployments of 4-5 commodity smart devices can perform pairwise ranging and localization with median errors of 0.5-0.9 m and 0.9-1.6 m
2023-07-20	Kick Back & Relax: Learning to Reconstruct the World by Watching SlowTV	Jaime Spencer et.al.	2307.10713v1	link	Self-supervised monocular depth estimation (SS-MDE) has the potential to scale to vast quantities of data. Unfortunately, existing approaches limit themselves to the automotive domain, resulting in models incapable of generalizing to complex environments such as natural or indoor settings. To address this, we propose a large-scale SlowTV dataset curated from YouTube, containing an order of magnitude more data than existing automotive datasets. SlowTV contains 1.7M images from a rich diversity of environments, such as worldwide seasonal hiking, scenic driving and scuba diving. Using this dataset, we train an SS-MDE model that provides zero-shot generalization to a large collection of indoor/outdoor datasets. The resulting model outperforms all existing SSL approaches and closes the gap on supervised SoTA, despite using a more efficient architecture. We additionally introduce a collection of best-practices to further maximize performance and zero-shot generalization. This includes 1) aspect ratio augmentation, 2) camera intrinsic estimation, 3) support frame randomization and 4) flexible motion estimation. Code is available at https://github.com/jspenmar/slowtv_monodepth.
2023-07-19	Watch out Venomous Snake Species: A Solution to SnakeCLEF2023	Feiran Hu et.al.	2307.09748v1	link	The SnakeCLEF2023 competition aims to the development of advanced algorithms for snake species identification through the analysis of images and accompanying metadata. This paper presents a method leveraging utilization of both images and metadata. Modern CNN models and strong data augmentation are utilized to learn better representation of images. To relieve the challenge of long-tailed distribution, seesaw loss is utilized in our method. We also design a light model to calculate prior probabilities using metadata features extracted from CLIP in post processing stage. Besides, we attach more importance to venomous species by assigning venomous species labels to some examples that model is uncertain about. Our method achieves 91.31% score of the final metric combined of F1 and other metrics on private leaderboard, which is the 1st place among the participators. The code is available at https://github.com/xiaoxsparraw/CLEF2023.
2023-07-18	GroupLane: End-to-End 3D Lane Detection with Channel-wise Grouping	Zhuoling Li et.al.	2307.09472v1	null	Efficiency is quite important for 3D lane detection due to practical deployment demand. In this work, we propose a simple, fast, and end-to-end detector that still maintains high detection precision. Specifically, we devise a set of fully convolutional heads based on row-wise classification. In contrast to previous counterparts, ours supports recognizing both vertical and horizontal lanes. Besides, our method is the first one to perform row-wise classification in bird-eye-view. In the heads, we split feature into multiple groups and every group of feature corresponds to a lane instance. During training, the predictions are associated with lane labels using the proposed single-win one-to-one matching to compute loss, and no post-processing operation is demanded for inference. In this way, our proposed fully convolutional detector, GroupLane, realizes end-to-end detection like DETR. Evaluated on 3 real world 3D lane benchmarks, OpenLane, Once-3DLanes, and OpenLane-Huawei, GroupLane adopting ConvNext-Base as the backbone outperforms the published state-of-the-art PersFormer by 13.6% F1 score in the OpenLane validation set. Besides, GroupLane with ResNet18 still surpasses PersFormer by 4.9% F1 score, while the inference speed is nearly 7x faster and the FLOPs is only 13.3% of it.
2023-07-17	Multi-Task Cross-Modality Attention-Fusion for 2D Object Detection	Huawei Sun et.al.	2307.08339v1	null	Accurate and robust object detection is critical for autonomous driving. Image-based detectors face difficulties caused by low visibility in adverse weather conditions. Thus, radar-camera fusion is of particular interest but presents challenges in optimally fusing heterogeneous data sources. To approach this issue, we propose two new radar preprocessing techniques to better align radar and camera data. In addition, we introduce a Multi-Task Cross-Modality Attention-Fusion Network (MCAF-Net) for object detection, which includes two new fusion blocks. These allow for exploiting information from the feature maps more comprehensively. The proposed algorithm jointly detects objects and segments free space, which guides the model to focus on the more relevant part of the scene, namely, the occupied space. Our approach outperforms current state-of-the-art radar-camera fusion-based object detectors in the nuScenes dataset and achieves more robust results in adverse weather conditions and nighttime scenarios.
2023-07-13	Probing the Galactic Halo with RR Lyrae Stars -- V. Chemistry, Kinematics, and Dynamically Tagged Groups	Jonathan Cabrera Garcia et.al.	2307.09572v1	null	We employ a sample of 135,873 RR Lyrae stars (RRLs) with precise photometric-metallicity and distance estimates from the newly calibrated $P$--$\phi_{31}$--$R_{21}$--[Fe/H] and $Gaia$ $G$-band $P$--$R_{21}$--[Fe/H] absolute magnitude-metallicity relations of Li et al., combined with available proper motions from $Gaia$ EDR3, and 6955 systemic radial velocities from $Gaia$ DR3 and other sources, in order to explore the chemistry and kinematics of the halo of the Milky Way (MW). This sample is ideally suited for characterization of the inner- and outer-halo populations of the stellar halo, free from the bias associated with spectroscopically selected probes, and for estimation of their relative contributions as a function of Galactocentric distance. The results of a Gaussian Mixture-Model analysis of these contributions are broadly consistent with other observational studies of the halo, and with expectations from recent MW simulation studies. We apply the HDBSCAN clustering method to the specific energies and cylindrical actions ($E$, J${r}$, J$$, J$_{z}$), identifying 97 Dynamically Tagged Groups (DTGs) of RRLs, and explore their associations with recognized substructures of the MW. The precise photometric-distance determinations ($\delta\, d/d < 5$\%), and the resulting high-quality determination of dynamical parameters, yield highly statistically significant (low) dispersions of [Fe/H] for the stellar members of the DTGs compared to random draws from the full sample, indicating that they share common star-formation and chemical histories, influenced by their birth environments.
2023-07-13	Watch Your Pose: Unsupervised Domain Adaption with Pose based Triplet Selection for Gait Recognition	Gavriel Habib et.al.	2307.06751v1	null	Gait Recognition is a computer vision task aiming to identify people by their walking patterns. Existing methods show impressive results on individual datasets but lack the ability to generalize to unseen scenarios. Unsupervised Domain Adaptation (UDA) tries to adapt a model, pre-trained in a supervised manner on a source domain, to an unlabelled target domain. UDA for Gait Recognition is still in its infancy and existing works proposed solutions to limited scenarios. In this paper, we reveal a fundamental phenomenon in adaptation of gait recognition models, in which the target domain is biased to pose-based features rather than identity features, causing a significant performance drop in the identification task. We suggest Gait Orientation-based method for Unsupervised Domain Adaptation (GOUDA) to reduce this bias. To this end, we present a novel Triplet Selection algorithm with a curriculum learning framework, aiming to adapt the embedding space by pushing away samples of similar poses and bringing closer samples of different poses. We provide extensive experiments on four widely-used gait datasets, CASIA-B, OU-MVLP, GREW, and Gait3D, and on three backbones, GaitSet, GaitPart, and GaitGL, showing the superiority of our proposed method over prior works.
2023-07-11	Efficient 3D Articulated Human Generation with Layered Surface Volumes	Yinghao Xu et.al.	2307.05462v1	null	Access to high-quality and diverse 3D articulated digital human assets is crucial in various applications, ranging from virtual reality to social platforms. Generative approaches, such as 3D generative adversarial networks (GANs), are rapidly replacing laborious manual content creation tools. However, existing 3D GAN frameworks typically rely on scene representations that leverage either template meshes, which are fast but offer limited quality, or volumes, which offer high capacity but are slow to render, thereby limiting the 3D fidelity in GAN settings. In this work, we introduce layered surface volumes (LSVs) as a new 3D object representation for articulated digital humans. LSVs represent a human body using multiple textured mesh layers around a conventional template. These layers are rendered using alpha compositing with fast differentiable rasterization, and they can be interpreted as a volumetric representation that allocates its capacity to a manifold of finite thickness around the template. Unlike conventional single-layer templates that struggle with representing fine off-surface details like hair or accessories, our surface volumes naturally capture such details. LSVs can be articulated, and they exhibit exceptional efficiency in GAN settings, where a 2D generator learns to synthesize the RGBA textures for the individual layers. Trained on unstructured, single-view 2D image datasets, our LSV-GAN generates high-quality and view-consistent 3D articulated digital humans without the need for view-inconsistent 2D upsampling networks.
2023-07-10	Active Learning for Video Classification with Frame Level Queries	Debanjan Goswami et.al.	2307.05587v1	null	Deep learning algorithms have pushed the boundaries of computer vision research and have depicted commendable performance in a variety of applications. However, training a robust deep neural network necessitates a large amount of labeled training data, acquiring which involves significant time and human effort. This problem is even more serious for an application like video classification, where a human annotator has to watch an entire video end-to-end to furnish a label. Active learning algorithms automatically identify the most informative samples from large amounts of unlabeled data; this tremendously reduces the human annotation effort in inducing a machine learning model, as only the few samples that are identified by the algorithm, need to be labeled manually. In this paper, we propose a novel active learning framework for video classification, with the goal of further reducing the labeling onus on the human annotators. Our framework identifies a batch of exemplar videos, together with a set of informative frames for each video; the human annotator needs to merely review the frames and provide a label for each video. This involves much less manual work than watching the complete video to come up with a label. We formulate a criterion based on uncertainty and diversity to identify the informative videos and exploit representative sampling techniques to extract a set of exemplar frames from each video. To the best of our knowledge, this is the first research effort to develop an active learning framework for video classification, where the annotators need to inspect only a few frames to produce a label, rather than watching the end-to-end video.
2023-07-07	A Self-Supervised Algorithm for Denoising Photoplethysmography Signals for Heart Rate Estimation from Wearables	Pranay Jain et.al.	2307.05339v1	null	Smart watches and other wearable devices are equipped with photoplethysmography (PPG) sensors for monitoring heart rate and other aspects of cardiovascular health. However, PPG signals collected from such devices are susceptible to corruption from noise and motion artifacts, which cause errors in heart rate estimation. Typical denoising approaches filter or reconstruct the signal in ways that eliminate much of the morphological information, even from the clean parts of the signal that would be useful to preserve. In this work, we develop an algorithm for denoising PPG signals that reconstructs the corrupted parts of the signal, while preserving the clean parts of the PPG signal. Our novel framework relies on self-supervised training, where we leverage a large database of clean PPG signals to train a denoising autoencoder. As we show, our reconstructed signals provide better estimates of heart rate from PPG signals than the leading heart rate estimation methods. Further experiments show significant improvement in Heart Rate Variability (HRV) estimation from PPG signals using our algorithm. We conclude that our algorithm denoises PPG signals in a way that can improve downstream analysis of many different health metrics from wearable devices.
2023-07-07	What makes a successful rebuttal in computer science conferences? : A perspective on social interaction	Junjie Huang et.al.	2307.03371v2	null	With an exponential increase in submissions to top-tier Computer Science (CS) conferences, more and more conferences have introduced a rebuttal stage to the conference peer review process. The rebuttal stage can be modeled as social interactions between authors and reviewers. A successful rebuttal often results in an increased review score after the rebuttal stage. In this paper, we conduct an empirical study to determine the factors contributing to a successful rebuttal using over 3,000 papers and 13,000 reviews from ICLR2022, one of the most prestigious computer science conferences. First, we observe a significant difference in review scores before and after the rebuttal stage, which is crucial for paper acceptance. Furthermore, we investigate factors from the reviewer's perspective using signed social network analysis. A notable finding is the increase in balanced network structure after the rebuttal stage. Subsequently, we evaluate several quantifiable author rebuttal strategies and their effects on review scores. These strategies can help authors in improving their review scores. Finally, we used machine learning models to predict rebuttal success and validated the impact of potential factors analyzed in this paper. Our experiments demonstrate that the utilization of all features proposed in this study can aid in predicting the success of the rebuttal. In summary, this work presents a study on the impact factors of successful rebuttals from both reviewers' and authors' perspectives and lays the foundation for analyzing rebuttals with social network analysis.
2023-07-06	Machine Learning Classification of Repeating FRBs from FRB121102	Bjorn Jasper R. Raquel et.al.	2307.02811v2	null	Fast Radio Bursts (FRBs) are mysterious bursts in the millisecond timescale at radio wavelengths. Currently, there is little understanding about the classification of repeating FRBs, based on difference in physics, which is of great importance in understanding their origin. Recent works from the literature focus on using specific parameters to classify FRBs to draw inferences on the possible physical mechanisms or properties of these FRB subtypes. In this study, we use publicly available 1652 repeating FRBs from FRB121102 detected with the Five-hundred-meter Aperture Spherical Telescope (FAST), and studied them with an unsupervised machine learning model. By fine-tuning the hyperparameters of the model, we found that there is an indication for four clusters from the bursts of FRB121102 instead of the two clusters ("Classical" and "Atypical") suggested in the literature. Wherein, the "Atypical" cluster can be further classified into three sub-clusters with distinct characteristics. Our findings show that the clustering result we obtained is more comprehensive not only because our study produced results which are consistent with those in the literature but also because our work uses more physical parameters to create these clusters. Overall, our methods and analyses produced a more holistic approach in clustering the repeating FRBs of FRB121102.
2023-07-03	MWPRanker: An Expression Similarity Based Math Word Problem Retriever	Mayank Goel et.al.	2307.01240v1	null	Math Word Problems (MWPs) in online assessments help test the ability of the learner to make critical inferences by interpreting the linguistic information in them. To test the mathematical reasoning capabilities of the learners, sometimes the problem is rephrased or the thematic setting of the original MWP is changed. Since manual identification of MWPs with similar problem models is cumbersome, we propose a tool in this work for MWP retrieval. We propose a hybrid approach to retrieve similar MWPs with the same problem model. In our work, the problem model refers to the sequence of operations to be performed to arrive at the solution. We demonstrate that our tool is useful for the mentioned tasks and better than semantic similarity-based approaches, which fail to capture the arithmetic and logical sequence of the MWPs. A demo of the tool can be found at https://www.youtube.com/watch?v=gSQWP3chFIs
2023-06-30	Collapse of Straight Soft Growing Inflated Beam Robots Under Their Own Weight	Ciera McFarland et.al.	2307.00089v1	null	Soft, growing inflated beam robots, also known as everting vine robots, have previously been shown to navigate confined spaces with ease. Less is known about their ability to navigate three-dimensional open spaces where they have the potential to collapse under their own weight as they attempt to move through a space. Previous work has studied collapse of inflated beams and vine robots due to purely transverse or purely axial external loads. Here, we extend previous models to predict the length at which straight vine robots will collapse under their own weight at arbitrary launch angle relative to gravity, inflated diameter, and internal pressure. Our model successfully predicts the general trends of collapse behavior of straight vine robots. We find that collapse length increases non-linearly with the robot's launch angle magnitude, linearly with the robot's diameter, and with the square root of the robot's internal pressure. We also demonstrate the use of our model to determine the robot parameters required to grow a vine robot across a gap in the floor. This work forms the foundation of an approach for modeling the collapse of vine robots and inflated beams in arbitrary shapes.
2023-06-30	INDCOR White Paper 0: Interactive Digital Narratives (IDNs) -- A Solution to the Challenge of Representing Complex Issues	Hartmut Koenitz et.al.	2306.17498v1	null	Citizens everywhere have the right to be well-informed. Yet, with the high complexity of many contemporary issues, such as global warming and migration, our means of information need to mutually adapt. Narrative has always been at the core of information exchange - regardless of whether our ancestors sat around a fire and exchanged stories, or whether we read an article in a newspaper, or watched a TV news broadcast. Yet, the narrative formats of the newspaper article, the news broadcast, the documentary, and the textbook are severely limited when it comes to representing highly complex topics which may include several competing - and sometimes equally valid - perspectives. Such complexity contributes to a high level of uncertainty due to a multitude of factors affecting an outcome. Fortunately, with Interactive Digital Narrative (IDN), there is a novel media format which can address these challenges. IDNs can present several different perspectives in the same work, and give audiences the ability to explore them at will through decision-making. After experiencing the consequences of their decisions, the audience can replay to revisit and change these decisions in order to consider their alternatives. IDN works enable deep personalization and the inclusion of live data. These capabilities make IDN a 21st century democratic medium, empowering citizens through the understanding of complex issues. In this white paper, we discuss the challenge of representing complexity, describe the advantages offered by IDNs, and point out opportunities and strategies for deployment.
2023-06-30	24 New Light Curves and Updated Ephemeris using EXOTIC for WASP-12b	Avinash S. Nediyedath et.al.	2306.17473v2	null	NASA citizen scientists from all over the world have used EXOplanet Transit Interpretation Code (EXOTIC) to reduce 71 sets of time-series images of WASP-12 taken by the 6-inch telescope operated by the Centre of Astrophysics
2023-06-30	Leveraging Watch-time Feedback for Short-Video Recommendations: A Causal Labeling Framework	Yang Zhang et.al.	2306.17426v1	null	With the proliferation of short video applications, the significance of short video recommendations has vastly increased. Unlike other recommendation scenarios, short video recommendation systems heavily rely on feedback from watch time. Existing approaches simply treat watch time as a direct label, failing to effectively harness its extensive semantics and introduce bias, thereby limiting the potential for modeling user interests based on watch time. To overcome this challenge, we propose a framework named Debiasied Multiple-semantics-extracting Labeling (DML). DML constructs labels that encompass various semantics by utilizing quantiles derived from the distribution of watch time, prioritizing relative order rather than absolute label values. This approach facilitates easier model learning while aligning with the ranking objective of recommendations. Furthermore, we introduce a method inspired by causal adjustment to refine label definitions, thereby reducing the impact of bias on the label and directly mitigating bias at the label level. We substantiate the effectiveness of our DML framework through both online and offline experiments. Extensive results demonstrate that our DML could effectively leverage watch time to discover users' real interests, enhancing their engagement in our application.
2023-06-29	13 New Light Curves and Updated Mid-Transit Time and Period for Hot Jupiter WASP-104 b with EXOTIC	Heather B. Hewitt et.al.	2306.17251v1	null	Using the EXOplanet Transit Interpretation Code (EXOTIC), we reduced 52 sets of images of WASP-104 b, a Hot Jupiter-class exoplanet orbiting WASP-104, in order to obtain an updated mid-transit time (ephemeris) and orbital period for the planet. We performed this reduction on images taken with a 6-inch telescope of the Center for Astrophysics

huawei band

Publish Date	Title	Authors	PDF	Code	Abstract
2023-07-27	Unpinned Dirac-Fermions in Carbon-Phosphorous-Arsenic Based Ternary Monolayer	Amrendra Kumar et.al.	2307.15001v1	null	We predict energetically and dynamically stable ternary Carbon-Phosphorous-Arsenic (CPAs2) monolayers in buckled geometric structure by employing density functional theory based calculations. We consider three different symmetric configurations, namely, inversion (i), mirror (m) and rotational (r). The low-energy dispersions in electronic band structure and density of states (DOS) around the Fermi level contain two contrasting features: (a) parabolic dispersion around highly symmetric Gamma point with a step function in DOS due to nearly-free-particle-like Schroedinger-Fermions and (b) linear dispersion around highly symmetric K point with linear DOS due to massless Dirac-Fermions for i-CPAs2 monolayer. The step function in DOS is a consequence of two-dimensionality of the system in which the motion of nearly-free-particles is confined. However, a closer look at (b) reveals that the ternary monolayers possess distinct characters, namely (i) massless-gapless, (ii) slightly massive-gapped and (iii) unpinned massless-gapless Dirac-Fermions for i, m and r-CPAs2 configurations respectively. Thus, the nature of states around the Fermi level depends crucially on the symmetry of systems. In addition, we probe the influence of mechanical strain on the properties of CPAs2 monolayer. The results indicate that the characteristic dispersions of (a) and (b) move in opposite directions in energy which leads to a metal-to-semimetal transition in i and r-CPAs2 configurations, for a few percentages of tensile strain. On the other hand, a strain induced metal-to-semiconductor transition is observed in m-CPAs2 configuration with a tunable energy band gap. Interestingly, unlike graphene, the Dirac cones can be unpinned from highly symmetric K (and K') point, but they are restricted to move along the edges (K-M'-K') of first Brillouin zone due to C2 symmetry in i and r-CPAs2 configurations.
2023-07-27	Learning locally dominant force balances in active particle systems	Dominik Sturm et.al.	2307.14970v1	null	We use a combination of unsupervised clustering and sparsity-promoting inference algorithms to learn locally dominant force balances that explain macroscopic pattern formation in self-organized active particle systems. The self-organized emergence of macroscopic patterns from microscopic interactions between self-propelled particles can be widely observed nature. Although hydrodynamic theories help us better understand the physical basis of this phenomenon, identifying a sufficient set of local interactions that shape, regulate, and sustain self-organized structures in active particle systems remains challenging. We investigate a classic hydrodynamic model of self-propelled particles that produces a wide variety of patterns, like asters and moving density bands. Our data-driven analysis shows that propagating bands are formed by local alignment interactions driven by density gradients, while steady-state asters are shaped by a mechanism of splay-induced negative compressibility arising from strong particle interactions. Our method also reveals analogous physical principles of pattern formation in a system where the speed of the particle is influenced by local density. This demonstrates the ability of our method to reveal physical commonalities across models. The physical mechanisms inferred from the data are in excellent agreement with analytical scaling arguments and experimental observations.
2023-07-27	A full-resolution training framework for Sentinel-2 image fusion	Matteo Ciotola et.al.	2307.14864v1	null	This work presents a new unsupervised framework for training deep learning models for super-resolution of Sentinel-2 images by fusion of its 10-m and 20-m bands. The proposed scheme avoids the resolution downgrade process needed to generate training data in the supervised case. On the other hand, a proper loss that accounts for cycle-consistency between the network prediction and the input components to be fused is proposed. Despite its unsupervised nature, in our preliminary experiments the proposed scheme has shown promising results in comparison to the supervised approach. Besides, by construction of the proposed loss, the resulting trained network can be ascribed to the class of multi-resolution analysis methods.
2023-07-27	Modeling Interference for the Coexistence of 6G Networks and Passive Sensing Systems	Paolo Testolina et.al.	2307.14848v1	null	Future wireless networks and sensing systems will benefit from access to large chunks of spectrum above 100 GHz, to achieve terabit-per-second data rates in 6th Generation (6G) cellular systems and improve accuracy and reach of Earth exploration and sensing and radio astronomy applications. These are extremely sensitive to interference from artificial signals, thus the spectrum above 100~GHz features several bands which are protected from active transmissions under current spectrum regulations. To provide more agile access to the spectrum for both services, active and passive users will have to coexist without harming passive sensing operations. In this paper, we provide the first, fundamental analysis of Radio Frequency Interference (RFI) that large-scale terrestrial deployments introduce in different satellite sensing systems now orbiting the Earth. We develop a geometry-based analysis and extend it into a data-driven model which accounts for realistic propagation, building obstruction, ground reflection, for network topology with up to $10^5$ nodes in more than $85$ km$^2$. We show that the presence of harmful RFI depends on several factors, including network load, density and topology, satellite orientation, and building density. The results and methodology provide the foundation for the development of coexistence solutions and spectrum policy towards 6G.
2023-07-27	Eigenenergy braids in 2D photonic crystals	Janet Zhong et.al.	2307.14845v1	null	We consider non-Hermitian energy band theory in two-dimensional systems, and study eigenenergy braids on slices in the two-dimensional Brillouin zone. We show the consequences of reciprocity and geometric symmetry on such eigenenergy braids. The point-gap topology of the energy bands can be found from the projection of the eigenenergy braid onto the complex energy plane. We show that the conjugacy class transitions in the eigenenergy braid results in the changes in the number of bands in a complete point-gap loop. This transition occurs at exceptional points. We numerically demonstrate these concepts using two-dimensional reciprocal and nonreciprocal photonic crystals.
2023-07-27	Triaxial projected shell model approach for negative parity states in even-even nuclei	Nazira Nazir et.al.	2307.14827v1	null	The triaxial projected shell model (TPSM) approach is generalized to investigate the negative parity band structures in even-even systems. In the earlier version of the TPSM approach, the quasiparticle excitations were restricted to one major oscillator shell and it was possible to study only $+$ive parity states in even-even systems. In the present extension, the excited quasiparticles are allowed to occupy two major oscillator shells, which makes it possible to generate the $-$ive parity states. As a major application of this development, the extended approach is applied to elucidate the $-$ive parity high-spin band structures in $^{102-112}$Ru. It is shown that TPSM approach provides a reasonable description of the observed properties.
2023-07-27	Interfacial Resonance States-Induced Negative Tunneling Magneto-resistance in Orthogonally-Magnetized CoFeB/MgO/CoFeB	Puyang Huang et.al.	2307.14807v1	null	Magnetic tunneling junctions (MTJs) are essential for non-volatile magneto-resistive random access memory (MRAM) applications. Here, we report the observation of a large negative tunneling magneto-resistance (TMR) in the CoFeB/MgO/CoFeB system with an orthogonally-magnetized configuration. Through the thickness modulation of the MgO barrier, the negative TMR component can be enhanced up to 20% under a negative voltage bias. Moreover, the tunnel anisotropic magneto-resistance measurements unveil that the negative TMR component likely arises from the interfacial resonance states (IRS) in the minority band of the bottom ferromagnetic layer. Complementary first principle calculations further quantify the IRS location and strength with respect to the Fermi level position. Our work not only confirm the vital role of IRS in the electrical transport of MTJ, but also provide valuable insights for the design of new-generation voltage-controlled MRAM and related spintronic applications.
2023-07-27	Broadband parametric amplification for multiplexed SiMOS quantum dot signals	Victor Elhomsy et.al.	2307.14717v1	null	Spins in semiconductor quantum dots hold great promise as building blocks of quantum processors. Trapping them in SiMOS transistor-like devices eases future industrial scale fabrication. Among the potentially scalable readout solutions, gate-based dispersive radiofrequency reflectometry only requires the already existing transistor gates to readout a quantum dot state, relieving the need for additional elements. In this effort towards scalability, traveling-wave superconducting parametric amplifiers significantly enhance the readout signal-to-noise ratio (SNR) by reducing the noise below typical cryogenic low-noise amplifiers, while offering a broad amplification band, essential to multiplex the readout of multiple resonators. In this work, we demonstrate a 3GHz gate-based reflectometry readout of electron charge states trapped in quantum dots formed in SiMOS multi-gate devices, with SNR enhanced thanks to a Josephson traveling-wave parametric amplifier (JTWPA). The broad, tunable 2GHz amplification bandwidth combined with more than 10dB ON/OFF SNR improvement of the JTWPA enables frequency and time division multiplexed readout of interdot transitions, and noise performance near the quantum limit. In addition, owing to a design without superconducting loops and with a metallic ground plane, the JTWPA is flux insensitive and shows stable performances up to a magnetic field of 1.2T at the quantum dot device, compatible with standard SiMOS spin qubit experiments.
2023-07-27	Two dimensional lattice with an imaginary magnetic field	Tomoki Ozawa et.al.	2307.14635v1	null	We explore gauge-independent properties of two-dimensional non-Hermitian lattice systems with an imaginary magnetic field. We find that the energy spectrum under the open boundary conditions is an example of such gauge-independent properties. We discuss how to obtain the asymptotic continuum energy spectrum upon increasing length of one side using the framework of the non-Bloch band theory. We also find an analog of the Aharonov-Bohm effect; the net change of the norm of the wavefunction upon adiabatically forming a closed path is determined by the imaginary magnetic flux enclosed by the path.
2023-07-27	Demonstrating Quantum Computation for Quasiparticle Band Structures	Takahiro Ohgoe et.al.	2307.14607v1	null	Understanding and predicting the properties of solid-state materials from first-principles has been a great challenge for decades. Owing to the recent advances in quantum technologies, quantum computations offer a promising way to achieve this goal. Here, we demonstrate the first-principles calculation of a quasiparticle band structure on actual quantum computers. This is achieved by hybrid quantum-classical algorithms in conjunction with qubit-reduction and error-mitigation techniques. Our demonstration will pave the way to practical applications of quantum computers.
2023-07-27	White-light superflare and long-term activity of the nearby M7 type binary EI~Cnc observed with GWAC system	Hua-Li Li et.al.	2307.14594v1	null	Stellar white-light flares are believed to play an essential role on the physical and chemical properties of the atmosphere of the surrounding exoplanets. Here we report an optical monitoring campaign on the nearby flaring system EI~Cnc carried out by the Ground-based Wide Angle Cameras (GWAC) and its dedicated follow-up telescope. A superflare, coming from the brighter component EI~CncA, was detected and observed, in which four components are required to properly model the complex decay light curve. The lower limit of flare energy in the $R-$band is estimated to be $3.3\times10^{32}$ ergs. 27 flares are additionally detected from the GWAC archive data with a total duration of 290 hours. The inferred cumulative flare frequency distribution follows a quite shallow power-law function with a slope of $\beta=-0.50\pm 0.03$ over the energy range between $10^{30}$ and $10^{33}$ erg, which reinforces the trend that stars cooler than M4 show enhanced superflare activity. The flares identified in EI~Cnc enable us to extend the $\tau-E$ relationship previously established in the white-light superflares of solar-type stars down to an energy as low as $\sim10^{30}$erg (i.e., by three orders): $\tau\propto E^{0.42\pm0.02}$, which suggests a common flare mechanism for stars with a type from M to solar-like, and implies an invariant of $B^{1/3}\upsilon_{\rm A}$ in the white-light flares.
2023-07-26	Dust enrichment and grain growth in a smooth disk around the DG Tau protostar revealed by ALMA triple bands frequency observations	Satoshi Ohashi et.al.	2307.14526v1	null	Characterizing the physical properties of dust grains in a protoplanetary disk is critical to comprehending the planet formation process. Our study presents ALMA high-resolution observations of the young protoplanetary disk around DG Tau at a 1.3 mm dust continuum. The observations, with a spatial resolution of $\approx 0.04''$, or $\approx5$ au, revealed a geometrically thin and smooth disk without substantial substructures, suggesting that the disk retains the initial conditions of the planet formation. To further analyze the distributions of dust surface density, temperature, and grain size, we conducted a multi-band analysis with several dust models, incorporating ALMA archival data of the 0.87 mm and 3.1 mm dust polarization. The results showed that the Toomre $Q$ parameter is $\lesssim2$ at a 20 au radius, assuming a dust-to-gas mass ratio of 0.01. This implies that a higher dust-to-gas mass ratio is necessary to stabilize the disk. The grain sizes depend on the dust models, and for the DSHARP compact dust, they were found to be smaller than $\sim400$ $\mu$m in the inner region ($r\lesssim20$ au), while exceeding larger than 3 mm in the outer part. Radiative transfer calculations show that the dust scale height is lower than at least one-third of the gas scale height. These distributions of dust enrichment, grain sizes, and weak turbulence strength may have significant implications for the formation of planetesimals through mechanisms such as streaming instability. We also discuss the CO snowline effect and collisional fragmentation in dust coagulation for the origin of the dust size distribution.
2023-07-26	Competition between fractional quantum Hall liquid and electron solid phases in the Landau levels of multilayer graphene	Rakesh K. Dora et.al.	2307.14519v1	null	We study the competition between the electron liquid and solid phases, such as Wigner crystal and bubbles, in partially filled Landau levels (LLs) of multilayer graphene. Graphene systems offer a versatile platform for controlling band dispersion by varying the number of its stacked layers. The band dispersion determines the LL wave functions, and consequently, the LL-projected Coulomb interaction in graphene and its multilayers is different from that in conventional semiconductors like GaAs. As a result, the energies of the liquid and solid phases are different in the different LLs of multilayer graphene, leading to a new phase diagram for the stability of these phases, which we work out. The phase diagram of competing solid and liquid phases in the LLs of monolayer graphene has been studied previously. Here, we primarily consider $AB{-}$ or Bernal$-$stacked bilayer graphene (BLG) and $ABC{-}$stacked trilayer graphene (TLG) and focus on the Laughlin fractions. We determine the cohesive energy of the solid phase using the Hartree-Fock approximation while the energy of the Laughlin liquid is computed analytically via the plasma sum rules. We find that at the Laughlin fillings, the electron liquid phase has the lowest energy among the phases considered in the $\mathcal{N}{=}0, 1, 2$ LLs of BLG, as well as in the $\mathcal{N}{=}3, 4$ LLs of TLG, while in the $\mathcal{N}{>}2$ LLs of BLG and $\mathcal{N}{>}4$ LLs of TLG, the solid phases are more favorable. We also discuss the effect of impurities on the above-mentioned phase diagram.
2023-07-26	CEERS MIRI Imaging: Data Reduction and Quality Assessment	Guang Yang et.al.	2307.14509v1	null	The Cosmic Evolution Early Release Science Survey (CEERS), targeting the Extended Groth Strip extragalactic field, is one of the JWST Director's Discretionary Early Release Science programs. To date, all observations have been executed and include NIRCam/MIRI imaging and NIRSpec/NIRCam spectroscopic exposures. Here, we discuss the MIRI imaging, which includes eight pointings, four of which provide deep imaging with the bluer bands (F560W, F770W) and four with contiguous wavelength coverage in F1000W, F1280W, F1500W, and F1800W, where two of these also include coverage in F770W and F2100W. We present a summary of the data, the data quality, and data reduction. The data reduction is based on the JWST Calibration Pipeline combined with custom modifications and additional steps designed to enhance the output quality, including improvements in astrometry and the removal of detector artifacts. We estimate the image depth of the reduced mosaics, and show that these generally agree with expectations from the Exposure Time Calculator. We compare the MIRI F560W and F770W flux densities for bright sources to measurements from Spitzer/IRAC Ch3 (5.8 $\mu$m) and Ch4 (8.0 $\mu$m), and we find that they agree with systematic differences of $<0.1$ mag. For the redder MIRI bands, we assess their quality by studying the spectral energy distributions (SEDs) of Galactic stars. The SEDs are consistent with the expected Rayleigh-Jeans law with a deviation $\sim 0.03$ mag, indicating that the MIRI colors are reliable. We also discuss all publicly released data products (images and source catalogs), which are available on the CEERS website (https://ceers.github.io/).
2023-07-26	AutoSourceID-Classifier. Star-Galaxy Classification using a Convolutional Neural Network with Spatial Information	F. Stoppa et.al.	2307.14456v1	null	Aims. Traditional star-galaxy classification techniques often rely on feature estimation from catalogues, a process susceptible to introducing inaccuracies, thereby potentially jeopardizing the classification's reliability. Certain galaxies, especially those not manifesting as extended sources, can be misclassified when their shape parameters and flux solely drive the inference. We aim to create a robust and accurate classification network for identifying stars and galaxies directly from astronomical images. By leveraging convolutional neural networks (CNN) and additional information about the source position, we aim to accurately classify all stars and galaxies within a survey, particularly those with a signal-to-noise ratio (S/N) near the detection limit. Methods. The AutoSourceID-Classifier (ASID-C) algorithm developed here uses 32x32 pixel single filter band source cutouts generated by the previously developed ASID-L code. ASID-C utilizes CNNs to distinguish these cutouts into stars or galaxies, leveraging their strong feature-learning capabilities. Subsequently, we employ a modified Platt Scaling calibration for the output of the CNN. This technique ensures that the derived probabilities are effectively calibrated, delivering precise and reliable results. Results. We show that ASID-C, trained on MeerLICHT telescope images and using the Dark Energy Camera Legacy Survey (DECaLS) morphological classification, outperforms similar codes like SourceExtractor. ASID-C opens up new possibilities for accurate celestial object classification, especially for sources with a S/N near the detection limit. Potential applications of ASID-C, like real-time star-galaxy classification and transient's host identification, promise significant contributions to astronomical research.
2023-07-26	Tunable Magnon-Photon Coupling by Magnon Band Gap in a Layered Hybrid Perovskite Antiferromagnet	Yi Li et.al.	2307.14447v1	null	Tunability of coherent coupling between fundamental excitations is an important prerequisite for expanding their functionality in hybrid quantum systems. In hybrid magnonics, the dipolar interaction between magnon and photon usually persists and cannot be switched off. Here, we demonstrate this capability by coupling a superconducting resonator to a layered hybrid perovskite antiferromagnet, which exhibits a magnon band gap due to its intrinsic Dzyaloshinskii-Moriya interaction. The pronounced temperature sensitivity of the magnon band gap location allows us to set the photon mode within the gap and to disable magnon-photon hybridization. When the resonator mode falls into the magnon band gap, the resonator damping rate increases due to the nonzero coupling to the detuned magnon mode. This phenomena can be used to quantify the magnon band gap using an analytical model. Our work brings new opportunities in controlling coherent information processing with quantum properties in complex magnetic materials.
2023-07-26	JWST Photometry of Globular Cluster Populations in Abell 2744 at $z=0.3$	William E. Harris et.al.	2307.14412v1	null	JWST imaging of the rich galaxy cluster Abell 2744 at $z=0.308$ has been used by the UNCOVER team (Bezanson et al. 2022) to construct mosaic images in the NIRCAM filters. The exceptionally deep images in the ($F115W$, $F150W$, $F200W$) bands reveal a large population of unresolved pointlike sources across the field, the vast majority of which are globular clusters (GCs). To the limits of our photometry, more than 10,000 such objects were measured, most of which are in the halos of the five largest A2744 galaxies but which also include GCs around some satellite galaxies and throughout the IntraCluster Medium. Their luminosity function follows a lognormal shape, with the data reaching to within one magnitude of the classic GCLF turnover point. The colour index ($F115W-F200W$) in particular covers a range of $0.5$ mag, clearly resolving the expected internal spread of GC metallicities. The estimated GC masses are systematically higher than in present-day galaxies, consistent with a large, normal GC population seen at a $3.5~$Gyr earlier stage of dynamical evolution. Lastly, the spatial distribution of the bluer (more metal-poor) GCs resembles the gravitational lensing map of the cluster, consistent with recent theoretical suggestions.
2023-07-26	Non-Hermitian tearing by dissipation	Qian Du et.al.	2307.14340v1	null	In the paper, we study the non-Hermitian system under dissipation in which the energy band shows an imaginary line gap and energy eigenstates are bound to a specific region. To describe these phenomena, we propose the concept of "non-Hermitian tearing", in which the degree of tearing we defined reveals a continuous phase transition at the exceptional point. The non-Hermitian tearing manifests in two forms -- bulk state separation and boundary state decoupling. For a deeper understanding of non-Hermitian tearing, we give the effective 22 Hamiltonian in the k-space by reducing the NN Hamiltonian in the real space. In addition, we also explore the non-Hermitian tearing in the one-dimensional Su-Schrieffer-Heeger model and the Qi-Wu-Zhang model. Our results provide a theoretical approach for studying non-Hermitian tearing in more complex systems.
2023-07-26	Non-chiral one-dimensional sates propagating inside AB/BA domain walls in bilayer graphene	V. V. Enaldiev et.al.	2307.14293v1	null	Boundaries between structural twins of bilayer graphene (so-called AB/BA domain walls) are often discussed in terms of the formation of topologically protected valley-polarised chiral states. Here, we show that, depending on the width of the AB/BA boundary, the latter can also support non-chiral one-dimensional (1D) states that are confined to the domain wall at low energies and take the form of quasi-bound states at higher energies, where the 1D bands cross into the two-dimensional spectral continuum. We present the results of modeling of electronic properties of AB/BA domain walls with and without magnetic field as a function of their width and interlayer bias.
2023-07-26	Topological Insulators	Yoichi Ando et.al.	2307.14196v1	null	Topological insulators are characterized by insulating bulk and conducting surface, the latter is a necessity consequence of the nontrivial topology of the wavefunctions forming the valence band. This chapter gives a historical overview of the discovery of topological insulators and a concise description of the $Z_2$ topology which defines them. The concept of topological insulators have been extended to various other topologies, giving rise to the recognition of further topological states of matter such as topological crystalline insulators and higher-order topological insulators. Representative materials of topological insulators, their synthesis techniques, and the ways for the experimental confirmation of the topological nature are introduced. Among the interesting phenomena derived from topological insulators, topological superconductivity, Majorana zero modes, and quantum anomalous Hall effect are briefly discussed.
2023-07-26	Non-Hermitian Chiral Edge Modes With Complex Fermi Velocity	Fei Yang et.al.	2307.14144v1	null	Recently, much attention has been paid to uncovering the influence of dissipation on a quantum system, particularly on how the non-Hermitian (NH) terms modify the band topology of topological materials and reshape the profile of the wavefunctions of a system (or the NH skin effect). In this paper, a specific NH skin effect that induced by local dissipation is studied for chiral edge modes, in which the NH term corresponds to the local imaginary Fermi velocity of the chiral edge modes. By solving the NH Schr\"{o}dinger equation of the non-Hermitian chiral edge modes (nhCEs) with complex Fermi velocity, we uncovered the remarkable complex spectra and the wavefunctions of the nhCEs. We find that the complex spectra of these modes is a straight line in the topological materials, and its chirality can separates the modes with positive energy from those with negative energy, in which they are localized at different positions. We also studied the nhCEs at the boundary of 2-dimensional (2D) topological materials, the 2D $p$-wave superconductor and the Qi-Wu-Zhang model, in which the general law of nhCEs was verified. We expect that our findings will pave the way for researching the transport properties of the chiral edge modes in the non-equilibrium context.
2023-07-26	Still alive and kicking: A significant outburst in changing-look AGN Mrk 1018	R. Brogan et.al.	2307.14139v1	null	Changing-look active galactic nuclei (CL-AGN) have been observed to change optical spectral type. Mrk 1018 is unique: first classified as a type 1.9 Seyfert galaxy, it transitioned to a type 1 before returning to its initial classification after approximately 30 years. We present a high-cadence monitoring programme that caught a major outburst in 2020. Due to sunblock, only the decline could be observed. We studied X-ray, UV, optical, and IR before and after the outburst to investigate the responses of the AGN structures. We derived a u'-band light curve of the AGN contribution alone. The flux increased by a factor of the order of 13. We confirmed this in other optical bands and determined the shape and speed of the decline in each waveband. The shapes of H beta and H alpha were analysed before and after the event. Two XMM-Newton observations from before and after the outburst were also exploited. The outburst is asymmetric, with a swifter rise than decline. The decline is best fit by a linear function, ruling out a tidal disruption event. The optical spectrum shows no change approximately 8 months before and 17 months after. The UV flux increased slightly after the outburst but the X-ray primary flux is unchanged. However, the 6.4 keV Iron line has doubled in strength. IR data taken 13 days after the observed optical peak show an increased emission level. Calculating the distance of the broad-line region and inner edge of the torus from the supermassive black hole can explain the multi-wavelength response to the outburst, in particular: i) the unchanged H beta and H alpha lines, ii) the unchanged primary X-ray spectral components, iii) the rapid and extended infrared response, as well as iv) the enhanced emission of the reflected 6.4 keV line. The outburst was due to a dramatic and short-lasting change in the intrinsic accretion rate. We discuss different models as potential causes.
2023-07-26	Erbium emitters in commercially fabricated nanophotonic silicon waveguides	Stephan Rinner et.al.	2307.14017v1	null	Quantum memories integrated into nanophotonic silicon devices are a promising platform for large quantum networks and scalable photonic quantum computers. In this context, erbium dopants are particularly attractive, as they combine optical transitions in the telecommunications frequency band with the potential for second-long coherence time. Here we show that these emitters can be reliably integrated into commercially fabricated low-loss waveguides. We investigate several integration procedures and obtain ensembles of many emitters with an inhomogeneous broadening of < 2 GHz and a homogeneous linewidth of < 30 kHz. We further observe the splitting of the electronic spin states in a magnetic field up to 9 T that freezes paramagnetic impurities. Our findings are an important step towards long-lived quantum memories that can be fabricated on a wafer-scale using CMOS technology.
2023-07-26	Sub-second periodic radio oscillations in a microquasar	Pengfu Tian et.al.	2307.14015v1	null	Powerful relativistic jets are one of the ubiquitous features of accreting black holes in all scales. GRS 1915+105 is a well-known fast-spinning black-hole X-ray binary with a relativistic jet, termed as a ``microquasar'', as indicated by its superluminal motion of radio emission. It exhibits persistent x-ray activity over the last 30 years, with quasi-periodic oscillations of $\sim 1-10$ Hz and 34 and 67 Hz in the x-ray band. These oscillations likely originate in the inner accretion disk, but other origins have been considered. Radio observations found variable light curves with quasi-periodic flares or oscillations with periods of $\sim 20-50$ minutes. Here we report two instances of $\sim$5 Hz transient periodic oscillation features from the source detected in the 1.05-1.45 GHz radio band that occurred in January 2021 and June 2022, respectively. Circular polarization was also observed during the oscillation phase.
2023-07-26	Normal state quantum geometry and superconducting domes in (111) oxide interfaces	Florian Simon et.al.	2307.13993v1	null	We theoretically investigate the influence of the normal state quantum geometry on the superconducting phase in (111) oriented oxide interfaces and discuss some of its implications in the case of the $\text{LaAlO}_3/\text{SrTiO}_3$ (LAO/STO) heterostructure. Based on a tight-binding representation of this interface, we introduce a low-energy model for which we compute the quantum geometry of the lowest band. The quantum metric exhibits a high peak around the $\Gamma$ point, owing to the closeness of the band to a degeneracy point, while the Berry curvature is negligible. We then compute the conventional and geometric contributions to the superfluid weight. The conventional part increases linearly with the chemical potential $\mu$, a generic behaviour for Schr\"odinger-like bands. The geometric part shows a dome upon varying $\mu$, and we argue that this is a generic behaviour when the quantum metric is peaked at the zero-filling point (where the filling starts). Both contributions can be of the same order when we include disorder effects, yielding a dome-shaped superfluid weight as a function of the chemical potential. Experimentally, a dome-shaped superconducting temperature is observed when the gate voltage $V_g$ is changed. We suggest that this effect stems from the variation of the chemical potential with $V_g$ and that it mirrors the evolution of the conventional part of the superfluid weight up to optimal doping. Furthermore, we propose that a second superconducting dome could be found at larger values of $V_g$ , as a result of the dominant contribution of the geometric superfluid weight, which would also matter in saturating the overdoped regime of the observed dome. Such features would underscore the impact of the normal state quantum geometry on the superconducting state.
2023-07-26	Inter-orbital Cooper pairing at finite energies in Rashba surface states	Philipp Rüßmann et.al.	2307.13990v1	null	Multi-band effects in hybrid structures provide a rich playground for unconventional superconductivity. We combine two complementary approaches based on density-functional theory (DFT) and effective low-energy model theory in order to investigate the proximity effect in a Rashba surface state in contact to an $s$-wave superconductor. We discuss these synergistic approaches and combine the effective model and DFT analysis at the example of a Au/Al heterostructure. This allows us to predict finite-energy superconducting pairing due to the interplay of the Rashba surface state of Au, and hybridization with the electronic structure of superconducting Al. We investigate the nature of the induced superconducting pairing and quantify its mixed singlet-triplet character. Our findings demonstrate general recipes to explore real material systems that exhibit inter-orbital pairing away from the Fermi energy.
2023-07-26	Towards a cosmic ray composition measurement with the IceAct telescopes at the IceCube Neutrino Observatory	Larissa Paul et.al.	2307.13965v1	null	The IceCube Neutrino Observatory is equipped with the unique possibility to measure cosmic ray induced air showers simultaneously by their particle footprint on the surface with the IceTop detector and by the high-energy muonic shower component at a depth of more than 1.5 km. Since 2019 additionally two Imaging Air Cherenkov Telescopes, called IceAct, measure the electromagnetic component of air showers in the atmosphere above the IceCube detector. This opens the possibility to measure air shower parameters in three independent detectors and allows to improve mass composition studies with the IceCube data. One IceAct camera consists of 61 SiPM pixels in a hexagonal grid. Each pixel has a field of view of 1.5 degree resulting in an approximately 12-degree field of view per camera. A single telescope tube has a diameter of 50 cm, is built robust enough to withstand the harsh Antarctic conditions, and is able to detect cosmic ray particles with energies above approximately 10 TeV. A Graph Neural Network (GNN) is trained to determine the air shower properties from IceAct data. The composition analysis is then performed using Random Forest Regression (RF). Since all three detectors have a different energy threshold, we train several RFs with different inputs, combining the different detectors and taking advantage of the lower energy threshold of the IceAct telescopes. This will result in composition measurements for different detector combinations and enables cross-checks of the results in overlapping energy bands. We present the method, parameters for data selection, and the status of this analysis.
2023-07-26	Detection of a strong ~2.5 Hz modulation in the Newly Discovered Millisecond Pulsar MAXI J1816-195	P. P. Li et.al.	2307.13955v1	null	MAXI J181-195 is a newly discovered accreting millisecond X-ray pulsar that went outburst in June 2022. Through timing analysis with NICER and NuSTAR observations, we find a transient modulation at ~2.5 Hz during the decay period of MAXI J1816-195. The modulation is strongly correlated with a spectral hardening, and its fractional rms amplitude increases with energy. These results suggest that the modulation is likely to be produced in an unstable corona. In addition, the presence of the modulation during thermonuclear bursts indicates that it may originate from a disk-corona where the optical depth is likely the main factor affecting the modulation, rather than temperature. Moreover, we find significant reflection features in the spectra observed simultaneously by NICER and NuSTAR, including a relativistically broadened Fe-K line around 6-7 keV, and a Compton hump in the 10-30 keV energy band. The radius of the inner disc is constrained to be Rin = (1.04-1.23) RISCO based on reflection modeling of the broadband spectra. Assuming that the inner disc is truncated at the magnetosphere radius, we estimate that the magnetic field strength is < 4.67 * 10e8 G.
2023-07-25	Fast Fabrication of WS2/Bi2Se3 Heterostructures for High Performance Photodetection	Fan Li et.al.	2307.13852v1	null	Two-dimensional (2D) material heterostructures have attracted considerable attention owing to their interesting and novel physical properties, which expand the possibilities for future optoelectronic, photovoltaic, and nanoelectronic applications. A portable, fast, and deterministic transfer technique is highly needed for the fabrication of heterostructures. Herein, we report a fast half wet poly(dimethylsiloxane) (PDMS) transfer process utilizing the change of adhesion energy with the help of micron-sized water droplets. Using this method, a vertical stacking of the WS2/Bi2Se3 heterostructure with a straddling band configuration is successfully assembled on a fluorophlogopite substrate. Thanks to the complementary band gaps and high efficiency of interfacial charge transfer, the photodetector based on the heterostructure exhibits a superior responsivity of 109.9 A/W for a visible incident light at 473 nm and 26.7 A/W for a 1064 nm near-infrared illumination. Such high photoresponsivity of the heterostructure demonstrates that our transfer method not only owns time efficiency but also ensures high quality of the heterointerface. Our study may open new pathways to the fast and massive fabrication of various vertical 2D heterostructures for applications in twistronics/valleytronics and other band engineering devices.
2023-07-25	Magnetotransport and Berry phase Tuning in Gd-doped Bi2Se3 Topological Insulator Single Crystals	Lei Chen et.al.	2307.13847v1	null	The Berry phase is an important concept in solids, correlated to the band topology, axion electrodynamics and potential applications of topological materials. Here, we investigate the magnetotransport and Berry phase of rare earth element Gd doped Bi2Se3 (Gd_Bi2Se3) topological insulator at low temperatures and high magnetic fields. Gd_Bi2Se3 single crystals show Shubnikov-de Haas (SdH) oscillations with nontrivial Berry phase while Bi2Se3 single crystals show zero Berry phase in SdH oscillations. The temperature dependent magnetization curves can be well fitted with the Curie-Weiss law in 3-300 K region, indicating no magnetic ordering in Gd_Bi2Se3 crystals. Moreover, Gd doping has limited influence on the quantum oscillation parameters (e.g., frequency of oscillations, the area of the Fermi surface, effective electron mass, Fermi wave vectors etc.), but has an impact on the Hall mobility, carrier density, and band topology. Our results demonstrate that Gd doping can tune the Berry phase of topological insulators effectively, which may pave a way for the future realization of many predicted exotic transport phenomena of topological origin.

all search terms

Publish Date	Title	Authors	PDF	Code	Abstract
2023-07-27	PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking	Yang Zheng et.al.	2307.15055v1	null	We introduce PointOdyssey, a large-scale synthetic dataset, and data generation framework, for the training and evaluation of long-term fine-grained tracking algorithms. Our goal is to advance the state-of-the-art by placing emphasis on long videos with naturalistic motion. Toward the goal of naturalism, we animate deformable characters using real-world motion capture data, we build 3D scenes to match the motion capture environments, and we render camera viewpoints using trajectories mined via structure-from-motion on real videos. We create combinatorial diversity by randomizing character appearance, motion profiles, materials, lighting, 3D assets, and atmospheric effects. Our dataset currently includes 104 videos, averaging 2,000 frames long, with orders of magnitude more correspondence annotations than prior work. We show that existing methods can be trained from scratch in our dataset and outperform the published variants. Finally, we introduce modifications to the PIPs point tracking method, greatly widening its temporal receptive field, which improves its performance on PointOdyssey as well as on two real-world benchmarks. Our data and code are publicly available at: https://pointodyssey.com
2023-07-27	MapNeRF: Incorporating Map Priors into Neural Radiance Fields for Driving View Simulation	Chenming Wu et.al.	2307.14981v1	null	Simulating camera sensors is a crucial task in autonomous driving. Although neural radiance fields are exceptional at synthesizing photorealistic views in driving simulations, they still fail in generating extrapolated views. This paper proposes to incorporate map priors into neural radiance fields to synthesize out-of-trajectory driving views with semantic road consistency. The key insight is that map information can be utilized as a prior to guide the training of the radiance fields with uncertainty. Specifically, we utilize the coarse ground surface as uncertain information to supervise the density field and warp depth with uncertainty from unknown camera poses to ensure multi-view consistency. Experimental results demonstrate that our approach can produce semantic consistency in deviated views for vehicle camera simulation.
2023-07-27	GET3D--: Learning GET3D from Unconstrained Image Collections	Fanghua Yu et.al.	2307.14918v1	null	The demand for efficient 3D model generation techniques has grown exponentially, as manual creation of 3D models is time-consuming and requires specialized expertise. While generative models have shown potential in creating 3D textured shapes from 2D images, their applicability in 3D industries is limited due to the lack of a well-defined camera distribution in real-world scenarios, resulting in low-quality shapes. To overcome this limitation, we propose GET3D--, the first method that directly generates textured 3D shapes from 2D images with unknown pose and scale. GET3D-- comprises a 3D shape generator and a learnable camera sampler that captures the 6D external changes on the camera. In addition, We propose a novel training schedule to stably optimize both the shape generator and camera sampler in a unified framework. By controlling external variations using the learnable camera sampler, our method can generate aligned shapes with clear textures. Extensive experiments demonstrate the efficacy of GET3D--, which precisely fits the 6D camera pose distribution and generates high-quality shapes on both synthetic and realistic unconstrained datasets.
2023-07-27	Weakly Supervised Multi-Modal 3D Human Body Pose Estimation for Autonomous Driving	Peter Bauer et.al.	2307.14889v1	null	Accurate 3D human pose estimation (3D HPE) is crucial for enabling autonomous vehicles (AVs) to make informed decisions and respond proactively in critical road scenarios. Promising results of 3D HPE have been gained in several domains such as human-computer interaction, robotics, sports and medical analytics, often based on data collected in well-controlled laboratory environments. Nevertheless, the transfer of 3D HPE methods to AVs has received limited research attention, due to the challenges posed by obtaining accurate 3D pose annotations and the limited suitability of data from other domains. We present a simple yet efficient weakly supervised approach for 3D HPE in the AV context by employing a high-level sensor fusion between camera and LiDAR data. The weakly supervised setting enables training on the target datasets without any 2D/3D keypoint labels by using an off-the-shelf 2D joint extractor and pseudo labels generated from LiDAR to image projections. Our approach outperforms state-of-the-art results by up to $\sim$ 13% on the Waymo Open Dataset in the weakly supervised setting and achieves state-of-the-art results in the supervised setting.
2023-07-27	Learning Full-Head 3D GANs from a Single-View Portrait Dataset	Yiqian Wu et.al.	2307.14770v1	null	33D-aware face generators are commonly trained on 2D real-life face image datasets. Nevertheless, existing facial recognition methods often struggle to extract face data captured from various camera angles. Furthermore, in-the-wild images with diverse body poses introduce a high-dimensional challenge for 3D-aware generators, making it difficult to utilize data that contains complete neck and shoulder regions. Consequently, these face image datasets often contain only near-frontal face data, which poses challenges for 3D-aware face generators to construct \textit{full-head} 3D portraits. To this end, we first create the dataset {$\it{360}^{\circ}$}-\textit{Portrait}-\textit{HQ} (\textit{$\it{360}^{\circ}$PHQ}), which consists of high-quality single-view real portraits annotated with a variety of camera parameters {(the yaw angles span the entire $360^{\circ}$ range)} and body poses. We then propose \textit{3DPortraitGAN}, the first 3D-aware full-head portrait generator that learns a canonical 3D avatar distribution from the body-pose-various \textit{$\it{360}^{\circ}$PHQ} dataset with body pose self-learning. Our model can generate view-consistent portrait images from all camera angles (${360}^{\circ}$) with a full-head 3D representation. We incorporate a mesh-guided deformation field into volumetric rendering to produce deformed results to generate portrait images that conform to the body pose distribution of the dataset using our canonical generator. We integrate two pose predictors into our framework to predict more accurate body poses to address the issue of inaccurately estimated body poses in our dataset. Our experiments show that the proposed framework can generate view-consistent, realistic portrait images with complete geometry from all camera angles and accurately predict portrait body pose.
2023-07-27	High Dynamic Range Imaging via Visual Attention Modules	Ali Reza Omrani et.al.	2307.14705v1	link	Thanks to High Dynamic Range (HDR) imaging methods, the scope of photography has seen profound changes recently. To be more specific, such methods try to reconstruct the lost luminosity of the real world caused by the limitation of regular cameras from the Low Dynamic Range (LDR) images. Additionally, although the State-Of-The-Art methods in this topic perform well, they mainly concentrate on combining different exposures and have less attention to extracting the informative parts of the images. Thus, this paper aims to introduce a new model capable of incorporating information from the most visible areas of each image extracted by a visual attention module (VAM), which is a result of a segmentation strategy. In particular, the model, based on a deep learning architecture, utilizes the extracted areas to produce the final HDR image. The results demonstrate that our method outperformed most of the State-Of-The-Art algorithms.
2023-07-27	FS-Depth: Focal-and-Scale Depth Estimation from a Single Image in Unseen Indoor Scene	Chengrui Wei et.al.	2307.14624v1	null	It has long been an ill-posed problem to predict absolute depth maps from single images in real (unseen) indoor scenes. We observe that it is essentially due to not only the scale-ambiguous problem but also the focal-ambiguous problem that decreases the generalization ability of monocular depth estimation. That is, images may be captured by cameras of different focal lengths in scenes of different scales. In this paper, we develop a focal-and-scale depth estimation model to well learn absolute depth maps from single images in unseen indoor scenes. First, a relative depth estimation network is adopted to learn relative depths from single images with diverse scales/semantics. Second, multi-scale features are generated by mapping a single focal length value to focal length features and concatenating them with intermediate features of different scales in relative depth estimation. Finally, relative depths and multi-scale features are jointly fed into an absolute depth estimation network. In addition, a new pipeline is developed to augment the diversity of focal lengths of public datasets, which are often captured with cameras of the same or similar focal lengths. Our model is trained on augmented NYUDv2 and tested on three unseen datasets. Our model considerably improves the generalization ability of depth estimation by 41%/13% (RMSE) with/without data augmentation compared with five recent SOTAs and well alleviates the deformation problem in 3D reconstruction. Notably, our model well maintains the accuracy of depth estimation on original NYUDv2.
2023-07-27	White-light superflare and long-term activity of the nearby M7 type binary EI~Cnc observed with GWAC system	Hua-Li Li et.al.	2307.14594v1	null	Stellar white-light flares are believed to play an essential role on the physical and chemical properties of the atmosphere of the surrounding exoplanets. Here we report an optical monitoring campaign on the nearby flaring system EI~Cnc carried out by the Ground-based Wide Angle Cameras (GWAC) and its dedicated follow-up telescope. A superflare, coming from the brighter component EI~CncA, was detected and observed, in which four components are required to properly model the complex decay light curve. The lower limit of flare energy in the $R-$band is estimated to be $3.3\times10^{32}$ ergs. 27 flares are additionally detected from the GWAC archive data with a total duration of 290 hours. The inferred cumulative flare frequency distribution follows a quite shallow power-law function with a slope of $\beta=-0.50\pm 0.03$ over the energy range between $10^{30}$ and $10^{33}$ erg, which reinforces the trend that stars cooler than M4 show enhanced superflare activity. The flares identified in EI~Cnc enable us to extend the $\tau-E$ relationship previously established in the white-light superflares of solar-type stars down to an energy as low as $\sim10^{30}$erg (i.e., by three orders): $\tau\propto E^{0.42\pm0.02}$, which suggests a common flare mechanism for stars with a type from M to solar-like, and implies an invariant of $B^{1/3}\upsilon_{\rm A}$ in the white-light flares.
2023-07-27	MCPA: Multi-scale Cross Perceptron Attention Network for 2D Medical Image Segmentation	Liang Xu et.al.	2307.14588v1	link	The UNet architecture, based on Convolutional Neural Networks (CNN), has demonstrated its remarkable performance in medical image analysis. However, it faces challenges in capturing long-range dependencies due to the limited receptive fields and inherent bias of convolutional operations. Recently, numerous transformer-based techniques have been incorporated into the UNet architecture to overcome this limitation by effectively capturing global feature correlations. However, the integration of the Transformer modules may result in the loss of local contextual information during the global feature fusion process. To overcome these challenges, we propose a 2D medical image segmentation model called Multi-scale Cross Perceptron Attention Network (MCPA). The MCPA consists of three main components: an encoder, a decoder, and a Cross Perceptron. The Cross Perceptron first captures the local correlations using multiple Multi-scale Cross Perceptron modules, facilitating the fusion of features across scales. The resulting multi-scale feature vectors are then spatially unfolded, concatenated, and fed through a Global Perceptron module to model global dependencies. Furthermore, we introduce a Progressive Dual-branch Structure to address the semantic segmentation of the image involving finer tissue structures. This structure gradually shifts the segmentation focus of MCPA network training from large-scale structural features to more sophisticated pixel-level features. We evaluate our proposed MCPA model on several publicly available medical image datasets from different tasks and devices, including the open large-scale dataset of CT (Synapse), MRI (ACDC), fundus camera (DRIVE, CHASE_DB1, HRF), and OCTA (ROSE). The experimental results show that our MCPA model achieves state-of-the-art performance. The code is available at https://github.com/simonustc/MCPA-for-2D-Medical-Image-Segmentation.
2023-07-27	A Memory-Augmented Multi-Task Collaborative Framework for Unsupervised Traffic Accident Detection in Driving Videos	Rongqin Liang et.al.	2307.14575v1	null	Identifying traffic accidents in driving videos is crucial to ensuring the safety of autonomous driving and driver assistance systems. To address the potential danger caused by the long-tailed distribution of driving events, existing traffic accident detection (TAD) methods mainly rely on unsupervised learning. However, TAD is still challenging due to the rapid movement of cameras and dynamic scenes in driving scenarios. Existing unsupervised TAD methods mainly rely on a single pretext task, i.e., an appearance-based or future object localization task, to detect accidents. However, appearance-based approaches are easily disturbed by the rapid movement of the camera and changes in illumination, which significantly reduce the performance of traffic accident detection. Methods based on future object localization may fail to capture appearance changes in video frames, making it difficult to detect ego-involved accidents (e.g., out of control of the ego-vehicle). In this paper, we propose a novel memory-augmented multi-task collaborative framework (MAMTCF) for unsupervised traffic accident detection in driving videos. Different from previous approaches, our method can more accurately detect both ego-involved and non-ego accidents by simultaneously modeling appearance changes and object motions in video frames through the collaboration of optical flow reconstruction and future object localization tasks. Further, we introduce a memory-augmented motion representation mechanism to fully explore the interrelation between different types of motion representations and exploit the high-level features of normal traffic patterns stored in memory to augment motion representations, thus enlarging the difference from anomalies. Experimental results on recently published large-scale dataset demonstrate that our method achieves better performance compared to previous state-of-the-art approaches.
2023-07-26	Patterns of Vehicle Lights: Addressing Complexities in Curation and Annotation of Camera-Based Vehicle Light Datasets and Metrics	Ross Greer et.al.	2307.14521v1	null	This paper explores the representation of vehicle lights in computer vision and its implications for various tasks in the field of autonomous driving. Different specifications for representing vehicle lights, including bounding boxes, center points, corner points, and segmentation masks, are discussed in terms of their strengths and weaknesses. Three important tasks in autonomous driving that can benefit from vehicle light detection are identified: nighttime vehicle detection, 3D vehicle orientation estimation, and dynamic trajectory cues. Each task may require a different representation of the light. The challenges of collecting and annotating large datasets for training data-driven models are also addressed, leading to introduction of the LISA Vehicle Lights Dataset and associated Light Visibility Model, which provides light annotations specifically designed for downstream applications in vehicle detection, intent and trajectory prediction, and safe path planning. A comparison of existing vehicle light datasets is provided, highlighting the unique features and limitations of each dataset. Overall, this paper provides insights into the representation of vehicle lights and the importance of accurate annotations for training effective detection models in autonomous driving applications. Our dataset and model are made available at https://cvrr.ucsd.edu/vehicle-lights-dataset
2023-07-26	Technical note: ShinyAnimalCV: open-source cloud-based web application for object detection, segmentation, and three-dimensional visualization of animals using computer vision	Jin Wang et.al.	2307.14487v1	null	Computer vision (CV), a non-intrusive and cost-effective technology, has furthered the development of precision livestock farming by enabling optimized decision-making through timely and individualized animal care. The availability of affordable two- and three-dimensional camera sensors, combined with various machine learning and deep learning algorithms, has provided a valuable opportunity to improve livestock production systems. However, despite the availability of various CV tools in the public domain, applying these tools to animal data can be challenging, often requiring users to have programming and data analysis skills, as well as access to computing resources. Moreover, the rapid expansion of precision livestock farming is creating a growing need to educate and train animal science students in CV. This presents educators with the challenge of efficiently demonstrating the complex algorithms involved in CV. Thus, the objective of this study was to develop ShinyAnimalCV, an open-source cloud-based web application. This application provides a user-friendly interface for performing CV tasks, including object segmentation, detection, three-dimensional surface visualization, and extraction of two- and three-dimensional morphological features. Nine pre-trained CV models using top-view animal data are included in the application. ShinyAnimalCV has been deployed online using cloud computing platforms. The source code of ShinyAnimalCV is available on GitHub, along with detailed documentation on training CV models using custom data and deploying ShinyAnimalCV locally to allow users to fully leverage the capabilities of the application. ShinyAnimalCV can contribute to CV research and teaching in the animal science community.
2023-07-26	AutoSourceID-Classifier. Star-Galaxy Classification using a Convolutional Neural Network with Spatial Information	F. Stoppa et.al.	2307.14456v1	null	Aims. Traditional star-galaxy classification techniques often rely on feature estimation from catalogues, a process susceptible to introducing inaccuracies, thereby potentially jeopardizing the classification's reliability. Certain galaxies, especially those not manifesting as extended sources, can be misclassified when their shape parameters and flux solely drive the inference. We aim to create a robust and accurate classification network for identifying stars and galaxies directly from astronomical images. By leveraging convolutional neural networks (CNN) and additional information about the source position, we aim to accurately classify all stars and galaxies within a survey, particularly those with a signal-to-noise ratio (S/N) near the detection limit. Methods. The AutoSourceID-Classifier (ASID-C) algorithm developed here uses 32x32 pixel single filter band source cutouts generated by the previously developed ASID-L code. ASID-C utilizes CNNs to distinguish these cutouts into stars or galaxies, leveraging their strong feature-learning capabilities. Subsequently, we employ a modified Platt Scaling calibration for the output of the CNN. This technique ensures that the derived probabilities are effectively calibrated, delivering precise and reliable results. Results. We show that ASID-C, trained on MeerLICHT telescope images and using the Dark Energy Camera Legacy Survey (DECaLS) morphological classification, outperforms similar codes like SourceExtractor. ASID-C opens up new possibilities for accurate celestial object classification, especially for sources with a S/N near the detection limit. Potential applications of ASID-C, like real-time star-galaxy classification and transient's host identification, promise significant contributions to astronomical research.
2023-07-26	US & MR Image-Fusion Based on Skin Co-Registration	Martina Paccini et.al.	2307.14288v1	null	The study and development of innovative solutions for the advanced visualisation, representation and analysis of medical images offer different research directions. Current practice in medical imaging consists in combining real-time US with imaging modalities that allow internal anatomy acquisitions, such as CT, MRI, PET or similar. Application of image-fusion approaches can be found in tracking surgical tools and/or needles, in real-time during interventions. Thus, this work proposes a fusion imaging system for the registration of CT and MRI images with real-time US acquisition leveraging a 3D camera sensor. The main focus of the work is the portability of the system and its applicability to different anatomical districts.
2023-07-26	Probing reflection from aerosols with the near-infrared dayside spectrum of WASP-80b	Bob Jacobs et.al.	2307.14399v1	null	The presence of aerosols is intimately linked to the global energy budget and composition of planet atmospheres. Their ability to reflect incoming light prevents energy from being deposited into the atmosphere, and they shape spectra of exoplanets. We observed one near-infrared secondary eclipse of WASP-80b with the Wide Field Camera 3 aboard the Hubble Space Telescope to provide constraints on the presence and properties of atmospheric aerosols. We detect a broadband eclipse depth of $34\pm10$ ppm for WASP-80b, making this the lowest equilibrium temperature planet for which a secondary eclipse has been detected so far with WFC3. We detect a higher planetary flux than expected from thermal emission alone at $1.6\sigma$ that hints toward the presence of reflecting aerosols on this planet's dayside. We paired the WFC3 data with Spitzer data and explored multiple atmospheric models with and without aerosols to interpret this spectrum. Albeit consistent with a clear dayside atmosphere, we found a slight preference for near-solar metallicities and for dayside clouds over hazes. We exclude soot haze formation rates higher than $10^{-10.7}$ g cm$^{-2}$s$^{-1}$ and tholin formation rates higher than $10^{-12.0}$ g cm$^{-2}$s$^{-1}$ at $3\sigma$. We applied the same atmospheric models to a previously published WFC3/Spitzer transmission spectrum for this planet and find weak haze formation. A single soot haze formation rate best fits both the dayside and the transmission spectra simultaneously. However, we emphasize that no models provide satisfactory fits in terms of chi-square of both spectra simultaneously, indicating longitudinal dissimilarity in the atmosphere's aerosol composition.
2023-07-26	DisguisOR: Holistic Face Anonymization for the Operating Room	Lennart Bastian et.al.	2307.14241v1	link	Purpose: Recent advances in Surgical Data Science (SDS) have contributed to an increase in video recordings from hospital environments. While methods such as surgical workflow recognition show potential in increasing the quality of patient care, the quantity of video data has surpassed the scale at which images can be manually anonymized. Existing automated 2D anonymization methods under-perform in Operating Rooms (OR), due to occlusions and obstructions. We propose to anonymize multi-view OR recordings using 3D data from multiple camera streams. Methods: RGB and depth images from multiple cameras are fused into a 3D point cloud representation of the scene. We then detect each individual's face in 3D by regressing a parametric human mesh model onto detected 3D human keypoints and aligning the face mesh with the fused 3D point cloud. The mesh model is rendered into every acquired camera view, replacing each individual's face. Results: Our method shows promise in locating faces at a higher rate than existing approaches. DisguisOR produces geometrically consistent anonymizations for each camera view, enabling more realistic anonymization that is less detrimental to downstream tasks. Conclusion: Frequent obstructions and crowding in operating rooms leaves significant room for improvement for off-the-shelf anonymization methods. DisguisOR addresses privacy on a scene level and has the potential to facilitate further research in SDS.
2023-07-26	The nature of the X-ray sources in dwarf galaxies in nearby clusters from the KIWICS	Şeyda Şen et.al.	2307.14230v1	null	We present a deep search for and analysis of X-ray sources in a sample of dwarf galaxies (M${r}$ < -15.5 mag) located within twelve galaxy clusters from the Kapteyn IAC WEAVE INT Cluster Survey (KIWICS) of photometric observations in the $\textit{r}$ and $\textit{g}$ using the Wide Field Camera (WFC) at the 2.5-m Isaac Newton telescope (INT). We first investigated the optical data, identified 2720 dwarf galaxies in all fields and determined their characteristics; namely, their colors, effective radii, and stellar masses. We then searched the $\textit{Chandra}$ data archive for X-ray counterparts of optically detected dwarf galaxies. We found a total of 20 X-ray emitting dwarf galaxies, with X-ray flux ranging from 1.7$\times10^{-15}$ to 4.1$\times10^{-14}$ erg cm$^{-2}$ s$^{-1}$ and X-ray luminosities varying from 2$\times10^{39}$ to 5.4$\times10^{41}$ erg s$^{-1}$. Our results indicate that the X-ray luminosity of the sources in our sample is larger than the Eddington luminosity limit for a typical neutron star, even at the lowest observed levels. This leads us to conclude that the sources emitting X-rays in our sample are likely black holes. Additionally, we have employed a scaling relation between black hole and stellar mass to estimate the masses of the black holes in our sample, and have determined a range of black hole masses from 4.6$\times10^{4}$ to 1.5$\times10^{6}$ M$\odot$. Finally, we find a trend between X-ray to optical flux ratio and X-ray flux. We discuss the implications of our findings and highlight the importance of X-ray observations in studying the properties of dwarf galaxies.
2023-07-26	Tackling Scattering and Reflective Flare in Mobile Camera Systems: A Raw Image Dataset for Enhanced Flare Removal	Fengbo Lan et.al.	2307.14180v1	null	The increasing prevalence of mobile devices has led to significant advancements in mobile camera systems and improved image quality. Nonetheless, mobile photography still grapples with challenging issues such as scattering and reflective flare. The absence of a comprehensive real image dataset tailored for mobile phones hinders the development of effective flare mitigation techniques. To address this issue, we present a novel raw image dataset specifically designed for mobile camera systems, focusing on flare removal. Capitalizing on the distinct properties of raw images, this dataset serves as a solid foundation for developing advanced flare removal algorithms. It encompasses a wide variety of real-world scenarios captured with diverse mobile devices and camera settings. The dataset comprises over 2,000 high-quality full-resolution raw image pairs for scattering flare and 1,100 for reflective flare, which can be further segmented into up to 30,000 and 2,200 paired patches, respectively, ensuring broad adaptability across various imaging conditions. Experimental results demonstrate that networks trained with synthesized data struggle to cope with complex lighting settings present in this real image dataset. We also show that processing data through a mobile phone's internal ISP compromises image quality while using raw image data presents significant advantages for addressing the flare removal problem. Our dataset is expected to enable an array of new research in flare removal and contribute to substantial improvements in mobile image quality, benefiting mobile photographers and end-users alike.
2023-07-26	Memory-Efficient Graph Convolutional Networks for Object Classification and Detection with Event Cameras	Kamil Jeziorek et.al.	2307.14124v1	null	Recent advances in event camera research emphasize processing data in its original sparse form, which allows the use of its unique features such as high temporal resolution, high dynamic range, low latency, and resistance to image blur. One promising approach for analyzing event data is through graph convolutional networks (GCNs). However, current research in this domain primarily focuses on optimizing computational costs, neglecting the associated memory costs. In this paper, we consider both factors together in order to achieve satisfying results and relatively low model complexity. For this purpose, we performed a comparative analysis of different graph convolution operations, considering factors such as execution time, the number of trainable model parameters, data format requirements, and training outcomes. Our results show a 450-fold reduction in the number of parameters for the feature extraction module and a 4.5-fold reduction in the size of the data representation while maintaining a classification accuracy of 52.3%, which is 6.3% higher compared to the operation used in state-of-the-art approaches. To further evaluate performance, we implemented the object detection architecture and evaluated its performance on the N-Caltech101 dataset. The results showed an accuracy of 53.7 % mAP@0.5 and reached an execution rate of 82 graphs per second.
2023-07-26	Learning heterogeneous delays in a layer of spiking neurons for fast motion detection	Antoine Grimaldi et.al.	2307.14077v1	null	The precise timing of spikes emitted by neurons plays a crucial role in shaping the response of efferent biological neurons. This temporal dimension of neural activity holds significant importance in understanding information processing in neurobiology, especially for the performance of neuromorphic hardware, such as event-based cameras. Nonetheless, many artificial neural models disregard this critical temporal dimension of neural activity. In this study, we present a model designed to efficiently detect temporal spiking motifs using a layer of spiking neurons equipped with heterogeneous synaptic delays. Our model capitalizes on the diverse synaptic delays present on the dendritic tree, enabling specific arrangements of temporally precise synaptic inputs to synchronize upon reaching the basal dendritic tree. We formalize this process as a time-invariant logistic regression, which can be trained using labeled data. To demonstrate its practical efficacy, we apply the model to naturalistic videos transformed into event streams, simulating the output of the biological retina or event-based cameras. To evaluate the robustness of the model in detecting visual motion, we conduct experiments by selectively pruning weights and demonstrate that the model remains efficient even under significantly reduced workloads. In conclusion, by providing a comprehensive, event-driven computational building block, the incorporation of heterogeneous delays has the potential to greatly improve the performance of future spiking neural network algorithms, particularly in the context of neuromorphic chips.
2023-07-26	Three-year performance of the IceAct telescopes at the IceCube Neutrino Observatory	Lars Heuermann et.al.	2307.13969v1	null	IceAct is an array of compact Imaging Air Cherenkov Telescopes at the ice surface as part of the IceCube Neutrino Observatory. The telescopes, featuring a camera of 61 silicon photomultipliers and fresnel-lens-based optics, are optimized to be operated in harsh environmental conditions, such as at the South Pole. Since 2019, the first two telescopes have been operating in a stereoscopic configuration in the center of IceCube's surface detector IceTop. With an energy threshold of about 10 TeV and a wide field-of-view, the IceAct telescopes show promising capabilities of improving current cosmic-ray composition studies: measuring the Cherenkov light emissions in the atmosphere adds new information about the shower development not accessible with the current detectors. First simulations indicate that the added information of a single telescope leads, e.g., to an improved discrimination between flux contributions from different primary particle species in the sensitive energy range. We review the performance and detector operations of the telescopes during the past 3 years (2020-2022) and give an outlook on the future of IceAct.
2023-07-26	Towards a cosmic ray composition measurement with the IceAct telescopes at the IceCube Neutrino Observatory	Larissa Paul et.al.	2307.13965v1	null	The IceCube Neutrino Observatory is equipped with the unique possibility to measure cosmic ray induced air showers simultaneously by their particle footprint on the surface with the IceTop detector and by the high-energy muonic shower component at a depth of more than 1.5 km. Since 2019 additionally two Imaging Air Cherenkov Telescopes, called IceAct, measure the electromagnetic component of air showers in the atmosphere above the IceCube detector. This opens the possibility to measure air shower parameters in three independent detectors and allows to improve mass composition studies with the IceCube data. One IceAct camera consists of 61 SiPM pixels in a hexagonal grid. Each pixel has a field of view of 1.5 degree resulting in an approximately 12-degree field of view per camera. A single telescope tube has a diameter of 50 cm, is built robust enough to withstand the harsh Antarctic conditions, and is able to detect cosmic ray particles with energies above approximately 10 TeV. A Graph Neural Network (GNN) is trained to determine the air shower properties from IceAct data. The composition analysis is then performed using Random Forest Regression (RF). Since all three detectors have a different energy threshold, we train several RFs with different inputs, combining the different detectors and taking advantage of the lower energy threshold of the IceAct telescopes. This will result in composition measurements for different detector combinations and enables cross-checks of the results in overlapping energy bands. We present the method, parameters for data selection, and the status of this analysis.
2023-07-25	Decisive Data using Multi-Modality Optical Sensors for Advanced Vehicular Systems	Muhammad Ali Farooq et.al.	2307.13600v1	null	Optical sensors have played a pivotal role in acquiring real world data for critical applications. This data, when integrated with advanced machine learning algorithms provides meaningful information thus enhancing human vision. This paper focuses on various optical technologies for design and development of state-of-the-art out-cabin forward vision systems and in-cabin driver monitoring systems. The focused optical sensors include Longwave Thermal Imaging (LWIR) cameras, Near Infrared (NIR), Neuromorphic/ event cameras, Visible CMOS cameras and Depth cameras. Further the paper discusses different potential applications which can be employed using the unique strengths of each these optical modalities in real time environment.
2023-07-25	HeightFormer: Explicit Height Modeling without Extra Data for Camera-only 3D Object Detection in Bird's Eye View	Yiming Wu et.al.	2307.13510v1	null	Vision-based Bird's Eye View (BEV) representation is an emerging perception formulation for autonomous driving. The core challenge is to construct BEV space with multi-camera features, which is a one-to-many ill-posed problem. Diving into all previous BEV representation generation methods, we found that most of them fall into two types: modeling depths in image views or modeling heights in the BEV space, mostly in an implicit way. In this work, we propose to explicitly model heights in the BEV space, which needs no extra data like LiDAR and can fit arbitrary camera rigs and types compared to modeling depths. Theoretically, we give proof of the equivalence between height-based methods and depth-based methods. Considering the equivalence and some advantages of modeling heights, we propose HeightFormer, which models heights and uncertainties in a self-recursive way. Without any extra data, the proposed HeightFormer could estimate heights in BEV accurately. Benchmark results show that the performance of HeightFormer achieves SOTA compared with those camera-only methods.
2023-07-25	Prior Based Online Lane Graph Extraction from Single Onboard Camera Image	Yigit Baran Can et.al.	2307.13344v1	null	The local road network information is essential for autonomous navigation. This information is commonly obtained from offline HD-Maps in terms of lane graphs. However, the local road network at a given moment can be drastically different than the one given in the offline maps; due to construction works, accidents etc. Moreover, the autonomous vehicle might be at a location not covered in the offline HD-Map. Thus, online estimation of the lane graph is crucial for widespread and reliable autonomous navigation. In this work, we tackle online Bird's-Eye-View lane graph extraction from a single onboard camera image. We propose to use prior information to increase quality of the estimations. The prior is extracted from the dataset through a transformer based Wasserstein Autoencoder. The autoencoder is then used to enhance the initial lane graph estimates. This is done through optimization of the latent space vector. The optimization encourages the lane graph estimation to be logical by discouraging it to diverge from the prior distribution. We test the method on two benchmark datasets, NuScenes and Argoverse. The results show that the proposed method significantly improves the performance compared to state-of-the-art methods.
2023-07-25	A Visual Quality Assessment Method for Raster Images in Scanned Document	Justin Yang et.al.	2307.13241v1	null	Image quality assessment (IQA) is an active research area in the field of image processing. Most prior works focus on visual quality of natural images captured by cameras. In this paper, we explore visual quality of scanned documents, focusing on raster image areas. Different from many existing works which aim to estimate a visual quality score, we propose a machine learning based classification method to determine whether the visual quality of a scanned raster image at a given resolution setting is acceptable. We conduct a psychophysical study to determine the acceptability at different image resolutions based on human subject ratings and use them as the ground truth to train our machine learning model. However, this dataset is unbalanced as most images were rated as visually acceptable. To address the data imbalance problem, we introduce several noise models to simulate the degradation of image quality during the scanning process. Our results show that by including augmented data in training, we can significantly improve the performance of the classifier to determine whether the visual quality of raster images in a scanned document is acceptable or not for a given resolution setting.
2023-07-24	Why Don't You Clean Your Glasses? Perception Attacks with Dynamic Optical Perturbations	Yi Han et.al.	2307.13131v1	null	Camera-based autonomous systems that emulate human perception are increasingly being integrated into safety-critical platforms. Consequently, an established body of literature has emerged that explores adversarial attacks targeting the underlying machine learning models. Adapting adversarial attacks to the physical world is desirable for the attacker, as this removes the need to compromise digital systems. However, the real world poses challenges related to the "survivability" of adversarial manipulations given environmental noise in perception pipelines and the dynamicity of autonomous systems. In this paper, we take a sensor-first approach. We present EvilEye, a man-in-the-middle perception attack that leverages transparent displays to generate dynamic physical adversarial examples. EvilEye exploits the camera's optics to induce misclassifications under a variety of illumination conditions. To generate dynamic perturbations, we formalize the projection of a digital attack into the physical domain by modeling the transformation function of the captured image through the optical pipeline. Our extensive experiments show that EvilEye's generated adversarial perturbations are much more robust across varying environmental light conditions relative to existing physical perturbation frameworks, achieving a high attack success rate (ASR) while bypassing state-of-the-art physical adversarial detection frameworks. We demonstrate that the dynamic nature of EvilEye enables attackers to adapt adversarial examples across a variety of objects with a significantly higher ASR compared to state-of-the-art physical world attack frameworks. Finally, we discuss mitigation strategies against the EvilEye attack.
2023-07-24	Automatic Infant Respiration Estimation from Video: A Deep Flow-based Algorithm and a Novel Public Benchmark	Sai Kumar Reddy Manne et.al.	2307.13110v1	link	Respiration is a critical vital sign for infants, and continuous respiratory monitoring is particularly important for newborns. However, neonates are sensitive and contact-based sensors present challenges in comfort, hygiene, and skin health, especially for preterm babies. As a step toward fully automatic, continuous, and contactless respiratory monitoring, we develop a deep-learning method for estimating respiratory rate and waveform from plain video footage in natural settings. Our automated infant respiration flow-based network (AIRFlowNet) combines video-extracted optical flow input and spatiotemporal convolutional processing tuned to the infant domain. We support our model with the first public annotated infant respiration dataset with 125 videos (AIR-125), drawn from eight infant subjects, set varied pose, lighting, and camera conditions. We include manual respiration annotations and optimize AIRFlowNet training on them using a novel spectral bandpass loss function. When trained and tested on the AIR-125 infant data, our method significantly outperforms other state-of-the-art methods in respiratory rate estimation, achieving a mean absolute error of $\sim$2.9 breaths per minute, compared to $\sim$4.7--6.2 for other public models designed for adult subjects and more uniform environments.
2023-07-24	Freeform three-mirror anastigmatic large-aperture telescope and receiver optics for CMB-S4	Patricio A. Gallardo et.al.	2307.12931v1	null	CMB-S4, the next-generation ground-based cosmic microwave background (CMB) observatory, will provide detailed maps of the CMB at millimeter wavelengths to dramatically advance our understanding of the origin and evolution of the universe. CMB-S4 will deploy large and small aperture telescopes with hundreds of thousands of detectors to observe the CMB at arcminute and degree resolutions at millimeter wavelengths. Inflationary science benefits from a deep delensing survey at arcminute resolutions capable of observing a large field of view at millimeter wavelengths. This kind of survey acts as a complement to a degree angular resolution survey. The delensing survey requires a nearly uniform distribution of cameras per frequency band across the focal plane. We present a large-throughput, large-aperture (5-meter diameter) freeform three-mirror anastigmatic telescope and an array of 85 cameras for CMB observations at arcminute resolutions, which meets the needs of the delensing survey of CMB-S4. A detailed prescription of this three-mirror telescope and cameras is provided, with a series of numerical calculations that indicate expected optical performance and mechanical tolerance.
2023-07-24	Trust-aware Safe Control for Autonomous Navigation: Estimation of System-to-human Trust for Trust-adaptive Control Barrier Functions	Saad Ejaz et.al.	2307.12815v1	null	A trust-aware safe control system for autonomous navigation in the presence of humans, specifically pedestrians, is presented. The system combines model predictive control (MPC) with control barrier functions (CBFs) and trust estimation to ensure safe and reliable navigation in complex environments. Pedestrian trust values are computed based on features, extracted from camera sensor images, such as mutual eye contact and smartphone usage. These trust values are integrated into the MPC controller's CBF constraints, allowing the autonomous vehicle to make informed decisions considering pedestrian behavior. Simulations conducted in the CARLA driving simulator demonstrate the feasibility and effectiveness of the proposed system, showcasing more conservative behaviour around inattentive pedestrians and vice versa. The results highlight the practicality of the system in real-world applications, providing a promising approach to enhance the safety and reliability of autonomous navigation systems, especially self-driving vehicles.

smart watch

Publish Date	Title	Authors	PDF	Code	Abstract
2023-07-27	Cavity-Mediated Molecular Entanglement and Generation of Non-Classical States of Light	Davis M. Welakuh et.al.	2307.15047v1	null	The generation and control of entanglement in a quantum mechanical system is a critical element of nearly all quantum applications. Molecular systems are a promising candidate, with numerous degrees of freedom able to be targeted. However, knowledge of inter-system entanglement mechanisms in such systems is limited. In this work, we demonstrate the generation of entanglement between vibrational degrees of freedom in molecules via strong coupling to a cavity mode driven by a weak coherent field. In a bi-molecular system, we show entanglement can not only be generated between the cavity and molecular system, but also between molecules. This process also results in the generation of non-classical states of light, providing potential pathways for harnessing entanglement in molecular systems.
2023-07-27	Smart Contract Migration: Security Analysis and Recommendations from Ethereum to Arbitrum	Xueyan Tang et.al.	2307.14773v1	null	This research aims to explore the security risks posed by compatibility and protocol differences in smart contract migration, using the migration of smart contracts from Ethereum to Arbitrum as a case study. Through literature review, online data collection, expert participation, and analysis of smart contract vulnerability cases, this paper conducts an in-depth research of the differences between Ethereum and Arbitrum in areas such as Messaging, Block Properties, Contract Address Alias, and Gas Fees. The research findings indicate the presence of certain security issues during the migration process from Ethereum to Arbitrum, such as abnormal operation of the sequencer resulting in outdated off-chain data retrieval, time-based logical errors, failed permission checks, DOS attacks, and gas loss due to L1-to-L2 transaction failures. To address these security issues, this paper proposes corresponding solutions and recommendations to ensure the security and meet the requirements of the migration process. Additionally, this research emphasizes the continued attention and support for the security issues of smart contract migration through the case of smart contract migration from Ethereum to Arbitrum. It is worth noting that this research is the first in-depth research of smart contract security migration from Ethereum to Arbitrum.
2023-07-27	How to Train Your YouTube Recommender	Alexander Liu et.al.	2307.14551v1	null	YouTube provides features for users to indicate disinterest when presented with unwanted recommendations, such as the `Not interested'' and`Don\'t recommend channel'' buttons. These buttons are purported to allow the user to correct `mistakes'' made by the recommendation system. Yet, relatively little is known about the empirical efficacy of these buttons. Neither is much known about users' awareness of and confidence in them. To address these gaps, we simulated YouTube users with sock puppet agents. Each agent first executed a`stain phase'', where it watched many videos of one assigned topic; then it executed a `scrub phase'', where it tried to remove recommendations of the assigned topic. Each agent repeatedly applied a single scrubbing strategy, which included disliking previously-watched videos or deleting them from watch history, as well as clicking the`not interested'' or `don\'t recommend channel'' button on newly-recommended videos. Overall, we found that the stain phase significantly increased the fraction of the recommended videos on the user\'s homepage dedicated to the assigned topic. For the scrub phase, using the`Not interested'' button worked best, significantly reducing such recommendations in all topics tested, on average removing 88\% of them. Neither the stain phase nor the scrub phase, however, had much effect on videopage recommendations (those given to users while they watch a video). We also ran a survey ($N$ =300) asking adult YouTube users in the US whether they were aware of and used these buttons before, as well as how effective they found these buttons to be. We found that 44\% of participants were not aware that the ``Not interested'' button existed. However, those who were aware of this button often used it to remove unwanted recommendations (82.8\%) and found it to be modestly effective (3.42 out of 5).
2023-07-26	Event-based Vision for Early Prediction of Manipulation Actions	Daniel Deniz et.al.	2307.14332v1	null	Neuromorphic visual sensors are artificial retinas that output sequences of asynchronous events when brightness changes occur in the scene. These sensors offer many advantages including very high temporal resolution, no motion blur and smart data compression ideal for real-time processing. In this study, we introduce an event-based dataset on fine-grained manipulation actions and perform an experimental study on the use of transformers for action prediction with events. There is enormous interest in the fields of cognitive robotics and human-robot interaction on understanding and predicting human actions as early as possible. Early prediction allows anticipating complex stages for planning, enabling effective and real-time interaction. Our Transformer network uses events to predict manipulation actions as they occur, using online inference. The model succeeds at predicting actions early on, building up confidence over time and achieving state-of-the-art classification. Moreover, the attention-based transformer architecture allows us to study the role of the spatio-temporal patterns selected by the model. Our experiments show that the Transformer network captures action dynamic features outperforming video-based approaches and succeeding with scenarios where the differences between actions lie in very subtle cues. Finally, we release the new event dataset, which is the first in the literature for manipulation action recognition. Code will be available at https://github.com/DaniDeniz/EventVisionTransformer.
2023-07-26	Simulation of Open Quantum Systems via Low-Depth Convex Unitary Evolutions	Joseph Peetz et.al.	2307.14325v2	null	Simulating physical systems on quantum devices is one of the most promising applications of quantum technology. Current quantum approaches to simulating open quantum systems are still practically challenging on NISQ-era devices, because they typically require ancilla qubits and extensive controlled sequences. In this work, we propose a hybrid quantum-classical approach for simulating a class of open system dynamics called random-unitary channels. These channels naturally decompose into a series of convex unitary evolutions, which can then be efficiently sampled and run as independent circuits. The method does not require deep ancilla frameworks and thus can be implemented with lower noise costs. We implement simulations of open quantum systems up to dozens of qubits and with large channel rank.
2023-07-26	Large-scale Fully-Unsupervised Re-Identification	Gabriel Bertocco et.al.	2307.14278v1	null	Fully-unsupervised Person and Vehicle Re-Identification have received increasing attention due to their broad applicability in surveillance, forensics, event understanding, and smart cities, without requiring any manual annotation. However, most of the prior art has been evaluated in datasets that have just a couple thousand samples. Such small-data setups often allow the use of costly techniques in time and memory footprints, such as Re-Ranking, to improve clustering results. Moreover, some previous work even pre-selects the best clustering hyper-parameters for each dataset, which is unrealistic in a large-scale fully-unsupervised scenario. In this context, this work tackles a more realistic scenario and proposes two strategies to learn from large-scale unlabeled data. The first strategy performs a local neighborhood sampling to reduce the dataset size in each iteration without violating neighborhood relationships. A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n^2) to O(kn) with k << n. To avoid the pre-selection of specific hyper-parameter values for the clustering algorithm, we also present a novel scheduling algorithm that adjusts the density parameter during training, to leverage the diversity of samples and keep the learning robust to noisy labeling. Finally, due to the complementary knowledge learned by different models, we also introduce a co-training strategy that relies upon the permutation of predicted pseudo-labels, among the backbones, with no need for any hyper-parameters or weighting optimization. The proposed methodology outperforms the state-of-the-art methods in well-known benchmarks and in the challenging large-scale Veri-Wild dataset, with a faster and memory-efficient Re-Ranking strategy, and a large-scale, noisy-robust, and ensemble-based learning approach.
2023-07-25	Integration of Digital Twin and Federated Learning for Securing Vehicular Internet of Things	Deepti Gupta et.al.	2307.13794v1	null	In the present era of advanced technology, the Internet of Things (IoT) plays a crucial role in enabling smart connected environments. This includes various domains such as smart homes, smart healthcare, smart cities, smart vehicles, and many others.With ubiquitous smart connected devices and systems, a large amount of data associated with them is at a prime risk from malicious entities (e.g., users, devices, applications) in these systems. Innovative technologies, including cloud computing, Machine Learning (ML), and data analytics, support the development of anomaly detection models for the Vehicular Internet of Things (V-IoT), which encompasses collaborative automatic driving and enhanced transportation systems. However, traditional centralized anomaly detection models fail to provide better services for connected vehicles due to issues such as high latency, privacy leakage, performance overhead, and model drift. Recently, Federated Learning (FL) has gained significant recognition for its ability to address data privacy concerns in the IoT domain. Digital Twin (DT), proves beneficial in addressing uncertain crises and data security issues by creating a virtual replica that simulates various factors, including traffic trajectories, city policies, and vehicle utilization. However, the effectiveness of a V-IoT DT system heavily relies on the collection of long-term and high-quality data to make appropriate decisions. This paper introduces a Hierarchical Federated Learning (HFL) based anomaly detection model for V-IoT, aiming to enhance the accuracy of the model. Our proposed model integrates both DT and HFL approaches to create a comprehensive system for detecting malicious activities using an anomaly detection model. Additionally, real-world V-IoT use case scenarios are presented to demonstrate the application of the proposed model.
2023-07-25	ChildGAN: Large Scale Synthetic Child Facial Data Using Domain Adaptation in StyleGAN	Muhammad Ali Farooq et.al.	2307.13746v1	null	In this research work, we proposed a novel ChildGAN, a pair of GAN networks for generating synthetic boys and girls facial data derived from StyleGAN2. ChildGAN is built by performing smooth domain transfer using transfer learning. It provides photo-realistic, high-quality data samples. A large-scale dataset is rendered with a variety of smart facial transformations: facial expressions, age progression, eye blink effects, head pose, skin and hair color variations, and variable lighting conditions. The dataset comprises more than 300k distinct data samples. Further, the uniqueness and characteristics of the rendered facial features are validated by running different computer vision application tests which include CNN-based child gender classifier, face localization and facial landmarks detection test, identity similarity evaluation using ArcFace, and lastly running eye detection and eye aspect ratio tests. The results demonstrate that synthetic child facial data of high quality offers an alternative to the cost and complexity of collecting a large-scale dataset from real children.
2023-07-25	Insights into Cognitive Engagement: Comparing the Effectiveness of Game-Based and Video-Based Learning	Shayla Sharmin et.al.	2307.13637v1	null	The analysis of brain signals holds considerable importance in enhancing our comprehension of diverse learning techniques and cognitive mechanisms. Game-based learning is increasingly being recognized for its interactive and engaging educational approach. A pilot study of twelve participants divided into experimental and control groups was conducted to understand its effects on cognitive processes. Both groups were provided with the same contents regarding the basic structure of the graph. The participants in the experimental group engaged in a quiz-based game, while those in the control group watched a pre-recorded video. Functional Near-Infrared Spectroscopy (fNIRS) was employed to acquire cerebral signals, and a series of pre and post-tests were administered. The findings of our study indicate that the group engaged in the game activity displayed elevated levels of oxygenated hemoglobin compared to the group involved in watching videos. Conversely, the deoxygenated hemoglobin levels remained relatively consistent across both groups throughout the learning process. The aforementioned findings suggest that the use of game-based learning has a substantial influence on cognitive processes. Furthermore, it is evident that both the game and video groups exhibited higher neural activity in the Lateral Prefrontal cortex (PFC). The oxygenated hemoglobin ratio demonstrates that the game group had 2.33 times more neural processing in the Lateral PFC than the video group. This data is further supported by the knowledge gain analysis, which indicates that the game-based approach resulted in a 47.74% higher knowledge gain than the video group, as calculated from the difference in pre-and post-test scores.
2023-07-25	On-Device Speaker Anonymization of Acoustic Embeddings for ASR based onFlexible Location Gradient Reversal Layer	Md Asif Jalal et.al.	2307.13343v1	null	Smart devices serviced by large-scale AI models necessitates user data transfer to the cloud for inference. For speech applications, this means transferring private user information, e.g., speaker identity. Our paper proposes a privacy-enhancing framework that targets speaker identity anonymization while preserving speech recognition accuracy for our downstream task~-~Automatic Speech Recognition (ASR). The proposed framework attaches flexible gradient reversal based speaker adversarial layers to target layers within an ASR model, where speaker adversarial training anonymizes acoustic embeddings generated by the targeted layers to remove speaker identity. We propose on-device deployment by execution of initial layers of the ASR model, and transmitting anonymized embeddings to the cloud, where the rest of the model is executed while preserving privacy. Experimental results show that our method efficiently reduces speaker recognition relative accuracy by 33%, and improves ASR performance by achieving 6.2% relative Word Error Rate (WER) reduction.
2023-07-25	A Pairwise Dataset for GUI Conversion and Retrieval between Android Phones and Tablets	Han Hu et.al.	2307.13225v1	null	With the popularity of smartphones and tablets, users have become accustomed to using different devices for different tasks, such as using their phones to play games and tablets to watch movies. To conquer the market, one app is often available on both smartphones and tablets. However, although one app has similar graphic user interfaces (GUIs) and functionalities on phone and tablet, current app developers typically start from scratch when developing a tablet-compatible version of their app, which drives up development costs and wastes existing design resources. Researchers are attempting to employ deep learning in automated GUIs development to enhance developers' productivity. Deep learning models rely heavily on high-quality datasets. There are currently several publicly accessible GUI page datasets for phones, but none for pairwise GUIs between phones and tablets. This poses a significant barrier to the employment of deep learning in automated GUI development. In this paper, we collect and make public the Papt dataset, which is a pairwise dataset for GUI conversion and retrieval between Android phones and tablets. The dataset contains 10,035 phone-tablet GUI page pairs from 5,593 phone-tablet app pairs. We illustrate the approaches of collecting pairwise data and statistical analysis of this dataset. We also illustrate the advantages of our dataset compared to other current datasets. Through preliminary experiments on this dataset, we analyse the present challenges of utilising deep learning in automated GUI development and find that our dataset can assist the application of some deep learning models to tasks involving automatic GUI development.
2023-07-24	Evaluating the reliability of automatically generated pedestrian and bicycle crash surrogates	Agnimitra Sengupta et.al.	2307.13178v1	null	Vulnerable road users (VRUs), such as pedestrians and bicyclists, are at a higher risk of being involved in crashes with motor vehicles, and crashes involving VRUs also are more likely to result in severe injuries or fatalities. Signalized intersections are a major safety concern for VRUs due to their complex and dynamic nature, highlighting the need to understand how these road users interact with motor vehicles and deploy evidence-based countermeasures to improve safety performance. Crashes involving VRUs are relatively infrequent, making it difficult to understand the underlying contributing factors. An alternative is to identify and use conflicts between VRUs and motorized vehicles as a surrogate for safety performance. Automatically detecting these conflicts using a video-based systems is a crucial step in developing smart infrastructure to enhance VRU safety. The Pennsylvania Department of Transportation conducted a study using video-based event monitoring system to assess VRU and motor vehicle interactions at fifteen signalized intersections across Pennsylvania to improve VRU safety performance. This research builds on that study to assess the reliability of automatically generated surrogates in predicting confirmed conflicts using advanced data-driven models. The surrogate data used for analysis include automatically collectable variables such as vehicular and VRU speeds, movements, post-encroachment time, in addition to manually collected variables like signal states, lighting, and weather conditions. The findings highlight the varying importance of specific surrogates in predicting true conflicts, some being more informative than others. The findings can assist transportation agencies to collect the right types of data to help prioritize infrastructure investments, such as bike lanes and crosswalks, and evaluate their effectiveness.
2023-07-24	Schema-Driven Actionable Insight Generation and Smart Recommendation	Allmin Susaiyah et.al.	2307.13176v1	null	In natural language generation (NLG), insight mining is seen as a data-to-text task, where data is mined for interesting patterns and verbalised into 'insight' statements. An 'over-generate and rank' paradigm is intuitively used to generate such insights. The multidimensionality and subjectivity of this process make it challenging. This paper introduces a schema-driven method to generate actionable insights from data to drive growth and change. It also introduces a technique to rank the insights to align with user interests based on their feedback. We show preliminary qualitative results of the insights generated using our technique and demonstrate its ability to adapt to feedback.
2023-07-24	A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning	Benjamin Eysenbach et.al.	2307.12968v1	link	As with any machine learning problem with limited data, effective offline RL algorithms require careful regularization to avoid overfitting. One-step methods perform regularization by doing just a single step of policy improvement, while critic regularization methods do many steps of policy improvement with a regularized objective. These methods appear distinct. One-step methods, such as advantage-weighted regression and conditional behavioral cloning, truncate policy iteration after just one step. This ``early stopping'' makes one-step RL simple and stable, but can limit its asymptotic performance. Critic regularization typically requires more compute but has appealing lower-bound guarantees. In this paper, we draw a close connection between these methods: applying a multi-step critic regularization method with a regularization coefficient of 1 yields the same policy as one-step RL. While practical implementations violate our assumptions and critic regularization is typically applied with smaller regularization coefficients, our experiments nevertheless show that our analysis makes accurate, testable predictions about practical offline RL methods (CQL and one-step RL) with commonly-used hyperparameters. Our results that every problem can be solved with a single step of policy improvement, but rather that one-step RL might be competitive with critic regularization on RL problems that demand strong regularization.
2023-07-24	Rechargeable Li/Cl$_2$ battery down to -80 °C	Peng Liang et.al.	2307.12947v1	null	Low temperature rechargeable batteries are important to life in cold climates, polar/deep-sea expeditions and space explorations. Here, we report ~ 3.5 - 4 V rechargeable lithium/chlorine (Li/Cl2) batteries operating down to -80 {\deg}C, employing Li metal negative electrode, a novel CO2 activated porous carbon (KJCO2) as the positive electrode, and a high ionic conductivity (~ 5 to 20 mS cm-1 from -80 {\deg}C to 25 {\deg}C) electrolyte comprised of 1 M aluminum chloride (AlCl3), 0.95 M lithium chloride (LiCl), and 0.05 M lithium bis(fluorosulfonyl)imide (LiFSI) in low melting point (-104.5 {\deg}C) thionyl chloride (SOCl2). Between room-temperature and -80 {\deg}C, the Li/Cl2 battery delivered up to ~ 30,000 - 4,500 mAh g-1 first discharge capacity and a 1,200 - 5,000 mAh g-1 reversible capacity (discharge voltages in ~ 3.5 to 3.1 V) over up to 130 charge-discharge cycles. Mass spectrometry and X-ray photoelectron spectroscopy (XPS) probed Cl2 trapped in the porous carbon upon LiCl electro-oxidation during charging. At lower temperature down to -80 {\deg}C, SCl2/S2Cl2 and Cl2 generated by electro-oxidation in the charging step were trapped in porous KJCO2 carbon, allowing for reversible reduction to afford a high discharge voltage plateau near ~ 4 V with up to ~ 1000 mAh g-1 capacity for SCl2/S2Cl2 reduction and up to ~ 4000 mAh g-1 capacity at ~ 3.1 V plateau for Cl2 reduction. Towards practical use, we made CR2032 Li/Cl2 battery cells to drive digital watches at -40 {\deg}C and light emitting diode at -80 {\deg}C, opening Li/Cl2 secondary batteries for ultra-cold conditions.
2023-07-24	Economic Analysis of Smart Roadside Infrastructure Sensors for Connected and Automated Mobility	Laurent Kloeker et.al.	2307.12893v1	null	Smart roadside infrastructure sensors in the form of intelligent transportation system stations (ITS-Ss) are increasingly deployed worldwide at relevant traffic nodes. The resulting digital twins of the real environment are suitable for developing and validating connected and automated driving functions and for increasing the operational safety of intelligent vehicles by providing ITS-S real-time data. However, ITS-Ss are very costly to establish and operate. The choice of sensor technology also has an impact on the overall costs as well as on the data quality. So far, there is only insufficient knowledge about the concrete expenses that arise with the construction of different ITS-S setups. Within this work, multiple modular infrastructure sensor setups are investigated with the help of a life cycle cost analysis (LCCA). Their economic efficiency, different user requirements and sensor data qualities are considered. Based on the static cost model, a Monte Carlo simulation is performed, to generate a range of possible project costs and to quantify the financial risks of implementing ITS-S projects of different scales. Due to its modularity, the calculation model is suitable for diverse applications and outputs a distinctive evaluation of the underlying cost-benefit ratio of investigated setups.
2023-07-24	Multi-Shooting Differential Dynamic Programming for Hybrid Systems using Analytical Derivatives	Shubham Singh et.al.	2307.12606v1	null	Differential Dynamic Programming (DDP) is a popular technique used to generate motion for dynamic-legged robots in the recent past. However, in most cases, only the first-order partial derivatives of the underlying dynamics are used, resulting in the iLQR approach. Neglecting the second-order terms often slows down the convergence rate compared to full DDP. Multi-Shooting is another popular technique to improve robustness, especially if the dynamics are highly non-linear. In this work, we consider Multi-Shooting DDP for trajectory optimization of a bounding gait for a simplified quadruped model. As the main contribution, we develop Second-Order analytical partial derivatives of the rigid-body contact dynamics, extending our previous results for fixed/floating base models with multi-DoF joints. Finally, we show the benefits of a novel Quasi-Newton method for approximating second-order derivatives of the dynamics, leading to order-of-magnitude speedups in the convergence compared to the full DDP method.
2023-07-24	Automated Mapping of Adaptive App GUIs from Phones to TVs	Han Hu et.al.	2307.12522v1	null	With the increasing interconnection of smart devices, users often desire to adopt the same app on quite different devices for identical tasks, such as watching the same movies on both their smartphones and TV. However, the significant differences in screen size, aspect ratio, and interaction styles make it challenging to adapt Graphical User Interfaces (GUIs) across these devices. Although there are millions of apps available on Google Play, only a few thousand are designed to support smart TV displays. Existing techniques to map a mobile app GUI to a TV either adopt a responsive design, which struggles to bridge the substantial gap between phone and TV or use mirror apps for improved video display, which requires hardware support and extra engineering efforts. Instead of developing another app for supporting TVs, we propose a semi-automated approach to generate corresponding adaptive TV GUIs, given the phone GUIs as the input. Based on our empirical study of GUI pairs for TV and phone in existing apps, we synthesize a list of rules for grouping and classifying phone GUIs, converting them to TV GUIs, and generating dynamic TV layouts and source code for the TV display. Our tool is not only beneficial to developers but also to GUI designers, who can further customize the generated GUIs for their TV app development. An evaluation and user study demonstrate the accuracy of our generated GUIs and the usefulness of our tool.
2023-07-23	An Efficient Authentication Protocol for Smart Grid Communication Based on On-Chip-Error-Correcting Physical Unclonable Function	Masoud Kaveh et.al.	2307.12374v1	null	Security has become a main concern for the smart grid to move from research and development to industry. The concept of security has usually referred to resistance to threats by an active or passive attacker. However, since smart meters (SMs) are often placed in unprotected areas, physical security has become one of the important security goals in the smart grid. Physical unclonable functions (PUFs) have been largely utilized for ensuring physical security in recent years, though their reliability has remained a major problem to be practically used in cryptographic applications. Although fuzzy extractors have been considered as a solution to solve the reliability problem of PUFs, they put a considerable computational cost to the resource-constrained SMs. To that end, we first propose an on-chip-error-correcting (OCEC) PUF that efficiently generates stable digits for the authentication process. Afterward, we introduce a lightweight authentication protocol between the SMs and neighborhood gateway (NG) based on the proposed PUF. The provable security analysis shows that not only the proposed protocol can stand secure in the Canetti-Krawczyk (CK) adversary model but also provides additional security features. Also, the performance evaluation demonstrates the significant improvement of the proposed scheme in comparison with the state-of-the-art.
2023-07-23	Semantic Communication-Empowered Traffic Management using Vehicle Count Prediction	Sachin Kadam et.al.	2307.12254v1	null	Vehicle count prediction is an important aspect of smart city traffic management. Most major roads are monitored by cameras with computing and transmitting capabilities. These cameras provide data to the central traffic controller (CTC), which is in charge of traffic control management. In this paper, we propose a joint CNN-LSTM-based semantic communication (SemCom) model in which the semantic encoder of a camera extracts the relevant semantics from raw images. The encoded semantics are then sent to the CTC by the transmitter in the form of symbols. The semantic decoder of the CTC predicts the vehicle count on each road based on the sequence of received symbols and develops a traffic management strategy accordingly. An optimization problem to improve the quality of experience (QoE) is introduced and numerically solved, taking into account constraints such as vehicle user safety, transmit power of camera devices, vehicle count prediction accuracy, and semantic entropy. Using numerical results, we show that the proposed SemCom model reduces overhead by $54.42\%$ when compared to source encoder/decoder methods. Also, we demonstrate through simulations that the proposed model outperforms state-of-the-art models in terms of mean absolute error (MAE) and QoE.
2023-07-23	LiveRetro: Visual Analytics for Strategic Retrospect in Livestream E-Commerce	Yuchen Wu et.al.	2307.12213v1	null	Livestream e-commerce integrates live streaming and online shopping, allowing viewers to make purchases while watching. However, effective marketing strategies remain a challenge due to limited empirical research and subjective biases from the absence of quantitative data. Current tools fail to capture the interdependence between live performances and feedback. This study identified computational features, formulated design requirements, and developed LiveRetro, an interactive visual analytics system. It enables comprehensive retrospective analysis of livestream e-commerce for streamers, viewers, and merchandise. LiveRetro employs enhanced visualization and time-series forecasting models to align performance features and feedback, identifying influences at channel, merchandise, feature, and segment levels. Through case studies and expert interviews, the system provides deep insights into the relationship between live performance and streaming statistics, enabling efficient strategic analysis from multiple perspectives.
2023-07-23	Content Censorship in the InterPlanetary File System	Srivatsan Sridhar et.al.	2307.12212v1	null	The InterPlanetary File System (IPFS) is currently the largest decentralized storage solution in operation, with thousands of active participants and millions of daily content transfers. IPFS is used as remote data storage for numerous blockchain-based smart contracts, Non-Fungible Tokens (NFT), and decentralized applications. We present a content censorship attack that can be executed with minimal effort and cost, and that prevents the retrieval of any chosen content in the IPFS network. The attack exploits a conceptual issue in a core component of IPFS, the Kademlia Distributed Hash Table (DHT), which is used to resolve content IDs to peer addresses. We provide efficient detection and mitigation mechanisms for this vulnerability. Our mechanisms achieve a 99.6\% detection rate and mitigate 100\% of the detected attacks with minimal signaling and computational overhead. We followed responsible disclosure procedures, and our countermeasures are scheduled for deployment in the future versions of IPFS.
2023-07-22	AI on the Road: A Comprehensive Analysis of Traffic Accidents and Accident Detection System in Smart Cities	Victor Adewopo et.al.	2307.12128v1	null	Accident detection and traffic analysis is a critical component of smart city and autonomous transportation systems that can reduce accident frequency, severity and improve overall traffic management. This paper presents a comprehensive analysis of traffic accidents in different regions across the United States using data from the National Highway Traffic Safety Administration (NHTSA) Crash Report Sampling System (CRSS). To address the challenges of accident detection and traffic analysis, this paper proposes a framework that uses traffic surveillance cameras and action recognition systems to detect and respond to traffic accidents spontaneously. Integrating the proposed framework with emergency services will harness the power of traffic cameras and machine learning algorithms to create an efficient solution for responding to traffic accidents and reducing human errors. Advanced intelligence technologies, such as the proposed accident detection systems in smart cities, will improve traffic management and traffic accident severity. Overall, this study provides valuable insights into traffic accidents in the US and presents a practical solution to enhance the safety and efficiency of transportation systems.
2023-07-22	Blockchain-based Cloud Data Deduplication Scheme with Fair Incentives	Mallikarjun Reddy Dorsala et.al.	2307.12052v1	null	With the rapid development of cloud computing, vast amounts of duplicated data are being uploaded to the cloud, wasting storage resources. Deduplication (dedup) is an efficient solution to save storage costs of cloud storage providers (CSPs) by storing only one copy of the uploaded data. However, cloud users do not benefit directly from dedup and may be reluctant to dedup their data. To motivate the cloud users towards dedup, CSPs offer incentives on storage fees. The problems with the existing dedup schemes are that they do not consider: (1) correctness - the incentive offered to a cloud user should be computed correctly without any prejudice. (2) fairness - the cloud user receives the file link and access rights of the uploaded data if and only if the CSP receives the storage fee. Meeting these requirements without a trusted party is non-trivial, and most of the existing dedup schemes do not apply. Another drawback is that most of the existing schemes emphasize incentives to cloud users but failed to provide a reliable incentive mechanism. As public Blockchain networks emulate the properties of trusted parties, in this paper, we propose a new Blockchain-based dedup scheme to meet the above requirements. In our scheme, a smart contract computes the incentives on storage fee, and the fairness rules are encoded into the smart contract for facilitating fair payments between the CSPs and cloud users. We prove the correctness and fairness of the proposed scheme. We also design a new incentive mechanism and show that the scheme is individually rational and incentive compatible. Furthermore, we conduct experiments by implementing the designed smart contract on Ethereum local Blockchain network and list the transactional and financial costs of interacting with the designed smart contract.
2023-07-22	Two-stream Multi-level Dynamic Point Transformer for Two-person Interaction Recognition	Yao Liu et.al.	2307.11973v1	null	As a fundamental aspect of human life, two-person interactions contain meaningful information about people's activities, relationships, and social settings. Human action recognition serves as the foundation for many smart applications, with a strong focus on personal privacy. However, recognizing two-person interactions poses more challenges due to increased body occlusion and overlap compared to single-person actions. In this paper, we propose a point cloud-based network named Two-stream Multi-level Dynamic Point Transformer for two-person interaction recognition. Our model addresses the challenge of recognizing two-person interactions by incorporating local-region spatial information, appearance information, and motion information. To achieve this, we introduce a designed frame selection method named Interval Frame Sampling (IFS), which efficiently samples frames from videos, capturing more discriminative information in a relatively short processing time. Subsequently, a frame features learning module and a two-stream multi-level feature aggregation module extract global and partial features from the sampled frames, effectively representing the local-region spatial information, appearance information, and motion information related to the interactions. Finally, we apply a transformer to perform self-attention on the learned features for the final classification. Extensive experiments are conducted on two large-scale datasets, the interaction subsets of NTU RGB+D 60 and NTU RGB+D 120. The results show that our network outperforms state-of-the-art approaches across all standard evaluation settings.
2023-07-21	Building3D: An Urban-Scale Dataset and Benchmarks for Learning Roof Structures from Point Clouds	Ruisheng Wang et.al.	2307.11914v1	null	Urban modeling from LiDAR point clouds is an important topic in computer vision, computer graphics, photogrammetry and remote sensing. 3D city models have found a wide range of applications in smart cities, autonomous navigation, urban planning and mapping etc. However, existing datasets for 3D modeling mainly focus on common objects such as furniture or cars. Lack of building datasets has become a major obstacle for applying deep learning technology to specific domains such as urban modeling. In this paper, we present a urban-scale dataset consisting of more than 160 thousands buildings along with corresponding point clouds, mesh and wire-frame models, covering 16 cities in Estonia about 998 Km2. We extensively evaluate performance of state-of-the-art algorithms including handcrafted and deep feature based methods. Experimental results indicate that Building3D has challenges of high intra-class variance, data imbalance and large-scale noises. The Building3D is the first and largest urban-scale building modeling benchmark, allowing a comparison of supervised and self-supervised learning methods. We believe that our Building3D will facilitate future research on urban modeling, aerial path planning, mesh simplification, and semantic/part segmentation etc.
2023-07-21	Selecting the motion ground truth for loose-fitting wearables: benchmarking optical MoCap methods	Lala Shakti Swarup Ray et.al.	2307.11881v2	link	To help smart wearable researchers choose the optimal ground truth methods for motion capturing (MoCap) for all types of loose garments, we present a benchmark, DrapeMoCapBench (DMCB), specifically designed to evaluate the performance of optical marker-based and marker-less MoCap. High-cost marker-based MoCap systems are well-known as precise golden standards. However, a less well-known caveat is that they require skin-tight fitting markers on bony areas to ensure the specified precision, making them questionable for loose garments. On the other hand, marker-less MoCap methods powered by computer vision models have matured over the years, which have meager costs as smartphone cameras would suffice. To this end, DMCB uses large real-world recorded MoCap datasets to perform parallel 3D physics simulations with a wide range of diversities: six levels of drape from skin-tight to extremely draped garments, three levels of motions and six body type - gender combinations to benchmark state-of-the-art optical marker-based and marker-less MoCap methods to identify the best-performing method in different scenarios. In assessing the performance of marker-based and low-cost marker-less MoCap for casual loose garments both approaches exhibit significant performance loss (>10cm), but for everyday activities involving basic and fast motions, marker-less MoCap slightly outperforms marker-based MoCap, making it a favorable and cost-effective choice for wearable studies.
2023-07-21	Smart Machine Vision for Universal Spatial Mode Reconstruction	José D. Huerta-Morales et.al.	2307.11841v1	link	Structured light beams, in particular those carrying orbital angular momentum (OAM), have gained a lot of attention due to their potential for enlarging the transmission capabilities of communication systems. However, the use of OAM-carrying light in communications faces two major problems, namely distortions introduced during propagation in disordered media, such as the atmosphere or optical fibers, and the large divergence that high-order OAM modes experience. While the use of non-orthogonal modes may offer a way to circumvent the divergence of high-order OAM fields, artificial intelligence (AI) algorithms have shown promise for solving the mode-distortion issue. Unfortunately, current AI-based algorithms make use of large-amount data-handling protocols that generally lead to large processing time and high power consumption. Here we show that a low-power, low-cost image sensor can itself act as an artificial neural network that simultaneously detects and reconstructs distorted OAM-carrying beams. We demonstrate the capabilities of our device by reconstructing (with a 95$\%$ efficiency) individual Vortex, Laguerre-Gaussian (LG) and Bessel modes, as well as hybrid (non-orthogonal) coherent superpositions of such modes. Our work provides a potentially useful basis for the development of low-power-consumption, light-based communication devices.
2023-07-21	Zero-touch realization of Pervasive Artificial Intelligence-as-a-service in 6G networks	Emna Baccour et.al.	2307.11468v1	null	The vision of the upcoming 6G technologies, characterized by ultra-dense network, low latency, and fast data rate is to support Pervasive AI (PAI) using zero-touch solutions enabling self-X (e.g., self-configuration, self-monitoring, and self-healing) services. However, the research on 6G is still in its infancy, and only the first steps have been taken to conceptualize its design, investigate its implementation, and plan for use cases. Toward this end, academia and industry communities have gradually shifted from theoretical studies of AI distribution to real-world deployment and standardization. Still, designing an end-to-end framework that systematizes the AI distribution by allowing easier access to the service using a third-party application assisted by a zero-touch service provisioning has not been well explored. In this context, we introduce a novel platform architecture to deploy a zero-touch PAI-as-a-Service (PAIaaS) in 6G networks supported by a blockchain-based smart system. This platform aims to standardize the pervasive AI at all levels of the architecture and unify the interfaces in order to facilitate the service deployment across application and infrastructure domains, relieve the users worries about cost, security, and resource allocation, and at the same time, respect the 6G stringent performance requirements. As a proof of concept, we present a Federated Learning-as-a-service use case where we evaluate the ability of our proposed system to self-optimize and self-adapt to the dynamics of 6G networks in addition to minimizing the users' perceived costs.
2023-07-21	Large Language Model-based System to Provide Immediate Feedback to Students in Flipped Classroom Preparation Learning	Shintaro Uchiyama et.al.	2307.11388v1	null	This paper proposes a system that uses large language models to provide immediate feedback to students in flipped classroom preparation learning. This study aimed to solve challenges in the flipped classroom model, such as ensuring that students are emotionally engaged and motivated to learn. Students often have questions about the content of lecture videos in the preparation of flipped classrooms, but it is difficult for teachers to answer them immediately. The proposed system was developed using the ChatGPT API on a video-watching support system for preparation learning that is being used in real practice. Answers from ChatGPT often do not align with the context of the student's question. Therefore, this paper also proposes a method to align the answer with the context. This paper also proposes a method to collect the teacher's answers to the students' questions and use them as additional guides for the students. This paper discusses the design and implementation of the proposed system.

apple

apple watch

Publish Date	Title	Authors	PDF	Code	Abstract
2023-07-27	Asymptotic approach to singular solutions for the CR Yamabe equation, and a conjecture by H. Brezis and L. A. Peletier in the Heisenberg group	Giampiero Palatucci et.al.	2307.14933v1	null	We investigate some effects of the lack of compactness in the critical Sobolev embedding by proving that a famous conjecture of Brezis & Peletier (Essays in honor of Ennio De Giorgi -- Progr. Differ. Equ. Appl. 1989) does still hold in the Heisenberg framework: optimal functions for a natural subcritical approximations of the Sobolev quotient concentrate energy at one point which can be localized via the Green function associated to the involved domain and in clear accordance with the underlying sub-Riemannian geometry -- and consequently a new suitable definition of domains geometrical regular near their characteristic set is given. In order to achieve the aforementioned result, we need to combine proper estimates and tools to attack the related CR Yamabe equation (Jerison & Lee, J. Diff. Geom. 1987) with novel feasible ingredients in PDEs and Calculus of Variations which also aim to constitute general independent results in the Heisenberg framework, as e.g. a fine asymptotic control of the optimal functions via the Jerison & Lee extremals realizing the equality in the critical Sobolev inequality (J. Amer. Math. Soc. 1988).
2023-07-27	Pre-Schwarzian and Schwarzian norm estimates for harmonic functions with fixed analytic part	Md Firoz Ali et.al.	2307.14793v1	null	In the present article, we discuss about the estimate of the pre-Schwarzian and Schwarzian norms for locally univalent harmonic functions $f=h+\overline{g}$ in the unit disk $\mathbb{D}:={z\in\mathbb{C}:\,
2023-07-27	Quinpi: Integrating stiff hyperbolic systems with implicit high order finite volume schemes	Gabriella Puppo et.al.	2307.14685v1	null	Many interesting physical problems described by systems of hyperbolic conservation laws are stiff, and thus impose a very small time-step because of the restrictive CFL stability condition. In this case, one can exploit the superior stability properties of implicit time integration which allows to choose the time-step only from accuracy requirements, and thus avoid the use of small time-steps. We discuss an efficient framework to devise high order implicit schemes for stiff hyperbolic systems without tailoring it to a specific problem. The nonlinearity of high order schemes, due to space- and time-limiting procedures which control nonphysical oscillations, makes the implicit time integration difficult, e.g.~because the discrete system is nonlinear also on linear problems. This nonlinearity of the scheme is circumvented as proposed in (Puppo et al., Comm.~Appl.~Math.~\& Comput., 2023) for scalar conservation laws, where a first order implicit predictor is computed to freeze the nonlinear coefficients of the essentially non-oscillatory space reconstruction, and also to achieve limiting in time. In addition, we propose a novel conservative flux-based a-posteriori time-limiting procedure using numerical entropy indicators to detect troubled cells. The numerical tests involve classical and artificially devised stiff problems using the Euler's system of gas-dynamics.
2023-07-27	How to Train Your YouTube Recommender	Alexander Liu et.al.	2307.14551v1	null	YouTube provides features for users to indicate disinterest when presented with unwanted recommendations, such as the `Not interested'' and`Don\'t recommend channel'' buttons. These buttons are purported to allow the user to correct `mistakes'' made by the recommendation system. Yet, relatively little is known about the empirical efficacy of these buttons. Neither is much known about users' awareness of and confidence in them. To address these gaps, we simulated YouTube users with sock puppet agents. Each agent first executed a`stain phase'', where it watched many videos of one assigned topic; then it executed a `scrub phase'', where it tried to remove recommendations of the assigned topic. Each agent repeatedly applied a single scrubbing strategy, which included disliking previously-watched videos or deleting them from watch history, as well as clicking the`not interested'' or `don\'t recommend channel'' button on newly-recommended videos. Overall, we found that the stain phase significantly increased the fraction of the recommended videos on the user\'s homepage dedicated to the assigned topic. For the scrub phase, using the`Not interested'' button worked best, significantly reducing such recommendations in all topics tested, on average removing 88\% of them. Neither the stain phase nor the scrub phase, however, had much effect on videopage recommendations (those given to users while they watch a video). We also ran a survey ($N$ =300) asking adult YouTube users in the US whether they were aware of and used these buttons before, as well as how effective they found these buttons to be. We found that 44\% of participants were not aware that the ``Not interested'' button existed. However, those who were aware of this button often used it to remove unwanted recommendations (82.8\%) and found it to be modestly effective (3.42 out of 5).
2023-07-26	Measuring 3D tree imbalance of plant models using graph-theoretical approaches	Sophie J. Kersting et.al.	2307.14537v1	null	Imbalance in the 3D structure of plants can be an important indicator of insufficient light or nutrient supply, as well as excessive wind, (formerly present) physical barriers, neighbor or storm damage. It can also be a simple means to detect certain illnesses, since some diseases like the apple proliferation disease, an infection with the barley yellow dwarf virus or plant canker can cause abnormal growth, like \enquote{witches' brooms} or burls, resulting in a deviating 3D plant architecture. However, quantifying imbalance of plant growth is not an easy task, and it requires a mathematically sound 3D model of plants to which imbalance indices can be applied. Current models of plants are often based on stacked cylinders or voxel matrices and do not allow for measuring the degree of 3D imbalance in the branching structure of the whole plant. On the other hand, various imbalance indices are readily available for so-called graph-theoretical trees and are frequently used in areas like phylogenetics and computer science. While only some basic ideas of these indices can be transferred to the 3D setting, graph-theoretical trees are a logical foundation for 3D plant models that allow for elegant and natural imbalance measures. In this manuscript, our aim is thus threefold: We first present a new graph-theoretical 3D model of plants and discuss desirable properties of imbalance measures in the 3D setting. We then introduce and analyze eight different imbalance indices and their properties. Thirdly, we illustrate all our findings using a data set of 63 bush beans. Moreover, we implemented all our indices in the publicly available \textsf{R}-software package \textsf{treeDbalance} accompanying this manuscript.
2023-07-26	Generating functions of non-backtracking walks on weighted digraphs: radius of convergence and Ihara's theorem	Vanni Noferini et.al.	2307.14200v1	null	It is known that the generating function associated with the enumeration of non-backtracking walks on finite graphs is a rational matrix-valued function of the parameter; such function is also closely related to graph-theoretical results such as Ihara's theorem and the zeta function on graphs. In [P. Grindrod, D. J. Higham, V. Noferini, The deformed graph Laplacian and its application to network centrality analysis, SIAM J. Matrix Anal. Appl. 39(1), 310--341, 2018], the radius of convergence of the generating function was studied for simple (i.e., undirected, unweighted and with no loops) graphs, and shown to depend on the number of cycles in the graph. In this paper, we use technologies from the theory of polynomial and rational matrices to greatly extend these results by studying the radius of convergence of the corresponding generating function for general, possibly directed and/or weighted, graphs. We give an analogous characterization of the radius of convergence for directed unweighted graphs, showing that it depends on the number of cycles in the undirectization of the graph. For weighted graphs, we provide for the first time an exact formula for the radius of convergence, improving a previous result that exhibited a lower bound. Finally, we consider also backtracking-downweighted walks on unweighted digraphs, and we prove a version of Ihara's theorem in that case.
2023-07-26	An Antithetic Multilevel Monte Carlo-Milstein Scheme for Stochastic Partial Differential Equations	Abdul-Lateef Haji-Al et.al.	2307.14169v1	null	We present a novel multilevel Monte Carlo approach for estimating quantities of interest for stochastic partial differential equations (SPDEs). Drawing inspiration from [Giles and Szpruch: Antithetic multilevel Monte Carlo estimation for multi-dimensional SDEs without L\'evy area simulation, Annals of Appl. Prob., 2014], we extend the antithetic Milstein scheme for finite-dimensional stochastic differential equations to Hilbert space-valued SPDEs. Our method has the advantages of both Euler and Milstein discretizations, as it is easy to implement and does not involve intractable L\'evy area terms. Moreover, the antithetic correction in our method leads to the same variance decay in a MLMC algorithm as the standard Milstein method, resulting in significantly lower computational complexity than a corresponding MLMC Euler scheme. Our approach is applicable to a broader range of non-linear diffusion coefficients and does not require any commutative properties. The key component of our MLMC algorithm is a truncated Milstein-type time stepping scheme for SPDEs, which accelerates the rate of variance decay in the MLMC method when combined with an antithetic coupling on the fine scales. We combine the truncated Milstein scheme with appropriate spatial discretizations and noise approximations on all scales to obtain a fully discrete scheme and show that the antithetic coupling does not introduce an additional bias.
2023-07-26	Energy Spectrum Analysis on a Red Blood Cell Model	Tetsuya Yamamoto et.al.	2307.14029v1	null	It is important to understand the dynamics of red blood cells (RBCs) in blood flow. This requires the formulation of coarse-grained RBC models that reproduce the hydrodynamic properties of blood accurately. One of the models that successfully reproduce the rheology and morphology of blood has been proposed by Fedosov et al. [D. A. Fedosov, B. Caswell, and G. E. Karniadakis, Comput. Methods Appl. Mech. Eng., Vol. 199, 1937-1948 (2010)]. The proposed RBC model contains several parameters whose values are determined either by various experiments or physical requirements. In this study, we developed a new method of determining the parameter values precisely from the fluctuations of the RBC membrane. Specifically, we studied the relationship between the spectra of the fluctuations and model parameters. Characteristic peaks were observed in the spectra, whose peak frequencies were dependent on the parameter values. In addition, we investigated the spectra of the radius of gyration. We identified the peaks originating from the spring potential and the volume-conserving potential appearing in the spectra. These results lead to the precise experimental determination of the parameters used in the RBC model.
2023-07-25	Fast Fabrication of WS2/Bi2Se3 Heterostructures for High Performance Photodetection	Fan Li et.al.	2307.13852v1	null	Two-dimensional (2D) material heterostructures have attracted considerable attention owing to their interesting and novel physical properties, which expand the possibilities for future optoelectronic, photovoltaic, and nanoelectronic applications. A portable, fast, and deterministic transfer technique is highly needed for the fabrication of heterostructures. Herein, we report a fast half wet poly(dimethylsiloxane) (PDMS) transfer process utilizing the change of adhesion energy with the help of micron-sized water droplets. Using this method, a vertical stacking of the WS2/Bi2Se3 heterostructure with a straddling band configuration is successfully assembled on a fluorophlogopite substrate. Thanks to the complementary band gaps and high efficiency of interfacial charge transfer, the photodetector based on the heterostructure exhibits a superior responsivity of 109.9 A/W for a visible incident light at 473 nm and 26.7 A/W for a 1064 nm near-infrared illumination. Such high photoresponsivity of the heterostructure demonstrates that our transfer method not only owns time efficiency but also ensures high quality of the heterointerface. Our study may open new pathways to the fast and massive fabrication of various vertical 2D heterostructures for applications in twistronics/valleytronics and other band engineering devices.
2023-07-25	Insights into Cognitive Engagement: Comparing the Effectiveness of Game-Based and Video-Based Learning	Shayla Sharmin et.al.	2307.13637v1	null	The analysis of brain signals holds considerable importance in enhancing our comprehension of diverse learning techniques and cognitive mechanisms. Game-based learning is increasingly being recognized for its interactive and engaging educational approach. A pilot study of twelve participants divided into experimental and control groups was conducted to understand its effects on cognitive processes. Both groups were provided with the same contents regarding the basic structure of the graph. The participants in the experimental group engaged in a quiz-based game, while those in the control group watched a pre-recorded video. Functional Near-Infrared Spectroscopy (fNIRS) was employed to acquire cerebral signals, and a series of pre and post-tests were administered. The findings of our study indicate that the group engaged in the game activity displayed elevated levels of oxygenated hemoglobin compared to the group involved in watching videos. Conversely, the deoxygenated hemoglobin levels remained relatively consistent across both groups throughout the learning process. The aforementioned findings suggest that the use of game-based learning has a substantial influence on cognitive processes. Furthermore, it is evident that both the game and video groups exhibited higher neural activity in the Lateral Prefrontal cortex (PFC). The oxygenated hemoglobin ratio demonstrates that the game group had 2.33 times more neural processing in the Lateral PFC than the video group. This data is further supported by the knowledge gain analysis, which indicates that the game-based approach resulted in a 47.74% higher knowledge gain than the video group, as calculated from the difference in pre-and post-test scores.
2023-07-25	Isogeometric analysis of insoluble surfactant spreading on a thin film	David Medina et.al.	2307.13605v1	null	In this paper we tackle the problem of surfactant spreading on a thin liquid film in the framework of isogeometric analysis. We consider a mathematical model that describes this phenomenon as an initial boundary value problem (IBVP) that includes two coupled fourth order partial differential equations (PDEs), one for the film height and one for the surfactant concentration. In order to solve this problem numerically, it is customary to transform it into a mixed problem that includes at most second order PDEs. However, the higher-order continuity of the approximation functions in Isogeometric Analysis (IGA) allows us to deal with the weak form of the fourth order PDEs directly, without the need of resorting to mixed methods. We demonstrate numerically that the IGA solution is able to reproduce results obtained before with mixed approaches. Complex phenomena such as Marangoni-driven fingering instabilities triggered by perturbations are easily captured.
2023-07-25	Local density of states above a disk -- geometrical vs. thermal boundary conditions	Svend-Age Biehs et.al.	2307.13438v1	null	We analytically calculate the contribution to the local density of states due to thermal sources in a disk-like patch within the framework of fluctuational electrodynamics. We further introduce a wavevector cutoff method to approximate this contribution. We compare the results obtained with the source and cutoff method with the numerical exact LDOS above a metal disk attained by SCUFF-EM calculations. By this comparison we highlight the difference and resemblance of thermal and geometrical boundary conditions which are both relevant for near-field scanning microscope measurements. Finally, we give an outlook to general lateral temperature profiles and compare it with surface profiles.
2023-07-25	A Pairwise Dataset for GUI Conversion and Retrieval between Android Phones and Tablets	Han Hu et.al.	2307.13225v1	null	With the popularity of smartphones and tablets, users have become accustomed to using different devices for different tasks, such as using their phones to play games and tablets to watch movies. To conquer the market, one app is often available on both smartphones and tablets. However, although one app has similar graphic user interfaces (GUIs) and functionalities on phone and tablet, current app developers typically start from scratch when developing a tablet-compatible version of their app, which drives up development costs and wastes existing design resources. Researchers are attempting to employ deep learning in automated GUIs development to enhance developers' productivity. Deep learning models rely heavily on high-quality datasets. There are currently several publicly accessible GUI page datasets for phones, but none for pairwise GUIs between phones and tablets. This poses a significant barrier to the employment of deep learning in automated GUI development. In this paper, we collect and make public the Papt dataset, which is a pairwise dataset for GUI conversion and retrieval between Android phones and tablets. The dataset contains 10,035 phone-tablet GUI page pairs from 5,593 phone-tablet app pairs. We illustrate the approaches of collecting pairwise data and statistical analysis of this dataset. We also illustrate the advantages of our dataset compared to other current datasets. Through preliminary experiments on this dataset, we analyse the present challenges of utilising deep learning in automated GUI development and find that our dataset can assist the application of some deep learning models to tasks involving automatic GUI development.
2023-07-24	A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning	Benjamin Eysenbach et.al.	2307.12968v1	link	As with any machine learning problem with limited data, effective offline RL algorithms require careful regularization to avoid overfitting. One-step methods perform regularization by doing just a single step of policy improvement, while critic regularization methods do many steps of policy improvement with a regularized objective. These methods appear distinct. One-step methods, such as advantage-weighted regression and conditional behavioral cloning, truncate policy iteration after just one step. This ``early stopping'' makes one-step RL simple and stable, but can limit its asymptotic performance. Critic regularization typically requires more compute but has appealing lower-bound guarantees. In this paper, we draw a close connection between these methods: applying a multi-step critic regularization method with a regularization coefficient of 1 yields the same policy as one-step RL. While practical implementations violate our assumptions and critic regularization is typically applied with smaller regularization coefficients, our experiments nevertheless show that our analysis makes accurate, testable predictions about practical offline RL methods (CQL and one-step RL) with commonly-used hyperparameters. Our results that every problem can be solved with a single step of policy improvement, but rather that one-step RL might be competitive with critic regularization on RL problems that demand strong regularization.
2023-07-24	Rechargeable Li/Cl$_2$ battery down to -80 °C	Peng Liang et.al.	2307.12947v1	null	Low temperature rechargeable batteries are important to life in cold climates, polar/deep-sea expeditions and space explorations. Here, we report ~ 3.5 - 4 V rechargeable lithium/chlorine (Li/Cl2) batteries operating down to -80 {\deg}C, employing Li metal negative electrode, a novel CO2 activated porous carbon (KJCO2) as the positive electrode, and a high ionic conductivity (~ 5 to 20 mS cm-1 from -80 {\deg}C to 25 {\deg}C) electrolyte comprised of 1 M aluminum chloride (AlCl3), 0.95 M lithium chloride (LiCl), and 0.05 M lithium bis(fluorosulfonyl)imide (LiFSI) in low melting point (-104.5 {\deg}C) thionyl chloride (SOCl2). Between room-temperature and -80 {\deg}C, the Li/Cl2 battery delivered up to ~ 30,000 - 4,500 mAh g-1 first discharge capacity and a 1,200 - 5,000 mAh g-1 reversible capacity (discharge voltages in ~ 3.5 to 3.1 V) over up to 130 charge-discharge cycles. Mass spectrometry and X-ray photoelectron spectroscopy (XPS) probed Cl2 trapped in the porous carbon upon LiCl electro-oxidation during charging. At lower temperature down to -80 {\deg}C, SCl2/S2Cl2 and Cl2 generated by electro-oxidation in the charging step were trapped in porous KJCO2 carbon, allowing for reversible reduction to afford a high discharge voltage plateau near ~ 4 V with up to ~ 1000 mAh g-1 capacity for SCl2/S2Cl2 reduction and up to ~ 4000 mAh g-1 capacity at ~ 3.1 V plateau for Cl2 reduction. Towards practical use, we made CR2032 Li/Cl2 battery cells to drive digital watches at -40 {\deg}C and light emitting diode at -80 {\deg}C, opening Li/Cl2 secondary batteries for ultra-cold conditions.
2023-07-24	Scattered trinomials of $\mathbb{F}_{q^6}[X]$ in even characteristic	Daniele Bartoli et.al.	2307.12829v1	null	In recent years, several families of scattered polynomials have been investigated in the literature. However, most of them only exist in odd characteristic. In [B. Csajb\'ok, G. Marino and F. Zullo: New maximum scattered linear sets of the projective line, Finite Fields Appl. 54 (2018), 133-150; G. Marino, M. Montanucci and F. Zullo: MRD-codes arising from the trinomial $x^q+x^{q^3}+cx^{q^5}\in\mathbb{F}{q^6}[x]$, Linear Algebra Appl. 591 (2020), 99-114], the authors proved that the trinomial $f_c(X)=X^{q}+X^{q^{3}}+cX^{q^{5}}$ of $\mathbb{F}[X]$ is scattered under the assumptions that $q$ is odd and $c^2+c=1$. They also explicitly observed that this is false when $q$ is even. In this paper, we provide a different set of conditions on $c$ for which this trinomial is scattered in the case of even $q$. Using tools of algebraic geometry in positive characteristic, we show that when $q$ is even and sufficiently large, there are roughly $q^3$ elements $c \in \mathbb{F}{q^6}$ such that $f(X)$ is scattered. Also, we prove that the corresponding MRD-codes and $\mathbb{F}_q$-linear sets of $\mathrm{PG}(1,q^6)$ are not equivalent to the previously known ones.
2023-07-24	Multi-Shooting Differential Dynamic Programming for Hybrid Systems using Analytical Derivatives	Shubham Singh et.al.	2307.12606v1	null	Differential Dynamic Programming (DDP) is a popular technique used to generate motion for dynamic-legged robots in the recent past. However, in most cases, only the first-order partial derivatives of the underlying dynamics are used, resulting in the iLQR approach. Neglecting the second-order terms often slows down the convergence rate compared to full DDP. Multi-Shooting is another popular technique to improve robustness, especially if the dynamics are highly non-linear. In this work, we consider Multi-Shooting DDP for trajectory optimization of a bounding gait for a simplified quadruped model. As the main contribution, we develop Second-Order analytical partial derivatives of the rigid-body contact dynamics, extending our previous results for fixed/floating base models with multi-DoF joints. Finally, we show the benefits of a novel Quasi-Newton method for approximating second-order derivatives of the dynamics, leading to order-of-magnitude speedups in the convergence compared to the full DDP method.
2023-07-24	Automated Mapping of Adaptive App GUIs from Phones to TVs	Han Hu et.al.	2307.12522v1	null	With the increasing interconnection of smart devices, users often desire to adopt the same app on quite different devices for identical tasks, such as watching the same movies on both their smartphones and TV. However, the significant differences in screen size, aspect ratio, and interaction styles make it challenging to adapt Graphical User Interfaces (GUIs) across these devices. Although there are millions of apps available on Google Play, only a few thousand are designed to support smart TV displays. Existing techniques to map a mobile app GUI to a TV either adopt a responsive design, which struggles to bridge the substantial gap between phone and TV or use mirror apps for improved video display, which requires hardware support and extra engineering efforts. Instead of developing another app for supporting TVs, we propose a semi-automated approach to generate corresponding adaptive TV GUIs, given the phone GUIs as the input. Based on our empirical study of GUI pairs for TV and phone in existing apps, we synthesize a list of rules for grouping and classifying phone GUIs, converting them to TV GUIs, and generating dynamic TV layouts and source code for the TV display. Our tool is not only beneficial to developers but also to GUI designers, who can further customize the generated GUIs for their TV app development. An evaluation and user study demonstrate the accuracy of our generated GUIs and the usefulness of our tool.
2023-07-23	LiveRetro: Visual Analytics for Strategic Retrospect in Livestream E-Commerce	Yuchen Wu et.al.	2307.12213v1	null	Livestream e-commerce integrates live streaming and online shopping, allowing viewers to make purchases while watching. However, effective marketing strategies remain a challenge due to limited empirical research and subjective biases from the absence of quantitative data. Current tools fail to capture the interdependence between live performances and feedback. This study identified computational features, formulated design requirements, and developed LiveRetro, an interactive visual analytics system. It enables comprehensive retrospective analysis of livestream e-commerce for streamers, viewers, and merchandise. LiveRetro employs enhanced visualization and time-series forecasting models to align performance features and feedback, identifying influences at channel, merchandise, feature, and segment levels. Through case studies and expert interviews, the system provides deep insights into the relationship between live performance and streaming statistics, enabling efficient strategic analysis from multiple perspectives.
2023-07-21	Large Language Model-based System to Provide Immediate Feedback to Students in Flipped Classroom Preparation Learning	Shintaro Uchiyama et.al.	2307.11388v1	null	This paper proposes a system that uses large language models to provide immediate feedback to students in flipped classroom preparation learning. This study aimed to solve challenges in the flipped classroom model, such as ensuring that students are emotionally engaged and motivated to learn. Students often have questions about the content of lecture videos in the preparation of flipped classrooms, but it is difficult for teachers to answer them immediately. The proposed system was developed using the ChatGPT API on a video-watching support system for preparation learning that is being used in real practice. Answers from ChatGPT often do not align with the context of the student's question. Therefore, this paper also proposes a method to align the answer with the context. This paper also proposes a method to collect the teacher's answers to the students' questions and use them as additional guides for the students. This paper discusses the design and implementation of the proposed system.
2023-07-21	Fused Spectatorship: Designing Bodily Experiences Where Spectators Become Players	Rakesh Patibanda et.al.	2307.11297v1	null	Spectating digital games can be exciting. However, due to its vicarious nature, spectators often wish to engage in the gameplay beyond just watching and cheering. To blur the boundaries between spectators and players, we propose a novel approach called ''Fused Spectatorship'', where spectators watch their hands play games by loaning bodily control to a computational Electrical Muscle Stimulation (EMS) system. To showcase this concept, we designed three games where spectators loan control over both their hands to the EMS system and watch them play these competitive and collaborative games. A study with 12 participants suggested that participants could not distinguish if they were watching their hands play, or if they were playing the games themselves. We used our results to articulate four spectator experience themes and four fused spectator types, the behaviours they elicited and offer one design consideration to support each of these behaviours. We also discuss the ethical design considerations of our approach to help game designers create future fused spectatorship experiences.
2023-07-20	Underwater 3D positioning on smart devices	Tuochao Chen et.al.	2307.11263v1	null	The emergence of water-proof mobile and wearable devices (e.g., Garmin Descent and Apple Watch Ultra) designed for underwater activities like professional scuba diving, opens up opportunities for underwater networking and localization capabilities on these devices. Here, we present the first underwater acoustic positioning system for smart devices. Unlike conventional systems that use floating buoys as anchors at known locations, we design a system where a dive leader can compute the relative positions of all other divers, without any external infrastructure. Our intuition is that in a well-connected network of devices, if we compute the pairwise distances, we can determine the shape of the network topology. By incorporating orientation information about a single diver who is in the visual range of the leader device, we can then estimate the positions of all the remaining divers, even if they are not within sight. We address various practical problems including detecting erroneous distance estimates, addressing rotational and flipping ambiguities as well as designing a distributed timestamp protocol that scales linearly with the number of devices. Our evaluations show that our distributed system running on underwater deployments of 4-5 commodity smart devices can perform pairwise ranging and localization with median errors of 0.5-0.9 m and 0.9-1.6 m
2023-07-20	Hybrid FEM and peridynamic simulation of hydraulic fracture propagation in saturated porous media	Tao Ni et.al.	2307.10929v1	null	This paper presents a hybrid modeling approach for simulating hydraulic fracture propagation in saturated porous media: ordinary state-based peridynamics is used to describe the behavior of the solid phase, including the deformation and crack propagation, while FEM is used to describe the fluid flow and to evaluate the pore pressure. Classical Biot poroelasticity theory is adopted. The proposed approach is first verified by comparing its results with the exact solutions of two examples. Subsequently, a series of pressure- and fluid-driven crack propagation examples are solved and presented. The phenomenon of fluid pressure oscillation is observed in the fluid-driven crack propagation examples, which is consistent with previous experimental and numerical evidences. All the presented examples demonstrate the capability of the proposed approach in solving problems of hydraulic fracture propagation in saturated porous media.
2023-07-20	The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning	Borja Rodríguez-Gálvez et.al.	2307.10907v1	link	The mechanisms behind the success of multi-view self-supervised learning (MVSSL) are not yet fully understood. Contrastive MVSSL methods have been studied through the lens of InfoNCE, a lower bound of the Mutual Information (MI). However, the relation between other MVSSL methods and MI remains unclear. We consider a different lower bound on the MI consisting of an entropy and a reconstruction term (ER), and analyze the main MVSSL families through its lens. Through this ER bound, we show that clustering-based methods such as DeepCluster and SwAV maximize the MI. We also re-interpret the mechanisms of distillation-based approaches such as BYOL and DINO, showing that they explicitly maximize the reconstruction term and implicitly encourage a stable entropy, and we confirm this empirically. We show that replacing the objectives of common MVSSL methods with this ER bound achieves competitive performance, while making them stable when training with smaller batch sizes or smaller exponential moving average (EMA) coefficients. Github repo: https://github.com/apple/ml-entropy-reconstruction.
2023-07-20	Asymptotic expansion for branching killed Brownian motion with drift	Haojie Hou et.al.	2307.10754v1	null	Let $Z_t^{(0,\infty)}$ be the point process formed by the positions of all particles alive at time $t$ in a branching Brownian motion with drift and killed upon reaching 0. We study the asymptotic expansions of $Z_t^{(0,\infty)}(A)$ for $A= (a,b)$ and $A=(a,\infty)$ under the assumption that $\sum_{k=1}^\infty k(\log k)^{1+\lambda} p_k <\infty$ for large $\lambda$ in the regime of $\theta \in [0,\sqrt{2})$. These results extend and sharpen the results of Louidor and Saglietti [J. Stat. Phys, 2020] and Kesten [Stochastic Process. Appl., 1978].
2023-07-20	Kick Back & Relax: Learning to Reconstruct the World by Watching SlowTV	Jaime Spencer et.al.	2307.10713v1	link	Self-supervised monocular depth estimation (SS-MDE) has the potential to scale to vast quantities of data. Unfortunately, existing approaches limit themselves to the automotive domain, resulting in models incapable of generalizing to complex environments such as natural or indoor settings. To address this, we propose a large-scale SlowTV dataset curated from YouTube, containing an order of magnitude more data than existing automotive datasets. SlowTV contains 1.7M images from a rich diversity of environments, such as worldwide seasonal hiking, scenic driving and scuba diving. Using this dataset, we train an SS-MDE model that provides zero-shot generalization to a large collection of indoor/outdoor datasets. The resulting model outperforms all existing SSL approaches and closes the gap on supervised SoTA, despite using a more efficient architecture. We additionally introduce a collection of best-practices to further maximize performance and zero-shot generalization. This includes 1) aspect ratio augmentation, 2) camera intrinsic estimation, 3) support frame randomization and 4) flexible motion estimation. Code is available at https://github.com/jspenmar/slowtv_monodepth.
2023-07-20	Influence of phytohormones on seed germination of Solanum linnaeanum	Aram Akram Mohammed et.al.	2307.11109v1	null	The aim of this study was to determine the germination ability and seedling growth of the apple of Sodom by soaking in water, gibberellin (GA3), naphthylacetic acid (NAA), and salicylic acid (SA), separately. The findings showed that NAA at 50 mgL-1 produced superior germination (77.78%), germination speed (1.43 seeds/time interval), hypocotyl length (1.01 cm), hypocotyl diameter (1.13 mm), leaf number (2.66), and root number (17.25), followed by 50 and 100 mgL-1 GA3, particularly in germination percentage. The best root length (5.33 cm) was detected at 100 mgL-1 SA. In contrast, control seeds and water-soaked seeds showed inferior results. The seeds of the apple of Sodom can be germinated successfully as a result of treatment with NAA at 50 mgL-1, followed by GA3 at 50 and 100 mgL-1.
2023-07-19	Superfast and sub-wavelength orbital rotation of plasmonic particles in focused Gaussian beams	Lei-Ming Zhou et.al.	2307.10090v1	null	The use of nanophotonics for optical manipulation has continuously attracted interest in both fundamental research and practical applications, due to its significantly enhanced capabilities at the nanoscale. In this work, we showed that plasmonic particles can be trapped at off-axis location in Gaussian beams assisted by surface plasmon resonance. The off-axis displacement can be tuned at the sub-wavelength scale by the incident light beams. Based on these, we propose that a superfast orbital rotation of particles in continuous-wave laser beam can be realized in tightly focused circularly polarized Gaussian beams. The rotation has a tunable orbital radius at the sub-wavelength scale and a superfast rotation speed (more than 10^4 r/s in water under common laboratory conditions). Our work will aid in the development of optically driven nanomachines, and find applications in micro/nano-rheology, micro-fluid mechanics, and biological research at the nanoscale.
2023-07-19	DFT+μ: Density Functional Theory for Muon Site Determination	S. J. Blundell et.al.	2307.10076v1	null	The technique of muon spin rotation ({\mu}SR) has emerged in the last few decades as one of the most powerful methods of obtaining local magnetic information. To make the technique fully quantitative, it is necessary to have an accurate estimate of where inside the crystal structure the muon implants. This can be provided by density functional theory calculations using an approach that is termed DFT+{\mu}, density functional theory with the implanted muon included. This article reviews this approach, describes some recent successes in particular {\mu}SR experiments, and suggests some avenues for future exploration.
2023-07-19	Epilegomena to the study of semiclassical orthogonal polynomials	K. Castillo et.al.	2307.10331v2	null	In his monograph [Classical and quantum orthogonal polynomials in one variable, Cambridge University Press, 2005 (paperback edition 2009)], Ismail conjectured that certain structure relations involving the Askey-Wilson operator characterize proper subsets of the set of all $\mathcal{D}_q$-classical orthogonal polynomials, here to be understood as the Askey-Wilson polynomials and their limit cases. In this paper we give two characterization theorems for $\mathcal{D}_q$-semiclassical (and classical) orthogonal polynomials in consonance with the pioneering works by Maroni [Ann. Mat. Pura. Appl. (1987)] and Bonan, Lubinsky, and Nevai [SIAM J. Math. Anal. 18 (1987)] for the standard derivative, re-establishing in this context the perfect "symmetry" between the standard derivative and the Askey-Wilson operator. As an application, we present a sequence of $\mathcal{D}_q$-semiclassical orthogonal polynomials of class two that disproves Ismail's conjectures. Further results are presented for Hahn's operator.

smart glass

Publish Date	Title	Authors	PDF	Code	Abstract
2023-07-27	Cavity-Mediated Molecular Entanglement and Generation of Non-Classical States of Light	Davis M. Welakuh et.al.	2307.15047v1	null	The generation and control of entanglement in a quantum mechanical system is a critical element of nearly all quantum applications. Molecular systems are a promising candidate, with numerous degrees of freedom able to be targeted. However, knowledge of inter-system entanglement mechanisms in such systems is limited. In this work, we demonstrate the generation of entanglement between vibrational degrees of freedom in molecules via strong coupling to a cavity mode driven by a weak coherent field. In a bi-molecular system, we show entanglement can not only be generated between the cavity and molecular system, but also between molecules. This process also results in the generation of non-classical states of light, providing potential pathways for harnessing entanglement in molecular systems.
2023-07-27	Vibrations and Heat Transfer in Glasses: the role played by Disorder	Anne Tanguy et.al.	2307.15038v1	null	Amorphous materials are also distinguished from crystals by their thermal properties. The structural disorder seems to be responsible both for a significant increase in heat capacity compared to crystals of the same composition, but also for a significant decrease in thermal conductivity. The temperature dependence of thermal conductivity, unusual for common interpretations of solid-state physics, gave rise to a lot of debates. We review in this article different interpretations of thermal conductivity in amorphous materials. We show finally that the temperature dependence of thermal conductivity in dielectric materials can be understood by relating it to the disorder-dependent harmonic vibrational eigenmodes.
2023-07-27	Smart Contract Migration: Security Analysis and Recommendations from Ethereum to Arbitrum	Xueyan Tang et.al.	2307.14773v1	null	This research aims to explore the security risks posed by compatibility and protocol differences in smart contract migration, using the migration of smart contracts from Ethereum to Arbitrum as a case study. Through literature review, online data collection, expert participation, and analysis of smart contract vulnerability cases, this paper conducts an in-depth research of the differences between Ethereum and Arbitrum in areas such as Messaging, Block Properties, Contract Address Alias, and Gas Fees. The research findings indicate the presence of certain security issues during the migration process from Ethereum to Arbitrum, such as abnormal operation of the sequencer resulting in outdated off-chain data retrieval, time-based logical errors, failed permission checks, DOS attacks, and gas loss due to L1-to-L2 transaction failures. To address these security issues, this paper proposes corresponding solutions and recommendations to ensure the security and meet the requirements of the migration process. Additionally, this research emphasizes the continued attention and support for the security issues of smart contract migration through the case of smart contract migration from Ethereum to Arbitrum. It is worth noting that this research is the first in-depth research of smart contract security migration from Ethereum to Arbitrum.
2023-07-27	Mathematical modelling and computational reduction of molten glass fluid flow in a furnace melting basin	Francesco Ballarin et.al.	2307.14700v1	null	In this work, we present the modelling and numerical simulation of a molten glass fluid flow in a furnace melting basin. We first derive a model for a molten glass fluid flow and present numerical simulations based on the Finite Element Method (FEM). We further discuss and validate the results obtained from the simulations by comparing them with experimental results. Finally, we also present a non-intrusive Proper Orthogonal Decomposition (POD) based on Artificial Neural Networks (ANN) to efficiently handle scenarios which require multiple simulations of the fluid flow upon changing parameters of relevant industrial interest. This approach lets us obtain solutions of a complex 3D model, with good accuracy with respect to the FEM solution, yet with negligible associated computational times.
2023-07-26	High Speed Precise Refractive Index Modification for Photonic Chips through Phase Aberrated Pulsed Lasers	Bangshan Sun et.al.	2307.14451v1	null	Integrated photonic chips have significant potential in telecommunications, classic computing, quantum systems, and topological photonics. Direct laser writing offers unique capability for creating three-dimensional photonic devices in an optical glass chip with quick prototyping. However, existing laser writing schemes cannot create index-modified structures in glass that precisely match the laser focal shape while also achieving high scanning speed and high refractive index contrast. Here, we introduce the theory of a refractive index modification scheme that combines the advantages of both traditional non-thermal and thermal regime fabrication methods. We also propose a model of waveguide formation that was verified through a thorough study on the effects of phase aberrations on the laser focus. The presented new photonic chip fabrication scheme uses a novel focal intensity distribution, where pulse energy is relocated to the bottom of a laser focus by manipulating primary and higher order spherical aberrations. The technique can produce index modifications with high scanning speed (20 mm/s or higher), high index contrast (16 x 10-3), and high precision to fabricate with arbitrary cross-sections. This method has potential to expand the capabilities of photonic chips in applications that require small-scale, high precision, or high contrast refractive index control.
2023-07-26	Event-based Vision for Early Prediction of Manipulation Actions	Daniel Deniz et.al.	2307.14332v1	null	Neuromorphic visual sensors are artificial retinas that output sequences of asynchronous events when brightness changes occur in the scene. These sensors offer many advantages including very high temporal resolution, no motion blur and smart data compression ideal for real-time processing. In this study, we introduce an event-based dataset on fine-grained manipulation actions and perform an experimental study on the use of transformers for action prediction with events. There is enormous interest in the fields of cognitive robotics and human-robot interaction on understanding and predicting human actions as early as possible. Early prediction allows anticipating complex stages for planning, enabling effective and real-time interaction. Our Transformer network uses events to predict manipulation actions as they occur, using online inference. The model succeeds at predicting actions early on, building up confidence over time and achieving state-of-the-art classification. Moreover, the attention-based transformer architecture allows us to study the role of the spatio-temporal patterns selected by the model. Our experiments show that the Transformer network captures action dynamic features outperforming video-based approaches and succeeding with scenarios where the differences between actions lie in very subtle cues. Finally, we release the new event dataset, which is the first in the literature for manipulation action recognition. Code will be available at https://github.com/DaniDeniz/EventVisionTransformer.
2023-07-26	Simulation of Open Quantum Systems via Low-Depth Convex Unitary Evolutions	Joseph Peetz et.al.	2307.14325v2	null	Simulating physical systems on quantum devices is one of the most promising applications of quantum technology. Current quantum approaches to simulating open quantum systems are still practically challenging on NISQ-era devices, because they typically require ancilla qubits and extensive controlled sequences. In this work, we propose a hybrid quantum-classical approach for simulating a class of open system dynamics called random-unitary channels. These channels naturally decompose into a series of convex unitary evolutions, which can then be efficiently sampled and run as independent circuits. The method does not require deep ancilla frameworks and thus can be implemented with lower noise costs. We implement simulations of open quantum systems up to dozens of qubits and with large channel rank.
2023-07-26	Large-scale Fully-Unsupervised Re-Identification	Gabriel Bertocco et.al.	2307.14278v1	null	Fully-unsupervised Person and Vehicle Re-Identification have received increasing attention due to their broad applicability in surveillance, forensics, event understanding, and smart cities, without requiring any manual annotation. However, most of the prior art has been evaluated in datasets that have just a couple thousand samples. Such small-data setups often allow the use of costly techniques in time and memory footprints, such as Re-Ranking, to improve clustering results. Moreover, some previous work even pre-selects the best clustering hyper-parameters for each dataset, which is unrealistic in a large-scale fully-unsupervised scenario. In this context, this work tackles a more realistic scenario and proposes two strategies to learn from large-scale unlabeled data. The first strategy performs a local neighborhood sampling to reduce the dataset size in each iteration without violating neighborhood relationships. A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n^2) to O(kn) with k << n. To avoid the pre-selection of specific hyper-parameter values for the clustering algorithm, we also present a novel scheduling algorithm that adjusts the density parameter during training, to leverage the diversity of samples and keep the learning robust to noisy labeling. Finally, due to the complementary knowledge learned by different models, we also introduce a co-training strategy that relies upon the permutation of predicted pseudo-labels, among the backbones, with no need for any hyper-parameters or weighting optimization. The proposed methodology outperforms the state-of-the-art methods in well-known benchmarks and in the challenging large-scale Veri-Wild dataset, with a faster and memory-efficient Re-Ranking strategy, and a large-scale, noisy-robust, and ensemble-based learning approach.
2023-07-26	Giant conductance of PSS:PEDOT micro-surfaces induced by microbubble lithography	Anand Dev Ranjan et.al.	2307.14231v1	null	We provide direct evidence of the effects of interface engineering of various substrates by Microbubble lithography (MBL). We choose a model organic plastic (or polymer) poly(3,4-ethylenedioxythiophene) polystyrene sulfonate (PEDOT:PSS), with conductivity of 140 S/cm, as a representative organic system to showcase our technique. Thus, we fabricate permanent patterns of PEDOT:PSS on glass, followed by a flexible PDMS substrate, and observe conductivity enhancement of 5 times on the former (694 S/cm), and 20 times (2844 S/cm) on the latter, without the use of external doping agents or invasive chemical treatment. Probing the patterned interface, we observe that MBL is able to tune the conformational states of PEDOT:PSS from coils in the pristine form, to extended coils on glass, and almost linear structures in PDMS due to its more malleable liquid-like interface. This results in higher ordering and vanishing grain boundaries leading to the highest conductivity of PEDOT:PSS on PDMS substrates.
2023-07-25	Integration of Digital Twin and Federated Learning for Securing Vehicular Internet of Things	Deepti Gupta et.al.	2307.13794v1	null	In the present era of advanced technology, the Internet of Things (IoT) plays a crucial role in enabling smart connected environments. This includes various domains such as smart homes, smart healthcare, smart cities, smart vehicles, and many others.With ubiquitous smart connected devices and systems, a large amount of data associated with them is at a prime risk from malicious entities (e.g., users, devices, applications) in these systems. Innovative technologies, including cloud computing, Machine Learning (ML), and data analytics, support the development of anomaly detection models for the Vehicular Internet of Things (V-IoT), which encompasses collaborative automatic driving and enhanced transportation systems. However, traditional centralized anomaly detection models fail to provide better services for connected vehicles due to issues such as high latency, privacy leakage, performance overhead, and model drift. Recently, Federated Learning (FL) has gained significant recognition for its ability to address data privacy concerns in the IoT domain. Digital Twin (DT), proves beneficial in addressing uncertain crises and data security issues by creating a virtual replica that simulates various factors, including traffic trajectories, city policies, and vehicle utilization. However, the effectiveness of a V-IoT DT system heavily relies on the collection of long-term and high-quality data to make appropriate decisions. This paper introduces a Hierarchical Federated Learning (HFL) based anomaly detection model for V-IoT, aiming to enhance the accuracy of the model. Our proposed model integrates both DT and HFL approaches to create a comprehensive system for detecting malicious activities using an anomaly detection model. Additionally, real-world V-IoT use case scenarios are presented to demonstrate the application of the proposed model.
2023-07-25	ChildGAN: Large Scale Synthetic Child Facial Data Using Domain Adaptation in StyleGAN	Muhammad Ali Farooq et.al.	2307.13746v1	null	In this research work, we proposed a novel ChildGAN, a pair of GAN networks for generating synthetic boys and girls facial data derived from StyleGAN2. ChildGAN is built by performing smooth domain transfer using transfer learning. It provides photo-realistic, high-quality data samples. A large-scale dataset is rendered with a variety of smart facial transformations: facial expressions, age progression, eye blink effects, head pose, skin and hair color variations, and variable lighting conditions. The dataset comprises more than 300k distinct data samples. Further, the uniqueness and characteristics of the rendered facial features are validated by running different computer vision application tests which include CNN-based child gender classifier, face localization and facial landmarks detection test, identity similarity evaluation using ArcFace, and lastly running eye detection and eye aspect ratio tests. The results demonstrate that synthetic child facial data of high quality offers an alternative to the cost and complexity of collecting a large-scale dataset from real children.
2023-07-25	Moisture-Driven Mechanisms in Multi-Functional TPU-Based Syntactic Foams	Sabarinathan P Subramaniyan et.al.	2307.13673v1	null	Syntactic foams have found widespread usage in various applications including, marine, aerospace, automotive, pipe insulation, electrical cable sheathing, and shoe insoles. However, syntactic foams are often exposed to moisture when used in these applications that potentially alter their viscoelastic, thermal transport, and dielectric properties, which influences their long-term durability. Despite their significance, previous research has mainly focused on experimental studies concerning mechanical property changes resulting from filler loading and different matrix materials, overlooking the fundamental mechanisms resulting from moisture exposure. The current paper aims to bridge this gap in knowledge by elucidating the impact of long-term moisture exposure on TPU and TPU-based syntactic foam through multi-scale materials characterization approaches. Here, we choose a flexible syntactic foam manufactured using thermoplastic polyurethane elastomer (TPU) reinforced with glass microballoons (GMB) through selective laser sintering. Specifically, the research investigates the influence of moisture exposure time and the volume fraction of GMB on chemical and microphase morphological changes, along with their associated mechanisms. The study further examines how these microphase morphological changes manifest in viscoelastic, thermal transport, and dielectric properties.
2023-07-25	On-Device Speaker Anonymization of Acoustic Embeddings for ASR based onFlexible Location Gradient Reversal Layer	Md Asif Jalal et.al.	2307.13343v1	null	Smart devices serviced by large-scale AI models necessitates user data transfer to the cloud for inference. For speech applications, this means transferring private user information, e.g., speaker identity. Our paper proposes a privacy-enhancing framework that targets speaker identity anonymization while preserving speech recognition accuracy for our downstream task~-~Automatic Speech Recognition (ASR). The proposed framework attaches flexible gradient reversal based speaker adversarial layers to target layers within an ASR model, where speaker adversarial training anonymizes acoustic embeddings generated by the targeted layers to remove speaker identity. We propose on-device deployment by execution of initial layers of the ASR model, and transmitting anonymized embeddings to the cloud, where the rest of the model is executed while preserving privacy. Experimental results show that our method efficiently reduces speaker recognition relative accuracy by 33%, and improves ASR performance by achieving 6.2% relative Word Error Rate (WER) reduction.
2023-07-24	Evaluating the reliability of automatically generated pedestrian and bicycle crash surrogates	Agnimitra Sengupta et.al.	2307.13178v1	null	Vulnerable road users (VRUs), such as pedestrians and bicyclists, are at a higher risk of being involved in crashes with motor vehicles, and crashes involving VRUs also are more likely to result in severe injuries or fatalities. Signalized intersections are a major safety concern for VRUs due to their complex and dynamic nature, highlighting the need to understand how these road users interact with motor vehicles and deploy evidence-based countermeasures to improve safety performance. Crashes involving VRUs are relatively infrequent, making it difficult to understand the underlying contributing factors. An alternative is to identify and use conflicts between VRUs and motorized vehicles as a surrogate for safety performance. Automatically detecting these conflicts using a video-based systems is a crucial step in developing smart infrastructure to enhance VRU safety. The Pennsylvania Department of Transportation conducted a study using video-based event monitoring system to assess VRU and motor vehicle interactions at fifteen signalized intersections across Pennsylvania to improve VRU safety performance. This research builds on that study to assess the reliability of automatically generated surrogates in predicting confirmed conflicts using advanced data-driven models. The surrogate data used for analysis include automatically collectable variables such as vehicular and VRU speeds, movements, post-encroachment time, in addition to manually collected variables like signal states, lighting, and weather conditions. The findings highlight the varying importance of specific surrogates in predicting true conflicts, some being more informative than others. The findings can assist transportation agencies to collect the right types of data to help prioritize infrastructure investments, such as bike lanes and crosswalks, and evaluate their effectiveness.
2023-07-24	Schema-Driven Actionable Insight Generation and Smart Recommendation	Allmin Susaiyah et.al.	2307.13176v1	null	In natural language generation (NLG), insight mining is seen as a data-to-text task, where data is mined for interesting patterns and verbalised into 'insight' statements. An 'over-generate and rank' paradigm is intuitively used to generate such insights. The multidimensionality and subjectivity of this process make it challenging. This paper introduces a schema-driven method to generate actionable insights from data to drive growth and change. It also introduces a technique to rank the insights to align with user interests based on their feedback. We show preliminary qualitative results of the insights generated using our technique and demonstrate its ability to adapt to feedback.
2023-07-24	Why Don't You Clean Your Glasses? Perception Attacks with Dynamic Optical Perturbations	Yi Han et.al.	2307.13131v1	null	Camera-based autonomous systems that emulate human perception are increasingly being integrated into safety-critical platforms. Consequently, an established body of literature has emerged that explores adversarial attacks targeting the underlying machine learning models. Adapting adversarial attacks to the physical world is desirable for the attacker, as this removes the need to compromise digital systems. However, the real world poses challenges related to the "survivability" of adversarial manipulations given environmental noise in perception pipelines and the dynamicity of autonomous systems. In this paper, we take a sensor-first approach. We present EvilEye, a man-in-the-middle perception attack that leverages transparent displays to generate dynamic physical adversarial examples. EvilEye exploits the camera's optics to induce misclassifications under a variety of illumination conditions. To generate dynamic perturbations, we formalize the projection of a digital attack into the physical domain by modeling the transformation function of the captured image through the optical pipeline. Our extensive experiments show that EvilEye's generated adversarial perturbations are much more robust across varying environmental light conditions relative to existing physical perturbation frameworks, achieving a high attack success rate (ASR) while bypassing state-of-the-art physical adversarial detection frameworks. We demonstrate that the dynamic nature of EvilEye enables attackers to adapt adversarial examples across a variety of objects with a significantly higher ASR compared to state-of-the-art physical world attack frameworks. Finally, we discuss mitigation strategies against the EvilEye attack.
2023-07-24	Exploring Quantum Annealing Architectures: A Spin Glass Perspective	Gabriel Jaumà et.al.	2307.13065v1	null	We study the spin-glass transition in several Ising models of relevance for quantum annealers. We extract the spin-glass critical temperature by extrapolating the pseudo-critical properties obtained with Replica-Exchange Monte-Carlo for finite-size systems. We find a spin-glass phase for some random lattices (random-regular and small-world graphs) in good agreement with previous results. However, our results for the quasi-two-dimensional graphs implemented in the D-Wave annealers (Chimera, Zephyr, and Pegasus) indicate only a zero-temperature spin-glass state, as their pseudo-critical temperature drifts towards smaller values. This implies that the asymptotic runtime to find the low-energy configuration of those graphs is likely to be polynomial in system size, nevertheless, this scaling may only be reached for very large system sizes -- much larger than existing annealers -- as we observe an abrupt increase in the computational cost of the simulations around the pseudo-critical temperatures. Thus, two-dimensional systems with local crossings can display enough complexity to make unfeasible the search with classical methods of low-energy configurations.
2023-07-24	Wide Field-of-View, Large-Area Long-wave Infrared Silicon Metalenses	Hung-I Lin et.al.	2307.12974v1	null	Long-wave infrared (LWIR, 8-12 $\mu m$ wavelengths) is a spectral band of vital importance to thermal imaging. Conventional LWIR optics made from single-crystalline Ge and chalcogenide glasses are bulky and fragile. The challenge is exacerbated for wide field-of-view (FOV) optics, which traditionally mandates multiple cascaded elements that severely add to complexity and cost. Here we designed and experimentally realized a LWIR metalens platform based on bulk Si wafers featuring 140$^\circ$ FOV. The metalenses, which have diameters exceeding 4 cm, were fabricated using a scalable wafer-level process involving photolithography and deep reactive ion etching. Using a metalens-integrated focal plane array, we further demonstrated wide-angle thermal imaging.
2023-07-24	Economic Analysis of Smart Roadside Infrastructure Sensors for Connected and Automated Mobility	Laurent Kloeker et.al.	2307.12893v1	null	Smart roadside infrastructure sensors in the form of intelligent transportation system stations (ITS-Ss) are increasingly deployed worldwide at relevant traffic nodes. The resulting digital twins of the real environment are suitable for developing and validating connected and automated driving functions and for increasing the operational safety of intelligent vehicles by providing ITS-S real-time data. However, ITS-Ss are very costly to establish and operate. The choice of sensor technology also has an impact on the overall costs as well as on the data quality. So far, there is only insufficient knowledge about the concrete expenses that arise with the construction of different ITS-S setups. Within this work, multiple modular infrastructure sensor setups are investigated with the help of a life cycle cost analysis (LCCA). Their economic efficiency, different user requirements and sensor data qualities are considered. Based on the static cost model, a Monte Carlo simulation is performed, to generate a range of possible project costs and to quantify the financial risks of implementing ITS-S projects of different scales. Due to its modularity, the calculation model is suitable for diverse applications and outputs a distinctive evaluation of the underlying cost-benefit ratio of investigated setups.
2023-07-24	A quantitative theoretical model of the boson peak based on stringlet excitations	Cunyuan Jiang et.al.	2307.12839v1	null	The boson peak (BP), a low-energy excess in the vibrational density of states over the phonon Debye contribution, is usually identified as one of the distinguishing features between ordered crystals and amorphous solid materials. Despite decades of efforts, its microscopic origin still remains a mystery and a consensus on its theoretical derivation has not yet been achieved. Recently, it has been proposed, and corroborated with simulations, that the BP might stem from intrinsic localized modes which involve string-like excitations ("stringlets") having a one-dimensional (1D) nature. In this work, we build on a theoretical framework originally proposed by Lund that describes the localized modes as 1D vibrating strings, but we specify the stringlet size distribution to be exponential, as observed in independent simulation studies. We show that a generalization of this framework provides an analytically prediction for the BP frequency $\omega_{BP}$ in the temperature regime well below the glass transition temperature in both 2D and 3D amorphous systems. The final result involves no free parameters and is in quantitative agreement with prior simulation observations. Additionally, this stringlet theory of the BP naturally reproduces the softening of the BP frequency upon heating and offers an analytical explanation for the experimentally observed scaling with the shear modulus in the glass state and changes in this scaling in cooled liquids. Finally, the theoretical analysis highlights the existence of a strong damping for the stringlet modes at finite temperature which leads to a large low-frequency contribution to the 3D vibrational density of states, as observed in both experiments and simulations.
2023-07-24	Residual stresses couple microscopic and macroscopic scales	Sebastian Steinhäuser et.al.	2307.12764v1	null	We show how residual stresses emerge in a visco-elastic material as a signature of its past flow history, through an interplay between flow-modified microscopic relaxation and macroscopic features of the flow. Long-lasting temporal-history dependence of the microscopic dynamics and nonlinear rheology are incorporated through the mode-coupling theory of the glass transition (MCT). The theory's integral constitutive equation (ICE) is coupled to continuum mechanics in a finite-element method (FEM) scheme that tracks the flow history through the Finger tensor. The method is suitable for a calculation of residual stresses from a "first-principles" starting point following well-understood approximations. As an example, we calculate within a schematic version of MCT the stress-induced optical birefringence pattern of an amorphous solid cast into the shape of a slab with a cylindrical obstacle and demonstrate how FEM-MCT can predict the dependence of material properties on the material's processing history.
2023-07-24	Automated Mapping of Adaptive App GUIs from Phones to TVs	Han Hu et.al.	2307.12522v1	null	With the increasing interconnection of smart devices, users often desire to adopt the same app on quite different devices for identical tasks, such as watching the same movies on both their smartphones and TV. However, the significant differences in screen size, aspect ratio, and interaction styles make it challenging to adapt Graphical User Interfaces (GUIs) across these devices. Although there are millions of apps available on Google Play, only a few thousand are designed to support smart TV displays. Existing techniques to map a mobile app GUI to a TV either adopt a responsive design, which struggles to bridge the substantial gap between phone and TV or use mirror apps for improved video display, which requires hardware support and extra engineering efforts. Instead of developing another app for supporting TVs, we propose a semi-automated approach to generate corresponding adaptive TV GUIs, given the phone GUIs as the input. Based on our empirical study of GUI pairs for TV and phone in existing apps, we synthesize a list of rules for grouping and classifying phone GUIs, converting them to TV GUIs, and generating dynamic TV layouts and source code for the TV display. Our tool is not only beneficial to developers but also to GUI designers, who can further customize the generated GUIs for their TV app development. An evaluation and user study demonstrate the accuracy of our generated GUIs and the usefulness of our tool.
2023-07-24	Evolution of Free Volume Elements in Amorphous Polymers Undergoing Uniaxial Deformation: a Molecular Dynamics Simulation Study	Brendan Wernisch et.al.	2307.12460v1	null	Amorphous polymers are considered promising materials for separations due to their excellent transport properties and low fabrication costs. The separation performance of a membrane material is characterized by its permeability (overall throughput of components), and selectivity (efficiency of separation). Both permeability and selectivity are controlled by the diffusion of different penetrants through the matrix, which is strongly influenced by the distribution and morphology of the free volume elements (FVEs). FVEs are void spaces in the polymer matrix that result from the inefficient packing of bulky and rigid groups on the polymer backbone. Thus, FVEs dictate the efficiency of membrane polymers, and it is imperative to understand how processing conditions such as high pressure influence their structure. In this paper, we apply uniaxial tensile deformation on three polymers, namely polystyrene (PS), polymethylpentene (PMP), and HAB-6FDA thermally rearranged polymer (TRP), at varying temperatures and strain rates. We calculate the stress strain curve, tensile modulus, and free volume element evolution at these conditions. We find that PMP and PS with low and moderate glass transition temperature, respectively, exhibit the most change in mechanical properties as a function of strain rate and temperature. The properties of TRP, however, do not vary as much. We also find that FVEs become larger with deformation, and the extent of this change is in line with the overall change of mechanical properties of the material.
2023-07-23	TransNet: Transparent Object Manipulation Through Category-Level Pose Estimation	Huijie Zhang et.al.	2307.12400v1	null	Transparent objects present multiple distinct challenges to visual perception systems. First, their lack of distinguishing visual features makes transparent objects harder to detect and localize than opaque objects. Even humans find certain transparent surfaces with little specular reflection or refraction, like glass doors, difficult to perceive. A second challenge is that depth sensors typically used for opaque object perception cannot obtain accurate depth measurements on transparent surfaces due to their unique reflective properties. Stemming from these challenges, we observe that transparent object instances within the same category, such as cups, look more similar to each other than to ordinary opaque objects of that same category. Given this observation, the present paper explores the possibility of category-level transparent object pose estimation rather than instance-level pose estimation. We propose \textit{\textbf{TransNet}}, a two-stage pipeline that estimates category-level transparent object pose using localized depth completion and surface normal estimation. TransNet is evaluated in terms of pose estimation accuracy on a large-scale transparent object dataset and compared to a state-of-the-art category-level pose estimation approach. Results from this comparison demonstrate that TransNet achieves improved pose estimation accuracy on transparent objects. Moreover, we use TransNet to build an autonomous transparent object manipulation system for robotic pick-and-place and pouring tasks.
2023-07-23	An Efficient Authentication Protocol for Smart Grid Communication Based on On-Chip-Error-Correcting Physical Unclonable Function	Masoud Kaveh et.al.	2307.12374v1	null	Security has become a main concern for the smart grid to move from research and development to industry. The concept of security has usually referred to resistance to threats by an active or passive attacker. However, since smart meters (SMs) are often placed in unprotected areas, physical security has become one of the important security goals in the smart grid. Physical unclonable functions (PUFs) have been largely utilized for ensuring physical security in recent years, though their reliability has remained a major problem to be practically used in cryptographic applications. Although fuzzy extractors have been considered as a solution to solve the reliability problem of PUFs, they put a considerable computational cost to the resource-constrained SMs. To that end, we first propose an on-chip-error-correcting (OCEC) PUF that efficiently generates stable digits for the authentication process. Afterward, we introduce a lightweight authentication protocol between the SMs and neighborhood gateway (NG) based on the proposed PUF. The provable security analysis shows that not only the proposed protocol can stand secure in the Canetti-Krawczyk (CK) adversary model but also provides additional security features. Also, the performance evaluation demonstrates the significant improvement of the proposed scheme in comparison with the state-of-the-art.
2023-07-23	Efficient structural color from pigment-loaded nanostructures	Tianqi Sai et.al.	2307.12346v1	null	Color can originate from wavelength-dependence in the absorption of pigments or the scattering of nanostructures. While synthetic colors are dominated by the former, vivid structural colors found in nature have inspired much research on the latter. However, many of the most vibrant colors in nature involve the interactions of structure and pigment. Here, we demonstrate that pigment can be exploited to efficiently create bright structural color at wavelengths outside its absorption band. We created pigment-enhanced Bragg reflectors by sequentially spin-coating layers of poly-vinyl alcohol (PVA) and polystyrene (PS) loaded with $\beta$-carotene (BC). With only 10 double layers, we acheived a peak reflectance over $0.8$ at 550 nm and normal incidence. A pigment-free multilayer made of the same materials would require 25 double layers to achieve the same reflectance. Further, pigment loading suppressed the Bragg reflector's characteristic iridescence. Using numerical simulations, we further show that similar pigment loadings could significantly expand the gamut of non-iridescent colors addressable by photonic glasses.
2023-07-23	Semantic Communication-Empowered Traffic Management using Vehicle Count Prediction	Sachin Kadam et.al.	2307.12254v1	null	Vehicle count prediction is an important aspect of smart city traffic management. Most major roads are monitored by cameras with computing and transmitting capabilities. These cameras provide data to the central traffic controller (CTC), which is in charge of traffic control management. In this paper, we propose a joint CNN-LSTM-based semantic communication (SemCom) model in which the semantic encoder of a camera extracts the relevant semantics from raw images. The encoded semantics are then sent to the CTC by the transmitter in the form of symbols. The semantic decoder of the CTC predicts the vehicle count on each road based on the sequence of received symbols and develops a traffic management strategy accordingly. An optimization problem to improve the quality of experience (QoE) is introduced and numerically solved, taking into account constraints such as vehicle user safety, transmit power of camera devices, vehicle count prediction accuracy, and semantic entropy. Using numerical results, we show that the proposed SemCom model reduces overhead by $54.42\%$ when compared to source encoder/decoder methods. Also, we demonstrate through simulations that the proposed model outperforms state-of-the-art models in terms of mean absolute error (MAE) and QoE.
2023-07-23	Ru doping induced spin frustration and enhancement of the room-temperature anomalous Hall effect in La2/3Sr1/3MnO3 films	Enda Hua et.al.	2307.12253v1	null	In transition-metal-oxide heterostructures, the anomalous Hall effect (AHE) is a powerful tool for detecting the magnetic state and revealing intriguing interfacial magnetic orderings. However, achieving a larger AHE at room temperature in oxide heterostructures is still challenging due to the dilemma of mutually strong spin-orbit coupling and magnetic exchange interactions. Here, we exploit the Ru doping-enhanced AHE in LSMRO epitaxial films. As the B-site Ru doping level increases up to 20 percent, the anomalous Hall resistivity at room temperature can be enhanced from nOhmcm to uOhmcm scale. Ru doping leads to strong competition between ferromagnetic double-exchange interaction and antiferromagnetic super-exchange interaction. The resultant spin frustration and spin-glass state facilitate a strong skew-scattering process, thus significantly enhancing the extrinsic AHE. Our findings could pave a feasible approach for boosting the controllability and reliability of oxide-based spintronic devices.
2023-07-23	Content Censorship in the InterPlanetary File System	Srivatsan Sridhar et.al.	2307.12212v1	null	The InterPlanetary File System (IPFS) is currently the largest decentralized storage solution in operation, with thousands of active participants and millions of daily content transfers. IPFS is used as remote data storage for numerous blockchain-based smart contracts, Non-Fungible Tokens (NFT), and decentralized applications. We present a content censorship attack that can be executed with minimal effort and cost, and that prevents the retrieval of any chosen content in the IPFS network. The attack exploits a conceptual issue in a core component of IPFS, the Kademlia Distributed Hash Table (DHT), which is used to resolve content IDs to peer addresses. We provide efficient detection and mitigation mechanisms for this vulnerability. Our mechanisms achieve a 99.6\% detection rate and mitigate 100\% of the detected attacks with minimal signaling and computational overhead. We followed responsible disclosure procedures, and our countermeasures are scheduled for deployment in the future versions of IPFS.
2023-07-22	AI on the Road: A Comprehensive Analysis of Traffic Accidents and Accident Detection System in Smart Cities	Victor Adewopo et.al.	2307.12128v1	null	Accident detection and traffic analysis is a critical component of smart city and autonomous transportation systems that can reduce accident frequency, severity and improve overall traffic management. This paper presents a comprehensive analysis of traffic accidents in different regions across the United States using data from the National Highway Traffic Safety Administration (NHTSA) Crash Report Sampling System (CRSS). To address the challenges of accident detection and traffic analysis, this paper proposes a framework that uses traffic surveillance cameras and action recognition systems to detect and respond to traffic accidents spontaneously. Integrating the proposed framework with emergency services will harness the power of traffic cameras and machine learning algorithms to create an efficient solution for responding to traffic accidents and reducing human errors. Advanced intelligence technologies, such as the proposed accident detection systems in smart cities, will improve traffic management and traffic accident severity. Overall, this study provides valuable insights into traffic accidents in the US and presents a practical solution to enhance the safety and efficiency of transportation systems.

heart rate

Publish Date	Title	Authors	PDF	Code	Abstract
2023-07-27	News from the Swampland -- Constraining string theory with astrophysics and cosmology	Nils Schöneberg et.al.	2307.15060v1	null	Our current best guess for a unified theory of gravitation and quantum field theory (string theory) generically predicts a set of requirements for a consistently quantized theory, the Swampland criteria. Refined versions of these criteria have recently been shown to be in mild tension with cosmological observations. We summarize the status of the current impact of and constraints on the Swampland conjectures from cosmology, and subject a variety of dark energy quintessence models to recently released cosmological datasets. We find that instead of tightening the tension, the new data allows for slightly more freedom in the Swampland criteria. We further demonstrate that if there is no theoretical argument made to prevent interactions of the moduli fields with the electromagnetic sector, a novel fine-tuning argument arises from the extremely tight current constraints on such interactions. Finally, we conclude with a cautionary tale on model-independent reconstructions of the Swampland criteria from expansion rate data.
2023-07-27	Probing the large deviations for the Beta random walk in random medium	Alexander K. Hartmann et.al.	2307.15041v1	null	We consider a discrete-time random walk on a one-dimensional lattice with space and time-dependent random jump probabilities, known as the Beta random walk. We are interested in the probability that, for a given realization of the jump probabilities (a sample), a walker starting at the origin at time $t=0$ is at position beyond $\xi \sqrt{T/2}$ at time $T$. This probability fluctuates from sample to sample and we study the large-deviation rate function which characterizes the tails of its distribution at large time $T \gg 1$. It is argued that, up to a simple rescaling, this rate function is identical to the one recently obtained exactly by two of the authors for the continuum version of the model. That continuum model also appears in the macroscopic fluctuation theory of a class of lattice gases, e.g. in the so-called KMP model of heat transfer. An extensive numerical simulation of the Beta random walk, based on an importance sampling algorithm, is found in good agreement with the detailed analytical predictions. A first-order transition in the tilted measure, predicted to occur in the continuum model, is also observed in the numerics.
2023-07-27	Diverse Inpainting and Editing with GAN Inversion	Ahmet Burak Yildirim et.al.	2307.15033v1	null	Recent inversion methods have shown that real images can be inverted into StyleGAN's latent space and numerous edits can be achieved on those images thanks to the semantically rich feature representations of well-trained GAN models. However, extensive research has also shown that image inversion is challenging due to the trade-off between high-fidelity reconstruction and editability. In this paper, we tackle an even more difficult task, inverting erased images into GAN's latent space for realistic inpaintings and editings. Furthermore, by augmenting inverted latent codes with different latent samples, we achieve diverse inpaintings. Specifically, we propose to learn an encoder and mixing network to combine encoded features from erased images with StyleGAN's mapped features from random samples. To encourage the mixing network to utilize both inputs, we train the networks with generated data via a novel set-up. We also utilize higher-rate features to prevent color inconsistencies between the inpainted and unerased parts. We run extensive experiments and compare our method with state-of-the-art inversion and inpainting methods. Qualitative metrics and visual comparisons show significant improvements.
2023-07-27	Revealing the Impact of Beamforming in ISAC	Chongjun Ouyang et.al.	2307.15023v1	null	This letter proposes advanced beamforming design and analyzes its influence on the sensing and communications (S&C) performance for a multiple-antenna integrated S&C (ISAC) system with a single communication user and a single target. Novel closed-form beamformers are derived for three typical scenarios, including the sensing-centric design, communications-centric design, and Pareto optimal design. Regarding each scenario, the outage probability, ergodic communication rate (CR), and sensing rate (SR) are analyzed to derive the diversity orders and high signal-to-noise ratio slopes. Numerical results are provided to demonstrate that i) beamforming design can affect the high-SNR power offset and diversity order but does not influence the high-SNR slope; ii) ISAC exhibits larger high-SNR slopes and a more extensive SR-CR region than conventional frequency-division S&C (FDSAC) techniques.
2023-07-27	SuperCLUE: A Comprehensive Chinese Large Language Model Benchmark	Liang Xu et.al.	2307.15020v1	null	Large language models (LLMs) have shown the potential to be integrated into human daily lives. Therefore, user preference is the most critical criterion for assessing LLMs' performance in real-world scenarios. However, existing benchmarks mainly focus on measuring models' accuracy using multi-choice questions, which limits the understanding of their capabilities in real applications. We fill this gap by proposing a comprehensive Chinese benchmark SuperCLUE, named after another popular Chinese LLM benchmark CLUE. SuperCLUE encompasses three sub-tasks: actual users' queries and ratings derived from an LLM battle platform (CArena), open-ended questions with single and multiple-turn dialogues (OPEN), and closed-ended questions with the same stems as open-ended single-turn ones (CLOSE). Our study shows that accuracy on closed-ended questions is insufficient to reflect human preferences achieved on open-ended ones. At the same time, they can complement each other to predict actual user preferences. We also demonstrate that GPT-4 is a reliable judge to automatically evaluate human preferences on open-ended questions in a Chinese context. Our benchmark will be released at https://www.CLUEbenchmarks.com
2023-07-27	Decomposing and Routing Quantum Circuits Under Constraints for Neutral Atom Architectures	Natalia Nottingham et.al.	2307.14996v1	null	Quantum computing is in an era defined by rapidly evolving quantum hardware technologies, combined with persisting high gate error rates, large amounts of noise, and short coherence times. Overcoming these limitations requires systems-level approaches that account for the strengths and weaknesses of the underlying hardware technology. Yet few hardware-aware compiler techniques exist for neutral atom devices, with no prior work on compiling to the neutral atom native gate set. In particular, current neutral atom hardware does not support certain single-qubit rotations via local addressing, which often requires the circuit to be decomposed into a large number of gates, leading to long circuit durations and low overall fidelities. We propose the first compiler designed to overcome the challenges of limited local addressibility in neutral atom quantum computers. We present algorithms to decompose circuits into the neutral atom native gate set, with emphasis on optimizing total pulse area of global gates, which dominate gate execution costs in several current architectures. Furthermore, we explore atom movement as an alternative to expensive gate decompositions, gaining immense speedup with routing, which remains a huge overhead for many quantum circuits. Our decomposition optimizations result in up to ~3.5x and ~2.9x speedup in time spent executing global gates and time spent executing single-qubit gates, respectively. When combined with our atom movement routing algorithms, our compiler achieves up to ~10x reduction in circuit duration, with over ~2x improvement in fidelity. We show that our compiler strategies can be adapted for a variety of hardware-level parameters as neutral atom technology continues to develop.
2023-07-27	Learning cross-layer dependence structure in multilayer networks	Jiaheng Li et.al.	2307.14982v1	null	Multilayer networks are a network data structure in which elements in a population of interest have multiple modes of interaction or relation, represented by multiple networks called layers. We propose a novel class of models for cross-layer dependence in multilayer networks, aiming to learn how interactions in one or more layers may influence interactions in other layers of the multilayer network, by developing a class of network separable models which separate the network formation process from the layer formation process. In our framework, we are able to extend existing single layer network models to a multilayer network model with cross-layer dependence. We establish non-asymptotic bounds on the error of estimators and demonstrate rates of convergence for both maximum likelihood estimators and maximum pseudolikelihood estimators in scenarios of increasing parameter dimension. We additionally establish non-asymptotic error bounds on the multivariate normal approximation and elaborate a method for model selection which controls the false discovery rate. We conduct simulation studies which demonstrate that our framework and method work well in realistic settings which might be encountered in applications. Lastly, we illustrate the utility of our method through an application to the Lazega lawyers network.
2023-07-27	A Stochastic Gradient Tracking Algorithm for Decentralized Optimization With Inexact Communication	Suhail M. Shah et.al.	2307.14942v1	null	Decentralized optimization is typically studied under the assumption of noise-free transmission. However, real-world scenarios often involve the presence of noise due to factors such as additive white Gaussian noise channels or probabilistic quantization of transmitted data. These sources of noise have the potential to degrade the performance of decentralized optimization algorithms if not effectively addressed. In this paper, we focus on the noisy communication setting and propose an algorithm that bridges the performance gap caused by communication noise while also mitigating other challenges like data heterogeneity. We establish theoretical results of the proposed algorithm that quantify the effect of communication noise and gradient noise on the performance of the algorithm. Notably, our algorithm achieves the optimal convergence rate for minimizing strongly convex, smooth functions in the context of inexact communication and stochastic gradients. Finally, we illustrate the superior performance of the proposed algorithm compared to its state-of-the-art counterparts on machine learning problems using MNIST and CIFAR-10 datasets.
2023-07-27	Simulated analogues II: a new methodology for non-parametric matching of models to observations	Rami Al-Belmpeisi et.al.	2307.14924v1	null	Star formation is a multi-scale problem, and only global simulations that account for the connection from the molecular cloud scale gas flow to the accreting protostar can reflect the observed complexity of protostellar systems. Star-forming regions are characterised by supersonic turbulence and as a result, it is not possible to simultaneously design models that account for the larger environment and in detail reproduce observed stellar systems. Instead, the stellar inventories can be matched statistically, and best matches found that approximate specific observations. Observationally, a combination of single-dish telescopes and interferometers are now able to resolve the nearest protostellar objects on all scales from the protostellar core to the inner 10 AU. We present a new non-parametric methodology which uses high-resolution simulations and post-processing methods to match simulations and observations using deep learning. Our goal is to perform a down-selection from large data sets of synthetic images to a ranked list of best-matching candidates with respect to the observation. This is particularly useful for binary and multiple stellar systems that form in turbulent environments. The objective is to accelerate the rate at which we can do such comparisons, remove biases from hand-picking matches, and contribute to identifying the underlying physical processes that drive the creation and evolution of observed protostellar systems.
2023-07-27	On the survivability of a population of gas giant planets on wide orbits	Ethan Carter et.al.	2307.14908v1	null	The existence of giant planets on wide orbits ($\stackrel{>}{_\sim}100$AU) challenge planet formation theories; the core accretion scenario has difficulty in forming them, whereas the disc instability model forms an overabundance of them that is not seen observations. We perform $N$-body simulations investigating the effect of close stellar encounters ($\leq 1200$AU) on systems hosting wide-orbit giant planets and the extent at which such interactions may disrupt the initial wide-orbit planet population. We find that the effect of an interaction on the orbit of a planet is stronger for high-mass, low-velocity perturbers, as expected. We find that due to just a single encounter there is a $\sim 17%$ chance that the wide-orbit giant planet is liberated in the field, a $\sim 10$% chance it is scattered significantly outwards, and a $\sim 6$% chance it is significantly scattered inwards. Moreover, there is a $\sim 21\%$ chance that its eccentricity is excited to e>0.1, making it more prone to disruption in subsequent encounters. The results strongly suggest that the effect of even a single stellar encounter is significant in disrupting the primordial wide-orbit giant planet population; in reality the effect will be even more prominent, as in a young star-forming region more such interactions are expected to occur. We conclude that the low occurrence rate of wide-orbit planets revealed by observational surveys does not exclude the possibility that such planetary systems are initially abundant, and therefore the disc-instability model may be a plausible scenario for their formation.
2023-07-27	Scaling Session-Based Transformer Recommendations using Optimized Negative Sampling and Loss Functions	Timo Wilm et.al.	2307.14906v1	link	This work introduces TRON, a scalable session-based Transformer Recommender using Optimized Negative-sampling. Motivated by the scalability and performance limitations of prevailing models such as SASRec and GRU4Rec+, TRON integrates top-k negative sampling and listwise loss functions to enhance its recommendation accuracy. Evaluations on relevant large-scale e-commerce datasets show that TRON improves upon the recommendation quality of current methods while maintaining training speeds similar to SASRec. A live A/B test yielded an 18.14% increase in click-through rate over SASRec, highlighting the potential of TRON in practical settings. For further research, we provide access to our source code at https://github.com/otto-de/TRON and an anonymized dataset at https://github.com/otto-de/recsys-dataset.
2023-07-27	Analyzing the Closed-Loop Performance of Detect-And-Avoid Systems	Ítalo Romani de Oliveira et.al.	2307.14894v1	null	Detect-And-Avoid (DAA) algorithms for unmanned air vehicles have industry standards called Minimum Operational Performance Standards (MOPS), establishing criteria to check whether they can ensure safe separation for all plausible operational conditions. However, these MOPS ensure performance for the avoidance maneuvers, which are open-loop, but not for the maneuvers that bring the air vehicles back to their intended courses, closing the control loop of the missions. In this paper, we analyze the closed-loop performance of existing DAA algorithms, by experimenting large numbers of traffic configurations with 4 aircraft in a delimited airspace. We measure and analyze their rates of loss of separation and timeout events, the latter happening when a chain of maneuvers exceeds the maximum supply of energy in a vehicle. We also analyze the efficiency of the closed-loop logic, expressed as the excess fuel rate, and study the relationship between safety and efficiency in these scenarios. Our results show that the inefficiency caused by DAA algorithms is very significant in dense airspaces and that it offers ample space for improvement. Performing the simulations of the closed-loop mission management logic can be highly time consuming. Despite there being Neural Network approximations of DAA logic that alleviate the computational load, none of those that we found available can properly handle cases with more than one intruder in the ownship surveillance range. Therefore, in an attempt to overcome the high computational cost of analyzing the closed-loop performance of DAA algorithms, we study the correlation between inefficiency and safety of closed-loop scenarios, and some indicators from the corresponding open-loop scenarios, such as angle of deviation and number of deviations per flight time.
2023-07-27	Anisotropic multiband superconductivity in 2M-WS$_{2}$ probed by controlled disorder	Sunil Ghimire et.al.	2307.14891v1	null	The intrinsically superconducting Dirac semimetal 2M-WS${2}$ is a promising candidate to realize proximity-induced topological superconductivity in its protected surface states. A precise characterization of the bulk superconducting state is essential for understanding the nature of surface superconductivity in the system. Here, we perform a detailed experimental study of the temperature and nonmagnetic disorder dependence of the London penetration depth $\lambda$, the upper critical field $H$, and the superconducting transition temperature $T_c$ in 2M-WS${2}$. We observe a power-law dependence $\lambda(T) - \lambda(0) \propto T^{3}$ at temperatures below $0.35~T_c$, which is remarkably different from the expected exponential attenuation of a fully gapped isotropic $s$-wave superconductor. We then probe the effect of controlled nonmagnetic disorder induced by 2.5 MeV electron irradiation at various doses and find a significant $T_c$ suppression rate. Together with the observed increase of the slope $dH/dT
2023-07-27	Constraints on dark matter and astrophysics from tomographic $γ$-ray cross-correlations	Anya Paopiamsap et.al.	2307.14881v1	null	We study the cross-correlation between maps of the unresolved $\gamma$-ray background constructed from the 12-year data release of the Fermi Large-Area Telescope, and the overdensity of galaxies in the redshift range $z\lesssim0.4$ as measured by the 2MASS Photometric Redshift survey and the WISE-SuperCOSMOS photometric survey. A signal is detected at the $8-9\sigma$ level, which we interpret in terms of both astrophysical $\gamma$-ray sources, and WIMP dark matter decay and annihilation. The sensitivity achieved allows us to characterise the energy and redshift dependence of the signal, and we show that the latter is incompatible with a pure dark matter origin. We thus use our measurement to place an upper bound on the WIMP decay rate and the annihilation cross-section, finding constraints that are competitive with those found in other analyses. Our analysis is based on the extraction of clean model-independent observables that can then be used to constrain arbitrary astrophysical and particle physics models. In this sense we produce measurements of the $\gamma$-ray emissivity as a function of redshift and rest-frame energy $\epsilon$, and of a quantity $F(\epsilon)$ encapsulating all WIMP parameters relevant for dark matter decay or annihilation. We make these measurements, together with a full account of their statistical uncertainties, publicly available.
2023-07-27	One-step nonparametric instrumental regression using smoothing splines	Jad Beyhum et.al.	2307.14867v1	null	We extend nonparametric regression smoothing splines to a context where there is endogeneity and instrumental variables are available. Unlike popular existing estimators, the resulting estimator is one-step and relies on a unique regularization parameter. We derive uniform rates of the convergence for the estimator and its first derivative. We also address the issue of imposing monotonicity in estimation. Simulations confirm the good performances of our estimator compared to two-step procedures. Our method yields economically sensible results when used to estimate Engel curves.
2023-07-27	Modeling Interference for the Coexistence of 6G Networks and Passive Sensing Systems	Paolo Testolina et.al.	2307.14848v1	null	Future wireless networks and sensing systems will benefit from access to large chunks of spectrum above 100 GHz, to achieve terabit-per-second data rates in 6th Generation (6G) cellular systems and improve accuracy and reach of Earth exploration and sensing and radio astronomy applications. These are extremely sensitive to interference from artificial signals, thus the spectrum above 100~GHz features several bands which are protected from active transmissions under current spectrum regulations. To provide more agile access to the spectrum for both services, active and passive users will have to coexist without harming passive sensing operations. In this paper, we provide the first, fundamental analysis of Radio Frequency Interference (RFI) that large-scale terrestrial deployments introduce in different satellite sensing systems now orbiting the Earth. We develop a geometry-based analysis and extend it into a data-driven model which accounts for realistic propagation, building obstruction, ground reflection, for network topology with up to $10^5$ nodes in more than $85$ km$^2$. We show that the presence of harmful RFI depends on several factors, including network load, density and topology, satellite orientation, and building density. The results and methodology provide the foundation for the development of coexistence solutions and spectrum policy towards 6G.
2023-07-27	State preparation by shallow circuits using feed forward	Harry Buhrman et.al.	2307.14840v1	null	In order to achieve fault-tolerant quantum computation, we need to repeat the following sequence of four steps: First, perform 1 or 2 qubit quantum gates (in parallel if possible). Second, do a syndrome measurement on a subset of the qubits. Third, perform a fast classical computation to establish which errors have occurred (if any). Fourth, depending on the errors, we apply a correction step. Then the procedure repeats with the next sequence of gates. In order for these four steps to succeed, we need the error rate of the gates to be below a certain threshold. Unfortunately, the error rates of current quantum hardware are still too high. On the other hand, current quantum hardware platforms are designed with these four steps in mind. In this work we make use of this four-step scheme not to carry out fault-tolerant computations, but to enhance short, constant-depth, quantum circuits that perform 1 qubit gates and nearest-neighbor 2 qubit gates. To explore how this can be useful, we study a computational model which we call Local Alternating Quantum Classical Computations (LAQCC). In this model, qubits are placed in a grid allowing nearest neighbor interactions; the quantum circuits are of constant depth with intermediate measurements; a classical controller can perform log-depth computations on these intermediate measurement outcomes to control future quantum operations. This model fits naturally between quantum algorithms in the NISQ era and full fledged fault-tolerant quantum computation. We show that LAQCC circuits can create long-ranged interactions, which constant-depth quantum circuits cannot achieve, and use it to construct a range of useful multi-qubit gates. With these gates, we create three new state preparation protocols for a uniform superposition over an arbitrary number of states, W-states and Dicke states.
2023-07-27	Experimental validation of particle-in-cell/Monte Carlo collisions simulations in low-pressure neon capacitively coupled plasmas	Chan-Won Park et.al.	2307.14821v1	null	Plasma simulations are powerful tools for understanding fundamental plasma science phenomena and for process optimization in applications. To ensure their quantitative accuracy, they must be validated against experiments. In this work, such an experimental validation is performed for a 1d3v particle-in-cell simulation complemented with the Monte Carlo treatment of collision processes of a capacitively coupled radio frequency plasma driven at 13.56 MHz and operated in neon gas. In a geometrically symmetric reactor the electron density in the discharge center and the spatio-temporal distribution of the electron impact excitation rate from the ground into the Ne 2p$_1$ state are measured by a microwave cutoff probe and phase resolved optical emission spectroscopy, respectively. The measurements are conducted for electrode gaps between 50 mm and 90 mm, neutral gas pressures between 20 mTorr and 50 mTorr, and peak-to-peak values of the driving voltage waveform between 250 V and 650 V. Simulations are performed under identical discharge conditions. In the simulations, various combinations of surface coefficients characterising the interactions of electrons and heavy particles with the anodized aluminium electrode surfaces are adopted. We find, that the simulations using a constant effective heavy particle induced secondary electron emission coefficient of 0.3 and a realistic electron-surface interaction model (which considers energy-dependent and material specific elastic and inelastic electron reflection, as well as the emission of true secondary electrons from the surface) yield results which are in good quantitative agreement with the experimental data.
2023-07-27	Fate of homogeneous $Z_2$-symmetric scalar condensates	Wen-Yuan Ai et.al.	2307.14811v1	null	Dark Matter, if represented by a $Z_2$-symmetric scalar field, can manifest as both particles and condensates. In this paper, we study the evolution of an oscillating homogeneous condensate of a $Z_2$-symmetric scalar field in a thermal plasma in an FLRW universe. We focus on the perturbative regime where the oscillation amplitude is sufficiently small so that parametric resonance is inefficient. This perturbative regime necessarily comprises the late stage of the condensate decay and determines its fate. The coupled coarse-grained equations of motion for the condensate, radiation, and spacetime are derived from first principles using nonequilibrium quantum field theory. We obtain analytical expressions for the relevant microscopic quantities that enter the equations of motion and solve the latter numerically. We find that there is always a nonvanishing relic abundance for a $Z_2$-symmetric condensate because its decay rate decreases faster than the Hubble parameter at late times due to either the amplitude-dependence or the temperature-dependence in the condensate decay rate. Consequently, accounting for the condensate contribution to the overall Dark Matter relic density is essential for $Z_2$ scalar singlet Dark Matter. Unlike normal thermal freeze-out for particles, the condensate relic density depends on the initial condition which we take as arbitrary in the present work provided that it falls within the perturbative regime.
2023-07-27	Automatic Parallelization of Software Network Functions	Francisco Pereira et.al.	2307.14791v1	null	Software network functions (NFs) trade-off flexibility and ease of deployment for an increased challenge of performance. The traditional way to increase NF performance is by distributing traffic to multiple CPU cores, but this poses a significant challenge: how to parallelize an NF without breaking its semantics? We propose Maestro, a tool that analyzes a sequential implementation of an NF and automatically generates an enhanced parallel version that carefully configures the NIC's Receive Side Scaling mechanism to distribute traffic across cores, while preserving semantics. When possible, Maestro orchestrates a shared-nothing architecture, with each core operating independently without shared memory coordination, maximizing performance. Otherwise, Maestro choreographs a fine-grained read-write locking mechanism that optimizes operation for typical Internet traffic. We parallelized 8 software NFs and show that they generally scale-up linearly until bottlenecked by PCIe when using small packets or by 100Gbps line-rate with typical Internet traffic. Maestro further outperforms modern hardware-based transactional memory mechanisms, even for challenging parallel-unfriendly workloads.
2023-07-27	A Variance-Reduced Aggregation Based Gradient Tracking method for Distributed Optimization over Directed Networks	Shengchao Zhao et.al.	2307.14776v1	null	This paper studies the distributed optimization problem over directed networks with noisy information-sharing. To resolve the imperfect communication issue over directed networks, a series of noise-robust variants of Push-Pull/AB method have been developed. These methods improve the robustness of Push-Pull method against the information-sharing noise through adding small factors on weight matrices and replacing the global gradient tracking with the cumulative gradient tracking. Based on the two techniques, we propose a new variant of the Push-Pull method by presenting a novel mechanism of inter-agent information aggregation, named variance-reduced aggregation (VRA). VRA helps us to release some conditions on the objective function and networks. When the objective function is convex and the sharing-information noise is variance-unbounded, it can be shown that the proposed method converges to the optimal solution almost surely. When the objective function is strongly convex and the sharing-information noise is variance-bounded, the proposed method achieves the convergence rate of $\mathcal{O}\left(k^{-(1-\epsilon)}\right)$ in the mean square sense, where $\epsilon$ could be close to 0 infinitely. Simulated experiments on ridge regression problems verify the effectiveness of the proposed method.
2023-07-27	Seasonal Variations of the Atmospheric Neutrino Flux measured in IceCube	Karolin Hymon et.al.	2307.14724v1	null	The IceCube Neutrino Observatory measures high energy atmospheric neutrinos with high statistics. These atmospheric neutrinos are produced in cosmic ray interactions in the atmosphere, mainly by the decay of pions and kaons. The rate of the measured neutrinos is affected by seasonal temperature variations in the stratosphere, which are expected to increase with the energy of the particle. In this contribution, seasonal energy spectra are obtained using a novel spectrum unfolding approach, the Dortmund Spectrum Estimation Algorithm (DSEA+), in which the energy distribution from 125 GeV to 10 TeV is estimated from measured quantities with machine learning algorithms. The seasonal spectral difference to the annual average flux will be discussed based on preliminary results from IceCube's atmospheric muon neutrino data.
2023-07-27	Unified Adversarial Patch for Visible-Infrared Cross-modal Attacks in the Physical World	Xingxing Wei et.al.	2307.14682v1	null	Physical adversarial attacks have put a severe threat to DNN-based object detectors. To enhance security, a combination of visible and infrared sensors is deployed in various scenarios, which has proven effective in disabling existing single-modal physical attacks. To further demonstrate the potential risks in such cases, we design a unified adversarial patch that can perform cross-modal physical attacks, achieving evasion in both modalities simultaneously with a single patch. Given the different imaging mechanisms of visible and infrared sensors, our work manipulates patches' shape features, which can be captured in different modalities when they undergo changes. To deal with challenges, we propose a novel boundary-limited shape optimization approach that aims to achieve compact and smooth shapes for the adversarial patch, making it easy to implement in the physical world. And a score-aware iterative evaluation method is also introduced to balance the fooling degree between visible and infrared detectors during optimization, which guides the adversarial patch to iteratively reduce the predicted scores of the multi-modal sensors. Furthermore, we propose an Affine-Transformation-based enhancement strategy that makes the learnable shape robust to various angles, thus mitigating the issue of shape deformation caused by different shooting angles in the real world. Our method is evaluated against several state-of-the-art object detectors, achieving an Attack Success Rate (ASR) of over 80%. We also demonstrate the effectiveness of our approach in physical-world scenarios under various settings, including different angles, distances, postures, and scenes for both visible and infrared sensors.
2023-07-27	Single Photon Superradiance and Subradiance as Collective Emission From Symmetric and Antisymmetric States	Nicola Piovella et.al.	2307.14667v1	null	Recent works have shown that collective single photon spontaneous emission from an ensemble of $N$ resonant two-level atoms is a rich field of study. Superradiance describes emission from a completely symmetric state of $N$ atoms, with a single excited atom prepared with a given phase, for instance imprinted by an external laser. Instead, subradiance is associated with the emission from the remaining $N-1$ asymmetric states, with a collective decay rate less than the single-atom value. Here, we discuss the properties of the orthonormal basis of symmetric and asymmetric states and the entanglement properties of superradiant and subradiant states.
2023-07-27	Speed Limits for Deep Learning	Inbar Seroussi et.al.	2307.14653v1	null	State-of-the-art neural networks require extreme computational power to train. It is therefore natural to wonder whether they are optimally trained. Here we apply a recent advancement in stochastic thermodynamics which allows bounding the speed at which one can go from the initial weight distribution to the final distribution of the fully trained network, based on the ratio of their Wasserstein-2 distance and the entropy production rate of the dynamical process connecting them. Considering both gradient-flow and Langevin training dynamics, we provide analytical expressions for these speed limits for linear and linearizable neural networks e.g. Neural Tangent Kernel (NTK). Remarkably, given some plausible scaling assumptions on the NTK spectra and spectral decomposition of the labels -- learning is optimal in a scaling sense. Our results are consistent with small-scale experiments with Convolutional Neural Networks (CNNs) and Fully Connected Neural networks (FCNs) on CIFAR-10, showing a short highly non-optimal regime followed by a longer optimal regime.
2023-07-27	Linear Convergence of Black-Box Variational Inference: Should We Stick the Landing?	Kyurae Kim et.al.	2307.14642v1	null	We prove that black-box variational inference (BBVI) with control variates, particularly the sticking-the-landing (STL) estimator, converges at a geometric (traditionally called "linear") rate under perfect variational family specification. In particular, we prove a quadratic bound on the gradient variance of the STL estimator, one which encompasses misspecified variational families. Combined with previous works on the quadratic variance condition, this directly implies convergence of BBVI with the use of projected stochastic gradient descent. We also improve existing analysis on the regular closed-form entropy gradient estimators, which enables comparison against the STL estimator and provides explicit non-asymptotic complexity guarantees for both.
2023-07-27	Final results of Borexino on CNO solar neutrinos	D. Basilico et.al.	2307.14636v1	null	We report the first measurement of CNO solar neutrinos by Borexino that uses the Correlated Integrated Directionality (CID) method, exploiting the sub-dominant Cherenkov light in the liquid scintillator detector. The directional information of the solar origin of the neutrinos is preserved by the fast Cherenkov photons from the neutrino scattered electrons, and is used to discriminate between signal and background. The directional information is independent from the spectral information on which the previous CNO solar neutrino measurements by Borexino were based. While the CNO spectral analysis could only be applied on the Phase-III dataset, the directional analysis can use the complete Borexino data taking period from 2007 to 2021. The absence of CNO neutrinos has been rejected with >5{\sigma} credible level using the Bayesian statistics. The directional CNO measurement is obtained without an external constraint on the $^{210}$Bi contamination of the liquid scintillator, which was applied in the spectral analysis approach. The final and the most precise CNO measurement of Borexino is then obtained by combining the new CID-based CNO result with an improved spectral fit of the Phase-III dataset. Including the statistical and the systematic errors, the extracted CNO interaction rate is $R(\mathrm{CNO})=6.7^{+1.2}{-0.8} \, \mathrm{cpd/100 \, tonnes}$. Taking into account the neutrino flavor conversion, the resulting CNO neutrino flux at Earth is $\Phi\mathrm{CNO}=6.7 ^{+1.2}_{-0.8} \times 10^8 \, \mathrm{cm^{-2} s^{-1}}$, in agreement with the high metallicity Standard Solar Models. The results described in this work reinforce the role of the event directional information in large-scale liquid scintillator detectors and open up new avenues for the next-generation liquid scintillator or hybrid neutrino experiments.
2023-07-27	BubbleML: A Multi-Physics Dataset and Benchmarks for Machine Learning	Sheikh Md Shakeel Hassan et.al.	2307.14623v1	link	In the field of phase change phenomena, the lack of accessible and diverse datasets suitable for machine learning (ML) training poses a significant challenge. Existing experimental datasets are often restricted, with limited availability and sparse ground truth data, impeding our understanding of this complex multi-physics phenomena. To bridge this gap, we present the BubbleML Dataset(https://github.com/HPCForge/BubbleML) which leverages physics-driven simulations to provide accurate ground truth information for various boiling scenarios, encompassing nucleate pool boiling, flow boiling, and sub-cooled boiling. This extensive dataset covers a wide range of parameters, including varying gravity conditions, flow rates, sub-cooling levels, and wall superheat, comprising 51 simulations. BubbleML is validated against experimental observations and trends, establishing it as an invaluable resource for ML research. Furthermore, we showcase its potential to facilitate exploration of diverse downstream tasks by introducing two benchmarks: (a) optical flow analysis to capture bubble dynamics, and (b) operator networks for learning temperature dynamics. The BubbleML dataset and its benchmarks serve as a catalyst for advancements in ML-driven research on multi-physics phase change phenomena, enabling the development and comparison of state-of-the-art techniques and models.
2023-07-27	Nonlinear Convex Optimization: From Relaxed Proximal Point Algorithm to Prediction Correction Method	Sai Wang et.al.	2307.14615v1	null	Nonlinear convex problems arise in various areas of applied mathematics and engineering. Classical techniques such as the relaxed proximal point algorithm (PPA) and the prediction correction (PC) method were proposed for linearly constrained convex problems. However, these methods have not been investigated for nonlinear constraints. In this paper, we customize the varying proximal matrix to develop the relaxed PPA for nonlinear convex problems. We also extend the PC method to nonlinear convex problems. As both methods are an extension of the PPA-based contraction method, their sequence convergence can be directly established. Moreover, we theoretically demonstrate that both methods can achieve a convergence rate of $O(1/t)$. Numerical results once again support the theoretical analysis.
2023-07-27	*$ \mathrm{Sr}{4}\mathrm{Al}\mathrm{O}_{7}$: A New Sacrificial Layer with High Water Dissolution Rate for the Synthesis of Freestanding Oxide Membranes*	Leyan Nian et.al.	2307.14584v1	null	Freestanding perovskite oxide membranes have drawn great attention recently since they offer exceptional structural tunability and stacking ability, providing new opportunities in fundamental research and potential device applications in silicon-based semiconductor technology. Among different types of sacrificial layers, the $ \mathrm{(Ca, Sr, Ba)}{3}\mathrm{Al}\mathrm{O}{6}$ compounds are most widely used since they can be dissolved in water and prepare high-quality perovskite oxide membranes with clean and sharp surfaces and interfaces. However, the typical transfer process takes a long time (up to hours) in obtaining millimeter-size freestanding membranes, let alone realize wafer-scale samples with high yield. Here, we introduce a new member of the $ \mathrm{SrO-}\mathrm{Al}\mathrm{O}{3}$ family,$ \mathrm{Sr}\mathrm{Al}{2}\mathrm{O},$, and demonstrate its high dissolution rate, about 10 times higher than that of $ \mathrm{Sr}{3}\mathrm{Al}\mathrm{O}{6}$. The high-dissolution-rate of $ \mathrm{Sr}\mathrm{Al}{2}\mathrm{O}$ is most likely related to the more discrete Al-O networks and higher concentration of water-soluble Sr-O species in this compound. Our work significantly facilitates the preparation of freestanding membranes and sheds light on the integration of multifunctional perovskite oxides in practical electronic devices.

smart ring

Publish Date	Title	Authors	PDF	Code	Abstract
2023-07-27	Cavity-Mediated Molecular Entanglement and Generation of Non-Classical States of Light	Davis M. Welakuh et.al.	2307.15047v1	null	The generation and control of entanglement in a quantum mechanical system is a critical element of nearly all quantum applications. Molecular systems are a promising candidate, with numerous degrees of freedom able to be targeted. However, knowledge of inter-system entanglement mechanisms in such systems is limited. In this work, we demonstrate the generation of entanglement between vibrational degrees of freedom in molecules via strong coupling to a cavity mode driven by a weak coherent field. In a bi-molecular system, we show entanglement can not only be generated between the cavity and molecular system, but also between molecules. This process also results in the generation of non-classical states of light, providing potential pathways for harnessing entanglement in molecular systems.
2023-07-27	Demazure operators for double cosets	Ben Elias et.al.	2307.15021v1	null	For any Coxeter system, and any double coset for two standard parabolic subgroups, we introduce a Demazure operator. These operators form a basis for morphism spaces in a category we call the nilCoxeter category, and we also present this category by generators and relations. We prove a generalization to this context of Demazure's celebrated theorem on Frobenius extensions. This generalized theorem serves as a criterion for ensuring the proper behavior of singular Soergel bimodules.
2023-07-27	Rationality for arbitrary closure operations and the test ideal of full extended plus closure	Zhan Jiang et.al.	2307.14958v1	null	We extend the notion of F-rationality to other closure operations, inspired by the work of Smith, Epstein and Schwede, and Ma and Schwede, which describe F-rationality in terms of the canonical module and top local cohomology module. We give conditions for a closure operation cl on a Cohen-Macaulay complete local ring under which cl-rationality is equivalent to parameter ideals being cl-closed. We also demonstrate that full extended plus closure as defined by Heitmann and weak full extended plus closure as defined by the first named author have no big test elements.
2023-07-27	On integral decomposition of unipotent elements in integral group rings	Geoffrey Janssens et.al.	2307.14820v1	null	Jespers and Sun conjectured that if a finite group $G$ has the property ND, i.e. for any nilpotent element $n$ in the integral group ring $\mathbb{Z}G$ and any primitive central idempotent $e \in \mathbb{Q}G$ one still has $ne \in \mathbb{Z}G$, then at most one of the simple components of the group algebra $\mathbb{Q} G$ has reduced degree bigger than $1$. With the exception of one very special series of groups we are able to answer their conjecture, showing that it is true - up to exactly one exception. To do so we first describe groups with the so-called SN property which was introduced by Liu and Passman in their investigation of the Multiplicative Jordan Decomposition for integral group rings. We then study further objects connected to the property ND. This concerns on one hand a certain section of the unit group of $\Z G$ which measures how far $G$ is from having ND and about which Jespers and Sun posed two further questions which we answer. On the other hand we introduce two properties which appeared naturally in these investigations: one is purely representation-theoretic, while the other can be regarded as indicating that it might be hard to decide ND. Among others we show these two notions are equivalent for groups with SN.
2023-07-27	Smart Contract Migration: Security Analysis and Recommendations from Ethereum to Arbitrum	Xueyan Tang et.al.	2307.14773v1	null	This research aims to explore the security risks posed by compatibility and protocol differences in smart contract migration, using the migration of smart contracts from Ethereum to Arbitrum as a case study. Through literature review, online data collection, expert participation, and analysis of smart contract vulnerability cases, this paper conducts an in-depth research of the differences between Ethereum and Arbitrum in areas such as Messaging, Block Properties, Contract Address Alias, and Gas Fees. The research findings indicate the presence of certain security issues during the migration process from Ethereum to Arbitrum, such as abnormal operation of the sequencer resulting in outdated off-chain data retrieval, time-based logical errors, failed permission checks, DOS attacks, and gas loss due to L1-to-L2 transaction failures. To address these security issues, this paper proposes corresponding solutions and recommendations to ensure the security and meet the requirements of the migration process. Additionally, this research emphasizes the continued attention and support for the security issues of smart contract migration through the case of smart contract migration from Ethereum to Arbitrum. It is worth noting that this research is the first in-depth research of smart contract security migration from Ethereum to Arbitrum.
2023-07-27	Smooth modules over the N=1 Bondi-Metzner-Sachs superalgebra	Dong Liu et.al.	2307.14608v1	null	In this paper, we present a determinant formula for the contravariant form on Verma modules over the N=1 Bondi-Metzner-Sachs (BMS) superalgebra. This formula establishes a necessary and sufficient condition for the irreducibility of the Verma modules. We then introduce and characterize a class of simple smooth modules that generalize both Verma and Whittaker modules over the N=1 BMS superalgebra. We also utilize the Heisenberg-Clifford vertex superalgebra to construct a free field realization for the N=1 BMS superalgebra. This free field realization allows us to obtain a family of natural smooth modules over the N=1 BMS superalgebra, which includes Fock modules and certain Whittaker modules.
2023-07-27	Accelerating Polynomial Modular Multiplication with Crossbar-Based Compute-in-Memory	Mengyuan Li et.al.	2307.14557v1	null	Lattice-based cryptographic algorithms built on ring learning with error theory are gaining importance due to their potential for providing post-quantum security. However, these algorithms involve complex polynomial operations, such as polynomial modular multiplication (PMM), which is the most time-consuming part of these algorithms. Accelerating PMM is crucial to make lattice-based cryptographic algorithms widely adopted by more applications. This work introduces a novel high-throughput and compact PMM accelerator, X-Poly, based on the crossbar (XB)-type compute-in-memory (CIM). We identify the most appropriate PMM algorithm for XB-CIM. We then propose a novel bit-mapping technique to reduce the area and energy of the XB-CIM fabric, and conduct processing engine (PE)-level optimization to increase memory utilization and support different problem sizes with a fixed number of XB arrays. X-Poly design achieves 3.1X10^6 PMM operations/s throughput and offers 200X latency improvement compared to the CPU-based implementation. It also achieves 3.9X throughput per area improvement compared with the state-of-the-art CIM accelerators.
2023-07-26	On the dimension of cofinite modules	Majid Rahro Zargar et.al.	2307.14513v1	null	Let $I$ be an ideal of a commutative Noetherian complete local ring $R$. In the present paper, we establish the equality $\dim R/(I+\Ann_R M)=\dim M$ for all $I$-cofinite $R$-modules $M$.
2023-07-26	Characterizing Rickart and Baer ultragraph Leavitt path algebras	Mitchell Jubeir et.al.	2307.14431v1	null	We characterize ultragraph Leavitt path algebras that are Rickart, locally Rickart, graded Rickart, and graded Rickart -rings. We also characterize ultragraph Leavitt path algebras that are Baer, locally Baer, graded Baer, Baer -rings, and combinations of these. These characterizations build on and generalize the work of Hazrat and Vas on Leavitt path algebras over fields to ultragraph Leavitt path algebras over semi-simple commutative unital rings.
2023-07-26	Commuting Line Defects At $q^N=1$	Davide Gaiotto et.al.	2307.14429v1	null	We explain the physical origin of a curious property of algebras $\mathcal{A}\mathfrak{q}$ which encode the rotation-equivariant fusion ring of half-BPS line defects in four-dimensional $\mathcal{N}=2$ supersymmetric quantum field theories. These algebras are a quantization of the algebras of holomorphic functions on the three-dimensional Coulomb branch of the SQFTs, with deformation parameter $\log \mathfrak{q}$. They are known to acquire a large center, canonically isomorphic to the undeformed algebra, whenever $\mathfrak{q}$ is a root of unity. We give a physical explanation of this fact. We also generalize the construction to characterize the action of this center in the $\mathcal{A}\mathfrak{q}$-modules associated to three-dimensional $\mathcal{N}=2$ boundary conditions. Finally, we use dualities to relate this construction to a construction in the Kapustin-Witten twist of four-dimensional $\mathcal{N}=4$ gauge theory. These considerations give simple physical explanations of certain properties of quantized skein algebras and cluster varieties, and quantum groups, when the deformation parameter is a root of unity.
2023-07-26	Event-based Vision for Early Prediction of Manipulation Actions	Daniel Deniz et.al.	2307.14332v1	null	Neuromorphic visual sensors are artificial retinas that output sequences of asynchronous events when brightness changes occur in the scene. These sensors offer many advantages including very high temporal resolution, no motion blur and smart data compression ideal for real-time processing. In this study, we introduce an event-based dataset on fine-grained manipulation actions and perform an experimental study on the use of transformers for action prediction with events. There is enormous interest in the fields of cognitive robotics and human-robot interaction on understanding and predicting human actions as early as possible. Early prediction allows anticipating complex stages for planning, enabling effective and real-time interaction. Our Transformer network uses events to predict manipulation actions as they occur, using online inference. The model succeeds at predicting actions early on, building up confidence over time and achieving state-of-the-art classification. Moreover, the attention-based transformer architecture allows us to study the role of the spatio-temporal patterns selected by the model. Our experiments show that the Transformer network captures action dynamic features outperforming video-based approaches and succeeding with scenarios where the differences between actions lie in very subtle cues. Finally, we release the new event dataset, which is the first in the literature for manipulation action recognition. Code will be available at https://github.com/DaniDeniz/EventVisionTransformer.
2023-07-26	Simulation of Open Quantum Systems via Low-Depth Convex Unitary Evolutions	Joseph Peetz et.al.	2307.14325v2	null	Simulating physical systems on quantum devices is one of the most promising applications of quantum technology. Current quantum approaches to simulating open quantum systems are still practically challenging on NISQ-era devices, because they typically require ancilla qubits and extensive controlled sequences. In this work, we propose a hybrid quantum-classical approach for simulating a class of open system dynamics called random-unitary channels. These channels naturally decompose into a series of convex unitary evolutions, which can then be efficiently sampled and run as independent circuits. The method does not require deep ancilla frameworks and thus can be implemented with lower noise costs. We implement simulations of open quantum systems up to dozens of qubits and with large channel rank.
2023-07-26	Large-scale Fully-Unsupervised Re-Identification	Gabriel Bertocco et.al.	2307.14278v1	null	Fully-unsupervised Person and Vehicle Re-Identification have received increasing attention due to their broad applicability in surveillance, forensics, event understanding, and smart cities, without requiring any manual annotation. However, most of the prior art has been evaluated in datasets that have just a couple thousand samples. Such small-data setups often allow the use of costly techniques in time and memory footprints, such as Re-Ranking, to improve clustering results. Moreover, some previous work even pre-selects the best clustering hyper-parameters for each dataset, which is unrealistic in a large-scale fully-unsupervised scenario. In this context, this work tackles a more realistic scenario and proposes two strategies to learn from large-scale unlabeled data. The first strategy performs a local neighborhood sampling to reduce the dataset size in each iteration without violating neighborhood relationships. A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n^2) to O(kn) with k << n. To avoid the pre-selection of specific hyper-parameter values for the clustering algorithm, we also present a novel scheduling algorithm that adjusts the density parameter during training, to leverage the diversity of samples and keep the learning robust to noisy labeling. Finally, due to the complementary knowledge learned by different models, we also introduce a co-training strategy that relies upon the permutation of predicted pseudo-labels, among the backbones, with no need for any hyper-parameters or weighting optimization. The proposed methodology outperforms the state-of-the-art methods in well-known benchmarks and in the challenging large-scale Veri-Wild dataset, with a faster and memory-efficient Re-Ranking strategy, and a large-scale, noisy-robust, and ensemble-based learning approach.
2023-07-26	Distributions of the Density and Kinetic Temperature of the Molecular Gas in the Central Region of NGC 613 using Hierarchical Bayesian Inference	Hiroyuki Kaneko et.al.	2307.14092v1	null	We present position-position-velocity (PPV) cubes of the physical and chemical properties of the molecular medium in the central 1.2 kpc region of the active galaxy NGC 613 at a PPV resolution of 0.$^{\prime\prime}$8$\times$0.$^{\prime\prime}$8$\times$10 km s$^{-1}$ (0.$^{\prime\prime}$8 = $\sim$68 pc). We used eight molecular lines obtained with ALMA. Non-LTE calculation with hierarchical Bayesian inference was used to construct PPV cubes of the gas kinetic temperature ($T_\mathrm{kin}$), molecular hydrogen volume density ($n_\mathrm{H_2}$), column densities ($N_\mathrm{H_2}$), and fractional abundances of four molecules ($^{12}$C$^{18}$O, HCN, HCO$^+$, and CS). The derived $n_\mathrm{H_2}$, $N_\mathrm{H_2}$, and $T_\mathrm{kin}$ ranged 10$^{3.21-3.85}$ cm$^{-3}$, 10$^{20.8-22.1}$ cm$^{-2}$, and 10$^{2.33-2.64}$ K, respectively. Our first application of the non-LTE method with the hierarchical Bayesian inference to external galaxies yielded compatible results compared with the previous studies of this galaxy, demonstrating the efficacy of this method for application to other galaxies. We examined the correlation between gas surface density $\Sigma_\mathrm{H_2}$ (converted from $N_\mathrm{H_2}$) and the star formation rate $\Sigma_\mathrm{SFR}$ obtained from the 110 GHz continuum flux map and found two distinct sequences in the $\Sigma_\mathrm{H_2}$-$\Sigma_\mathrm{SFR}$ diagram; the southwestern subregion of the star-forming ring exhibited a $\sim$0.5 dex higher star formation efficiency (SFE; $\Sigma_\mathrm{SFR}/\Sigma_\mathrm{H_2}$) than the eastern subregion. However, they exhibited no systematic difference in $n_\mathrm{H_2}$, which is often argued as a driver of SFE variation. We suggest that the deficiency of molecular gas in the southwestern subregion, where no significant gas supply is evident along the offset ridges in the bar, is responsible for the elevated SFE.
2023-07-26	The cycle class of the supersingular locus of principally polarized abelian varieties	Gerard van der Geer et.al.	2307.14393v1	null	We prove a formula for the cycle class of the supersingular locus in the Chow ring with rational coefficients of the moduli space of principally polarized abelian varieties in characteristic $p$. This formula determines this class as a monomial in the Chern classes of the Hodge bundle up to a factor that is a polynomial in $p$. This factor is known for $g\leq 3$. We determine the factor for $g=4$.
2023-07-26	Topology of light rings for extremal and non-extremal Kerr-Newman Taub-NUT black holes without $\mathbb{Z}_2$ symmetry	Shan-Ping Wu et.al.	2307.14003v1	null	Understanding the light ring, one kind fundamental orbit, shall provide us with novel insight into the astronomical phenomena, such as the ringdown of binary merger and shadow of black holes. Recently, topological approach has preliminarily demonstrated its potential advantages on the properties of the light rings. However, for the black holes without $\mathbb{Z}_2$ symmetry and extremal spinning black holes are remained to be tested. In this paper, we aim at these two issues. Due to the NUT charge, the Kerr-Newman Taub-NUT solution has no $\mathbb{Z}_2$ symmetry. By constructing the corresponding topology for the non-extremal spinning black holes, we find the topological number keeps unchanged. This indicates that $\mathbb{Z}_2$ symmetry has no influence on the topological number, while it indeed affects the locations of the light rings and deviates them off the equatorial plane. For the extremal spinning black holes, we find its topology is critically dependent of the leading term of the vector's radial component at the zero point of its angular component on the black hole horizon. The findings state that there exists a topological phase transition, where the topological number changes, for the prograde light rings. While no phase transition occurs for the retrograde light rings. Our study uncovers some universal topological properties for the extremal and non-extremal spinning black holes with or without $\mathbb{Z}_2$ symmetry. It also has enlightening significance on understanding the light rings in a more general black hole background.
2023-07-26	Schubert calculus in Lie groups	Haibao Duan et.al.	2307.13904v1	null	Let $G$ be a simple Lie group with a maximal torus $T$. Combining Schubert calculus in the flag manifold $G/T$ with the Serre spectral sequence of the fibration $G\rightarrow G/T$, we construct the integral cohomology ring $H^{\ast}(G)$ uniformly for all $G$.
2023-07-25	The Core of Bayesian Persuasion	Laura Doval et.al.	2307.13849v1	null	An analyst observes the frequency with which an agent takes actions, but not the frequency with which she takes actions conditional on a payoff relevant state. In this setting, we ask when the analyst can rationalize the agent's choices as the outcome of the agent learning something about the state before taking action. Our characterization marries the obedience approach in information design (Bergemann and Morris, 2016) and the belief approach in Bayesian persuasion (Kamenica and Gentzkow, 2011) relying on a theorem by Strassen (1965) and Hall's marriage theorem. We apply our results to ring-network games and to identify conditions under which a data set is consistent with a public information structure in first-order Bayesian persuasion games.
2023-07-25	*The local bisection hypothesis for twisted groupoid C-algebras**	Becky Armstrong et.al.	2307.13814v1	null	In this note, we present criteria that are equivalent to a locally compact Hausdorff groupoid $G$ being effective. One of these conditions is that $G$ satisfies the "C-algebraic local bisection hypothesis"; that is, that every normaliser in the reduced twisted groupoid C-algebra is supported on an open bisection. The semigroup of normalisers plays a fundamental role in our proof, as does the semigroup of normalisers in cyclic group C*-algebras.
2023-07-25	Integration of Digital Twin and Federated Learning for Securing Vehicular Internet of Things	Deepti Gupta et.al.	2307.13794v1	null	In the present era of advanced technology, the Internet of Things (IoT) plays a crucial role in enabling smart connected environments. This includes various domains such as smart homes, smart healthcare, smart cities, smart vehicles, and many others.With ubiquitous smart connected devices and systems, a large amount of data associated with them is at a prime risk from malicious entities (e.g., users, devices, applications) in these systems. Innovative technologies, including cloud computing, Machine Learning (ML), and data analytics, support the development of anomaly detection models for the Vehicular Internet of Things (V-IoT), which encompasses collaborative automatic driving and enhanced transportation systems. However, traditional centralized anomaly detection models fail to provide better services for connected vehicles due to issues such as high latency, privacy leakage, performance overhead, and model drift. Recently, Federated Learning (FL) has gained significant recognition for its ability to address data privacy concerns in the IoT domain. Digital Twin (DT), proves beneficial in addressing uncertain crises and data security issues by creating a virtual replica that simulates various factors, including traffic trajectories, city policies, and vehicle utilization. However, the effectiveness of a V-IoT DT system heavily relies on the collection of long-term and high-quality data to make appropriate decisions. This paper introduces a Hierarchical Federated Learning (HFL) based anomaly detection model for V-IoT, aiming to enhance the accuracy of the model. Our proposed model integrates both DT and HFL approaches to create a comprehensive system for detecting malicious activities using an anomaly detection model. Additionally, real-world V-IoT use case scenarios are presented to demonstrate the application of the proposed model.
2023-07-25	Keck Near-Infrared Detections of Mab and Perdita	Edward M. Molter et.al.	2307.13773v1	null	We report the first near-infrared detection of Uranus's tiny moon Mab, the presumed source of the blue and diffuse $\mu$ ring, using the NIRC2 instrument at Keck Observatory. The detection was permitted by an updated shift-and-stack procedure allowing us to integrate on Mab as it moved across the detector in 23 separate exposures taken over $\sim$2 hours, as well as the very low (0.02$^{\circ}$) phase angle at the time of observation. At this phase angle, Mab has an integrated I/F of 24 $\pm$ 3 km$^2$ at 1.6 $\mu$m and $\lesssim$37 km$^2$ at 2.1 $\mu$m. Comparing these values with Mab's visible reflectance as derived by HST reveals that Mab is spectrally blue; its (0.5 $\mu$m)/(1.6 $\mu$m) color is more consistent with Miranda's value than Puck's value. Mab is therefore more likely a $\sim$6-km radius body with a Miranda-like surface than a 12-km radius body with a Puck-like surface, in agreement with prior work based on infrared upper limits, but we caution that a Puck-like color is only ruled out at the 2$\sigma$ level. We also report the first infrared photometry of Perdita, finding an integrated I/F of 31 $\pm$ 3 km$^2$ at 1.6 $\mu$m.
2023-07-25	ChildGAN: Large Scale Synthetic Child Facial Data Using Domain Adaptation in StyleGAN	Muhammad Ali Farooq et.al.	2307.13746v1	null	In this research work, we proposed a novel ChildGAN, a pair of GAN networks for generating synthetic boys and girls facial data derived from StyleGAN2. ChildGAN is built by performing smooth domain transfer using transfer learning. It provides photo-realistic, high-quality data samples. A large-scale dataset is rendered with a variety of smart facial transformations: facial expressions, age progression, eye blink effects, head pose, skin and hair color variations, and variable lighting conditions. The dataset comprises more than 300k distinct data samples. Further, the uniqueness and characteristics of the rendered facial features are validated by running different computer vision application tests which include CNN-based child gender classifier, face localization and facial landmarks detection test, identity similarity evaluation using ArcFace, and lastly running eye detection and eye aspect ratio tests. The results demonstrate that synthetic child facial data of high quality offers an alternative to the cost and complexity of collecting a large-scale dataset from real children.
2023-07-25	The GMRT archive atomic gas survey -- II. Mass modelling and dark matter halo properties across late-type spirals	Prerana Biswas et.al.	2307.13738v1	null	Studying the kinematics and mass modelling of galaxies from HI 21 cm data provides valuable insights into the properties of both the baryonic components and the dark matter halo in nearby galaxies. Despite many observational studies, mass modelling of galaxies remains challenging due to different limitations. For example, most of the previous studies involving mass modelling are based on rotation curves derived from two-dimensional velocity fields from HI or H$\alpha$ spectroscopic observation which are often affected by beam smearing and projection effect. However, kinematic modelling done by fitting the "Tilted ring model" to three-dimensional data cube is not affected by these issues. In this study, we present and compare 3D kinematic modelling of a pilot sample of eleven galaxies from the GMRT archive atomic gas survey (GARCIA) using two different publicly available pipelines. We model the observed HI rotation curve using 3.6 $\mu$m infrared data and SDSS r-band data for stellar contribution, HI surface density profile for gas, and Navarro-Frenk-White (NFW) profile for dark matter halo; and employ the Markov Chain Monte Carlo (MCMC) optimization method for parameter estimation. Further, to validate our analysis, we revisit important scaling relations, e.g., the M${gas}$-M$$ relation, M${star}$-M$$ relation, M${gas}$-M$$ relation and Baryonic Tully-Fisher relation (BTFR). The scaling relations from our analysis are broadly consistent with that reported in the literature. A larger sample of galaxies from GARCIA in the near future will allow studying these scaling relations in greater details.
2023-07-25	Strong generation & (co)ghost index for module categories	Pat Lank et.al.	2307.13675v1	null	This work is concerned with both strong generation and (co)ghost index in the module category of a commutative noetherian ring. A sufficiency criterion is established for such rings to admit strong generators in their module category, and as a consequence, it answers affirmatively to a question of Iyengar and Takahashi. Moreover, it is shown that any noetherian quasi-excellent ring of finite krull dimension admits strong generators in their module category. Lastly, a local-to-global principle is established for (co)ghost index in the module category, and explicit computations are made.
2023-07-25	A class of rotating metrics in the presence of a scalar field	Behrouz Mirza et.al.	2307.13588v1	null	We consider a class of three parameter static and axially symmetric metrics that reduce to the Janis-Newman-Winicour (JNW) and $ \gamma$-metrics in certain limits of the parameters. We obtain rotating form of the metrics that are asymptotically flat, stationary and axisymmetric. In certain values of the parameters, the solutions represent the rotating JNW metric, rotating $ \gamma$-metric and Bogush-Gal'tsov (BG) metric. The singularities of rotating metrics are investigated. Using the light-ring method, we obtain the quasi normal modes (QNMs) related to rotating metrics in the eikonal limit. Finally, we investigate the precession frequency of a test gyroscope in the presence of the rotating metrics.
2023-07-25	The homotopy category of monomorphisms between projective modules	Abdolnaser Bahlekeh et.al.	2307.13559v1	null	Let $(S, \n)$ be a commutative noetherian local ring and $\omega\in\n$ be non-zerodivisor. This paper deals with the behavior of the category $\mon(\omega, \cp)$ consisting of all monomorphisms between finitely generated projective $S$-modules with cokernels annihilated by $\omega$. We introduce a homotopy category $\HT\mon(\omega, \cp)$, which is shown to be triangulated. It is proved that this homotopy category embeds into the singularity category of the factor ring $R=S/{(\omega)}$. As an application, not only the existence of almost split sequences {ending at indecomposable non-projective objects of} $\mon(\omega, \cp)$ is proven, but also the Auslander-Reiten translation, $\tau_{\mon}(-)$, is completely recognized. Particularly, it will be observed that any non-projective object of $\mon(\omega, \cp)$ with local endomorphism ring is invariant under the square of the Auslander-Reiten translation.
2023-07-25	Perturbed Block Toeplitz matrices and the non-Hermitian skin effect in dimer systems of subwavelength resonators	Habib Ammari et.al.	2307.13551v1	null	The aim of this paper is fourfold: (i) to obtain explicit formulas for the eigenpairs of perturbed tridiagonal block Toeplitz matrices; (ii) to make use of such formulas in order to provide a mathematical justification of the non-Hermitian skin effect in dimer systems by proving the condensation of the system's bulk eigenmodes at one of the edges of the system; (iii) to show the topological origin of the non-Hermitian skin effect for dimer systems and (iv) to prove localisation of the interface modes between two dimer structures with non-Hermitian gauge potentials of opposite signs based on new estimates of the decay of the entries of the eigenvectors of block matrices with mirrored blocks.
2023-07-25	Earth-based Stellar Occultation Predictions for Jupiter, Saturn, Uranus, Neptune, Titan, and Triton: 2023-2050	Richard G. French et.al.	2307.13530v1	null	In support of studies of decadal-timescale evolution of outer solar system atmospheres and ring systems, we present detailed Earth-based stellar occultation predictions for Jupiter, Saturn, Uranus, Neptune, Titan, and Triton for 2023-2050, based on the Gaia DR3 star catalog and near-IR K-band photometry from the 2MASS catalog. We tabulate the number of observable events by year and magnitude interval, reflecting the highly variable frequency of high-SNR events depending on the target's path relative to the star-rich regions of the Milky Way. We identify regions on Earth where each event is potentially observable, and for atmospheric occultations, we determine the latitude of the ingress and egress events. For Saturn, Uranus, and Neptune, we also compute the predicted ring occultation event times. We present representative subsets of the predicted events and highlights particularly promising events. Jupiter occultations with K $\leq$7 occur at a cadence of about one per year, with bright events at higher frequency in 2031 and 2043. Saturn occultations are much rarer, with only two predicted events with K $\leq$5 in 2032 and 2047. Ten Uranus ring occultations are predicted with K$\leq$10 for the period 2023 to 2050. Neptune traverses star-poor regions of the sky until 2068, resulting in only 13 predicted occultations for K$\leq$12 between 2023 and 2050. Titan has several high-SNR events between 2029--2031, whereas Triton is limited to a total of 22 occultations with K$\leq$15 between 2023 and 2050. Details of all predicted events are included in the Supplementary Online Material.
2023-07-25	XMM-Newton and INTEGRAL observations of the bright GRB 230307A : vanishing of the local absorption and limits on the dust in the Magellanic Bridge	Sandro Mereghetti et.al.	2307.13514v1	null	230307A is the second brightest gamma ray burst detected in more than 50 years of observations and is located in the direction of the Magellanic Bridge. Despite its long duration, it is most likely the result of the compact merger of a binary ejected from a galaxy in the local universe (redshift z=0.065). Our XMM-Newton observation of its afterglow at 4.5 days shows a power-law spectrum with photon index $\Gamma =1.73 \pm0.10$, unabsorbed flux $F_{0.3-10\,\rm keV}=(8.8\pm0.5)\times 10^{-14}$ erg cm$^{-2}$ s$^{-1}$ and no absorption in excess of that produced in our Galaxy and in the Magellanic Bridge. We derive a limit of $N_{\rm H}^{\rm HOST} < 5\times 10^{20}$ cm$^{-2}$ on the absorption at the GRB redshift, which is a factor $\sim\,$5 below the value measured during the prompt phase. We searched for the presence of dust scattering rings with negative results and set an upper limit of the order of $A_V<0.05$ on the absorption from dust in the Magellanic Bridge.
2023-07-25	Good codes from twisted group algebras	Samir Assuena et.al.	2307.13507v1	null	In this paper, we shall give an explicit proof that constacyclic codes over finite commutative rings can be realized as ideals in some twisted group rings. Also, we shall study isometries between those codes and, finally, we shall study k-Galois LCD constacyclic codes over finite fields. In particular, we shall characterize constacyclic LCD codes with respect to Euclidean inner product in terms of its idempotent generators and the classical involution using the twisted group algebras structures and find some good LCD codes.

wearable device

Publish Date	Title	Authors	PDF	Code	Abstract
2023-07-27	Exponential speedups for quantum walks in random hierarchical graphs	Shankar Balasubramanian et.al.	2307.15062v1	null	There are few known exponential speedups for quantum algorithms and these tend to fall into even fewer families. One speedup that has mostly resisted generalization is the use of quantum walks to traverse the welded-tree graph, due to Childs, Cleve, Deotto, Farhi, Gutmann, and Spielman. We show how to generalize this to a large class of hierarchical graphs in which the vertices are grouped into ``supervertices'' which are arranged according to a $d$-dimensional lattice. Supervertices can have different sizes, and edges between supervertices correspond to random connections between their constituent vertices. The hitting times of quantum walks on these graphs are related to the localization properties of zero modes in certain disordered tight binding Hamiltonians. The speedups range from superpolynomial to exponential, depending on the underlying dimension and the random graph model. We also provide concrete realizations of these hierarchical graphs, and introduce a general method for constructing graphs with efficient quantum traversal times using graph sparsification.
2023-07-27	IPv6 Hitlists at Scale: Be Careful What You Wish For	Erik Rye et.al.	2307.15057v1	null	Today's network measurements rely heavily on Internet-wide scanning, employing tools like ZMap that are capable of quickly iterating over the entire IPv4 address space. Unfortunately, IPv6's vast address space poses an existential threat for Internet-wide scans and traditional network measurement techniques. To address this reality, efforts are underway to develop ``hitlists'' of known-active IPv6 addresses to reduce the search space for would-be scanners. As a result, there is an inexorable push for constructing as large and complete a hitlist as possible. This paper asks: what are the potential benefits and harms when IPv6 hitlists grow larger? To answer this question, we obtain the largest IPv6 active-address list to date: 7.9 billion addresses, 898 times larger than the current state-of-the-art hitlist. Although our list is not comprehensive, it is a significant step forward and provides a glimpse into the type of analyses possible with more complete hitlists. We compare our dataset to prior IPv6 hitlists and show both benefits and dangers. The benefits include improved insight into client devices (prior datasets consist primarily of routers), outage detection, IPv6 roll-out, previously unknown aliased networks, and address assignment strategies. The dangers, unfortunately, are severe: we expose widespread instances of addresses that permit user tracking and device geolocation, and a dearth of firewalls in home networks. We discuss ethics and security guidelines to ensure a safe path towards more complete hitlists.
2023-07-27	Autocalibrating Gaze Tracking: A Demonstration through Gaze Typing	Akanksha Saran et.al.	2307.15039v1	null	Miscalibration of gaze tracking devices and the resulting need for repeat calibration are a significant barrier to use. As devices miscalibrate, people tend to auto-correct by gazing at neighboring targets, which makes it difficult to detect miscalibration from eye signals. To address this problem, we provide a novel and simple insight for autocalibrating eye trackers during gaze typing: the eyes are used as both input (i.e. typing) and output (i.e. reading) signals, but auto-correction by users only occurs when eye gaze is functioning as input. Thus, output eye gaze signals during reading can help systems detect the miscalibration offset and enable autocalibration. To demonstrate the potential for this type of approach, we designed and built an auto-calibration system for gaze typing and ran a user study with 15 able-bodied participants. Results from our user study suggest that such an implicit approach to autocalibration can significantly improve typing speed and overall user experience for gaze typing interfaces. Insights from our work are applicable to a broad set of gaze tracking technologies and may help create more seamless user experiences in a variety of domains.
2023-07-27	Dissipation-enabled bosonic Hamiltonian learning via new information-propagation bounds	Tim Möbus et.al.	2307.15026v1	null	Reliable quantum technology requires knowledge of the dynamics governing the underlying system. This problem of characterizing and benchmarking quantum devices or experiments in continuous time is referred to as the Hamiltonian learning problem. In contrast to multi-qubit systems, learning guarantees for the dynamics of bosonic systems have hitherto remained mostly unexplored. For $m$-mode Hamiltonians given as polynomials in annihilation and creation operators with modes arranged on a lattice, we establish a simple moment criterion in terms of the particle number operator which ensures that learning strategies from the finite-dimensional setting extend to the bosonic setting, requiring only coherent states and heterodyne detection on the experimental side. We then propose an enhanced procedure based on added dissipation that even works if the Hamiltonian time evolution violates this moment criterion: With high success probability it learns all coefficients of the Hamiltonian to accuracy $\varepsilon$ using a total evolution time of $\mathcal{O}(\varepsilon^{-2}\log(m))$. Our protocol involves the experimentally reachable resources of projected coherent state preparation, dissipative regularization akin to recent quantum error correction schemes involving cat qubits stabilized by a nonlinear multi-photon driven dissipation process, and heterodyne measurements. As a crucial step in our analysis, we establish our moment criterion and a new Lieb-Robinson type bound for the evolution generated by an arbitrary bosonic Hamiltonian of bounded degree in the annihilation and creation operators combined with photon-driven dissipation. Our work demonstrates that a broad class of bosonic Hamiltonians can be efficiently learned from simple quantum experiments, and our bosonic Lieb-Robinson bound may independently serve as a versatile tool for studying evolutions on continuous variable systems.
2023-07-27	Samplable Anonymous Aggregation for Private Federated Data Analysis	Kunal Talwar et.al.	2307.15017v1	null	We revisit the problem of designing scalable protocols for private statistics and private federated learning when each device holds its private data. Our first contribution is to propose a simple primitive that allows for efficient implementation of several commonly used algorithms, and allows for privacy accounting that is close to that in the central setting without requiring the strong trust assumptions it entails. Second, we propose a system architecture that implements this primitive and perform a security analysis of the proposed system.
2023-07-27	FLiCR: A Fast and Lightweight LiDAR Point Cloud Compression Based on Lossy RI	Jin Heo et.al.	2307.15005v1	null	Light detection and ranging (LiDAR) sensors are becoming available on modern mobile devices and provide a 3D sensing capability. This new capability is beneficial for perceptions in various use cases, but it is challenging for resource-constrained mobile devices to use the perceptions in real-time because of their high computational complexity. In this context, edge computing can be used to enable LiDAR online perceptions, but offloading the perceptions on the edge server requires a low-latency, lightweight, and efficient compression due to the large volume of LiDAR point clouds data. This paper presents FLiCR, a fast and lightweight LiDAR point cloud compression method for enabling edge-assisted online perceptions. FLiCR is based on range images (RI) as an intermediate representation (IR), and dictionary coding for compressing RIs. FLiCR achieves its benefits by leveraging lossy RIs, and we show the efficiency of bytestream compression is largely improved with quantization and subsampling. In addition, we identify the limitation of current quality metrics for presenting the entropy of a point cloud, and introduce a new metric that reflects both point-wise and entropy-wise qualities for lossy IRs. The evaluation results show FLiCR is more suitable for edge-assisted real-time perceptions than the existing LiDAR compressions, and we demonstrate the effectiveness of our compression and metric with the evaluations on 3D object detection and LiDAR SLAM.
2023-07-27	Unpinned Dirac-Fermions in Carbon-Phosphorous-Arsenic Based Ternary Monolayer	Amrendra Kumar et.al.	2307.15001v1	null	We predict energetically and dynamically stable ternary Carbon-Phosphorous-Arsenic (CPAs2) monolayers in buckled geometric structure by employing density functional theory based calculations. We consider three different symmetric configurations, namely, inversion (i), mirror (m) and rotational (r). The low-energy dispersions in electronic band structure and density of states (DOS) around the Fermi level contain two contrasting features: (a) parabolic dispersion around highly symmetric Gamma point with a step function in DOS due to nearly-free-particle-like Schroedinger-Fermions and (b) linear dispersion around highly symmetric K point with linear DOS due to massless Dirac-Fermions for i-CPAs2 monolayer. The step function in DOS is a consequence of two-dimensionality of the system in which the motion of nearly-free-particles is confined. However, a closer look at (b) reveals that the ternary monolayers possess distinct characters, namely (i) massless-gapless, (ii) slightly massive-gapped and (iii) unpinned massless-gapless Dirac-Fermions for i, m and r-CPAs2 configurations respectively. Thus, the nature of states around the Fermi level depends crucially on the symmetry of systems. In addition, we probe the influence of mechanical strain on the properties of CPAs2 monolayer. The results indicate that the characteristic dispersions of (a) and (b) move in opposite directions in energy which leads to a metal-to-semimetal transition in i and r-CPAs2 configurations, for a few percentages of tensile strain. On the other hand, a strain induced metal-to-semiconductor transition is observed in m-CPAs2 configuration with a tunable energy band gap. Interestingly, unlike graphene, the Dirac cones can be unpinned from highly symmetric K (and K') point, but they are restricted to move along the edges (K-M'-K') of first Brillouin zone due to C2 symmetry in i and r-CPAs2 configurations.
2023-07-27	Decomposing and Routing Quantum Circuits Under Constraints for Neutral Atom Architectures	Natalia Nottingham et.al.	2307.14996v1	null	Quantum computing is in an era defined by rapidly evolving quantum hardware technologies, combined with persisting high gate error rates, large amounts of noise, and short coherence times. Overcoming these limitations requires systems-level approaches that account for the strengths and weaknesses of the underlying hardware technology. Yet few hardware-aware compiler techniques exist for neutral atom devices, with no prior work on compiling to the neutral atom native gate set. In particular, current neutral atom hardware does not support certain single-qubit rotations via local addressing, which often requires the circuit to be decomposed into a large number of gates, leading to long circuit durations and low overall fidelities. We propose the first compiler designed to overcome the challenges of limited local addressibility in neutral atom quantum computers. We present algorithms to decompose circuits into the neutral atom native gate set, with emphasis on optimizing total pulse area of global gates, which dominate gate execution costs in several current architectures. Furthermore, we explore atom movement as an alternative to expensive gate decompositions, gaining immense speedup with routing, which remains a huge overhead for many quantum circuits. Our decomposition optimizations result in up to ~3.5x and ~2.9x speedup in time spent executing global gates and time spent executing single-qubit gates, respectively. When combined with our atom movement routing algorithms, our compiler achieves up to ~10x reduction in circuit duration, with over ~2x improvement in fidelity. We show that our compiler strategies can be adapted for a variety of hardware-level parameters as neutral atom technology continues to develop.
2023-07-27	A chemical reaction network implementation of a Maxwell demon	Massimo Bilancioni et.al.	2307.14994v1	null	We study an autonomous model of a Maxwell demon that works by rectifying thermal fluctuations of chemical reactions. It constitutes the chemical analog of a recently studied electronic demon. We characterize its scaling behavior in the macroscopic limit, its performances, and the impact of potential internal delays. We obtain analytical expressions for all quantities of interest, namely, the generated reverse chemical current, the output power, the transduction efficiency, and the correlations between the numbers of molecules. Due to a bound on the nonequilibrium response of its chemical reaction network, we find that, contrary to the electronic case, there is no way for the Maxwell demon to generate a finite output in the macroscopic limit. Finally, we analyze the information thermodynamics of the Maxwell demon from a bipartite perspective. In the limit of a fast demon, the information flow is obtained, its pattern in the state space is discussed, and the behavior of the partial efficiencies related to the measurement and the feedback processes is examined.
2023-07-27	Super-resolution enabled widefield quantum diamond microscopy	Feng Xu et.al.	2307.14990v1	null	Widefield quantum diamond microscopy (WQDM) based on Kohler-illumination has been widely adopted in the field of quantum sensing, however, practical applications are still limited by issues such as unavoidable photodamage and unsatisfied spatial-resolution. Here, we design and develop a super-resolution enabled WQDM using a digital micromirror device (DMD)-based structured illumination microscopy. With the rapidly programmable illumination patterns, we have firstly demonstrated how to mitigate phototoxicity when imaging nanodiamonds in cell samples. As a showcase, we have performed the super-resolved quantum sensing measurements of two individual nanodiamonds not even distinguishable with conventional WQDM. The DMD-powered WQDM presents not only excellent compatibility with quantum sensing solutions, but also strong advantages in high imaging speed, high resolution, low phototoxicity, and enhanced signal-to-background ratio, making it a competent tool to for applications in demanding fields such as biomedical science.
2023-07-27	Super-tetragonal Sr4Al2O7: a versatile sacrificial layer for high-integrity freestanding oxide membranes	Jinfeng Zhang et.al.	2307.14966v1	null	Identifying a suitable water-soluble sacrificial layer is crucial to fabricating large-scale freestanding oxide membranes, which stimulates intriguing functionalities and enables novel integrations with semiconductor technologies. In this work, we introduce a new water-soluble sacrificial layer, "super-tetragonal" Sr4Al2O7 (SAOT). Its unique atomic structure ensures a coherent growth of perovskite ABO3/SAOT heterostructures, effectively inhibiting crack formation in the water-released membranes. For various non-ferroelectric oxide membranes with lattice constants ranging from 3.85 to 4.04 A, the crack-free areas can span up to millimeter-scale. The high water-solubility of SAOT shortens the exfoliation duration to a few minutes only. Our findings highlight the SAOT as an effective and versatile sacrificial layer for freestanding oxide membranes with superior integrity, crystallinity, and functionalities, further promoting their potential for innovative device applications.
2023-07-27	Interlayer coupling driven high-temperature superconductivity in La$_3$Ni$_2$O$_7$ under pressure	Chen Lu et.al.	2307.14965v1	null	The newly discovered high-temperature superconductivity in La$3$Ni$_2$O$_7$ under pressure has attracted a great deal of interests. The essential ingredient characterizing the electronic properties is the bilayer NiO$_2$ planes, in which the two layers couple with each other from the bonding of Ni-$3d$ orbital through the intermediate oxygen-atoms. In the strong coupling limit, an intralayer antiferromagnetic spin-exchange interaction $J_{\parallel}$ between $3d_{x^2-y^2}$ orbitals and an interlayer one $J_{\perp}$ between $3d_{z^2}$ orbitals are generated. Taking into account the Hund's rule at each site and integrating out the $3d_{z^2}$ spin of freedom, the system reduces to a single-orbital bilayer $t$-$J$ model of the $3d_{x^2-y^2}$. Based on the slave-boson approach, the self-consistent equation for the hopping and pairing order parameters is solved. Near the relevant $\frac{1}{4}$-filling regime (doping $\delta=0.3\sim 0.5$), the inter-layer coupling $J_{\perp}$ tunes the conventional single-layer $d$-wave superconducting state to the $s$-wave one. A strong $J_{\perp}$ could enhance the intra-layer superconducting order, leading to a dramatically increased $T_c$. Interestingly, there could exist a finite regime in which an $s+id$ state emerges.
2023-07-27	Probing an ultralight QCD axion with electromagnetic quadratic interaction	Hyungjin Kim et.al.	2307.14962v1	null	The axion-gluon coupling is the defining feature of the QCD axion. This feature induces additional and qualitatively different interactions of the axion with standard model particles -- quadratic couplings. Previously, hadronic quadratic couplings have been studied and experimental implications have been explored especially in the context of atomic spectroscopy and interferometry. We investigate additional quadratic couplings to the electromagnetic field and electron mass. These electromagnetic quadratic couplings are generated at the loop level from threshold corrections and are expected to be present in the absence of fine-tuning. While they are generally loop-suppressed compared to the hadronic ones, they open up new ways to search for the QCD axion, for instance via optical atomic clocks. Moreover, due to the velocity spread of the dark matter field, the quadratic nature of the coupling leads to low-frequency fluctuations in any detector setup. These distinctive low-frequency fluctuations offer a way to search for heavier axions. We provide an analytic expression for the power spectral density of this low-frequency background and briefly discuss experimental strategies for a low-frequency background search.
2023-07-27	Confronting axial-vector form factor from lattice QCD with MINERvA antineutrino-proton data	Oleksandr Tomalak et.al.	2307.14920v1	null	We compare recent MINERvA antineutrino-hydrogen charged-current measurements to phenomenological predictions of the axial-vector form factor based on fits to all available electron scattering and deuterium bubble-chamber data and to representative lattice-QCD (LQCD) determination by the PNDME Collaboration. While there is $1$--$2\sigma$ agreement in the cross section with MINERvA data for each bin in $Q^2$, we identify three regions with different relevance and opportunity for LQCD predictions. For $Q^2 \lesssim 0.2~\mathrm{GeV}^2$, the phenomenological extractions have large number of data points and LQCD is competitive, while MINERvA data have large errors. For $0.2~\mathrm{GeV}^2 \lesssim Q^2 \lesssim 1~\mathrm{GeV}^2$, LQCD is competitive with the MINERvA determination, and both give values larger than from phenomenological extraction. For $Q^2 > 1~\mathrm{GeV}^2$, the MINERvA data are the most precise. Our analysis indicates that with improving precision of MINERvA-like and LQCD data, the uncertainty in the nucleon axial-vector form factor will be steadily reduced.
2023-07-27	Contribution of hadronic light-by-light scattering to the hyperfine structure of muonium	V. I. Korobov et.al.	2307.14916v1	null	The contribution of hadronic scattering of light-by-light to the hyperfine structure of muonium is calculated using experimental data on the transition form factors of two photons into a hadron. The amplitudes of interaction between a muon and an electron with horizontal and vertical exchange are constructed. The contributions due to the exchange of pseudoscalar, axial vector, scalar and tensor mesons are taken into account.
2023-07-27	Anisotropic multiband superconductivity in 2M-WS$_{2}$ probed by controlled disorder	Sunil Ghimire et.al.	2307.14891v1	null	The intrinsically superconducting Dirac semimetal 2M-WS${2}$ is a promising candidate to realize proximity-induced topological superconductivity in its protected surface states. A precise characterization of the bulk superconducting state is essential for understanding the nature of surface superconductivity in the system. Here, we perform a detailed experimental study of the temperature and nonmagnetic disorder dependence of the London penetration depth $\lambda$, the upper critical field $H$, and the superconducting transition temperature $T_c$ in 2M-WS${2}$. We observe a power-law dependence $\lambda(T) - \lambda(0) \propto T^{3}$ at temperatures below $0.35~T_c$, which is remarkably different from the expected exponential attenuation of a fully gapped isotropic $s$-wave superconductor. We then probe the effect of controlled nonmagnetic disorder induced by 2.5 MeV electron irradiation at various doses and find a significant $T_c$ suppression rate. Together with the observed increase of the slope $dH/dT
2023-07-27	Spectrum of the hole excitation in spin-orbit Mott insulator Na$_2$IrO$_3$	Wei Wang et.al.	2307.14885v1	null	We study the motion of a hole with internal degrees of freedom, introduced to the zigzag magnetic ground state of Na$_2$IrO$_3$, by using the self-consistent Born approximation. We find that the low, intermediate, and high-energy spectra are primarily attributed to the singlet, triplet, and quintet hole contributions, respectively. The spectral functions exhibit distinct features such as the electron-like dispersion of low-energy states near the $\Gamma$ point, the maximum M-point intensity of mid-energy states, and the hole-like dispersion of high-energy states. These features are robust and almost insensitive to the exchange model and Hund's coupling, and are in qualitative agreement with the angular-resolved photoemission spectra observed in Na$_2$IrO$_3$. Our results reveal that the interference between internal degrees of freedom in different sublattices plays an important role in inducing the complex dispersions.
2023-07-27	Don't Shoot the Messenger: Localization Prevention of Satellite Internet Users	David Koisser et.al.	2307.14879v1	null	Satellite Internet plays an increasingly important role in geopolitical conflicts. This notion was affirmed in the Ukrainian conflict escalating at the beginning of 2022, with the large-scale deployment of the Starlink satellite Internet service which consequently demonstrated the strategic importance of a free flow of information. Aside from military use, many citizens publish sensitive information on social media platforms to influence the public narrative. However, the use of satellite communication has proven to be dangerous, as the signals can be monitored by other satellites and used to triangulate the source on the ground. Unfortunately, the targeted killings of journalists have shown this threat to be effective. While the increasing deployment of satellite Internet systems gives citizens an unprecedented mouthpiece in conflicts, protecting them against localization is an unaddressed problem. To address this threat, we present AnonSat, a novel scheme to protect satellite Internet users from triangulation. AnonSat works with cheap off-the-shelf devices, leveraging long-range wireless communication to span a local network among satellite base stations. This allows rerouting users' communication to other satellite base stations, some distance away from each user, thus, preventing their localization. AnonSat is designed for easy deployment and usability, which we demonstrate with a prototype implementation. Our large-scale network simulations using real-world data sets show the effectiveness of AnonSat in various practical settings.
2023-07-27	Search for Light Dark Matter with accelerator and direct detection experiments: comparison and complementarity of recent results	S. ~N. ~Gninenko et.al.	2307.14865v1	null	We discuss the most sensitive constraints on Light Dark Matter (LDM) from accelerator experiments NA64 and BaBar and compare it with recent results from direct searches at XENON1T, DAMIC-M, SuperCDMS, and DarkSide-50. We show that for the dark photon ($A'$) model with scalar LDM, NA64 gives more stringent bounds for $A'$ masses $m_{A'} \leq 0.15~GeV$ than direct searches. Moreover, for the case of Majorana LDM the damping DM velocity $v$ factor, $v^2 \sim O(10^{-6})$, for the elastic LDM electron(nucleon) cross section makes direct observation of Majorana LDM extremely challenging, while the absence of this suppression in the NA64 case gives an advantage to the experiment. The similar situation takes place for pseudo-Dirac LDM. The BaBar provides the most stringent bounds for $A'$ masses $m_{A'} \geq 0.35~GeV$. For scalar LDM the direct detection experiments give more stringent bounds at $m_{A'} \geq 0.35~GeV$ while for Majorana and pseudo-Dirac LDM case, the BaBar bounds are more stringent. The complementarity of the two approaches in searching for LDM is underlined.
2023-07-27	Comparative Evaluation of Digital and Analog Chest Radiographs to Identify Tuberculosis using Deep Learning Model	Subhankar Chattoraj et.al.	2307.14859v1	null	Purpose:Chest X-ray (CXR) is an essential tool and one of the most prescribed imaging to detect pulmonary abnormalities, with a yearly estimate of over 2 billion imaging performed worldwide. However, the accurate and timely diagnosis of TB remains an unmet goal. The prevalence of TB is highest in low-middle-income countries, and the requirement of a portable, automated, and reliable solution is required. In this study, we compared the performance of DL-based devices on digital and analog CXR. The evaluated DL-based device can be used in resource-constraint settings. Methods: A total of 10,000 CXR DICOMs(.dcm) and printed photos of the films acquired with three different cellular phones - Samsung S8, iPhone 8, and iPhone XS along with their radiological report were retrospectively collected from various sites across India from April 2020 to March 2021. Results: 10,000 chest X-rays were utilized to evaluate the DL-based device in identifying radiological signs of TB. The AUC of qXR for detecting signs of tuberculosis on the original DICOMs dataset was 0.928 with a sensitivity of 0.841 at a specificity of 0.806. At an optimal threshold, the difference in the AUC of three cellular smartphones with the original DICOMs is 0.024 (2.55%), 0.048 (5.10%), and 0.038 (1.91%). The minimum difference demonstrates the robustness of the DL-based device in identifying radiological signs of TB in both digital and analog CXR.
2023-07-27	Thermal one-point functions: CFT's with fermions, large $d$ and large spin	Justin R. David et.al.	2307.14847v1	null	We apply the OPE inversion formula on thermal two-point functions of fermions to obtain thermal one-point function of fermion bi-linears appearing in the corresponding OPE. We primarily focus on the OPE channel which contains the stress tensor of the theory. We apply our formalism to the mean field theory of fermions and verify that the inversion formula reproduces the spectrum as well as their corresponding thermal one-point functions. We then examine the large $N$ critical Gross-Neveu model in $d=2k+1$ dimensions with $k$ even and at finite temperature. We show that stress tensor evaluated from the inversion formula agrees with that evaluated from the partition function at the critical point. We demonstrate the expectation values of 3 different classes of higher spin currents are all related to each other by numerical constants, spin and the thermal mass. We evaluate the ratio of the thermal expectation values of higher spin currents at the critical point to the Gaussian fixed point or the Stefan-Boltzmann result, both for the large $N$ critical $O(N)$ model and the Gross-Neveu model in odd dimensions. This ratio is always less than one and it approaches unity on increasing the spin with the dimension $d$ held fixed. The ratio however approaches zero when the dimension $d$ is increased with the spin held fixed.
2023-07-27	Semi-Grant-Free Orthogonal Multiple Access with Partial-Information for Short Packet Transmissions	Alberto Rech et.al.	2307.14846v1	null	Next-generation internet-of-things (IoT) networks require extremely low latency, complexity, and collision probability. We introduce the novel partial-information multiple access (PIMA) scheme, a semi-grant-free (GF) coordinated random access (RA) protocol for short packet transmission, with the aim of reducing the latency and packet loss of traditional multiple access schemes, as well as more recent preamble-based schemes. With PIMA, the base station (BS) acquires partial information on instantaneous traffic conditions in the partial information acquisition (PIA) sub-frame, estimating the number of active devices, i.e., having packets waiting for transmission in their queue. Based on this estimate, the BS chooses both the total number of slots to be allocated in the data transmission (DT) sub-frame and the respective user-to-slot assignment. Although collisions may still occur due to multiple users assigned to the same slot, they are drastically reduced with respect to the slotted ALOHA (SALOHA) scheme, while achieving lower latency than both time-division multiple-access (TDMA) and preamble-based protocols, due to the extremely reduced overhead of the PIA sub-frame. Finally, we analyze and assess the performance of PIMA under various activation statistics, proving the robustness of the proposed solution to the intensity of traffic, also with burst traffic.
2023-07-27	Frustration-Induced Superconductivity in the $t$-$t'$ Hubbard Model	Changkai Zhang et.al.	2307.14835v1	null	The two-dimensional (2D) Hubbard model is widely believed to capture key ingredients of high-$T_c$ superconductivity in cuprate materials. However, compelling evidence remains elusive. In particular, various magnetic orders may emerge as strong competitors of superconducting orders. Here, we study the ground state properties of the doped 2D $t$-$t'$ Hubbard model on a square lattice via the infinite Projected Entangled-Pair State (iPEPS) method with $\mathrm{U}(1)$ or $\mathrm{SU}(2)$ spin symmetry. The former is compatible with antiferromagnetic orders, while the latter forbids them. Therefore, we obtain by comparison a detailed understanding of the magnetic impact on superconductivity. Moreover, an additional $t'$ term accommodates the particle-hole asymmetry, which facilitates studies on the discrepancies between electron- and hole-doped systems. We demonstrate that (i) a positive $t'/t$ significantly amplifies the strength of superconducting orders; (ii) at sufficiently large doping levels, the $t$-$t'$ Hubbard model favors a uniform state with superconducting orders instead of stripe states with charge and spin modulations; and (iii) the enhancement of magnetic frustration, by increasing either the strength of NNN interactions or the charge doping, impairs stripe orders and helps stabilize superconductivity.
2023-07-27	Disturbance Preview for Nonlinear Model Predictive Trajectory Tracking of Underwater Vehicles in Wave Dominated Environments	Kyle L. Walker et.al.	2307.14834v1	null	Operating in the near-vicinity of marine energy devices poses significant challenges to the control of underwater vehicles, predominantly due to the presence of large magnitude wave disturbances causing hazardous state perturbations. Approaches to tackle this problem have varied, but one promising solution is to adopt predictive control methods. Given the predictable nature of ocean waves, the potential exists to incorporate disturbance estimations directly within the plant model; this requires inclusion of a wave predictor to provide online preview information. To this end, this paper presents a Nonlinear Model Predictive Controller with an integrated Deterministic Sea Wave Predictor for trajectory tracking of underwater vehicles. State information is obtained through an Extended Kalman Filter, forming a complete closed-loop strategy and facilitating online wave load estimations. The strategy is compared to a similar feed-forward disturbance mitigation scheme, showing mean performance improvements of 51% in positional error and 44.5% in attitude error. The preliminary results presented here provide strong evidence of the proposed method's high potential to effectively mitigate disturbances, facilitating accurate tracking performance even in the presence of high wave loading.
2023-07-27	Test of $^{116}$CdWO$_4$ and Li$_2$MoO$_4$ scintillating bolometers in the CROSS underground facility with upgraded detector suspension	A. Ahmine et.al.	2307.14831v1	null	In preparation to the CROSS $2\beta$ decay experiment, we installed a new detector suspension with magnetic dumping inside a pulse-tube cryostat of a dedicated low-background facility at the LSC (Spain). The suspension was tested with two scintillating bolometers based on large-volume 116CdWO4 (CWO-enr) and Li2MoO4 (LMO) crystals. The former, a reference device, was used for testing new noise conditions and for comparing bolometric performance of an advanced Li2MoO4 crystal developed in the framework of the CLYMENE project, in view of next-generation double-beta decay experiments like CUPID. We cooled down detectors to 15 mK and achieved high performance for all tested devices. In particular both CWO-enr and LMO bolometers demonstrated the energy resolution of 6 keV FWHM for the 2.6 MeV gamma quanta, among the best for thermal detectors based on such compounds. The baseline noise resolution (FWHM) of the CWO-enr detector was improved by 2 keV, compared to the best previous measurement of this detector in the CROSS facility, while the noise of the Ge-based optical bolometer was improved by a factor 2, to 100 eV FWHM. Despite of the evident progress in the improving of noise conditions of the set-up, we see high-frequency harmonics of a pulse-tube induced noise, suggesting a noise pick-up by cabling. Another Ge light detector was assisted with the signal amplification exploiting the Neganov-Trofimov-Luke effect, which allowed to reach 20 eV FWHM noise resolution by applying 60 V electrode bias. Highly-efficient particle identification was achieved with both detectors, despite a low scintillation efficiency of the LMO material. The radiopurity level of the LMO crystal is rather high; only traces of 210Po and 226Ra were detected (0.1 mBq/kg each), while the 228Th activity is expected to be at least an order of magnitude lower, as well as a 40K activity is found to be < 6 mBq/kg.
2023-07-27	Understanding the polaritonic ground state in cavity quantum electrodynamics	Tor S. Haugland et.al.	2307.14822v1	null	Molecular polaritons arise when molecules interact so strongly with light that they become entangled with each other. This light-matter hybridization alters the chemical and physical properties of the molecular system and allows chemical reactions to be controlled without the use of external fields. We investigate the impact of strong light-matter coupling on the electronic structure using perturbative approaches and demonstrate that Rayleigh-Schr\"odinger perturbation theory can reproduce the ground state energies in optical cavities to comparable accuracy as ab initio cavity quantum electrodynamics methodologies for currently relevant coupling strengths. The method is effective in both low and high cavity frequency regimes and straightforward to implement via response functions. Furthermore, we establish simple relations between cavity-induced intermolecular forces and van der Waals forces. These findings provide valuable insight into the manipulation of ground-state polaritonic energy landscapes, shedding light on the systems and conditions in which modifications can be achieved.
2023-07-27	Experimental validation of particle-in-cell/Monte Carlo collisions simulations in low-pressure neon capacitively coupled plasmas	Chan-Won Park et.al.	2307.14821v1	null	Plasma simulations are powerful tools for understanding fundamental plasma science phenomena and for process optimization in applications. To ensure their quantitative accuracy, they must be validated against experiments. In this work, such an experimental validation is performed for a 1d3v particle-in-cell simulation complemented with the Monte Carlo treatment of collision processes of a capacitively coupled radio frequency plasma driven at 13.56 MHz and operated in neon gas. In a geometrically symmetric reactor the electron density in the discharge center and the spatio-temporal distribution of the electron impact excitation rate from the ground into the Ne 2p$_1$ state are measured by a microwave cutoff probe and phase resolved optical emission spectroscopy, respectively. The measurements are conducted for electrode gaps between 50 mm and 90 mm, neutral gas pressures between 20 mTorr and 50 mTorr, and peak-to-peak values of the driving voltage waveform between 250 V and 650 V. Simulations are performed under identical discharge conditions. In the simulations, various combinations of surface coefficients characterising the interactions of electrons and heavy particles with the anodized aluminium electrode surfaces are adopted. We find, that the simulations using a constant effective heavy particle induced secondary electron emission coefficient of 0.3 and a realistic electron-surface interaction model (which considers energy-dependent and material specific elastic and inelastic electron reflection, as well as the emission of true secondary electrons from the surface) yield results which are in good quantitative agreement with the experimental data.
2023-07-27	*High-temperature superconductivity with zero-resistance and strange metal behavior in La${3}$Ni$$O$_{7}$*	Yanan Zhang et.al.	2307.14819v1	null	Recently signatures of superconductivity were observed close to 80 K in La${3}$Ni$$O${7}$ under pressure [1]. This discovery positions La$$Ni${2}$O$$ as the first bulk nickelate with high-temperature superconductivity, but the lack of zero resistance presents a significant drawback for validating the findings. Here we show that La${3}$Ni$$O${7}$ exhibits zero resistance upon applying hydrostatic pressure up to 29.2~GPa using a liquid pressure medium. We find that La$$Ni${2}$O$$ remains metallic under applied pressures, suggesting the absence of a metal-insulator transition proximate to the superconductivity. Analysis of the normal state $T$-linear resistance suggests an intricate link between this strange metal behaviour and superconductivity, whereby at high pressures both the linear resistance coefficient and superconducting transition are slowly suppressed by pressure, while at intermediate pressures both the superconductivity and strange metal behaviour appear disrupted, possibly due to a nearby structural instability. The association between strange metal behaviour and high-temperature superconductivity is very much in line with diverse classes of unconventional superconductors, including the cuprates and iron-pnictides [2-6]. Understanding the superconductivity of La${3}$Ni$$O$_{7}$ evidently requires further revealing the interplay of strange metal behaviour, superconductivity, as well as possible competing electronic or structural phases.
2023-07-27	Polarization properties of X-ray tubes used for Imaging X-ray Polarimetry Explorer calibration	Ajay Ratheesh et.al.	2307.14814v1	null	In this work, we measured the polarization properties of the X-rays emitted from the X-ray tubes, which were used during the calibration of the instrument onboard Imaging X-ray Polarimetry Explorer (IXPE). X-ray tubes are used as a source of unpolarized X-rays to calibrate the response of the gas pixel detectors to unpolarized radiation. However, even though the characteristic fluorescent emission lines are unpolarized, continuum bremsstrahlung emission can be polarized based on the geometry of the accelerated electrons and emitted photons. Hence, characterizing the contribution of polarized X-rays from bremsstrahlung emission is of interest, also for future measurements. We find that when accelerated electrons are parallel to the emitted photons, the bremsstrahlung emission is unpolarized, and when they are perpendicular, the polarization increases with energy, as expected from the theoretical predictions. A comparison with the theoretical predictions is also shown.
2023-07-27	Understanding magnetoelectric switching in BiFeO$_3$ thin films	Natalya S. Fedorova et.al.	2307.14789v1	null	In this work we use a phenomenological theory of ferroelectric switching in BiFeO$_3$ thin films to uncover the mechanism of the two-step process that leads to the reversal of the weak magnetization of these materials. First, we introduce a realistic model of a BiFeO$_3$ film, including the Landau energy of isolated domains as well as the constraints that account for the presence of the substrate and the multidomain configuration found experimentally. We use this model to obtain statistical information about the switching behavior - by running dynamical simulations based on the Landau-Khalatnikov time-evolution equation, including thermal fluctuations - and we thus identify the factors that drive the two-step polarization reversal observed in the experiments. Additionally, we apply our model to test potential strategies for optimizing the switching characteristics.

Electromyography

Publish Date	Title	Authors	PDF	Code	Abstract
2023-07-25	Gait Cycle-Inspired Learning Strategy for Continuous Prediction of Knee Joint Trajectory from sEMG	Xueming Fu et.al.	2307.13209v1	null	Predicting lower limb motion intent is vital for controlling exoskeleton robots and prosthetic limbs. Surface electromyography (sEMG) attracts increasing attention in recent years as it enables ahead-of-time prediction of motion intentions before actual movement. However, the estimation performance of human joint trajectory remains a challenging problem due to the inter- and intra-subject variations. The former is related to physiological differences (such as height and weight) and preferred walking patterns of individuals, while the latter is mainly caused by irregular and gait-irrelevant muscle activity. This paper proposes a model integrating two gait cycle-inspired learning strategies to mitigate the challenge for predicting human knee joint trajectory. The first strategy is to decouple knee joint angles into motion patterns and amplitudes former exhibit low variability while latter show high variability among individuals. By learning through separate network entities, the model manages to capture both the common and personalized gait features. In the second, muscle principal activation masks are extracted from gait cycles in a prolonged walk. These masks are used to filter out components unrelated to walking from raw sEMG and provide auxiliary guidance to capture more gait-related features. Experimental results indicate that our model could predict knee angles with the average root mean square error (RMSE) of 3.03(0.49) degrees and 50ms ahead of time. To our knowledge this is the best performance in relevant literatures that has been reported, with reduced RMSE by at least 9.5%.
2023-07-20	Analysis of the rate of force development reveals high neuromuscular fatigability in elderly patients with chronic kidney disease	Antoine Chatrenet et.al.	2307.10691v1	null	Background Chronic kidney disease (CKD) induces muscle wasting and a reduction in the maximum voluntary force (MVF). Little is known about the neuromuscular fatigability in CKD patients, defined as the reduction of muscle force capacities during exercise. Neuromuscular fatigability is a crucial physical parameter of the daily living. The quantification of explosive force has been shown to be a sensitive means to assess neuromuscular fatigability. Thus, our study used explosive force estimates to assess neuromuscular fatigability in elderly CKD patients. Methods Inclusion criteria for CKD patients were age $\ge$ 60 years old and glomerular filtration rate (GFR) < 45 mL/ min/1.73 m 2 not on dialysis, and those for controls were GFR > 60 mL/min/1.73 m 2 , age and diabetes matched. The fatigability protocol focused on a handgrip task coupled with surface electromyography (sEMG). Scalars were extracted from the rate of force development (RFD): absolute and normalized time periods (50, 75, 100, 150 and 200 ms, RFD 50 , RFD 75 , RFD 100 , RFD 150 and RFD 200 , respectively), peak RFD (RFD peak in absolute; NRFD peak normalized), timeto-peak RFD (t-RFD peak) and the relative force at RFD peak (MVF-RFD peak). A statistical parametric mapping approach was performed on the force, impulse and RFD-time curves. The integrated sEMG with time at 0-30, 0-50, 0-100 and 0-200 ms time intervals relative to onset of sEMG activity was extracted and groups were compared separately for each sex. Results The cohort of 159 individuals had a median age of 69 (9 IQR) years and body mass index was 27.6 (6.2 IQR) kg/ m 2. Propensity-score-matched groups balanced CKD patients and controls by gender with 66 males and 34 females. In scalar analysis, CKD patients manifested a higher decrement than controls in the early phase of contraction, regarding the NRFD peak (P = 0.009; $\eta$ 2 p = 0.034) and RFD 75 and RFD 100 (for both P < 0.001; $\eta$ 2 p = 0.068 and 0.064). The onedimensional analysis confirmed that CKD males manifest higher and delayed neuromuscular fatigability, especially before 100 ms from onset of contraction. sEMG was lower in CKD patients than controls in the 0-100 ms (at rest: P = 0.049, Cohen's d = 0.458) and 0-200 ms (at rest: P = 0.016, Cohen's d = 0.496; during exercise: P = 0.006, Cohen's d = 0.421) time windows. Controls showed greater decrease of sEMG than CKD patients in the 0-30 ms (P = 0.020, Cohen's d = 0.533) and 0-50 ms (P = 0.010, Cohen's d = 0.640) time windows. As opposite to females, males showed almost the same differences between groups. Conclusions Our study is the first to show that CKD patients have higher fatigability than controls, which may be associated with an impaired motor-unit recruitment, highlighting a neural drive disturbance with CKD. Further studies are needed to confirm these findings.
2023-07-13	Combining Vision and EMG-Based Hand Tracking for Extended Reality Musical Instruments	Max Graf et.al.	2307.10203v1	null	Hand tracking is a critical component of natural user interactions in extended reality (XR) environments, including extended reality musical instruments (XRMIs). However, self-occlusion remains a significant challenge for vision-based hand tracking systems, leading to inaccurate results and degraded user experiences. In this paper, we propose a multimodal hand tracking system that combines vision-based hand tracking with surface electromyography (sEMG) data for finger joint angle estimation. We validate the effectiveness of our system through a series of hand pose tasks designed to cover a wide range of gestures, including those prone to self-occlusion. By comparing the performance of our multimodal system to a baseline vision-based tracking method, we demonstrate that our multimodal approach significantly improves tracking accuracy for several finger joints prone to self-occlusion. These findings suggest that our system has the potential to enhance XR experiences by providing more accurate and robust hand tracking, even in the presence of self-occlusion.
2023-07-08	A Physics-Informed Low-Shot Learning For sEMG-Based Estimation of Muscle Force and Joint Kinematics	Yue Shi et.al.	2307.05361v1	null	Muscle force and joint kinematics estimation from surface electromyography (sEMG) are essential for real-time biomechanical analysis of the dynamic interplay among neural muscle stimulation, muscle dynamics, and kinetics. Recent advances in deep neural networks (DNNs) have shown the potential to improve biomechanical analysis in a fully automated and reproducible manner. However, the small sample nature and physical interpretability of biomechanical analysis limit the applications of DNNs. This paper presents a novel physics-informed low-shot learning method for sEMG-based estimation of muscle force and joint kinematics. This method seamlessly integrates Lagrange's equation of motion and inverse dynamic muscle model into the generative adversarial network (GAN) framework for structured feature decoding and extrapolated estimation from the small sample data. Specifically, Lagrange's equation of motion is introduced into the generative model to restrain the structured decoding of the high-level features following the laws of physics. And a physics-informed policy gradient is designed to improve the adversarial learning efficiency by rewarding the consistent physical representation of the extrapolated estimations and the physical references. Experimental validations are conducted on two scenarios (i.e. the walking trials and wrist motion trials). Results indicate that the estimations of the muscle forces and joint kinematics are unbiased compared to the physics-based inverse dynamics, which outperforms the selected benchmark methods, including physics-informed convolution neural network (PI-CNN), vallina generative adversarial network (GAN), and multi-layer extreme learning machine (ML-ELM).
2023-07-07	An Improved Compound Gaussian Model for Bivariate Surface EMG Signals Related to Strength Training	Durgesh Kusuru et.al.	2307.03403v1	null	Recent literature suggests that the surface electromyography (sEMG) signals have non-stationary statistical characteristics specifically due to random nature of the covariance. Thus suitability of a statistical model for sEMG signals is determined by the choice of an appropriate model for describing the covariance. The purpose of this study is to propose a Compound-Gaussian (CG) model for multivariate sEMG signals in which latent variable of covariance is modeled as a random variable that follows an exponential model. The parameters of the model are estimated using the iterative Expectation Maximization (EM) algorithm. Further, a new dataset, electromyography analysis of human activities database 2 (EMAHA-DB2) is developed. Based on the model fitting analysis on the sEMG signals from EMAHA-DB2, it is found that the proposed CG model fits more closely to the empirical pdf of sEMG signals than the existing models. The proposed model is validated by visual inspection, further validated by matching central moments and better quantitative metrics in comparison with other models. The proposed compound model provides an improved fit to the statistical behavior of sEMG signals. Further, the estimate of rate parameter of the exponential model shows clear relation to the training weights. Finally, the average signal power estimates of the channels shows distinctive dependency on the training weights, the subject's training experience and the type of activity.
2023-06-29	Labour Monitoring in Pregnant Women Using Phonocardiography, Electrocardiography and Electromyography Technique	Anushka Tiwari et.al.	2306.17198v1	null	Continuous monitoring of fetal and maternal vital signs, particularly during labor, can be critical for the child and mother's health. We present a novel wearable electronic system that measures, in real-time, maternal heart rate using phonocardiography (PCG) and Electrocardiography (ECG). Uterine contractions using electromyography (EMG). When in later stages we employed ECG technique for maternal heart rate monitoring. The heart rate is determined using moving average filters to remove noises in the signal and ACF(Autocorrelation Function) for determining periodicity. For UC monitoring we stick to the same EMG technique. We also tried employing EMG technique to monitor the Fetal Heart Rate(FHR). But, in later stages of this design, this idea was aborted as we concluded that it needs further research on pregnancy stages and would require more intricate sensor integration that might not be in our reach at the moment. The system is accurate, low-cost, and portable, so it can be deployed at primary healthcare centers in low-income countries. The system can also be used by women in the comfort of their homes. At the same time, the data collected is transferred to their doctor for analysis and diagnosis, which can bring a revolutionary change in the continuous monitoring of fetal wellbeing during labor.
2023-06-22	Influence of Force-Length Relationship and Task-Specific Constraints on Finger Force-Generating Capacities	Benjamin Goislard de Monsabert et.al.	2306.12842v1	null	Grip strength loss in extended and flexed wrist postures has been explained by reduced force-generating capacities of extrinsic finger flexor resulting from non-optimal length, owing to the force-length relationship. Recent works suggested that other muscles, especially wrist extensors, participate in this grip strength loss. The objective of this study was to clarify the role of the force-length relationship in finger force production. 18 participants performed maximal isometric finger force production during pinch grip (Pinch) and four-finger pressing (Press) tasks in four different wrist postures (extended, flexed, neutral, spontaneous). The maximum finger force (MFF), finger and wrist joint angles, as well as activation of four muscles were determined using dynamometry, motion capture, and electromyography. The force and length of the four muscles were estimated from joint angles and muscle activation using a musculoskeletal model. MFF decreased for flexed wrist during Pinch but remained stable across wrist postures during Press. The results suggested that the loss of pinch grip force in deviated wrist posture is partially related to force-length relationship of finger extensors. In opposition, MFF during Press was not influenced by the modulation of muscle capacities but was probably first limited by mechanical and neural factors related to finger interdependence
2023-05-28	Multi-Modal Wireless Flexible Gel-Free Sensors with Edge Deep Learning for Detecting and Alerting Freezing of Gait in Parkinson's Patients	Yuhan Hou et.al.	2305.17629v1	null	Freezing of gait (FoG) is a debilitating symptom of Parkinson's disease (PD). This work develops flexible wearable sensors that can detect FoG and alert patients and companions to help prevent falls. FoG is detected on the sensors using a deep learning (DL) model with multi-modal sensory inputs collected from distributed wireless sensors. Two types of wireless sensors are developed, including: (1) a C-shape central node placed around the patient's ears, which collects electroencephalogram (EEG), detects FoG using an on-device DL model, and generates auditory alerts when FoG is detected; (2) a stretchable patch-type sensor attached to the patient's legs, which collects electromyography (EMG) and movement information from accelerometers. The patch-type sensors wirelessly send collected data to the central node through low-power ultra-wideband (UWB) transceivers. All sensors are fabricated on flexible printed circuit boards. Adhesive gel-free acetylene carbon black and polydimethylsiloxane electrodes are fabricated on the flexible substrate to allow conformal wear over the long term. Custom integrated circuits (IC) are developed in 180 nm CMOS technology and used in both types of sensors for signal acquisition, digitization, and wireless communication. A novel lightweight DL model is trained using multi-modal sensory data. The inference of the DL model is performed on a low-power microcontroller in the central node. The DL model achieves a high detection sensitivity of 0.81 and a specificity of 0.88. The developed wearable sensors are ready for clinical experiments and hold great promise in improving the quality of life of patients with PD. The proposed design methodologies can be used in wearable medical devices for the monitoring and treatment of a wide range of neurodegenerative diseases.
2023-05-26	A Multi-Resolution Physics-Informed Recurrent Neural Network: Formulation and Application to Musculoskeletal Systems	Karan Taneja et.al.	2305.16593v1	null	This work presents a multi-resolution physics-informed recurrent neural network (MR PI-RNN), for simultaneous prediction of musculoskeletal (MSK) motion and parameter identification of the MSK systems. The MSK application was selected as the model problem due to its challenging nature in mapping the high-frequency surface electromyography (sEMG) signals to the low-frequency body joint motion controlled by the MSK and muscle contraction dynamics. The proposed method utilizes the fast wavelet transform to decompose the mixed frequency input sEMG and output joint motion signals into nested multi-resolution signals. The prediction model is subsequently trained on coarser-scale input-output signals using a gated recurrent unit (GRU), and then the trained parameters are transferred to the next level of training with finer-scale signals. These training processes are repeated recursively under a transfer-learning fashion until the full-scale training (i.e., with unfiltered signals) is achieved, while satisfying the underlying dynamic equilibrium. Numerical examples on recorded subject data demonstrate the effectiveness of the proposed framework in generating a physics-informed forward-dynamics surrogate, which yields higher accuracy in motion predictions of elbow flexion-extension of an MSK system compared to the case with single-scale training. The framework is also capable of identifying muscle parameters that are physiologically consistent with the subject's kinematics data.
2023-05-18	Adaptive Learning based Upper-Limb Rehabilitation Training System with Collaborative Robot	Jun Hong Lim et.al.	2305.10642v2	null	Rehabilitation training for patients with motor disabilities usually requires specialized devices in rehabilitation centers. Home-based multi-purpose training would significantly increase treatment accessibility and reduce medical costs. While it is unlikely to equip a set of rehabilitation robots at home, we investigate the feasibility to use the general-purpose collaborative robot for rehabilitation therapies. In this work, we developed a new system for multi-purpose upper-limb rehabilitation training using a generic robot arm with human motor feedback and preference. We integrated surface electromyography, force/torque sensors, RGB-D cameras, and robot controllers with the Robot Operating System to enable sensing, communication, and control of the system. Imitation learning methods were adopted to imitate expert-provided training trajectories which could adapt to subject capabilities to facilitate in-home training. Our rehabilitation system is able to perform gross motor function and fine motor skill training with a gripper-based end-effector. We simulated system control in Gazebo and training effects (muscle activation level) in OpenSim and evaluated its real performance with human subjects. For all the subjects enrolled, our system achieved better training outcomes compared to specialist-assisted rehabilitation under the same conditions. Our work demonstrates the potential of utilizing collaborative robots for in-home motor rehabilitation training.
2023-05-06	Electromyography Signal Classification Using Deep Learning	Mekia Shigute Gaso et.al.	2305.04006v1	null	We have implemented a deep learning model with L2 regularization and trained it on Electromyography (EMG) data. The data comprises of EMG signals collected from control group, myopathy and ALS patients. Our proposed deep neural network consists of eight layers; five fully connected, two batch normalization and one dropout layers. The data is divided into training and testing sections by subsequently dividing the training data into sub-training and validation sections. Having implemented this model, an accuracy of 99 percent is achieved on the test data set. The model was able to distinguishes the normal cases (control group) from the others at a precision of 100 percent and classify the myopathy and ALS with high accuracy of 97.4 and 98.2 percents, respectively. Thus we believe that, this highly improved classification accuracies will be beneficial for their use in the clinical diagnosis of neuromuscular disorders.
2023-04-08	Overview of processing techniques for surface electromyography signals	Alejandra Manjarres-Triana et.al.	2304.04098v1	null	Surface electromyography (sEMG) is a technology to assess muscle activation, which is an important component in applications related to diagnosis, treatment, progression assessment, and rehabilitation of specific individuals' conditions. Recently, sEMG potential has been shown, since it can be used in a non-invasive manner; nevertheless, it requires careful signal analysis to support health professionals reliably. This paper briefly described the basic concepts involved in the sEMG, such as the physiology of the muscles, the data acquisition, the signal processing techniques, and classification methods that may be used to identify disorders or signs of abnormalities according to muscular patterns. Specifically, classification methods encompass digital signal processing techniques and machine learning with high potential in the field. We hope that this work serves as an introduction to researchers interested in this field.
2023-04-02	A Framework and Call to Action for the Future Development of EMG-Based Input in HCI	Ethan Eddy et.al.	2304.00582v1	null	Electromyography (EMG) has been explored as an HCI input modality following a long history of success for prosthesis control. While EMG has the potential to address a range of hands-free interaction needs, it has yet to be widely accepted outside of prosthetics due to a perceived lack of robustness and intuitiveness. To understand how EMG input systems can be better designed, we sampled the ACM digital library to identify limitations in the approaches taken. Leveraging these works in combination with our research group's extensive interdisciplinary experience in this field, four themes emerged (1) interaction design, (2) model design, (3) system evaluation, and (4) reproducibility. Using these themes, we provide a step-by-step framework for designing EMG-based input systems to strengthen the foundation on which EMG-based interactions are built. Additionally, we provide a call-to-action for researchers to unlock the hidden potential of EMG as a widely applicable and highly usable input modality.
2023-03-13	Discriminative sEMG-based features to assess damping ability and interpret activation patterns in lower-limb muscles of ACLR athletes	Mehran Hatamzadeh et.al.	2303.06954v1	null	Objective: The main goal of the athletes who undergo anterior cruciate ligament reconstruction (ACLR) surgery is a successful return-to-sport. At this stage, identifying muscular deficits becomes important. Hence, in this study, three discriminative features based on surface electromyographic signals (sEMG) acquired in a dynamic protocol are introduced to assess the damping ability and interpret activation patterns in lower-limb muscles of ACLR athletes. Methods: The features include the median frequency of the power spectrum density (PSD), the relative percentage of the equivalent damping or equivalent stiffness derived from the median frequency, and the energy of the signals in the time-frequency plane of the pseudo-Wigner-Ville distribution (PWVD). To evaluate the features, 11 healthy and 11 ACLR athletes (6 months post-reconstruction surgery) were recruited to acquire the sEMG signals from the medial and the lateral parts of the hamstrings, quadriceps, and gastrocnemius muscles in pre- and post-fatigue single-leg landings. Results: A significant damping deficiency is observed in the hamstring muscles of ACLR athletes by evaluating the proposed features. This deficiency indicates that more attention should be paid to this muscle of ACLR athletes in pre-return-to-sport rehabilitations. Conclusion: The quality of electromyography-based pre-return-to-sport assessments on ACLR subjects depends on the sEMG acquisition protocol, as well as the type and nature of the extracted features. Hence, combinatorial application of both energy-based features (derived from the PWVD) and power-based features (derived from the PSD) could facilitate the assessment process by providing additional biomechanical information regarding the behavior of the muscles surrounding the knee.
2023-03-11	AI-Enhanced Intensive Care Unit: Revolutionizing Patient Care with Pervasive Sensing	Subhash Nerella et.al.	2303.06252v1	null	The intensive care unit (ICU) is a specialized hospital space where critically ill patients receive intensive care and monitoring. Comprehensive monitoring is imperative in assessing patients conditions, in particular acuity, and ultimately the quality of care. However, the extent of patient monitoring in the ICU is limited due to time constraints and the workload on healthcare providers. Currently, visual assessments for acuity, including fine details such as facial expressions, posture, and mobility, are sporadically captured, or not captured at all. These manual observations are subjective to the individual, prone to documentation errors, and overburden care providers with the additional workload. Artificial Intelligence (AI) enabled systems has the potential to augment the patient visual monitoring and assessment due to their exceptional learning capabilities. Such systems require robust annotated data to train. To this end, we have developed pervasive sensing and data processing system which collects data from multiple modalities depth images, color RGB images, accelerometry, electromyography, sound pressure, and light levels in ICU for developing intelligent monitoring systems for continuous and granular acuity, delirium risk, pain, and mobility assessment. This paper presents the Intelligent Intensive Care Unit (I2CU) system architecture we developed for real-time patient monitoring and visual assessment.
2023-02-19	Estimation and Early Prediction of Grip Force Based on sEMG Signals and Deep Recurrent Neural Networks	Atusa Ghorbani et.al.	2302.09555v1	null	Hands are used for communicating with the surrounding environment and have a complex structure that enables them to perform various tasks with their multiple degrees of freedom. Hand amputation can prevent a person from performing their daily activities. In that event, finding a suitable, fast, and reliable alternative for the missing limb can affect the lives of people who suffer from such conditions. As the most important use of the hands is to grasp objects, the purpose of this study is to accurately predict gripping force from surface electromyography (sEMG) signals during a pinch-type grip. In that regard, gripping force and sEMG signals are derived from 10 healthy subjects. Results show that for this task, recurrent networks outperform nonrecurrent ones, such as a fully connected multilayer perceptron (MLP) network. Gated recurrent unit (GRU) and long short-term memory (LSTM) networks can predict the gripping force with R-squared values of 0.994 and 0.992, respectively, and a prediction rate of over 1300 predictions per second. The predominant advantage of using such frameworks is that the gripping force can be predicted straight from preprocessed sEMG signals without any form of feature extraction, not to mention the ability to predict future force values using larger prediction horizons adequately. The methods presented in this study can be used in the myoelectric control of prosthetic hands or robotic grippers.
2023-02-17	Sleep Model -- A Sequence Model for Predicting the Next Sleep Stage	Iksoo Choi et.al.	2302.12709v1	null	As sleep disorders are becoming more prevalent there is an urgent need to classify sleep stages in a less disturbing way.In particular, sleep-stage classification using simple sensors, such as single-channel electroencephalography (EEG), electrooculography (EOG), electromyography (EMG), or electrocardiography (ECG) has gained substantial interest. In this study, we proposed a sleep model that predicts the next sleep stage and used it to improve sleep classification accuracy. The sleep models were built using sleep-sequence data and employed either statistical $n$-gram or deep neural network-based models. We developed beam-search decoding to combine the information from the sensor and the sleep models. Furthermore, we evaluated the performance of the $n$-gram and long short-term memory (LSTM) recurrent neural network (RNN)-based sleep models and demonstrated the improvement of sleep-stage classification using an EOG sensor. The developed sleep models significantly improved the accuracy of sleep-stage classification, particularly in the absence of an EEG sensor.
2023-02-15	Automated Movement Detection with Dirichlet Process Mixture Models and Electromyography	Navin Cooray et.al.	2302.07509v1	null	Numerous sleep disorders are characterised by movement during sleep, these include rapid-eye movement sleep behaviour disorder (RBD) and periodic limb movement disorder. The process of diagnosing movement related sleep disorders requires laborious and time-consuming visual analysis of sleep recordings. This process involves sleep clinicians visually inspecting electromyogram (EMG) signals to identify abnormal movements. The distribution of characteristics that represent movement can be diverse and varied, ranging from brief moments of tensing to violent outbursts. This study proposes a framework for automated limb-movement detection by fusing data from two EMG sensors (from the left and right limb) through a Dirichlet process mixture model. Several features are extracted from 10 second mini-epochs, where each mini-epoch has been classified as 'leg-movement' or 'no leg-movement' based on annotations of movement from sleep clinicians. The distributions of the features from each category can be estimated accurately using Gaussian mixture models with the Dirichlet process as a prior. The available dataset includes 36 participants that have all been diagnosed with RBD. The performance of this framework was evaluated by a 10-fold cross validation scheme (participant independent). The study was compared to a random forest model and outperformed it with a mean accuracy, sensitivity, and specificity of 94\%, 48\%, and 95\%, respectively. These results demonstrate the ability of this framework to automate the detection of limb movement for the potential application of assisting clinical diagnosis and decision-making.
2023-02-08	Simplified markerless stride detection pipeline (sMaSDP) for surface EMG segmentation	Rafael Castro Aguiar et.al.	2302.04243v1	null	People with mobility impairments are often recommended for gait assessment studies to diagnose their condition and to select appropriate physiotherapy to improve their mobility. These studies are often conducted in clinical or lab settings, where subjects are assessed in a foreign environment, which may influence their motivation, coordination and overall mobility. Alternatively, if the subject's gait could be assessed in their daily-lives, in unconstrained settings, a more naturalistic gait assessment could be performed. Kinematic analysis of a gait pattern on its own may not be sufficient to characterise a subject's mobility. To better diagnose gait deficiencies, analysis of the patient's muscle activity should be conducted as well. To do so, gait studies should collect, synchronously, Electromyography (EMG) and kinematic data. This method introduces a simplified markerless gait event detection pipeline for the segmentation of EMG signals, via synchronously recorded Inertial Measurement Unit (IMU) data. In an unconstrained walking experiment, healthy subjects walk through a designed course with their kinematic and EMG data recorded. This course comprises 5 different walking modalities (level walking, ramp up/down, staircase up/down), mimicking everyday walking. Through timepoint matching, segmentation and filtering, we generate an algorithm that detects heel-strike (HS) events using a single IMU, and isolates EMG activity of gait cycles, in the different walking modalities. This gait event detection algorithm can be adapted to different datasets, and was tested in both healthy and Parkinson's Disease (PD) gait. Results demonstrate the extracted muscle activity levels in a healthy subject's level ground walking, and the extracted HS events of a PD patient. Adjustments to algorithm parameters are possible (e.g., expected velocity, cadence) and can further increase the detection accuracy.
2023-02-01	Upper-limb Geometric MyoPassivity Map for Physical Human-Robot Interaction	Xingyuan Zhou et.al.	2302.00495v1	null	The intrinsic biomechanical characteristic of the human upper limb plays a central role in absorbing the interactive energy during physical human-robot interaction (pHRI). We have recently shown that based on the concept of ``Excess of Passivity (EoP)," from nonlinear control theory, it is possible to decode such energetic behavior for both upper and lower limbs. The extracted knowledge can be used in the design of controllers for optimizing the transparency and fidelity of force fields in human-robot interaction and in haptic systems. In this paper, for the first time, we investigate the frequency behavior of the passivity map for the upper limb when the muscle co-activation was controlled in real-time through visual electromyographic feedback. Five healthy subjects (age: 27 +/- 5) were included in this study. The energetic behavior was evaluated at two stimulation frequencies at eight interaction directions over two controlled muscle co-activation levels. Electromyography (EMG) was captured using the Delsys Wireless Trigno system. Results showed a correlation between EMG and EoP, which was further altered by increasing the frequency. The proposed energetic behavior is named the Geometric MyoPassivity (GMP) map. The findings indicate that the GMP map has the potential to be used in real-time to quantify the absorbable energy, thus passivity margin of stability for upper limb interaction during pHRI.
2023-01-31	A Prototype System for High Frame Rate Ultrasound Imaging based Prosthetic Arm Control	Ayush Singh et.al.	2301.13809v3	null	The creation of unique control methods for a hand prosthesis is still a problem that has to be addressed. The best choice of a human-machine interface (HMI) that should be used to enable natural control is still a challenge. Surface electromyography (sEMG), the most popular option, has a variety of difficult-to-fix issues (electrode displacement, sweat, fatigue). The ultrasound imaging-based methodology offers a means of recognising complex muscle activity and configuration with a greater SNR and less hardware requirements as compared to sEMG. In this study, a prototype system for high frame rate ultrasound imaging for prosthetic arm control is proposed. Using the proposed framework, a virtual robotic hand simulation is developed that can mimic a human hand as illustrated in the link [10]. The proposed classification model simulating four hand gestures has a classification accuracy of more than 90%.
2023-01-23	Long-term stable Electromyography classification using Canonical Correlation Analysis	Elisa Donati et.al.	2301.09729v1	null	Discrimination of hand gestures based on the decoding of surface electromyography (sEMG) signals is a well-establish approach for controlling prosthetic devices and for Human-Machine Interfaces (HMI). However, despite the promising results achieved by this approach in well-controlled experimental conditions, its deployment in long-term real-world application scenarios is still hindered by several challenges. One of the most critical challenges is maintaining high EMG data classification performance across multiple days without retraining the decoding system. The drop in performance is mostly due to the high EMG variability caused by electrodes shift, muscle artifacts, fatigue, user adaptation, or skin-electrode interfacing issues. Here we propose a novel statistical method based on canonical correlation analysis (CCA) that stabilizes EMG classification performance across multiple days for long-term control of prosthetic devices. We show how CCA can dramatically decrease the performance drop of standard classifiers observed across days, by maximizing the correlation among multiple-day acquisition data sets. Our results show how the performance of a classifier trained on EMG data acquired only of the first day of the experiment maintains 90% relative accuracy across multiple days, compensating for the EMG data variability that occurs over long-term periods, using the CCA transformation on data obtained from a small number of gestures. This approach eliminates the need for large data sets and multiple or periodic training sessions, which currently hamper the usability of conventional pattern recognition based approaches
2023-01-23	High-density magnetomyography is superior to high-density surface electromyography for motor unit decomposition: a simulation study	Thomas Klotz et.al.	2301.09494v2	null	Objective: Studying motor units (MUs) is essential for understanding motor control, the detection of neuromuscular disorders and the control of human-machine interfaces. Individual motor unit firings are currently identified in vivo by decomposing electromyographic (EMG) signals. Due to our body's properties and anatomy, individual motor units can only be separated to a limited extent with surface EMG. Unlike electrical signals, magnetic fields do not interact with human tissues. This physical property and the emerging technology of quantum sensors make magnetomyography (MMG) a highly promising methodology. However, the full potential of MMG to study neuromuscular physiology has not yet been explored. Approach: In this work, we perform in silico trials that combine a biophysical model of EMG and MMG with state-of-the-art algorithms for the decomposition of motor units. This allows the prediction of an upper-bound for the motor unit decomposition accuracy. Main results: It is shown that non-invasive high-density MMG data is superior over comparable high-density surface EMG data for the robust identification of the discharge patterns of individual motor units. Decomposing MMG instead of EMG increased the number of identifiable motor units by 76%. Notably, MMG exhibits a less pronounced bias to detect superficial motor units. Significance: The presented simulations provide insights into methods to study the neuromuscular system non-invasively and in vivo that would not be easily feasible by other means. Hence, this study provides guidance for the development of novel biomedical technologies.
2023-01-13	Analysis of LGM Model for sEMG Signals related to Weight Training	Durgesh Kusuru et.al.	2301.05417v1	null	Statistical models of Surface electromyography (sEMG) signals have several applications such as better understanding of sEMG signal generation, improved pattern recognition based control of wearable exoskeletons and prostheses, improving training strategies in sports activities, and EMG simulation studies. Most of the existing studies analysed the statistical model of sEMG signals acquired under isometric contractions. However, there is no study that addresses the statistical model under isotonic contractions. In this work, a new dataset, electromyography analysis of human activities - database 2 (EMAHA-DB2) is developed. It consists of two experiments based on both isometric and isotonic activities during weight training. Previously, a novel Laplacian-Gaussian Mixture (LGM) model was demonstrated for a few benchmark datasets consisting of basic movements and gestures. In this work, the model suitability analysis is extended to the EMAHA-DB2 dataset. Further, the LGM model is compared with three existing statistical models including the recent scale-mixture model. According to qualitative and quantitative analyses, the LGM model has a better fit to the empirical pdf of the recorded sEMG signals compared with the scale mixture model and the other standard models. The variance and mixing weight of the Laplacian component of the signal are analyzed with respect to the type of muscle, type of muscle contraction, dumb-bell weight and training experience of the subjects. The sEMG variance (the Laplacian component) increases with respect to the weights, is greater for isotonic activity especially for the biceps. For isotonic activity, the signal variance increases with training experience. Importantly, the ratio of the variances from the two muscle sites is observed to be nearly independent of the lifted weight and consistently increases with the training experience.
2023-01-09	EMAHA-DB1: A New Upper Limb sEMG Dataset for Classification of Activities of Daily Living	Naveen Kumar Karnam et.al.	2301.03325v1	null	In this paper, we present electromyography analysis of human activity - database 1 (EMAHA-DB1), a novel dataset of multi-channel surface electromyography (sEMG) signals to evaluate the activities of daily living (ADL). The dataset is acquired from 25 able-bodied subjects while performing 22 activities categorised according to functional arm activity behavioral system (FAABOS) (3 - full hand gestures, 6 - open/close office draw, 8 - grasping and holding of small office objects, 2 - flexion and extension of finger movements, 2 - writing and 1 - rest). The sEMG data is measured by a set of five Noraxon Ultium wireless sEMG sensors with Ag/Agcl electrodes placed on a human hand. The dataset is analyzed for hand activity recognition classification performance. The classification is performed using four state-ofthe-art machine learning classifiers, including Random Forest (RF), Fine K-Nearest Neighbour (KNN), Ensemble KNN (sKNN) and Support Vector Machine (SVM) with seven combinations of time domain and frequency domain feature sets. The state-of-theart classification accuracy on five FAABOS categories is 83:21% by using the SVM classifier with the third order polynomial kernel using energy feature and auto regressive feature set ensemble. The classification accuracy on 22 class hand activities is 75:39% by the same SVM classifier with the log moments in frequency domain (LMF) feature, modified LMF, time domain statistical (TDS) feature, spectral band powers (SBP), channel cross correlation and local binary patterns (LBP) set ensemble. The analysis depicts the technical challenges addressed by the dataset. The developed dataset can be used as a benchmark for various classification methods as well as for sEMG signal analysis corresponding to ADL and for the development of prosthetics and other wearable robotics.
2023-01-04	A Novel Power-optimized CMOS sEMG Device with Ultra Low-noise integrated with ConvNet (VGG16) for Biomedical Applications	Ahmed Ayman - Mohamed Sabry et.al.	2301.09570v2	null	The needle bio-potential sensors for measuring muscle and brain activity need invasive surgical targeted muscle reinnervation (TMR) and a demanding process to maintain, but surface bio-potential sensors lack clear bio-signal reading (Signal-Interference). In this research, a novel power-optimized complementary metal-oxide-semiconductor (CMOS) Surface Electromyography (sEMG) is developed to improve the efficiency and quality of captured bio-signal for biomedical application: The early diagnosis of neurological disorders (Dystonia) and a novel compatible mind-controlled prosthetic leg with human daily activities. A novel sEMG composed of CMOS Op-Amp based PIC16F877A 8-bit CMOS Flash-based Microcontroller is utilized to minimize power consumption and data processing time. sEMG Circuit is implemented with developed analog filter along with infinite impulse response (IIR) digital filter via Fast Fourier Transform (FFT), Z-transform, and difference equations. The analysis shows a significant improvement of 169.2% noise-reduction in recorded EMG signal using developed digital filter compared to analog one according to numerical root mean square error (RMSE). Moreover, digital IIR was tested in two stages: algorithmic and real-world. As a result, IIR's algorithmic (MATLAB) and real-world RMSEs were 0.03616 and 0.05224, respectively. A notable advancement of 20.8% in data processing duration in EMG signal analysis. Optimizing VGG, AlexNet, and ResNet ConvNet as trained and tested on 15 public EEG (62-electrode) and 18 subjects' observed EMG data. The results indicate that VGG16-1D is 98.43% higher. During real testing, the accuracy was 95.8 +/- 4.6% for 16 subjects (6 Amputees-10 Dystonia). This study demonstrates the potential for sEMG, paving the way for biomedical applications.
2023-01-03	A Laplacian Gaussian Mixture Model for Surface EMG Signals of Human Arm Activity	Durgesh Kusuru et.al.	2301.01080v1	null	The probability density function (pdf) of surface Electromyography (sEMG) signals follows any one of the standalone standard distributions: the Gaussian or the Laplacian. Further, the choice of the model is dependent on muscle contraction force (MCF) levels. Hence, a unified model is proposed which explains the statistical nature of sEMG signals at different MCF levels. In this paper, we propose the Laplacian Gaussian Mixture (LGM) model for the signals recorded from upper limbs. This model is able to explain the sEMG signals from different activities corresponding to different MCF levels. The model is tested on different bench-mark sEMG data sets and is validated using both the qualitative and quantitative perspectives. It is determined that for low and medium contraction force levels the proposed mixture model is more accurate than both the Laplacian and the Gaussian models. Whereas for high contraction force level, the LGM model behaves as a Gaussian model. The mixing weights of the LGM model are analyzed and it is observed that for low and medium MCF levels both the mixing weights of LGM model do contribute. Whereas for high contraction force levels the Laplacian weight becomes weaker. The proposed LGM model for sEMG signals from upper limbs explains sEMG signals at different MCF levels. The proposed model helps in improved understanding of statistical nature of sEMG signals and better feature representation in the classification problems.
2022-12-28	Joint Action is a Framework for Understanding Partnerships Between Humans and Upper Limb Prostheses	Michael R. Dawson et.al.	2212.14124v1	null	Recent advances in upper limb prostheses have led to significant improvements in the number of movements provided by the robotic limb. However, the method for controlling multiple degrees of freedom via user-generated signals remains challenging. To address this issue, various machine learning controllers have been developed to better predict movement intent. As these controllers become more intelligent and take on more autonomy in the system, the traditional approach of representing the human-machine interface as a human controlling a tool becomes limiting. One possible approach to improve the understanding of these interfaces is to model them as collaborative, multi-agent systems through the lens of joint action. The field of joint action has been commonly applied to two human partners who are trying to work jointly together to achieve a task, such as singing or moving a table together, by effecting coordinated change in their shared environment. In this work, we compare different prosthesis controllers (proportional electromyography with sequential switching, pattern recognition, and adaptive switching) in terms of how they present the hallmarks of joint action. The results of the comparison lead to a new perspective for understanding how existing myoelectric systems relate to each other, along with recommendations for how to improve these systems by increasing the collaborative communication between each partner.
2022-12-24	Agent-based Modeling and Simulation of Human Muscle For Development of Software to Analyze the Human Gait	Sina Saadati et.al.	2212.12760v1	null	In this research, we are about to present an agentbased model of human muscle which can be used in analysis of human movement. As the model is designed based on the physiological structure of the muscle, The simulation calculations would be natural, and also, It can be possible to analyze human movement using reverse engineering methods. The model is also a suitable choice to be used in modern prostheses, because the calculation of the model is less than other machine learning models such as artificial neural network algorithms and It makes our algorithm battery-friendly. We will also devise a method that can calculate the intensity of human muscle during gait cycle using a reverse engineering solution. The algorithm called Boots is different from some optimization methods, so It would be able to compute the activities of both agonist and antagonist muscles in a joint. As a consequence, By having an agent-based model of human muscle and Boots algorithm, We would be capable to develop software that can calculate the nervous stimulation of human's lower body muscle based on the angular displacement during gait cycle without using painful methods like electromyography. By developing the application as open-source software, We are hopeful to help researchers and physicians who are studying in medical and biomechanical fields.
2022-12-20	Pain level and pain-related behaviour classification using GRU-based sparsely-connected RNNs	Mohammad Mahdi Dehshibi et.al.	2212.14806v1	null	There is a growing body of studies on applying deep learning to biometrics analysis. Certain circumstances, however, could impair the objective measures and accuracy of the proposed biometric data analysis methods. For instance, people with chronic pain (CP) unconsciously adapt specific body movements to protect themselves from injury or additional pain. Because there is no dedicated benchmark database to analyse this correlation, we considered one of the specific circumstances that potentially influence a person's biometrics during daily activities in this study and classified pain level and pain-related behaviour in the EmoPain database. To achieve this, we proposed a sparsely-connected recurrent neural networks (s-RNNs) ensemble with the gated recurrent unit (GRU) that incorporates multiple autoencoders using a shared training framework. This architecture is fed by multidimensional data collected from inertial measurement unit (IMU) and surface electromyography (sEMG) sensors. Furthermore, to compensate for variations in the temporal dimension that may not be perfectly represented in the latent space of s-RNNs, we fused hand-crafted features derived from information-theoretic approaches with represented features in the shared hidden state. We conducted several experiments which indicate that the proposed method outperforms the state-of-the-art approaches in classifying both pain level and pain-related behaviour.

actigraphy

Publish Date	Title	Authors	PDF	Code	Abstract
2023-07-07	A Bayesian Circadian Hidden Markov Model to Infer Rest-Activity Rhythms Using 24-hour Actigraphy Data	Jiachen Lu et.al.	2307.03832v1	null	24-hour actigraphy data collected by wearable devices offer valuable insights into physical activity types, intensity levels, and rest-activity rhythms (RAR). RARs, or patterns of rest and activity exhibited over a 24-hour period, are regulated by the body's circadian system, synchronizing physiological processes with external cues like the light-dark cycle. Disruptions to these rhythms, such as irregular sleep patterns, daytime drowsiness or shift work, have been linked to adverse health outcomes including metabolic disorders, cardiovascular disease, depression, and even cancer, making RARs a critical area of health research. In this study, we propose a Bayesian Circadian Hidden Markov Model (BCHMM) that explicitly incorporates 24-hour circadian oscillators mirroring human biological rhythms. The model assumes that observed activity counts are conditional on hidden activity states through Gaussian emission densities, with transition probabilities modeled by state-specific sinusoidal functions. Our comprehensive simulation study reveals that BCHMM outperforms frequentist approaches in identifying the underlying hidden states, particularly when the activity states are difficult to separate. BCHMM also excels with smaller Kullback-Leibler divergence on estimated densities. With the Bayesian framework, we address the label-switching problem inherent to hidden Markov models via a positive constraint on mean parameters. From the proposed BCHMM, we can infer the 24-hour rest-activity profile via time-varying state probabilities, to characterize the person-level RAR. We demonstrate the utility of the proposed BCHMM using 2011-2014 National Health and Nutrition Examination Survey (NHANES) data, where worsened RAR, indicated by lower probabilities in low-activity state during the day and higher probabilities in high-activity state at night, is associated with an increased risk of diabetes.
2023-03-14	Transfer Learning for Real-time Deployment of a Screening Tool for Depression Detection Using Actigraphy	Rajanikant Ghate et.al.	2303.07847v1	null	Automated depression screening and diagnosis is a highly relevant problem today. There are a number of limitations of the traditional depression detection methods, namely, high dependence on clinicians and biased self-reporting. In recent years, research has suggested strong potential in machine learning (ML) based methods that make use of the user's passive data collected via wearable devices. However, ML is data hungry. Especially in the healthcare domain primary data collection is challenging. In this work, we present an approach based on transfer learning, from a model trained on a secondary dataset, for the real time deployment of the depression screening tool based on the actigraphy data of users. This approach enables machine learning modelling even with limited primary data samples. A modified version of leave one out cross validation approach performed on the primary set resulted in mean accuracy of 0.96, where in each iteration one subject's data from the primary set was set aside for testing.
2023-01-04	KIDS: kinematics-based (in)activity detection and segmentation in a sleep case study	Omar Elnaggar et.al.	2301.03469v1	null	Sleep behaviour and in-bed movements contain rich information on the neurophysiological health of people, and have a direct link to the general well-being and quality of life. Standard clinical practices rely on polysomnography for sleep assessment; however, it is intrusive, performed in unfamiliar environments and requires trained personnel. Progress has been made on less invasive sensor technologies, such as actigraphy, but clinical validation raises concerns over their reliability and precision. Additionally, the field lacks a widely acceptable algorithm, with proposed approaches ranging from raw signal or feature thresholding to data-hungry classification models, many of which are unfamiliar to medical staff. This paper proposes an online Bayesian probabilistic framework for objective (in)activity detection and segmentation based on clinically meaningful joint kinematics, measured by a custom-made wearable sensor. Intuitive three-dimensional visualisations of kinematic timeseries were accomplished through dimension reduction based preprocessing, offering out-of-the-box framework explainability potentially useful for clinical monitoring and diagnosis. The proposed framework attained up to 99.2\% $F_1$-score and 0.96 Pearson's correlation coefficient in, respectively, the posture change detection and inactivity segmentation tasks. The work paves the way for a reliable home-based analysis of movements during sleep which would serve patient-centred longitudinal care plans.
2022-12-31	Definition and clinical validation of Pain Patient States from high-dimensional mobile data: application to a chronic pain cohort	Jenna M. Reinen et.al.	2301.00299v1	null	The technical capacity to monitor patients with a mobile device has drastically expanded, but data produced from this approach are often difficult to interpret. We present a solution to produce a meaningful representation of patient status from large, complex data streams, leveraging both a data-driven approach, and use clinical knowledge to validate results. Data were collected from a clinical trial enrolling chronic pain patients, and included questionnaires, voice recordings, actigraphy, and standard health assessments. The data were reduced using a clustering analysis. In an initial exploratory analysis with only questionnaire data, we found up to 3 stable cluster solutions that grouped symptoms on a positive to negative spectrum. Objective features (actigraphy, speech) expanded the cluster solution granularity. Using a 5 state solution with questionnaire and actigraphy data, we found significant correlations between cluster properties and assessments of disability and quality-of-life. The correlation coefficient values showed an ordinal distinction, confirming the cluster ranking on a negative to positive spectrum. This suggests we captured novel, distinct Pain Patient States with this approach, even when multiple clusters were equated on pain magnitude. Relative to using complex time courses of many variables, Pain Patient States holds promise as an interpretable, useful, and actionable metric for a clinician or caregiver to simplify and provide timely delivery of care.
2022-12-21	A hidden Markov modeling approach combining objective measure of activity and subjective measure of self-reported sleep to estimate the sleep-wake cycle	Semhar B. Ogbagaber et.al.	2212.11224v1	null	Characterizing the sleep-wake cycle in adolescents is an important prerequisite to better understand the association of abnormal sleep patterns with subsequent clinical and behavioral outcomes. The aim of this research was to develop hidden Markov models (HMM) that incorporate both objective (actigraphy) and subjective (sleep log) measures to estimate the sleep-wake cycle using data from the NEXT longitudinal study, a large population-based cohort study. The model was estimated with a negative binomial distribution for the activity counts (1-minute epochs) to account for overdispersion relative to a Poisson process. Furthermore, self-reported measures were dichotomized (for each one-minute interval) and subject to misclassification. We assumed that the unobserved sleep-wake cycle follows a two-state Markov chain with transitional probabilities varying according to a circadian rhythm. Maximum-likelihood estimation using a backward-forward algorithm was applied to fit the longitudinal data on a subject by subject basis. The algorithm was used to reconstruct the sleep-wake cycle from sequences of self-reported sleep and activity data. Furthermore, we conduct simulations to examine the properties of this approach under different observational patterns including both complete and partially observed measurements on each individual.
2022-08-30	Mediation analysis with densities as mediators with an application to iCOMPARE trial	Jingru Zhang et.al.	2208.13939v1	null	Physical activity has long been shown to be associated with biological and physiological performance and risk of diseases. It is of great interest to assess whether the effect of an exposure or intervention on an outcome is mediated through physical activity measured by modern wearable devices such as actigraphy. However, existing methods for mediation analysis focus almost exclusively on mediation variable that is in the Euclidean space, which cannot be applied directly to the actigraphy data of physical activity. Such data is best summarized in the form of an histogram or density. In this paper, we extend the structural equation models (SEMs) to the settings where a density is treated as the mediator to study the indirect mediation effect of physical activity on an outcome. We provide sufficient conditions for identifying the average causal effects of density mediator and present methods for estimating the direct and mediating effects of density on an outcome. We apply our method to the data set from the iCOMPARE trial that compares flexible duty-hour policies and standard duty-hour policies on interns' sleep related outcomes to explore the mediation effect of physical activity on the causal path between flexible duty-hour policies and sleep related outcomes.
2021-11-29	Validating CircaCP: a Generic Sleep-Wake Cycle Detection Algorithm	Shanshan Chen et.al.	2111.14960v1	link	Sleep-wake cycle detection is a key step when extrapolating sleep patterns from actigraphy data. Numerous supervised detection algorithms have been developed with parameters estimated from and optimized for a particular dataset, yet their generalizability from sensor to sensor or study to study is unknown. In this paper, we propose and validate an unsupervised algorithm -- CircaCP -- to detect sleep-wake cycles from minute-by-minute actigraphy data. It first uses a robust cosinor model to estimate circadian rhythm, then searches for a single change point (CP) within each cycle. We used CircaCP to estimate sleep/wake onset times (S/WOTs) from 2125 indviduals' data in the MESA Sleep study and compared the estimated S/WOTs against self-reported S/WOT event markers. Lastly, we quantified the biases between estimated and self-reported S/WOTs, as well as variation in S/WOTs contributed by the two methods, using linear mixed-effects models and variance component analysis. On average, SOTs estimated by CircaCP were five minutes behind those reported by event markers, and WOTs estimated by CircaCP were less than one minute behind those reported by markers. These differences accounted for less than 0.2% variability in SOTs and in WOTs, taking into account other sources of between-subject variations. By focusing on the commonality in human circadian rhythms captured by actigraphy, our algorithm transferred seamlessly from hip-worn ActiGraph data collected from children in our previous study to wrist-worn Actiwatch data collected from adults. The large between- and within-subject variability highlights the need for estimating individual-level S/WOTs when conducting actigraphy research. The generalizability of our algorithm also suggests that it could be widely applied to actigraphy data collected by other wearable sensors.
2021-07-08	Circadian Rhythms are Not Captured Equal: Exploring Circadian Metrics Extracted by Different Computational Methods from Smartphone Accelerometer and GPS Sensors in Daily Life Tracking	Congyu Wu et.al.	2107.04135v1	null	Circadian rhythm is the natural biological cycle manifested in human daily routines. A regular and stable rhythm is found to be correlated with good physical and mental health. With the wide adoption of mobile and wearable technology, many types of sensor data, such as GPS and actigraphy, provide evidence for researchers to objectively quantify the circadian rhythm of a user and further use these quantified metrics of circadian rhythm to infer the user's health status. Researchers in computer science and psychology have investigated circadian rhythm using various mobile and wearable sensors in ecologically valid human sensing studies, but questions remain whether and how different data types produce different circadian rhythm results when simultaneously used to monitor a user. We hypothesize that different sensor data reveal different aspects of the user's daily behavior, thus producing different circadian rhythm patterns. In this paper we focus on two data types: GPS and accelerometer data from smartphones. We used smartphone data from 225 college student participants and applied four circadian rhythm characterization methods. We found significant and interesting discrepancies in the rhythmic patterns discovered among sensors, which suggests circadian rhythms discovered from different personal tracking sensors have different levels of sensitivity to device usage and aspects of daily behavior.
2021-07-01	Long-Short Ensemble Network for Bipolar Manic-Euthymic State Recognition Based on Wrist-worn Sensors	Ulysse Côté-Allard et.al.	2107.00710v3	link	Manic episodes of bipolar disorder can lead to uncritical behaviour and delusional psychosis, often with destructive consequences for those affected and their surroundings. Early detection and intervention of a manic episode are crucial to prevent escalation, hospital admission and premature death. However, people with bipolar disorder may not recognize that they are experiencing a manic episode and symptoms such as euphoria and increased productivity can also deter affected individuals from seeking help. This work proposes to perform user-independent, automatic mood-state detection based on actigraphy and electrodermal activity acquired from a wrist-worn device during mania and after recovery (euthymia). This paper proposes a new deep learning-based ensemble method leveraging long (20h) and short (5 minutes) time-intervals to discriminate between the mood-states. When tested on 47 bipolar patients, the proposed classification scheme achieves an average accuracy of 91.59% in euthymic/manic mood-state recognition.
2021-05-05	Activity-Aware Deep Cognitive Fatigue Assessment using Wearables	Mohammad Arif Ul Alam et.al.	2105.02824v1	null	Cognitive fatigue has been a common problem among workers which has become an increasing global problem since the emergence of COVID-19 as a global pandemic. While existing multi-modal wearable sensors-aided automatic cognitive fatigue monitoring tools have focused on physical and physiological sensors (ECG, PPG, Actigraphy) analytic on specific group of people (say gamers, athletes, construction workers), activity-awareness is utmost importance due to its different responses on physiology in different person. In this paper, we propose a novel framework, Activity-Aware Recurrent Neural Network (\emph{AcRoNN}), that can generalize individual activity recognition and improve cognitive fatigue estimation significantly. We evaluate and compare our proposed method with state-of-art methods using one real-time collected dataset from 5 individuals and another publicly available dataset from 27 individuals achieving max. 19% improvement.
2021-04-28	Optimizing Rescoring Rules with Interpretable Representations of Long-Term Information	Aaron Fisher et.al.	2104.14291v1	null	Analyzing temporal data (e.g., wearable device data) requires a decision about how to combine information from the recent and distant past. In the context of classifying sleep status from actigraphy, Webster's rescoring rules offer one popular solution based on the long-term patterns in the output of a moving-window model. Unfortunately, the question of how to optimize rescoring rules for any given setting has remained unsolved. To address this problem and expand the possible use cases of rescoring rules, we propose rephrasing these rules in terms of epoch-specific features. Our features take two general forms: (1) the time lag between now and the most recent [or closest upcoming] bout of time spent in a given state, and (2) the length of the most recent [or closest upcoming] bout of time spent in a given state. Given any initial moving window model, these features can be defined recursively, allowing for straightforward optimization of rescoring rules. Joint optimization of the moving window model and the subsequent rescoring rules can also be implemented using gradient-based optimization software, such as Tensorflow. Beyond binary classification problems (e.g., sleep-wake), the same approach can be applied to summarize long-term patterns for multi-state classification problems (e.g., sitting, walking, or stair climbing). We find that optimized rescoring rules improve the performance of sleep-wake classifiers, achieving accuracy comparable to that of certain neural network architectures.
2021-01-05	Bayesian Hierarchical Modeling and Analysis for Actigraph Data from Wearable Devices	Pierfrancesco Alaimo Di Loro et.al.	2101.01624v4	link	The majority of Americans fail to achieve recommended levels of physical activity, which leads to numerous preventable health problems such as diabetes, hypertension, and heart diseases. This has generated substantial interest in monitoring human activity to gear interventions toward environmental features that may relate to higher physical activity. Wearable devices, such as wrist-worn sensors that monitor gross motor activity (actigraph units) continuously record the activity levels of a subject, producing massive amounts of high-resolution measurements. Analyzing actigraph data needs to account for spatial and temporal information on trajectories or paths traversed by subjects wearing such devices. Inferential objectives include estimating a subject's physical activity levels along a given trajectory; identifying trajectories that are more likely to produce higher levels of physical activity for a given subject; and predicting expected levels of physical activity in any proposed new trajectory for a given set of health attributes. Here, we devise a Bayesian hierarchical modeling framework for spatial-temporal actigraphy data to deliver fully model-based inference on trajectories while accounting for subject-level health attributes and spatial-temporal dependencies. We undertake a comprehensive analysis of an original dataset from the Physical Activity through Sustainable Transport Approaches in Los Angeles (PASTA-LA) study to ascertain spatial zones and trajectories exhibiting significantly higher levels of physical activity while accounting for various sources of heterogeneity.
2020-11-14	Using Convolutional Variational Autoencoders to Predict Post-Trauma Health Outcomes from Actigraphy Data	Ayse S. Cakmak et.al.	2011.07406v2	null	Depression and post-traumatic stress disorder (PTSD) are psychiatric conditions commonly associated with experiencing a traumatic event. Estimating mental health status through non-invasive techniques such as activity-based algorithms can help to identify successful early interventions. In this work, we used locomotor activity captured from 1113 individuals who wore a research grade smartwatch post-trauma. A convolutional variational autoencoder (VAE) architecture was used for unsupervised feature extraction from four weeks of actigraphy data. By using VAE latent variables and the participant's pre-trauma physical health status as features, a logistic regression classifier achieved an area under the receiver operating characteristic curve (AUC) of 0.64 to estimate mental health outcomes. The results indicate that the VAE model is a promising approach for actigraphy data analysis for mental health outcomes in long-term studies.
2020-08-06	Fatigue Assessment using ECG and Actigraphy Sensors	Yang Bai et.al.	2008.02871v2	link	Fatigue is one of the key factors in the loss of work efficiency and health-related quality of life, and most fatigue assessment methods were based on self-reporting, which may suffer from many factors such as recall bias. To address this issue, we developed an automated system using wearable sensing and machine learning techniques for objective fatigue assessment. ECG/Actigraphy data were collected from subjects in free-living environments. Preprocessing and feature engineering methods were applied, before interpretable solution and deep learning solution were introduced. Specifically, for interpretable solution, we proposed a feature selection approach which can select less correlated and high informative features for better understanding system's decision-making process. For deep learning solution, we used state-of-the-art self-attention model, based on which we further proposed a consistency self-attention (CSA) mechanism for fatigue assessment. Extensive experiments were conducted, and very promising results were achieved.
2019-06-03	Deep learning from wristband sensor data: towards wearable, non-invasive seizure forecasting	Christian Meisel et.al.	1906.00511v2	null	Seizure forecasting may provide patients with timely warnings to adapt their daily activities and help clinicians deliver more objective, personalized treatments. While recent work has convincingly demonstrated that seizure risk assessment is possible, these early approaches relied largely on complex, often invasive setups including intracranial electrocorticography, implanted devices and multi-channel EEG, which limits translation of these methods to broad clinical application. To facilitate broader adaptation of seizure forecasting in clinical practice, non-invasive, easily applicable techniques that reliably assess seizure risk, in combination with clinical information, are crucial. Wristbands that continuously record physiological parameters, including electrodermal activity, body temperature, blood volume pressure and actigraphy, may afford monitoring of autonomous nervous system function and movement relevant for such a task, hence minimizing potential complications associated with invasive monitoring, and avoiding stigma associated with bulky external monitoring devices on the head. Here, we use deep learning to analyze long-term, multi-modal wristband sensor data from 50 patients with epilepsy (total duration $>$1400 hours) to assess its capability to distinguish preictal from interictal states. Prediction performance is assessed using area under the receiver operating charateristic (AUC) and improvement over chance (IoC) based on F1 scores. Using one- and two-dimensional convolutional neural networks, we identified better-than-chance predictability in out-of-sample test data in 60\% of the patients in leave-one-out and 43\% of patients in pseudo-prospective approaches. These results provide a step towards developing easier to apply, non-invasive methods for seizure risk assessments in patients with epilepsy.
2019-03-28	A Generic Algorithm for Sleep-Wake Cycle Detection using Unlabeled Actigraphy Data	Shanshan Chen et.al.	1904.05313v1	null	One key component when analyzing actigraphy data for sleep studies is sleep-wake cycle detection. Most detection algorithms rely on accurate sleep diary labels to generate supervised classifiers, with parameters optimized for a particular dataset. However, once the actigraphy trackers are deployed in the field, labels for training models and validating detection accuracy are often not available. In this paper, we propose a generic, training-free algorithm to detect sleep-wake cycles from minute-by-minute actigraphy. Leveraging a robust nonlinear parametric model, our proposed method refines the detection region by searching for a single change point within bounded regions defined by the parametric model. Challenged by the absence of ground truth labels, we also propose an evaluation metric dedicated to this problem. Tested on week-long actigraphy from 112 children, the results show that the proposed algorithm improves on the baseline model consistently and significantly (p<3e-15). Moreover, focusing on the commonality in human circadian rhythm captured by actigraphy, the proposed method is generic to data collected by various actigraphy trackers, circumventing the laborious label collection step in developing customized classifiers for sleep detection.
2019-02-10	Classifying attention deficit hyperactivity disorder in children with non-linearities in actigraphy	Jeremi K. Ochab et.al.	1902.03530v1	null	Objective This study provides an objective measure based on actigraphy for Attention Deficit Hyperactivity Disorder (ADHD) diagnosis in children. We search for motor activity features that could allow further investigation into their association with other neurophysiological disordered traits. Method The study involved $n=29$ (48 eligible) male participants aged $9.89\pm0.92$ years (8 controls, and 7 in each group: ADHD combined subtype, ADHD hyperactive-impulsive subtype, and autism spectrum disorder, ASD) wearing a wristwatch actigraph continuously for a week ($9\%$ losses in daily records) in two acquisition modes. We analyzed 47 quantities: from sleep duration or movement intensity to theory-driven scaling exponents or non-linear prediction errors of both diurnal and nocturnal activity. We used them in supervised classification to obtain cross-validated diagnostic performance. Results We report the best performing measures, including a nearest neighbors 4-feature classifier providing $69.4\pm1.6\%$ accuracy, $78.0\pm2.2\%$ sensitivity and $60.8\pm2.6\%$ specificity in a binary ADHD vs control classification and $46.5\pm1.1\%$ accuracy (against $25\%$ baseline), $61.8\pm1.4\%$ sensitivity and $79.30 \pm0.43\%$ specificity in 4-class task (two ADHD subtypes, ASD, and control). The most informative feature is skewness of the shape of Zero Crossing Mode (ZCM) activity. Mean and standard deviation of nocturnal activity are among the least informative. Conclusion Actigraphy causes only minor discomfort to the subjects and is inexpensive. The range of existing mathematical and machine learning tools also allow it to be a useful add-on test for ADHD or differential diagnosis between ADHD subtypes. The study was limited to a small, male sample without the inattentive ADHD subtype.
2018-12-03	A Hidden Markov Model Based Unsupervised Algorithm for Sleep/Wake Identification Using Actigraphy	Xinyue Li et.al.	1812.00553v2	null	Actigraphy is widely used in sleep studies but lacks a universal unsupervised algorithm for sleep/wake identification. In this study, we proposed a Hidden Markov Model (HMM) based unsupervised algorithm that can automatically and effectively infer sleep/wake states. It is an individualized data-driven approach that analyzes actigraphy from each individual respectively to learn activity characteristics and further separate sleep and wake states. We used Actiwatch and polysomnography (PSG) data from 43 individuals in the Multi-Ethnic Study of Atherosclerosis to evaluate the performance of our method. Epoch-by-epoch comparisons were made between our HMM algorithm and that embedded in the Actiwatch software (AS). The percent agreement between HMM and PSG was 85.7%, and that between AS and PSG was 84.7%. Positive predictive values for sleep epochs were 85.6% and 84.6% for HMM and AS, respectively, and 95.5% and 85.6% for wake epochs. Both methods have similar performance and tend to overestimate sleep and underestimate wake compared to PSG. Our HMM approach is able to quantify the variability in activity counts that allow us to differentiate relatively active and sedentary individuals: individuals with higher estimated variabilities tend to show more frequent sedentary behaviors. In conclusion, our unsupervised data-driven HMM algorithm achieves slightly better performance compared to the commonly used algorithm in the Actiwatch software. HMM can help expand the application of actigraphy in large-scale studies and in cases where intrusive PSG is hard to acquire or unavailable. In addition, the estimated HMM parameters can characterize individual activity patterns that can be utilized for further analysis.
2018-08-20	Bayesian Function-on-Scalars Regression for High Dimensional Data	Daniel R. Kowal et.al.	1808.06689v2	null	We develop a fully Bayesian framework for function-on-scalars regression with many predictors. The functional data response is modeled nonparametrically using unknown basis functions, which produces a flexible and data-adaptive functional basis. We incorporate shrinkage priors that effectively remove unimportant scalar covariates from the model and reduce sensitivity to the number of (unknown) basis functions. For variable selection in functional regression, we propose a decision theoretic posterior summarization technique, which identifies a subset of covariates that retains nearly the predictive accuracy of the full model. Our approach is broadly applicable for Bayesian functional regression models, and unlike existing methods provides joint rather than marginal selection of important predictor variables. Computationally scalable posterior inference is achieved using a Gibbs sampler with linear time complexity in the number of predictors. The resulting algorithm is empirically faster than existing frequentist and Bayesian techniques, and provides joint estimation of model parameters, prediction and imputation of functional trajectories, and uncertainty quantification via the posterior distribution. A simulation study demonstrates improvements in estimation accuracy, uncertainty quantification, and variable selection relative to existing alternatives. The methodology is applied to actigraphy data to investigate the association between intraday physical activity and responses to a sleep questionnaire.
2018-04-25	The Intelligent ICU Pilot Study: Using Artificial Intelligence Technology for Autonomous Patient Monitoring	Anis Davoudi et.al.	1804.10201v2	null	Currently, many critical care indices are repetitively assessed and recorded by overburdened nurses, e.g. physical function or facial pain expressions of nonverbal patients. In addition, many essential information on patients and their environment are not captured at all, or are captured in a non-granular manner, e.g. sleep disturbance factors such as bright light, loud background noise, or excessive visitations. In this pilot study, we examined the feasibility of using pervasive sensing technology and artificial intelligence for autonomous and granular monitoring of critically ill patients and their environment in the Intensive Care Unit (ICU). As an exemplar prevalent condition, we also characterized delirious and non-delirious patients and their environment. We used wearable sensors, light and sound sensors, and a high-resolution camera to collected data on patients and their environment. We analyzed collected data using deep learning and statistical analysis. Our system performed face detection, face recognition, facial action unit detection, head pose detection, facial expression recognition, posture recognition, actigraphy analysis, sound pressure and light level detection, and visitation frequency detection. We were able to detect patient's face (Mean average precision (mAP)=0.94), recognize patient's face (mAP=0.80), and their postures (F1=0.94). We also found that all facial expressions, 11 activity features, visitation frequency during the day, visitation frequency during the night, light levels, and sound pressure levels during the night were significantly different between delirious and non-delirious patients (p-value<0.05). In summary, we showed that granular and autonomous monitoring of critically ill patients and their environment is feasible and can be used for characterizing critical care conditions and related environment factors.
2018-03-31	Continuous Circadian Phase Estimation Using Adaptive Notch Filter	Wei Qiao et.al.	1804.00115v1	null	Actigraphy has been widely used for the analysis of circadian rhythm. Current practice applies regression analysis to data from multiple days to estimate the circadian phase. This paper presents a filtering method for online processing of biometric data to estimate the circadian phase. We apply the proposed method on actigraphy data of fruit flies (Drosophila melanogaster).
2018-02-22	Actigraphy-based Sleep/Wake Pattern Detection using Convolutional Neural Networks	Lena Granovsky et.al.	1802.07945v1	null	Common medical conditions are often associated with sleep abnormalities. Patients with medical disorders often suffer from poor sleep quality compared to healthy individuals, which in turn may worsen the symptoms of the disorder. Accurate detection of sleep/wake patterns is important in developing personalized digital markers, which can be used for objective measurements and efficient disease management. Big Data technologies and advanced analytics methods hold the promise to revolutionize clinical research processes, enabling the effective blending of digital data into clinical trials. Actigraphy, a non-invasive activity monitoring method is heavily used to detect and evaluate activities and movement disorders, and assess sleep/wake behavior. In order to study the connection between sleep/wake patterns and a cluster headache disorder, activity data was collected using a wearable device in the course of a clinical trial. This study presents two novel modeling schemes that utilize Deep Convolutional Neural Networks (CNN) to identify sleep/wake states. The proposed methods are a sequential CNN, reminiscent of the bi-directional CNN for slot filling, and a Multi-Task Learning (MTL) based model. Furthermore, we expand standard "Sleep" and "Wake" activity states space by adding the "Falling asleep" and "Siesta" states. We show that the proposed methods provide promising results in accurate detection of the expanded sleep/wake states. Finally, we explore the relations between the detected sleep/wake patterns and onset of cluster headache attacks, and present preliminary observations.
2017-12-27	Co-Morbidity Exploration on Wearables Activity Data Using Unsupervised Pre-training and Multi-Task Learning	Karan Aggarwal et.al.	1712.09527v1	null	Physical activity and sleep play a major role in the prevention and management of many chronic conditions. It is not a trivial task to understand their impact on chronic conditions. Currently, data from electronic health records (EHRs), sleep lab studies, and activity/sleep logs are used. The rapid increase in the popularity of wearable health devices provides a significant new data source, making it possible to track the user's lifestyle real-time through web interfaces, both to consumer as well as their healthcare provider, potentially. However, at present there is a gap between lifestyle data (e.g., sleep, physical activity) and clinical outcomes normally captured in EHRs. This is a critical barrier for the use of this new source of signal for healthcare decision making. Applying deep learning to wearables data provides a new opportunity to overcome this barrier. To address the problem of the unavailability of clinical data from a major fraction of subjects and unrepresentative subject populations, we propose a novel unsupervised (task-agnostic) time-series representation learning technique called act2vec. act2vec learns useful features by taking into account the co-occurrence of activity levels along with periodicity of human activity patterns. The learned representations are then exploited to boost the performance of disorder-specific supervised learning models. Furthermore, since many disorders are often related to each other, a phenomenon referred to as co-morbidity, we use a multi-task learning framework for exploiting the shared structure of disorder inducing life-style choices partially captured in the wearables data. Empirical evaluation using actigraphy data from 4,124 subjects shows that our proposed method performs and generalizes substantially better than the conventional time-series symbolic representational methods and task-specific deep learning models.
2017-12-18	Activity and Circadian Rhythm of Sepsis Patients in the Intensive Care Unit	Anis Davoudi et.al.	1712.06631v1	null	Early mobilization of critically ill patients in the Intensive Care Unit (ICU) can prevent adverse outcomes such as delirium and post-discharge physical impairment. To date, no studies have characterized activity of sepsis patients in the ICU using granular actigraphy data. This study characterizes the activity of sepsis patients in the ICU to aid in future mobility interventions. We have compared the actigraphy features of 24 patients in four groups: Chronic Critical Illness (CCI) sepsis patients in the ICU, Rapid Recovery (RR) sepsis patients in the ICU, non-sepsis ICU patients (control-ICU), and healthy subjects. We used several statistical and circadian rhythm features extracted from the patients' actigraphy data collected over a five-day period. Our results show that the four groups are significantly different in terms of activity features. In addition, we observed that the CCI and control-ICU patients show less regularity in their circadian rhythm compared to the RR patients. These results show the potential of using actigraphy data for guiding mobilization practices, classifying sepsis recovery subtype, as well as for tracking patients' recovery.
2017-11-02	Sleep Stage Classification Based on Multi-level Feature Learning and Recurrent Neural Networks via Wearable Device	Xin Zhang et.al.	1711.00629v1	null	This paper proposes a practical approach for automatic sleep stage classification based on a multi-level feature learning framework and Recurrent Neural Network (RNN) classifier using heart rate and wrist actigraphy derived from a wearable device. The feature learning framework is designed to extract low- and mid-level features. Low-level features capture temporal and frequency domain properties and mid-level features learn compositions and structural information of signals. Since sleep staging is a sequential problem with long-term dependencies, we take advantage of RNNs with Bidirectional Long Short-Term Memory (BLSTM) architectures for sequence data learning. To simulate the actual situation of daily sleep, experiments are conducted with a resting group in which sleep is recorded in resting state, and a comprehensive group in which both resting sleep and non-resting sleep are included.We evaluate the algorithm based on an eight-fold cross validation to classify five sleep stages (W, N1, N2, N3, and REM). The proposed algorithm achieves weighted precision, recall and F1 score of 58.0%, 60.3%, and 58.2% in the resting group and 58.5%, 61.1%, and 58.5% in the comprehensive group, respectively. Various comparison experiments demonstrate the effectiveness of feature learning and BLSTM. We further explore the influence of depth and width of RNNs on performance. Our method is specially proposed for wearable devices and is expected to be applicable for long-term sleep monitoring at home. Without using too much prior domain knowledge, our method has the potential to generalize sleep disorder detection.
2017-05-10	Visualization of Wearable Data and Biometrics for Analysis and Recommendations in Childhood Obesity	Michael Aupetit et.al.	1705.03691v1	null	Obesity is one of the major health risk factors be- hind the rise of non-communicable conditions. Understanding the factors influencing obesity is very complex since there are many variables that can affect the health behaviors leading to it. Nowadays, multiple data sources can be used to study health behaviors, such as wearable sensors for physical activity and sleep, social media, mobile and health data. In this paper we describe the design of a dashboard for the visualization of actigraphy and biometric data from a childhood obesity camp in Qatar. This dashboard allows quantitative discoveries that can be used to guide patient behavior and orient qualitative research.
2017-02-13	On multifractals: a non-linear study of actigraphy data	Lucas Gabriel Souza França et.al.	1702.03912v2	link	This work aimed, to determine the characteristics of activity series from fractal geometry concepts application, in addition to evaluate the possibility of identifying individuals with fibromyalgia. Activity level data were collected from 27 healthy subjects and 27 fibromyalgia patients, with the use of clock-like devices equipped with accelerometers, for about four weeks, all day long. The activity series were evaluated through fractal and multifractal methods. Hurst exponent analysis exhibited values according to other studies ($H>0.5$) for both groups ($H=0.98\pm0.04$ for healthy subjects and $H=0.97\pm0.03$ for fibromyalgia patients), however, it is not possible to distinguish between the two groups by such analysis. Activity time series also exhibited a multifractal pattern. A paired analysis of the spectra indices for the sleep and awake states revealed differences between healthy subjects and fibromyalgia patients. The individuals feature differences between awake and sleep states, having statistically significant differences for $\alpha_{q-} - \alpha_{0}$ in healthy subjects ($p = 0.014$) and $D_{0}$ for patients with fibromyalgia ($p = 0.013$). The approach has proven to be an option on the characterisation of such kind of signals and was able to differ between both healthy and fibromyalgia groups. This outcome suggests changes in the physiologic mechanisms of movement control.
2016-09-12	Hearables: Multimodal physiological in-ear sensing	Valentin Goverdovsky et.al.	1609.03330v2	null	Future health systems require the means to assess and track the neural and physiological function of a user over long periods of time and in the community. Human body responses are manifested through multiple modalities, such as the mechanical, electrical and chemical; yet current physiological monitors (actigraphy, heart rate) largely lack in both the desired cross-modal and non-stigmatizing aspects. We address these challenges through an inconspicuous and comfortable earpiece, equipped with miniature multimodal sensors, which benefits from the relatively stable position of the ear canal with respect to vital organs to robustly measure the brain, cardiac and respiratory functions. Comprehensive experiments validate each modality within the proposed earpiece, while its potential in health monitoring is illustrated through case studies. We further demonstrate how combining data from multiple sensors within such an integrated wearable device improves both the accuracy of measurements and the ability to deal with artifacts in real-life scenarios.
2016-07-30	Learning Tree-Structured Detection Cascades for Heterogeneous Networks of Embedded Devices	Hamid Dadkhahi et.al.	1608.00159v4	null	In this paper, we present a new approach to learning cascaded classifiers for use in computing environments that involve networks of heterogeneous and resource-constrained, low-power embedded compute and sensing nodes. We present a generalization of the classical linear detection cascade to the case of tree-structured cascades where different branches of the tree execute on different physical compute nodes in the network. Different nodes have access to different features, as well as access to potentially different computation and energy resources. We concentrate on the problem of jointly learning the parameters for all of the classifiers in the cascade given a fixed cascade architecture and a known set of costs required to carry out the computation at each node.To accomplish the objective of joint learning of all detectors, we propose a novel approach to combining classifier outputs during training that better matches the hard cascade setting in which the learned system will be deployed. This work is motivated by research in the area of mobile health where energy efficient real time detectors integrating information from multiple wireless on-body sensors and a smart phone are needed for real-time monitoring and delivering just- in-time adaptive interventions. We apply our framework to two activity recognition datasets as well as the problem of cigarette smoking detection from a combination of wrist-worn actigraphy data and respiration chest band data.
2016-07-24	Impact of Physical Activity on Sleep:A Deep Learning Based Exploration	Aarti Sathyanarayana et.al.	1607.07034v1	null	The importance of sleep is paramount for maintaining physical, emotional and mental wellbeing. Though the relationship between sleep and physical activity is known to be important, it is not yet fully understood. The explosion in popularity of actigraphy and wearable devices, provides a unique opportunity to understand this relationship. Leveraging this information source requires new tools to be developed to facilitate data-driven research for sleep and activity patient-recommendations. In this paper we explore the use of deep learning to build sleep quality prediction models based on actigraphy data. We first use deep learning as a pure model building device by performing human activity recognition (HAR) on raw sensor data, and using deep learning to build sleep prediction models. We compare the deep learning models with those build using classical approaches, i.e. logistic regression, support vector machines, random forest and adaboost. Secondly, we employ the advantage of deep learning with its ability to handle high dimensional datasets. We explore several deep learning models on the raw wearable sensor output without performing HAR or any other feature extraction. Our results show that using a convolutional neural network on the raw wearables output improves the predictive value of sleep quality from physical activity, by an additional 8% compared to state-of-the-art non-deep learning approaches, which itself shows a 15% improvement over current practice. Moreover, utilizing deep learning on raw data eliminates the need for data pre-processing and simplifies the overall workflow to analyze actigraphy data for sleep and physical activity research.

Electroencephalography

Publish Date	Title	Authors	PDF	Code	Abstract
2023-07-26	NeuroHeed: Neuro-Steered Speaker Extraction using EEG Signals	Zexu Pan et.al.	2307.14303v1	null	Humans possess the remarkable ability to selectively attend to a single speaker amidst competing voices and background noise, known as selective auditory attention. Recent studies in auditory neuroscience indicate a strong correlation between the attended speech signal and the corresponding brain's elicited neuronal activities, which the latter can be measured using affordable and non-intrusive electroencephalography (EEG) devices. In this study, we present NeuroHeed, a speaker extraction model that leverages EEG signals to establish a neuronal attractor which is temporally associated with the speech stimulus, facilitating the extraction of the attended speech signal in a cocktail party scenario. We propose both an offline and an online NeuroHeed, with the latter designed for real-time inference. In the online NeuroHeed, we additionally propose an autoregressive speaker encoder, which accumulates past extracted speech signals for self-enrollment of the attended speaker information into an auditory attractor, that retains the attentional momentum over time. Online NeuroHeed extracts the current window of the speech signals with guidance from both attractors. Experimental results demonstrate that NeuroHeed effectively extracts brain-attended speech signals, achieving high signal quality, excellent perceptual quality, and intelligibility in a two-speaker scenario.
2023-07-17	How time window influences biometrics performance: an EEG-based fingerprints connectivity study	Luca Didaci et.al.	2307.08291v2	null	EEG-based biometric represents a relatively recent research field that aims to recognize individuals based on their recorded brain activity by means of electroencephalography (EEG). Among the numerous features that have been proposed, connectivity-based approaches represent one of the more promising methods tested so far. In this paper, we investigate how the performance of an EEG biometric system varies with respect to different time windows to understand if it is possible to define the optimal duration of EEG signal that can be used to extract those distinctive features. Overall, the results have shown a pronounced effect of the time window on the biometric performance measured in terms of EER (equal error rate) and AUC (area under the curve), with an evident increase of the biometric performance with an increase of the time window. In conclusion, we want to highlight that EEG connectivity has the potential to represent an optimal candidate as EEG fingerprint and that, in this context, it is very important to define a sufficient time window able to collect the subject specific features. Moreover, our preliminary results show that extending the window size beyond a certain maximum does not improve biometric systems' performance.
2023-07-13	Corticomorphic Hybrid CNN-SNN Architecture for EEG-based Low-footprint Low-latency Auditory Attention Detection	Richard Gall et.al.	2307.08501v1	null	In a multi-speaker "cocktail party" scenario, a listener can selectively attend to a speaker of interest. Studies into the human auditory attention network demonstrate cortical entrainment to speech envelopes resulting in highly correlated Electroencephalography (EEG) measurements. Current trends in EEG-based auditory attention detection (AAD) using artificial neural networks (ANN) are not practical for edge-computing platforms due to longer decision windows using several EEG channels, with higher power consumption and larger memory footprint requirements. Nor are ANNs capable of accurately modeling the brain's top-down attention network since the cortical organization is complex and layer. In this paper, we propose a hybrid convolutional neural network-spiking neural network (CNN-SNN) corticomorphic architecture, inspired by the auditory cortex, which uses EEG data along with multi-speaker speech envelopes to successfully decode auditory attention with low latency down to 1 second, using only 8 EEG electrodes strategically placed close to the auditory cortex, at a significantly higher accuracy of 91.03%, compared to the state-of-the-art. Simultaneously, when compared to a traditional CNN reference model, our model uses ~15% fewer parameters at a lower bit precision resulting in ~57% memory footprint reduction. The results show great promise for edge-computing in brain-embedded devices, like smart hearing aids.
2023-07-06	A Hybrid End-to-End Spatio-Temporal Attention Neural Network with Graph-Smooth Signals for EEG Emotion Recognition	Shadi Sartipi et.al.	2307.03068v1	null	Recently, physiological data such as electroencephalography (EEG) signals have attracted significant attention in affective computing. In this context, the main goal is to design an automated model that can assess emotional states. Lately, deep neural networks have shown promising performance in emotion recognition tasks. However, designing a deep architecture that can extract practical information from raw data is still a challenge. Here, we introduce a deep neural network that acquires interpretable physiological representations by a hybrid structure of spatio-temporal encoding and recurrent attention network blocks. Furthermore, a preprocessing step is applied to the raw data using graph signal processing tools to perform graph smoothing in the spatial domain. We demonstrate that our proposed architecture exceeds state-of-the-art results for emotion classification on the publicly available DEAP dataset. To explore the generality of the learned model, we also evaluate the performance of our architecture towards transfer learning (TL) by transferring the model parameters from a specific source to other target domains. Using DEAP as the source dataset, we demonstrate the effectiveness of our model in performing cross-modality TL and improving emotion classification accuracy on DREAMER and the Emotional English Word (EEWD) datasets, which involve EEG-based emotion classification tasks with different stimuli.
2023-07-06	Trends in Machine Learning and Electroencephalogram (EEG): A Review for Undergraduate Researchers	Nathan Koome Murungi et.al.	2307.02819v1	null	This paper presents a systematic literature review on Brain-Computer Interfaces (BCIs) in the context of Machine Learning. Our focus is on Electroencephalography (EEG) research, highlighting the latest trends as of 2023. The objective is to provide undergraduate researchers with an accessible overview of the BCI field, covering tasks, algorithms, and datasets. By synthesizing recent findings, our aim is to offer a fundamental understanding of BCI research, identifying promising avenues for future investigations.
2023-07-06	Brain Computer Interface (BCI) based on Electroencephalographic (EEG) patterns due to new cognitive tasks	Zahmeeth Sayed Sakkaff et.al.	2307.02780v1	null	New mental tasks were investigated for suitability in Brain-Computer Interface (BCI). Electroencephalography (EEG) signals were collected and analyzed to identify these mental tasks. MS Windows-based software was developed for investigating and classifying recorded EEG data with unnecessary frequencies filtered out with Bandpass filtering. To identify the best feature vector construction method for a given mental task, feature vectors were constructed using Bandpower, Principal Component Analysis, and Downsampling separately. These feature vectors were then classified with Linear Discriminant Analysis, Linear Support Vector Machines, Critical Distance Classifiers, Nearest Neighbor Classifiers, and their Non-Linear counterparts to find the best-performing classifier. For comparison purposes, performances of already well-known mental tasks in the BCI community were computed along with that of new mental tasks introduced in this thesis. In the preliminary studies, it was found that the most promising new mental task which a BCI system could identify is the imagination of hitting a given square with an imaginary arrow from above (or below) and right, (or left) to the screen. The group of these mental tasks was named as 'Hit Series' (HS). A detailed investigation of HS was carried out and compared with the performance of Motor Imagery (MI) events which are the most heavily used mental tasks in EEG-based BCI systems. One subject achieved the maximum average performance for HS, 100 pct in the binary classifications while 99 pct in overall combined performance. The best average performances of the other two subjects for the same mental tasks were 93 pct and 87pct with the overall performance of 89 pct and 78 pct. Performances of the same three subjects for mental tasks in MI were relatively poor. The average performances were 92, 78, and 92 pct while overall performances were 87, 69, and 88 pct.
2023-07-04	K-complex Detection Using Fourier Spectrum Analysis In EEG	Alexey Protopopov et.al.	2307.01754v1	null	K-complexes are an important marker of brain activity and are used both in clinical practice to perform sleep scoring, and in research. However, due to the size of electroencephalography (EEG) records, as well as the subjective nature of K-complex detection performed by somnologists, it is reasonable to automate K-complex detection. Previous works in this field of research have relied on the values of true positive rate and false positive rate to quantify the effectiveness of proposed methods, however this set of metrics may be misleading. The objective of the present research is to find a more accurate set of metrics and use them to develop a new method of K-complex detection, which would not rely on neural networks. Thus, the present article proposes two new methods for K-complex detection based on the fast Fourier transform. The results achieved demonstrated that the proposed methods offered a quality of K-complex detection that is either similar or superior to the quality of the methods demonstrated in previous works, including the methods employing neural networks, while requiring less computational power, meaning that K-complex detection does not require the use of neural networks. The proposed methods were evaluated using a new set of metrics, which is more representative of the quality of K-complex detection.
2023-07-04	Sensors and Systems for Monitoring Mental Fatigue: A systematic review	Prabin Sharma et.al.	2307.01666v1	null	Mental fatigue is a leading cause of motor vehicle accidents, medical errors, loss of workplace productivity, and student disengagements in e-learning environment. Development of sensors and systems that can reliably track mental fatigue can prevent accidents, reduce errors, and help increase workplace productivity. This review provides a critical summary of theoretical models of mental fatigue, a description of key enabling sensor technologies, and a systematic review of recent studies using biosensor-based systems for tracking mental fatigue in humans. We conducted a systematic search and review of recent literature which focused on detection and tracking of mental fatigue in humans. The search yielded 57 studies (N=1082), majority of which used electroencephalography (EEG) based sensors for tracking mental fatigue. We found that EEG-based sensors can provide a moderate to good sensitivity for fatigue detection. Notably, we found no incremental benefit of using high-density EEG sensors for application in mental fatigue detection. Given the findings, we provide a critical discussion on the integration of wearable EEG and ambient sensors in the context of achieving real-world monitoring. Future work required to advance and adapt the technologies toward widespread deployment of wearable sensors and systems for fatigue monitoring in semi-autonomous and autonomous industries is examined.
2023-06-27	Network inference in a stochastic multi-population neural mass model via approximate Bayesian computation	Susanne Ditlevsen et.al.	2306.15787v1	link	In this article, we propose a 6N-dimensional stochastic differential equation (SDE), modelling the activity of N coupled populations of neurons in the brain. This equation extends the Jansen and Rit neural mass model, which has been introduced to describe human electroencephalography (EEG) rhythms, in particular signals with epileptic activity. Our contributions are threefold: First, we introduce this stochastic N-population model and construct a reliable and efficient numerical method for its simulation, extending a splitting procedure for one neural population. Second, we present a modified Sequential Monte Carlo Approximate Bayesian Computation (SMC-ABC) algorithm to infer both the continuous and the discrete model parameters, the latter describing the coupling directions within the network. The proposed algorithm further develops a previous reference-table acceptance rejection ABC method, initially proposed for the inference of one neural population. On the one hand, the considered SMC-ABC approach reduces the computational cost due to the basic acceptance-rejection scheme. On the other hand, it is designed to account for both marginal and coupled interacting dynamics, allowing to identify the directed connectivity structure. Third, we illustrate the derived algorithm on both simulated data and real multi-channel EEG data, aiming to infer the brain's connectivity structure during epileptic seizure. The proposed algorithm may be used for parameter and network estimation in other multi-dimensional coupled SDEs for which a suitable numerical simulation method can be derived.
2023-06-23	Virtual Reality Sickness Reduces Attention During Immersive Experiences	Katherine J. Mimnaugh et.al.	2306.13505v1	null	In this paper, we show that Virtual Reality (VR) sickness is associated with a reduction in attention, which was detected with the P3b Event-Related Potential (ERP) component from electroencephalography (EEG) measurements collected in a dual-task paradigm. We hypothesized that sickness symptoms such as nausea, eyestrain, and fatigue would reduce the users' capacity to pay attention to tasks completed in a virtual environment, and that this reduction in attention would be dynamically reflected in a decrease of the P3b amplitude while VR sickness was experienced. In a user study, participants were taken on a tour through a museum in VR along paths with varying amounts of rotation, shown previously to cause different levels of VR sickness. While paying attention to the virtual museum (the primary task), participants were asked to silently count tones of a different frequency (the secondary task). Control measurements for comparison against the VR sickness conditions were taken when the users were not wearing the Head-Mounted Display (HMD) and while they were immersed in VR but not moving through the environment. This exploratory study shows, across multiple analyses, that the effect mean amplitude of the P3b collected during the task is associated with both sickness severity measured after the task with a questionnaire (SSQ) and with the number of counting errors on the secondary task. Thus, VR sickness may impair attention and task performance, and these changes in attention can be tracked with ERP measures as they happen, without asking participants to assess their sickness symptoms in the moment.
2023-06-21	Reporting existing datasets for automatic epilepsy diagnosis and seizure detection	Palak Handa et.al.	2306.12292v1	null	More than 50 million individuals are affected by epilepsy, a chronic neurological disorder characterized by unprovoked, recurring seizures and psychological symptoms. Researchers are working to automatically detect or predict epileptic episodes through Electroencephalography (EEG) signal analysis, and machine, and deep learning methods. Good quality, open-source, and free EEG data acts as a catalyst in this ongoing battle to manage this disease. This article presents 40+ publicly available EEG datasets for adult and pediatric human populations from 2001-2023. A comparative analysis and discussion on open and private EEG datasets have been done based on objective parameters in this domain. Bonn and CHB-MIT remain the benchmark datasets used for the automatic detection of epileptic and seizure EEG signals. Meta-data has also been released for large EEG data like CHB-MIT. This article will be updated every year to report the progress and changing trends in the development of EEG datasets in this field.
2023-06-13	Empirical Measurement of Aesthetic Experience of Music	Abhishek Gupta et.al.	2306.07802v1	null	Chills or goosebumps, also called frisson, is a phenomenon that is often associated with an aesthetic experience e.g., music or some other ecstatic experience. The temporal and spatial cause of frisson in the brain has been one of the biggest mysteries of human nature. Accumulating evidence suggests that aesthetic, namely subjective, affective, and evaluative processes are at play while listening to music, hence, it is an important subjective stimulus for systematic investigation. Advances in neuroimaging and cognitive neuroscience, have given impetus to neuro-aesthetics, a novel approach to music providing a phenomenological brain-based framework for the aesthetic experience of music with the potential to open the scope for future research. In this paper, we present an affordable, wearable, easy-to-carry device to measure phenomenological goosebumps intensity on our skin with respect to real-time data using IoT devices (Raspberry pi 3, model B). To test the device subjects were asked to provide a list of songs that elicit goosebumps. Wireless earphones were provided, allowing participants to walk around and dance while listening to their music. (Some subjects moved during sessions). Results indicate that goosebumps were reliably detected by the device after visual inspection of the videos/music. The effective measurement when interfaced with neurophysiological devices such as electroencephalography (EEG) can help interpret biomarkers of ecstatic emotions. The second part of the study focuses on identifying primary brain regions involved in goosebump experience during musical stimulation.
2023-06-10	TS-MoCo: Time-Series Momentum Contrast for Self-Supervised Physiological Representation Learning	Philipp Hallgarten et.al.	2306.06522v1	link	Limited availability of labeled physiological data often prohibits the use of powerful supervised deep learning models in the biomedical machine intelligence domain. We approach this problem and propose a novel encoding framework that relies on self-supervised learning with momentum contrast to learn representations from multivariate time-series of various physiological domains without needing labels. Our model uses a transformer architecture that can be easily adapted to classification problems by optimizing a linear output classification layer. We experimentally evaluate our framework using two publicly available physiological datasets from different domains, i.e., human activity recognition from embedded inertial sensory and emotion recognition from electroencephalography. We show that our self-supervised learning approach can indeed learn discriminative features which can be exploited in downstream classification tasks. Our work enables the development of domain-agnostic intelligent systems that can effectively analyze multivariate time-series data from physiological domains.
2023-06-05	Gotta Go Fast: Measuring Input/Output Latencies of Virtual Reality 3D Engines for Cognitive Experiments	Taeho Kang et.al.	2306.02637v1	null	Virtual Reality (VR) is seeing increased adoption across many fields. The field of experimental cognitive science is also testing utilization of the technology combined with physiological measures such as electroencephalography (EEG) and eye tracking. Quantitative measures of human behavior and cognition process, however, are sensitive to minuscule time resolutions that are often overlooked in the scope of consumer-level VR hardware and software stacks. In this preliminary study, we implement VR testing environments in two prominent 3D Virtual Reality frameworks (Unity and Unreal Engine) to measure latency values for stimulus onset execution code to Head-Mount Display (HMD) pixel change, as well as the latency between human behavioral response input to its registration in the engine environment under a typical cognitive experiment hardware setup. We find that whereas the specifics of the latency may further be influenced by different hardware and software setups, the variations in consumer hardware is apparent regardless and report detailed statistics on these latencies. Such consideration should be taken into account when designing VR-based cognitive experiments that measure human behavior.
2023-05-22	Towards Ultrasound Tongue Image prediction from EEG during speech production	Tamás Gábor Csapó et.al.	2306.05374v1	link	Previous initial research has already been carried out to propose speech-based BCI using brain signals (e.g.~non-invasive EEG and invasive sEEG / ECoG), but there is a lack of combined methods that investigate non-invasive brain, articulation, and speech signals together and analyze the cognitive processes in the brain, the kinematics of the articulatory movement and the resulting speech signal. In this paper, we describe our multimodal (electroencephalography, ultrasound tongue imaging, and speech) analysis and synthesis experiments, as a feasibility study. We extend the analysis of brain signals recorded during speech production with ultrasound-based articulation data. From the brain signal measured with EEG, we predict ultrasound images of the tongue with a fully connected deep neural network. The results show that there is a weak but noticeable relationship between EEG and ultrasound tongue images, i.e. the network can differentiate articulated speech and neutral tongue position.
2023-05-19	Energy-efficient memcapacitive physical reservoir computing system for temporal data processing	Md Razuan Hossain et.al.	2305.12025v1	null	Reservoir computing is a highly efficient machine learning framework for processing temporal data by extracting features from the input signal and mapping them into higher dimensional spaces. Physical reservoir layers have been realized using spintronic oscillators, atomic switch networks, silicon photonic modules, ferroelectric transistors, and volatile memristors. However, these devices are intrinsically energy-dissipative due to their resistive nature, which leads to increased power consumption. Therefore, capacitive memory devices can provide a more energy-efficient approach. Here, we leverage volatile biomembrane-based memcapacitors that closely mimic certain short-term synaptic plasticity functions as reservoirs to solve classification tasks and analyze time-series data in simulation and experimentally. Our system achieves a 98% accuracy rate for spoken digit classification and a normalized mean square error of 0.0012 in a second-order non-linear regression task. Further, to demonstrate the device's real-time temporal data processing capability, we demonstrate a 100% accuracy for an electroencephalography (EEG) signal classification problem for epilepsy detection. Most importantly, we demonstrate that for a random input sequence, each memcapacitor consumes on average 41.5fJ of energy per spike, irrespective of the chosen input voltage pulse width, and 415fW of average power for 100 ms pulse width, orders of magnitude lower than the state-of-the-art devices. Lastly, we believe the biocompatible, soft nature of our memcapacitor makes it highly suitable for computing and signal-processing applications in biological environments.
2023-05-18	Temporal Aware Mixed Attention-based Convolution and Transformer Network (MACTN) for EEG Emotion Recognition	Xiaopeng Si et.al.	2305.18234v1	null	Emotion recognition plays a crucial role in human-computer interaction, and electroencephalography (EEG) is advantageous for reflecting human emotional states. In this study, we propose MACTN, a hierarchical hybrid model for jointly modeling local and global temporal information. The model is inspired by neuroscience research on the temporal dynamics of emotions. MACTN extracts local emotional features through a convolutional neural network (CNN) and integrates sparse global emotional features through a transformer. Moreover, we employ channel attention mechanisms to identify the most task-relevant channels. Through extensive experimentation on two publicly available datasets, namely THU-EP and DEAP, our proposed method, MACTN, consistently achieves superior classification accuracy and F1 scores compared to other existing methods in most experimental settings. Furthermore, ablation studies have shown that the integration of both self-attention mechanisms and channel attention mechanisms leads to improved classification performance. Finally, an earlier version of this method, which shares the same ideas, won the Emotional BCI Competition's final championship in the 2022 World Robot Contest.
2023-05-18	Robust inference of causality in high-dimensional dynamical processes from the Information Imbalance of distance ranks	Vittorio Del Tatto et.al.	2305.10817v2	link	We introduce an approach which allows inferring causal relationships between variables for which the time evolution is available. Our method builds on the ideas of Granger Causality and Transfer Entropy, but overcomes most of their limitations. Specifically, our approach tests whether the predictability of a putative driven system Y can be improved by incorporating information from a potential driver system X, without making assumptions on the underlying dynamics and without the need to compute probability densities of the dynamic variables. Causality is assessed by a rigorous variational scheme based on the Information Imbalance of distance ranks, a recently developed statistical test capable of inferring the relative information content of different distance measures. This framework makes causality detection possible even for high-dimensional systems where only few of the variables are known or measured. Benchmark tests on coupled dynamical systems demonstrate that our approach outperforms other model-free causality detection methods, successfully handling both unidirectional and bidirectional couplings, and it is capable of detecting the arrow of time when present. We also show that the method can be used to robustly detect causality in electroencephalography data in humans.
2023-05-17	BASEN: Time-Domain Brain-Assisted Speech Enhancement Network with Convolutional Cross Attention in Multi-talker Conditions	Jie Zhang et.al.	2305.09994v1	link	Time-domain single-channel speech enhancement (SE) still remains challenging to extract the target speaker without any prior information on multi-talker conditions. It has been shown via auditory attention decoding that the brain activity of the listener contains the auditory information of the attended speaker. In this paper, we thus propose a novel time-domain brain-assisted SE network (BASEN) incorporating electroencephalography (EEG) signals recorded from the listener for extracting the target speaker from monaural speech mixtures. The proposed BASEN is based on the fully-convolutional time-domain audio separation network. In order to fully leverage the complementary information contained in the EEG signals, we further propose a convolutional multi-layer cross attention module to fuse the dual-branch features. Experimental results on a public dataset show that the proposed model outperforms the state-of-the-art method in several evaluation metrics. The reproducible code is available at https://github.com/jzhangU/Basen.git.
2023-04-24	Time delay multi-feature correlation analysis to extract subtle dependencies from EEG signals	Jarek Duda et.al.	2305.09478v2	null	Electroencephalography (EEG) signals are resultants of extremely complex brain activity. Some details of this hidden dynamics might be accessible through e.g. joint distributions $\rho_{\Delta t}$ of signals of pairs of electrodes shifted by various time delays (lag $\Delta t$). A standard approach is monitoring a single evaluation of such joint distributions, like Pearson correlation (or mutual information), which turns out relatively uninteresting - as expected, there is usually a small peak for zero delay and nearly symmetric drop with delay. In contrast, such a complex signal might be composed of multiple types of statistical dependencies - this article proposes approach to automatically decompose and extract them. Specifically, we model such joint distributions as polynomials, estimated separately for all considered lag dependencies, then with PCA dimensionality reduction we find the dominant joint density distortion directions $f_v$. This way we get a few lag dependent features $a_i(\Delta t)$ describing separate dominating statistical dependencies of known contributions: $\rho_{\Delta t}(y,z)\approx \sum_{i=1}^r a_i(\Delta t)\, f_{v_i}(y,z)$. Such features complement Pearson correlation, extracting hidden more complex behavior, e.g. with asymmetry which might be related with direction of information transfer, extrema suggesting characteristic delays, or oscillatory behavior suggesting some periodicity. There is also discussed extension of Granger causality to such multi-feature joint density analysis, suggesting e.g. two separate causality waves. While this early article is initial fundamental research, in future it might help e.g. with understanding of cortex hidden dynamics, diagnosis of pathologies like epilepsy, determination of precise electrode position, or building brain-computer interface.
2023-04-21	A Convolutional Spiking Network for Gesture Recognition in Brain-Computer Interfaces	Yiming Ai et.al.	2304.11106v2	null	Brain-computer interfaces are being explored for a wide variety of therapeutic applications. Typically, this involves measuring and analyzing continuous-time electrical brain activity via techniques such as electrocorticogram (ECoG) or electroencephalography (EEG) to drive external devices. However, due to the inherent noise and variability in the measurements, the analysis of these signals is challenging and requires offline processing with significant computational resources. In this paper, we propose a simple yet efficient machine learning-based approach for the exemplary problem of hand gesture classification based on brain signals. We use a hybrid machine learning approach that uses a convolutional spiking neural network employing a bio-inspired event-driven synaptic plasticity rule for unsupervised feature learning of the measured analog signals encoded in the spike domain. We demonstrate that this approach generalizes to different subjects with both EEG and ECoG data and achieves superior accuracy in the range of 92.74-97.07% in identifying different hand gesture classes and motor imagery tasks.
2023-04-21	Interpretable and Robust AI in EEG Systems: A Survey	Xinliang Zhou et.al.	2304.10755v1	null	The close coupling of artificial intelligence (AI) and electroencephalography (EEG) has substantially advanced human-computer interaction (HCI) technologies in the AI era. Different from traditional EEG systems, the interpretability and robustness of AI-based EEG systems are becoming particularly crucial. The interpretability clarifies the inner working mechanisms of AI models and thus can gain the trust of users. The robustness reflects the AI's reliability against attacks and perturbations, which is essential for sensitive and fragile EEG signals. Thus the interpretability and robustness of AI in EEG systems have attracted increasing attention, and their research has achieved great progress recently. However, there is still no survey covering recent advances in this field. In this paper, we present the first comprehensive survey and summarize the interpretable and robust AI techniques for EEG systems. Specifically, we first propose a taxonomy of interpretability by characterizing it into three types: backpropagation, perturbation, and inherently interpretable methods. Then we classify the robustness mechanisms into four classes: noise and artifacts, human variability, data acquisition instability, and adversarial attacks. Finally, we identify several critical and unresolved challenges for interpretable and robust AI in EEG systems and further discuss their future directions.
2023-04-12	Adaptive Gated Graph Convolutional Network for Explainable Diagnosis of Alzheimer's Disease using EEG Data	Dominik Klepl et.al.	2304.05874v1	null	Graph neural network (GNN) models are increasingly being used for the classification of electroencephalography (EEG) data. However, GNN-based diagnosis of neurological disorders, such as Alzheimer's disease (AD), remains a relatively unexplored area of research. Previous studies have relied on functional connectivity methods to infer brain graph structures and used simple GNN architectures for the diagnosis of AD. In this work, we propose a novel adaptive gated graph convolutional network (AGGCN) that can provide explainable predictions. AGGCN adaptively learns graph structures by combining convolution-based node feature enhancement with a well-known correlation-based measure of functional connectivity. Furthermore, the gated graph convolution can dynamically weigh the contribution of various spatial scales. The proposed model achieves high accuracy in both eyes-closed and eyes-open conditions, indicating the stability of learned representations. Finally, we demonstrate that the proposed AGGCN model generates consistent explanations of its predictions that might be relevant for further study of AD-related alterations of brain networks.
2023-04-12	Dynamic Graph Representation Learning with Neural Networks: A Survey	Leshanshui Yang et.al.	2304.05729v1	null	In recent years, Dynamic Graph (DG) representations have been increasingly used for modeling dynamic systems due to their ability to integrate both topological and temporal information in a compact representation. Dynamic graphs allow to efficiently handle applications such as social network prediction, recommender systems, traffic forecasting or electroencephalography analysis, that can not be adressed using standard numeric representations. As a direct consequence of the emergence of dynamic graph representations, dynamic graph learning has emerged as a new machine learning problem, combining challenges from both sequential/temporal data processing and static graph learning. In this research area, Dynamic Graph Neural Network (DGNN) has became the state of the art approach and plethora of models have been proposed in the very recent years. This paper aims at providing a review of problems and models related to dynamic graph learning. The various dynamic graph supervised learning settings are analysed and discussed. We identify the similarities and differences between existing models with respect to the way time information is modeled. Finally, general guidelines for a DGNN designer when faced with a dynamic graph learning problem are provided.
2023-04-01	Upper Limb Movement Execution Classification using Electroencephalography for Brain Computer Interface	Saadat Ullah Khan et.al.	2304.06036v1	null	An accurate classification of upper limb movements using electroencephalography (EEG) signals is gaining significant importance in recent years due to the prevalence of brain-computer interfaces. The upper limbs in the human body are crucial since different skeletal segments combine to make a range of motion that helps us in our trivial daily tasks. Decoding EEG-based upper limb movements can be of great help to people with spinal cord injury (SCI) or other neuro-muscular diseases such as amyotrophic lateral sclerosis (ALS), primary lateral sclerosis, and periodic paralysis. This can manifest in a loss of sensory and motor function, which could make a person reliant on others to provide care in day-to-day activities. We can detect and classify upper limb movement activities, whether they be executed or imagined using an EEG-based brain-computer interface (BCI). Toward this goal, we focus our attention on decoding movement execution (ME) of the upper limb in this study. For this purpose, we utilize a publicly available EEG dataset that contains EEG signal recordings from fifteen subjects acquired using a 61-channel EEG device. We propose a method to classify four ME classes for different subjects using spectrograms of the EEG data through pre-trained deep learning (DL) models. Our proposed method of using EEG spectrograms for the classification of ME has shown significant results, where the highest average classification accuracy (for four ME classes) obtained is 87.36%, with one subject achieving the best classification accuracy of 97.03%.
2023-03-29	Parkinsons Disease Detection via Resting-State Electroencephalography Using Signal Processing and Machine Learning Techniques	Krish Desai et.al.	2304.01214v1	null	Parkinsons Disease (PD) is a neurodegenerative disorder resulting in motor deficits due to advancing degeneration of dopaminergic neurons. PD patients report experiencing tremor, rigidity, visual impairment, bradykinesia, and several cognitive deficits. Although Electroencephalography (EEG) indicates abnormalities in PD patients, one major challenge is the lack of a consistent, accurate, and systemic biomarker for PD in order to closely monitor the disease with therapeutic treatments and medication. In this study, we collected Electroencephalographic data from 15 PD patients and 16 Healthy Controls (HC). We first preprocessed every EEG signal using several techniques and extracted relevant features using many feature extraction algorithms. Afterwards, we applied several machine learning algorithms to classify PD versus HC. We found the most significant metrics to be achieved by the Random Forest ensemble learning approach, with an accuracy, precision, recall, F1 score, and AUC of 97.5%, 100%, 95%, 0.967, and 0.975, respectively. The results of this study show promise for exposing PD abnormalities using EEG during clinical diagnosis, and automating this process using signal processing techniques and ML algorithms to evaluate the difference between healthy individuals and PD patients.
2023-03-27	EEGMatch: Learning with Incomplete Labels for Semi-Supervised EEG-based Cross-Subject Emotion Recognition	Rushuang Zhou et.al.	2304.06496v1	link	Electroencephalography (EEG) is an objective tool for emotion recognition and shows promising performance. However, the label scarcity problem is a main challenge in this field, which limits the wide application of EEG-based emotion recognition. In this paper, we propose a novel semi-supervised learning framework (EEGMatch) to leverage both labeled and unlabeled EEG data. First, an EEG-Mixup based data augmentation method is developed to generate more valid samples for model learning. Second, a semi-supervised two-step pairwise learning method is proposed to bridge prototype-wise and instance-wise pairwise learning, where the prototype-wise pairwise learning measures the global relationship between EEG data and the prototypical representation of each emotion class and the instance-wise pairwise learning captures the local intrinsic relationship among EEG data. Third, a semi-supervised multi-domain adaptation is introduced to align the data representation among multiple domains (labeled source domain, unlabeled source domain, and target domain), where the distribution mismatch is alleviated. Extensive experiments are conducted on two benchmark databases (SEED and SEED-IV) under a cross-subject leave-one-subject-out cross-validation evaluation protocol. The results show the proposed EEGmatch performs better than the state-of-the-art methods under different incomplete label conditions (with 6.89% improvement on SEED and 1.44% improvement on SEED-IV), which demonstrates the effectiveness of the proposed EEGMatch in dealing with the label scarcity problem in emotion recognition using EEG signals. The source code is available at https://github.com/KAZABANA/EEGMatch.
2023-03-26	Driver Drowsiness Detection with Commercial EEG Headsets	Qazal Rezaee et.al.	2303.14841v1	null	Driver Drowsiness is one of the leading causes of road accidents. Electroencephalography (EEG) is highly affected by drowsiness; hence, EEG-based methods detect drowsiness with the highest accuracy. Developments in manufacturing dry electrodes and headsets have made recording EEG more convenient. Vehicle-based features used for detecting drowsiness are easy to capture but do not have the best performance. In this paper, we investigated the performance of EEG signals recorded in 4 channels with commercial headsets against the vehicle-based technique in drowsiness detection. We recorded EEG signals of 50 volunteers driving a simulator in drowsy and alert states by commercial devices. The observer rating of the drowsiness method was used to determine the drowsiness level of the subjects. The meaningful separation of vehicle-based features, recorded by the simulator, and EEG-based features of the two states of drowsiness and alertness have been investigated. The comparison results indicated that the EEG-based features are separated with lower p-values than the vehicle-based ones in the two states. It is concluded that EEG headsets can be feasible alternatives with better performance compared to vehicle-based methods for detecting drowsiness.
2023-03-20	Relate auditory speech to EEG by shallow-deep attention-based network	Fan Cui et.al.	2303.10897v1	null	Electroencephalography (EEG) plays a vital role in detecting how brain responses to different stimulus. In this paper, we propose a novel Shallow-Deep Attention-based Network (SDANet) to classify the correct auditory stimulus evoking the EEG signal. It adopts the Attention-based Correlation Module (ACM) to discover the connection between auditory speech and EEG from global aspect, and the Shallow-Deep Similarity Classification Module (SDSCM) to decide the classification result via the embeddings learned from the shallow and deep layers. Moreover, various training strategies and data augmentation are used to boost the model robustness. Experiments are conducted on the dataset provided by Auditory EEG challenge (ICASSP Signal Processing Grand Challenge 2023). Results show that the proposed model has a significant gain over the baseline on the match-mismatch track.
2023-03-19	Enabling Immersion and Presence in the Metaverse with Over-the-Air Brain-Computer Interface	Nguyen Quang Hieu et.al.	2303.10577v1	null	Decoding brain signals can not only reveal Metaverse users' expectations but also early detect error-related behaviors such as stress, drowsiness, and motion sickness. For that, this article proposes a pioneering framework using wireless/over-the-air Brain-Computer Interface (BCI) to assist creation of virtual avatars as human representation in the Metaverse. Specifically, to eliminate the computational burden for Metaverse users' devices, we leverage Wireless Edge Servers (WES) that are popular in 5G architecture and therein URLLC, enhanced broadband features to obtain and process the brain activities, i.e., electroencephalography (EEG) signals (via uplink wireless channels). As a result, the WES can learn human behaviors, adapt system configurations, and allocate radio resources to create individualized settings and enhance user experiences. Despite the potential of BCI, the inherent noisy/fading wireless channels and the uncertainty in Metaverse users' demands and behaviors make the related resource allocation and learning/classification problems particularly challenging. We formulate the joint learning and resource allocation problem as a Quality-of-Experience (QoE) maximization problem that takes into the latency, brain classification accuracy, and resources of the system. To tackle this mixed integer programming problem, we then propose two novel algorithms that are (i) a hybrid learning algorithm to maximize the user QoE and (ii) a meta-learning algorithm to exploit the neurodiversity of the brain signals among multiple Metaverse users. The extensive experiment results with different BCI datasets show that our proposed algorithms can not only provide low delay for virtual reality (VR) applications but also can achieve high classification accuracy for the collected brain signals.

PPG

Photoplethysmography

Publish Date	Title	Authors	PDF	Code	Abstract
2023-07-24	Remote Bio-Sensing: Open Source Benchmark Framework for Fair Evaluation of rPPG	Dae Yeol Kim et.al.	2307.12644v1	link	Remote Photoplethysmography (rPPG) is a technology that utilizes the light absorption properties of hemoglobin, captured via camera, to analyze and measure blood volume pulse (BVP). By analyzing the measured BVP, various physiological signals such as heart rate, stress levels, and blood pressure can be derived, enabling applications such as the early prediction of cardiovascular diseases. rPPG is a rapidly evolving field as it allows the measurement of vital signals using camera-equipped devices without the need for additional devices such as blood pressure monitors or pulse oximeters, and without the assistance of medical experts. Despite extensive efforts and advances in this field, serious challenges remain, including issues related to skin color, camera characteristics, ambient lighting, and other sources of noise, which degrade performance accuracy. We argue that fair and evaluable benchmarking is urgently required to overcome these challenges and make any meaningful progress from both academic and commercial perspectives. In most existing work, models are trained, tested, and validated only on limited datasets. Worse still, some studies lack available code or reproducibility, making it difficult to fairly evaluate and compare performance. Therefore, the purpose of this study is to provide a benchmarking framework to evaluate various rPPG techniques across a wide range of datasets for fair evaluation and comparison, including both conventional non-deep neural network (non-DNN) and deep neural network (DNN) methods. GitHub URL: https://github.com/remotebiosensing/rppg.
2023-07-18	Robust peak detection for photoplethysmography signal analysis	Márton Á. Goda et.al.	2307.10398v1	null	Efficient and accurate evaluation of long-term photoplethysmography (PPG) recordings is essential for both clinical assessments and consumer products. In 2021, the top opensource peak detectors were benchmarked on the Multi-Ethnic Study of Atherosclerosis (MESA) database consisting of polysomnography (PSG) recordings and continuous sleep PPG data, where the Automatic Beat Detector (Aboy) had the best accuracy. This work presents Aboy++, an improved version of the original Aboy beat detector. The algorithm was evaluated on 100 adult PPG recordings from the MESA database, which contains more than 4.25 million reference beats. Aboy++ achieved an F1-score of 85.5%, compared to 80.99% for the original Aboy peak detector. On average, Aboy++ processed a 1 hour-long recording in less than 2 seconds. This is compared to 115 seconds (i.e., over 57-times longer) for the open-source implementation of the original Aboy peak detector. This study demonstrated the importance of developing robust algorithms like Aboy++ to improve PPG data analysis and clinical outcomes. Overall, Aboy++ is a reliable tool for evaluating long-term wearable PPG measurements in clinical and consumer contexts.
2023-07-17	Quality Assessment of Photoplethysmography Signals For Cardiovascular Biomarkers Monitoring Using Wearable Devices	Felipe M. Dias et.al.	2307.08766v1	null	Photoplethysmography (PPG) is a non-invasive technology that measures changes in blood volume in the microvascular bed of tissue. It is commonly used in medical devices such as pulse oximeters and wrist worn heart rate monitors to monitor cardiovascular hemodynamics. PPG allows for the assessment of parameters (e.g., heart rate, pulse waveform, and peripheral perfusion) that can indicate conditions such as vasoconstriction or vasodilation, and provides information about microvascular blood flow, making it a valuable tool for monitoring cardiovascular health. However, PPG is subject to a number of sources of variations that can impact its accuracy and reliability, especially when using a wearable device for continuous monitoring, such as motion artifacts, skin pigmentation, and vasomotion. In this study, we extracted 27 statistical features from the PPG signal for training machine-learning models based on gradient boosting (XGBoost and CatBoost) and Random Forest (RF) algorithms to assess quality of PPG signals that were labeled as good or poor quality. We used the PPG time series from a publicly available dataset and evaluated the algorithm s performance using Sensitivity (Se), Positive Predicted Value (PPV), and F1-score (F1) metrics. Our model achieved Se, PPV, and F1-score of 94.4, 95.6, and 95.0 for XGBoost, 94.7, 95.9, and 95.3 for CatBoost, and 93.7, 91.3 and 92.5 for RF, respectively. Our findings are comparable to state-of-the-art reported in the literature but using a much simpler model, indicating that ML models are promising for developing remote, non-invasive, and continuous measurement devices.
2023-07-14	An Embedded Auto-Calibrated Offset Current Compensation Technique for PPG/fNIRS System	Sadan Saquib Khan et.al.	2307.07414v1	null	Usually, the current generated by the photodiode proportional to the oxygenated blood in the photoplethysmography (PPG) and functional infrared spectroscopy (fNIRS) based recording systems is small as compared to the offset-current. The offset current is the combination of the dark current of the photodiode, the current due to ambient light, and the current due to the reflected light from fat and skull . The relatively large value of the offset current limits the amplification of the signal current and affects the overall performance of the PPG/fNIRS recording systems. In this paper, we present a mixed-signal auto-calibrated offset current compensation technique for PPG and fNIRS recording systems. The system auto-calibrates the offset current, compensates using a dual discrete loop technique, and amplifies the signal current. Thanks to the amplification, the system provides better sensitivity. A prototype of the system is built and tested for PPG signal recording. The prototype is developed for a 3.3 V single supply. The results show that the proposed system is able to effectively compensate for the offset current.
2023-07-12	Personalized Anomaly Detection in PPG Data using Representation Learning and Biometric Identification	Ramin Ghorbani et.al.	2307.06380v1	null	Photoplethysmography (PPG) signals, typically acquired from wearable devices, hold significant potential for continuous fitness-health monitoring. In particular, heart conditions that manifest in rare and subtle deviating heart patterns may be interesting. However, robust and reliable anomaly detection within these data remains a challenge due to the scarcity of labeled data and high inter-subject variability. This paper introduces a two-stage framework leveraging representation learning and personalization to improve anomaly detection performance in PPG data. The proposed framework first employs representation learning to transform the original PPG signals into a more discriminative and compact representation. We then apply three different unsupervised anomaly detection methods for movement detection and biometric identification. We validate our approach using two different datasets in both generalized and personalized scenarios. The results show that representation learning significantly improves anomaly detection performance while reducing the high inter-subject variability. Personalized models further enhance anomaly detection performance, underscoring the role of personalization in PPG-based fitness-health monitoring systems. The results from biometric identification show that it's easier to distinguish a new user from one intended authorized user than from a group of users. Overall, this study provides evidence of the effectiveness of representation learning and personalization for anomaly detection in PPG data.
2023-07-07	A Self-Supervised Algorithm for Denoising Photoplethysmography Signals for Heart Rate Estimation from Wearables	Pranay Jain et.al.	2307.05339v1	null	Smart watches and other wearable devices are equipped with photoplethysmography (PPG) sensors for monitoring heart rate and other aspects of cardiovascular health. However, PPG signals collected from such devices are susceptible to corruption from noise and motion artifacts, which cause errors in heart rate estimation. Typical denoising approaches filter or reconstruct the signal in ways that eliminate much of the morphological information, even from the clean parts of the signal that would be useful to preserve. In this work, we develop an algorithm for denoising PPG signals that reconstructs the corrupted parts of the signal, while preserving the clean parts of the PPG signal. Our novel framework relies on self-supervised training, where we leverage a large database of clean PPG signals to train a denoising autoencoder. As we show, our reconstructed signals provide better estimates of heart rate from PPG signals than the leading heart rate estimation methods. Further experiments show significant improvement in Heart Rate Variability (HRV) estimation from PPG signals using our algorithm. We conclude that our algorithm denoises PPG signals in a way that can improve downstream analysis of many different health metrics from wearable devices.
2023-07-06	Learned Kernels for Interpretable and Efficient PPG Signal Quality Assessment and Artifact Segmentation	Sully F. Chen et.al.	2307.05385v1	null	Photoplethysmography (PPG) provides a low-cost, non-invasive method to continuously monitor various cardiovascular parameters. PPG signals are generated by wearable devices and frequently contain large artifacts caused by external factors, such as motion of the human subject. In order to ensure robust and accurate extraction of physiological parameters, corrupted areas of the signal need to be identified and handled appropriately. Previous methodology relied either on handcrafted feature detectors or signal metrics which yield sub-optimal performance, or relied on machine learning techniques such as deep neural networks (DNN) which lack interpretability and are computationally and memory intensive. In this work, we present a novel method to learn a small set of interpretable convolutional kernels that has performance similar to -- and often better than -- the state-of-the-art DNN approach with several orders of magnitude fewer parameters. This work allows for efficient, robust, and interpretable signal quality assessment and artifact segmentation on low-power devices.
2023-06-19	ApSense: Data-driven Algorithm in PPG-based Sleep Apnea Sensing	Tanut Choksatchawathi et.al.	2306.10863v1	null	In this paper, we utilized obstructive sleep apnea and cardiovascular disease-related photoplethysmography (PPG) features in constructing the input to deep learning (DL). The features are pulse wave amplitude (PWA), beat-to-beat or RR interval, a derivative of PWA, a derivative of RR interval, systolic phase duration, diastolic phase duration, and pulse area. Then, we develop DL architectures to evaluate the proposed features' usefulness. Eventually, we demonstrate that in human-machine settings where the medical staff only needs to label 20% of the PPG recording length, our proposed features with the developed DL architectures achieve 79.95% and 73.81% recognition accuracy in MESA and HeartBEAT datasets. This simplifies the labelling task of the medical staff during the sleep test yet provides accurate apnea event recognition.
2023-06-16	Camera PPG waveforms at the forehead	A. C. den Brinker et.al.	2306.09879v1	null	In order to obtain insights into the feasibility of replacing ECG-guided triggering in magnetic resonance imaging (MRI) by a system based on video photoplethysmography (PPG), PPG and ECG data were collected from volunteers in an MRI scanner. PPG waveforms obtained using remote camera PPG directed at the forehead are studied in qualitative and quantitative sense over a number of volunteers. The data analysis considers variations in PPG waveforms across volunteers, modelling of the waveforms in Fourier series, dependencies of waveforms and features on the interbeat interval (IBI) and breath-holding, and models for ECG-blind estimation of R-peak position. The main findings are that the PPG waveform depends on the volunteer and that its shape changes with IBI and does not depend on breath-holding in the given scenario. Low-order harmonic models provide accurate approximations to the PPG waveform, where for higher IBI the waveform shows more temporal details. Accurate predictions (20 ms std) of the delays between markers in ECG and PPG appear feasible from a single PPG feature.
2023-06-13	BeliefPPG: Uncertainty-aware Heart Rate Estimation from PPG signals via Belief Propagation	Valentin Bieri et.al.	2306.07730v2	link	We present a novel learning-based method that achieves state-of-the-art performance on several heart rate estimation benchmarks extracted from photoplethysmography signals (PPG). We consider the evolution of the heart rate in the context of a discrete-time stochastic process that we represent as a hidden Markov model. We derive a distribution over possible heart rate values for a given PPG signal window through a trained neural network. Using belief propagation, we incorporate the statistical distribution of heart rate changes to refine these estimates in a temporal context. From this, we obtain a quantized probability distribution over the range of possible heart rate values that captures a meaningful and well-calibrated estimate of the inherent predictive uncertainty. We show the robustness of our method on eight public datasets with three different cross-validation experiments.
2023-06-04	rPPG-MAE: Self-supervised Pre-training with Masked Autoencoders for Remote Physiological Measurement	Xin Liu et.al.	2306.02301v1	null	Remote photoplethysmography (rPPG) is an important technique for perceiving human vital signs, which has received extensive attention. For a long time, researchers have focused on supervised methods that rely on large amounts of labeled data. These methods are limited by the requirement for large amounts of data and the difficulty of acquiring ground truth physiological signals. To address these issues, several self-supervised methods based on contrastive learning have been proposed. However, they focus on the contrastive learning between samples, which neglect the inherent self-similar prior in physiological signals and seem to have a limited ability to cope with noisy. In this paper, a linear self-supervised reconstruction task was designed for extracting the inherent self-similar prior in physiological signals. Besides, a specific noise-insensitive strategy was explored for reducing the interference of motion and illumination. The proposed framework in this paper, namely rPPG-MAE, demonstrates excellent performance even on the challenging VIPL-HR dataset. We also evaluate the proposed method on two public datasets, namely PURE and UBFC-rPPG. The results show that our method not only outperforms existing self-supervised methods but also exceeds the state-of-the-art (SOTA) supervised methods. One important observation is that the quality of the dataset seems more important than the size in self-supervised pre-training of rPPG. The source code is released at https://github.com/linuxsino/rPPG-MAE.
2023-06-01	Privacy-Preserving Remote Heart Rate Estimation from Facial Videos	Divij Gupta et.al.	2306.01141v1	null	Remote Photoplethysmography (rPPG) is the process of estimating PPG from facial videos. While this approach benefits from contactless interaction, it is reliant on videos of faces, which often constitutes an important privacy concern. Recent research has revealed that deep learning techniques are vulnerable to attacks, which can result in significant data breaches making deep rPPG estimation even more sensitive. To address this issue, we propose a data perturbation method that involves extraction of certain areas of the face with less identity-related information, followed by pixel shuffling and blurring. Our experiments on two rPPG datasets (PURE and UBFC) show that our approach reduces the accuracy of facial recognition algorithms by over 60%, with minimal impact on rPPG extraction. We also test our method on three facial recognition datasets (LFW, CALFW, and AgeDB), where our approach reduced performance by nearly 50%. Our findings demonstrate the potential of our approach as an effective privacy-preserving solution for rPPG estimation.
2023-05-25	Mask Attack Detection Using Vascular-weighted Motion-robust rPPG Signals	Chenglin Yao et.al.	2305.15940v1	null	Detecting 3D mask attacks to a face recognition system is challenging. Although genuine faces and 3D face masks show significantly different remote photoplethysmography (rPPG) signals, rPPG-based face anti-spoofing methods often suffer from performance degradation due to unstable face alignment in the video sequence and weak rPPG signals. To enhance the rPPG signal in a motion-robust way, a landmark-anchored face stitching method is proposed to align the faces robustly and precisely at the pixel-wise level by using both SIFT keypoints and facial landmarks. To better encode the rPPG signal, a weighted spatial-temporal representation is proposed, which emphasizes the face regions with rich blood vessels. In addition, characteristics of rPPG signals in different color spaces are jointly utilized. To improve the generalization capability, a lightweight EfficientNet with a Gated Recurrent Unit (GRU) is designed to extract both spatial and temporal features from the rPPG spatial-temporal representation for classification. The proposed method is compared with the state-of-the-art methods on five benchmark datasets under both intra-dataset and cross-dataset evaluations. The proposed method shows a significant and consistent improvement in performance over other state-of-the-art rPPG-based methods for face spoofing detection.
2023-05-24	Promoting Generalization in Cross-Dataset Remote Photoplethysmography	Nathan Vance et.al.	2305.15199v1	null	Remote Photoplethysmography (rPPG), or the remote monitoring of a subject's heart rate using a camera, has seen a shift from handcrafted techniques to deep learning models. While current solutions offer substantial performance gains, we show that these models tend to learn a bias to pulse wave features inherent to the training dataset. We develop augmentations to mitigate this learned bias by expanding both the range and variability of heart rates that the model sees while training, resulting in improved model convergence when training and cross-dataset generalization at test time. Through a 3-way cross dataset analysis we demonstrate a reduction in mean absolute error from over 13 beats per minute to below 3 beats per minute. We compare our method with other recent rPPG systems, finding similar performance under a variety of evaluation parameters.
2023-05-23	Amplitude-Independent Machine Learning for PPG through Visibility Graphs and Transfer Learning	Yuyang Miao et.al.	2305.14062v1	null	Photoplethysmography (PPG) signals are omnipresent in wearable devices, as they measure blood volume variations using LED technology. These signals provide insight into the body's circulatory system and can be employed to extract various bio-features, such as heart rate and vascular ageing. Although several algorithms have been proposed for this purpose, many exhibit limitations, including heavy reliance on human calibration, high signal quality requirements, and a lack of generalization. In this paper, we introduce a PPG signal processing framework that integrates graph theory and computer vision algorithms, which is invariant to affine transformations, offers rapid computation speed, and exhibits robust generalization across tasks and datasets.
2023-05-21	Your smartphone could act as a pulse-oximeter and as a single-lead ECG	Ahsan Mehmood et.al.	2305.12583v1	null	In the post-covid19 era, every new wave of the pandemic causes an increased concern among the masses to learn more about their state of well-being. Therefore, it is the need of the hour to come up with ubiquitous, low-cost, non-invasive tools for rapid and continuous monitoring of body vitals that reflect the status of one's overall health. In this backdrop, this work proposes a deep learning approach to turn a smartphone-the popular hand-held personal gadget-into a diagnostic tool to measure/monitor the three most important body vitals, i.e., pulse rate (PR), blood oxygen saturation level (aka SpO2), and respiratory rate (RR). Furthermore, we propose another method that could extract a single-lead electrocardiograph (ECG) of the subject. The proposed methods include the following core steps: subject records a small video of his/her fingertip by placing his/her finger on the rear camera of the smartphone, and the recorded video is pre-processed to extract the filtered and/or detrended video-photoplethysmography (vPPG) signal, which is then fed to custom-built convolutional neural networks (CNN), which eventually spit-out the vitals (PR, SpO2, and RR) as well as a single-lead ECG of the subject. To be precise, the contribution of this paper is two-fold: 1) estimation of the three body vitals (PR, SpO2, RR) from the vPPG data using custom-built CNNs, vision transformer, and most importantly by CLIP model; 2) a novel discrete cosine transform+feedforward neural network-based method that translates the recorded video- PPG signal to a single-lead ECG signal. The proposed method is anticipated to find its application in several use-case scenarios, e.g., remote healthcare, mobile health, fitness, sports, etc.
2023-05-09	Predicting Cardiovascular Disease Risk using Photoplethysmography and Deep Learning	Wei-Hung Weng et.al.	2305.05648v1	null	Cardiovascular diseases (CVDs) are responsible for a large proportion of premature deaths in low- and middle-income countries. Early CVD detection and intervention is critical in these populations, yet many existing CVD risk scores require a physical examination or lab measurements, which can be challenging in such health systems due to limited accessibility. Here we investigated the potential to use photoplethysmography (PPG), a sensing technology available on most smartphones that can potentially enable large-scale screening at low cost, for CVD risk prediction. We developed a deep learning PPG-based CVD risk score (DLS) to predict the probability of having major adverse cardiovascular events (MACE: non-fatal myocardial infarction, stroke, and cardiovascular death) within ten years, given only age, sex, smoking status and PPG as predictors. We compared the DLS with the office-based refit-WHO score, which adopts the shared predictors from WHO and Globorisk scores (age, sex, smoking status, height, weight and systolic blood pressure) but refitted on the UK Biobank (UKB) cohort. In UKB cohort, DLS's C-statistic (71.1%, 95% CI 69.9-72.4) was non-inferior to office-based refit-WHO score (70.9%, 95% CI 69.7-72.2; non-inferiority margin of 2.5%, p<0.01). The calibration of the DLS was satisfactory, with a 1.8% mean absolute calibration error. Adding DLS features to the office-based score increased the C-statistic by 1.0% (95% CI 0.6-1.4). DLS predicts ten-year MACE risk comparable with the office-based refit-WHO score. It provides a proof-of-concept and suggests the potential of a PPG-based approach strategies for community-based primary prevention in resource-limited regions.
2023-04-28	Non-Contact Heart Rate Measurement from Deteriorated Videos	Nhi Nguyen et.al.	2304.14789v1	null	Remote photoplethysmography (rPPG) offers a state-of-the-art, non-contact methodology for estimating human pulse by analyzing facial videos. Despite its potential, rPPG methods can be susceptible to various artifacts, such as noise, occlusions, and other obstructions caused by sunglasses, masks, or even involuntary facial contact, such as individuals inadvertently touching their faces. In this study, we apply image processing transformations to intentionally degrade video quality, mimicking these challenging conditions, and subsequently evaluate the performance of both non-learning and learning-based rPPG methods on the deteriorated data. Our results reveal a significant decrease in accuracy in the presence of these artifacts, prompting us to propose the application of restoration techniques, such as denoising and inpainting, to improve heart-rate estimation outcomes. By addressing these challenging conditions and occlusion artifacts, our approach aims to make rPPG methods more robust and adaptable to real-world situations. To assess the effectiveness of our proposed methods, we undertake comprehensive experiments on three publicly available datasets, encompassing a wide range of scenarios and artifact types. Our findings underscore the potential to construct a robust rPPG system by employing an optimal combination of restoration algorithms and rPPG techniques. Moreover, our study contributes to the advancement of privacy-conscious rPPG methodologies, thereby bolstering the overall utility and impact of this innovative technology in the field of remote heart-rate estimation under realistic and diverse conditions.
2023-04-21	Heart Rate Extraction from Abdominal Audio Signals	Jake Stuchbury-Wass et.al.	2304.11020v1	null	Abdominal sounds (ABS) have been traditionally used for assessing gastrointestinal (GI) disorders. However, the assessment requires a trained medical professional to perform multiple abdominal auscultation sessions, which is resource-intense and may fail to provide an accurate picture of patients' continuous GI wellbeing. This has generated a technological interest in developing wearables for continuous capture of ABS, which enables a fuller picture of patient's GI status to be obtained at reduced cost. This paper seeks to evaluate the feasibility of extracting heart rate (HR) from such ABS monitoring devices. The collection of HR directly from these devices would enable gathering vital signs alongside GI data without the need for additional wearable devices, providing further cost benefits and improving general usability. We utilised a dataset containing 104 hours of ABS audio, collected from the abdomen using an e-stethoscope, and electrocardiogram as ground truth. Our evaluation shows for the first time that we can successfully extract HR from audio collected from a wearable on the abdomen. As heart sounds collected from the abdomen suffer from significant noise from GI and respiratory tracts, we leverage wavelet denoising for improved heart beat detection. The mean absolute error of the algorithm for average HR is 3.4 BPM with mean directional error of -1.2 BPM over the whole dataset. A comparison to photoplethysmography-based wearable HR sensors shows that our approach exhibits comparable accuracy to consumer wrist-worn wearables for average and instantaneous heart rate.
2023-04-21	IoT-Based Solution for Paraplegic Sufferer to Send Signals to Physician via Internet	L. Srinivasan et.al.	2304.10840v1	null	We come across hospitals and non-profit organizations that care for people with paralysis who have experienced all or portion of their physique being incapacitated by the paralyzing attack. Due to a lack of motor coordination by their mind, these persons are typically unable to communicate their requirements because they can speak clearly or use sign language. In such a case, we suggest a system that enables a disabled person to move any area of his body capable of moving to broadcast a text on the LCD. This method also addresses the circumstance in which the patient cannot be attended to in person and instead sends an SMS message using GSM. By detecting the user part's tilt direction, our suggested system operates. As a result, patients can communicate with physicians, therapists, or their loved ones at home or work over the web. Case-specific data, such as heart rate, must be continuously reported in health centers. The suggested method tracks the body of the case's pulse rate and other comparable data. For instance, photoplethysmography is used to assess heart rate. The decoded periodic data is transmitted continually via a Microcontroller coupled to a transmitting module. The croaker's cabin contains a receiver device that obtains and deciphers data as well as constantly exhibits it on Graphical interfaces viewable on the laptop. As a result, the croaker can monitor and handle multiple situations at once.
2023-04-14	PPG Signals for Hypertension Diagnosis: A Novel Method using Deep Learning Models	Graham Frederick et.al.	2304.06952v1	null	Hypertension is a medical condition characterized by high blood pressure, and classifying it into its various stages is crucial to managing the disease. In this project, a novel method is proposed for classifying stages of hypertension using Photoplethysmography (PPG) signals and deep learning models, namely AvgPool_VGG-16. The PPG signal is a non-invasive method of measuring blood pressure through the use of light sensors that measure the changes in blood volume in the microvasculature of tissues. PPG images from the publicly available blood pressure classification dataset were used to train the model. Multiclass classification for various PPG stages were done. The results show the proposed method achieves high accuracy in classifying hypertension stages, demonstrating the potential of PPG signals and deep learning models in hypertension diagnosis and management.
2023-04-05	Deep Learning Systems for Advanced Driving Assistance	Francesco Rundo et.al.	2304.06041v1	null	Next generation cars embed intelligent assessment of car driving safety through innovative solutions often based on usage of artificial intelligence. The safety driving monitoring can be carried out using several methodologies widely treated in scientific literature. In this context, the author proposes an innovative approach that uses ad-hoc bio-sensing system suitable to reconstruct the physio-based attentional status of the car driver. To reconstruct the car driver physiological status, the author proposed the use of a bio-sensing probe consisting of a coupled LEDs at Near infrared (NiR) spectrum with a photodetector. This probe placed over the monitored subject allows to detect a physiological signal called PhotoPlethysmoGraphy (PPG). The PPG signal formation is regulated by the change in oxygenated and non-oxygenated hemoglobin concentration in the monitored subject bloodstream which will be directly connected to cardiac activity in turn regulated by the Autonomic Nervous System (ANS) that characterizes the subject's attention level. This so designed car driver drowsiness monitoring will be combined with further driving safety assessment based on correlated intelligent driving scenario understanding.
2023-03-23	Efficient and Direct Inference of Heart Rate Variability using Both Signal Processing and Machine Learning	Yuntong Zhang et.al.	2303.13637v1	null	Heart Rate Variability (HRV) measures the variation of the time between consecutive heartbeats and is a major indicator of physical and mental health. Recent research has demonstrated that photoplethysmography (PPG) sensors can be used to infer HRV. However, many prior studies had high errors because they only employed signal processing or machine learning (ML), or because they indirectly inferred HRV, or because there lacks large training datasets. Many prior studies may also require large ML models. The low accuracy and large model sizes limit their applications to small embedded devices and potential future use in healthcare. To address the above issues, we first collected a large dataset of PPG signals and HRV ground truth. With this dataset, we developed HRV models that combine signal processing and ML to directly infer HRV. Evaluation results show that our method had errors between 3.5% to 25.7% and outperformed signal-processing-only and ML-only methods. We also explored different ML models, which showed that Decision Trees and Multi-level Perceptrons have 13.0% and 9.1% errors on average with models at most hundreds of KB and inference time less than 1ms. Hence, they are more suitable for small embedded devices and potentially enable the future use of PPG-based HRV monitoring in healthcare.
2023-03-23	PPG-based Heart Rate Estimation with Efficient Sensor Sampling and Learning Models	Yuntong Zhang et.al.	2303.13636v1	null	Recent studies showed that Photoplethysmography (PPG) sensors embedded in wearable devices can estimate heart rate (HR) with high accuracy. However, despite of prior research efforts, applying PPG sensor based HR estimation to embedded devices still faces challenges due to the energy-intensive high-frequency PPG sampling and the resource-intensive machine-learning models. In this work, we aim to explore HR estimation techniques that are more suitable for lower-power and resource-constrained embedded devices. More specifically, we seek to design techniques that could provide high-accuracy HR estimation with low-frequency PPG sampling, small model size, and fast inference time. First, we show that by combining signal processing and ML, it is possible to reduce the PPG sampling frequency from 125 Hz to only 25 Hz while providing higher HR estimation accuracy. This combination also helps to reduce the ML model feature size, leading to smaller models. Additionally, we present a comprehensive analysis on different ML models and feature sizes to compare their accuracy, model size, and inference time. The models explored include Decision Tree (DT), Random Forest (RF), K-nearest neighbor (KNN), Support vector machines (SVM), and Multi-layer perceptron (MLP). Experiments were conducted using both a widely-utilized dataset and our self-collected dataset. The experimental results show that our method by combining signal processing and ML had only 5% error for HR estimation using low-frequency PPG data. Moreover, our analysis showed that DT models with 10 to 20 input features usually have good accuracy, while are several magnitude smaller in model sizes and faster in inference time.
2023-03-21	Motion Matters: Neural Motion Transfer for Better Camera Physiological Sensing	Akshay Paruchuri et.al.	2303.12059v2	link	Machine learning models for camera-based physiological measurement can have weak generalization due to a lack of representative training data. Body motion is one of the most significant sources of noise when attempting to recover the subtle cardiac pulse from a video. We explore motion transfer as a form of data augmentation to introduce motion variation while preserving physiological changes. We adapt a neural video synthesis approach to augment videos for the task of remote photoplethysmography (PPG) and study the effects of motion augmentation with respect to 1) the magnitude and 2) the type of motion. After training on motion-augmented versions of publicly available datasets, the presented inter-dataset results on five benchmark datasets show improvements of up to 75% over existing state-of-the-art results. Our findings illustrate the utility of motion transfer as a data augmentation technique for improving the generalization of models for camera-based physiological sensing. We release our code and pre-trained models for using motion transfer as a data augmentation technique on our project page: https://motion-matters.github.io/
2023-03-17	HDformer: A Higher Dimensional Transformer for Diabetes Detection Utilizing Long Range Vascular Signals	Ella Lan et.al.	2303.11340v1	null	Diabetes mellitus is a worldwide concern, and early detection can help to prevent serious complications. Low-cost, non-invasive detection methods, which take cardiovascular signals into deep learning models, have emerged. However, limited accuracy constrains their clinical usage. In this paper, we present a new Transformer-based architecture, Higher Dimensional Transformer (HDformer), which takes long-range photoplethysmography (PPG) signals to detect diabetes. The long-range PPG contains broader and deeper signal contextual information compared to the less-than-one-minute PPG signals commonly utilized in existing research. To increase the capability and efficiency of processing the long range data, we propose a new attention module Time Square Attention (TSA), reducing the volume of the tokens by more than 10x, while retaining the local/global dependencies. It converts the 1-dimensional inputs into 2-dimensional representations and groups adjacent points into a single 2D token, using the 2D Transformer models as the backbone of the encoder. It generates the dynamic patch sizes into a gated mixture-of-experts (MoE) network as decoder, which optimizes the learning on different attention areas. Extensive experimentations show that HDformer results in the state-of-the-art performance (sensitivity 98.4, accuracy 97.3, specificity 92.8, and AUC 0.929) on the standard MIMIC-III dataset, surpassing existing studies. This work is the first time to take long-range, non-invasive PPG signals via Transformer for diabetes detection, achieving a more scalable and convenient solution compared to traditional invasive approaches. The proposed HDformer can also be scaled to analyze general long-range biomedical waveforms. A wearable prototype finger-ring is designed as a proof of concept.
2023-03-16	Full-Body Cardiovascular Sensing with Remote Photoplethysmography	Lu Niu et.al.	2303.09638v1	null	Remote photoplethysmography (rPPG) allows for noncontact monitoring of blood volume changes from a camera by detecting minor fluctuations in reflected light. Prior applications of rPPG focused on face videos. In this paper we explored the feasibility of rPPG from non-face body regions such as the arms, legs, and hands. We collected a new dataset titled Multi-Site Physiological Monitoring (MSPM), which will be released with this paper. The dataset consists of 90 frames per second video of exposed arms, legs, and face, along with 10 synchronized PPG recordings. We performed baseline heart rate estimation experiments from non-face regions with several state-of-the-art rPPG approaches, including chrominance-based (CHROM), plane-orthogonal-to-skin (POS) and RemotePulseNet (RPNet). To our knowledge, this is the first evaluation of the fidelity of rPPG signals simultaneously obtained from multiple regions of a human body. Our experiments showed that skin pixels from arms, legs, and hands are all potential sources of the blood volume pulse. The best-performing approach, POS, achieved a mean absolute error peaking at 7.11 beats per minute from non-facial body parts compared to 1.38 beats per minute from the face. Additionally, we performed experiments on pulse transit time (PTT) from both the contact PPG and rPPG signals. We found that remote PTT is possible with moderately high frame rate video when distal locations on the body are visible. These findings and the supporting dataset should facilitate new research on non-face rPPG and monitoring blood flow dynamics over the whole body with a camera.
2023-03-16	Image Enhancement for Remote Photoplethysmography in a Low-Light Environment	Lin Xi et.al.	2303.09336v1	link	With the improvement of sensor technology and significant algorithmic advances, the accuracy of remote heart rate monitoring technology has been significantly improved. Despite of the significant algorithmic advances, the performance of rPPG algorithm can degrade in the long-term, high-intensity continuous work occurred in evenings or insufficient light environments. One of the main challenges is that the lost facial details and low contrast cause the failure of detection and tracking. Also, insufficient lighting in video capturing hurts the quality of physiological signal. In this paper, we collect a large-scale dataset that was designed for remote heart rate estimation recorded with various illumination variations to evaluate the performance of the rPPG algorithm (Green, ICA, and POS). We also propose a low-light enhancement solution (technical solution) for remote heart rate estimation under the low-light condition. Using collected dataset, we found 1) face detection algorithm cannot detect faces in video captured in low light conditions; 2) A decrease in the amplitude of the pulsatile signal will lead to the noise signal to be in the dominant position; and 3) the chrominance-based method suffers from the limitation in the assumption about skin-tone will not hold, and Green and ICA method receive less influence than POS in dark illuminance environment. The proposed solution for rPPG process is effective to detect and improve the signal-to-noise ratio and precision of the pulsatile signal.
2023-03-14	Non-Contrastive Unsupervised Learning of Physiological Signals from Video	Jeremy Speth et.al.	2303.07944v1	link	Subtle periodic signals such as blood volume pulse and respiration can be extracted from RGB video, enabling remote health monitoring at low cost. Advancements in remote pulse estimation -- or remote photoplethysmography (rPPG) -- are currently driven by deep learning solutions. However, modern approaches are trained and evaluated on benchmark datasets with associated ground truth from contact-PPG sensors. We present the first non-contrastive unsupervised learning framework for signal regression to break free from the constraints of labelled video data. With minimal assumptions of periodicity and finite bandwidth, our approach is capable of discovering the blood volume pulse directly from unlabelled videos. We find that encouraging sparse power spectra within normal physiological bandlimits and variance over batches of power spectra is sufficient for learning visual features of periodic signals. We perform the first experiments utilizing unlabelled video data not specifically created for rPPG to train robust pulse rate estimators. Given the limited inductive biases and impressive empirical results, the approach is theoretically capable of discovering other periodic signals from video, enabling multiple physiological measurements without the need for ground truth signals. Codes to fully reproduce the experiments are made available along with the paper.
2023-03-14	ForDigitStress: A multi-modal stress dataset employing a digital job interview scenario	Alexander Heimerl et.al.	2303.07742v1	null	We present a multi-modal stress dataset that uses digital job interviews to induce stress. The dataset provides multi-modal data of 40 participants including audio, video (motion capturing, facial recognition, eye tracking) as well as physiological information (photoplethysmography, electrodermal activity). In addition to that, the dataset contains time-continuous annotations for stress and occurred emotions (e.g. shame, anger, anxiety, surprise). In order to establish a baseline, five different machine learning classifiers (Support Vector Machine, K-Nearest Neighbors, Random Forest, Long-Short-Term Memory Network) have been trained and evaluated on the proposed dataset for a binary stress classification task. The best-performing classifier achieved an accuracy of 88.3% and an F1-score of 87.5%.

camera

wearable camera

Publish Date	Title	Authors	PDF	Code	Abstract
2023-07-27	PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking	Yang Zheng et.al.	2307.15055v1	null	We introduce PointOdyssey, a large-scale synthetic dataset, and data generation framework, for the training and evaluation of long-term fine-grained tracking algorithms. Our goal is to advance the state-of-the-art by placing emphasis on long videos with naturalistic motion. Toward the goal of naturalism, we animate deformable characters using real-world motion capture data, we build 3D scenes to match the motion capture environments, and we render camera viewpoints using trajectories mined via structure-from-motion on real videos. We create combinatorial diversity by randomizing character appearance, motion profiles, materials, lighting, 3D assets, and atmospheric effects. Our dataset currently includes 104 videos, averaging 2,000 frames long, with orders of magnitude more correspondence annotations than prior work. We show that existing methods can be trained from scratch in our dataset and outperform the published variants. Finally, we introduce modifications to the PIPs point tracking method, greatly widening its temporal receptive field, which improves its performance on PointOdyssey as well as on two real-world benchmarks. Our data and code are publicly available at: https://pointodyssey.com
2023-07-27	MapNeRF: Incorporating Map Priors into Neural Radiance Fields for Driving View Simulation	Chenming Wu et.al.	2307.14981v1	null	Simulating camera sensors is a crucial task in autonomous driving. Although neural radiance fields are exceptional at synthesizing photorealistic views in driving simulations, they still fail in generating extrapolated views. This paper proposes to incorporate map priors into neural radiance fields to synthesize out-of-trajectory driving views with semantic road consistency. The key insight is that map information can be utilized as a prior to guide the training of the radiance fields with uncertainty. Specifically, we utilize the coarse ground surface as uncertain information to supervise the density field and warp depth with uncertainty from unknown camera poses to ensure multi-view consistency. Experimental results demonstrate that our approach can produce semantic consistency in deviated views for vehicle camera simulation.
2023-07-27	GET3D--: Learning GET3D from Unconstrained Image Collections	Fanghua Yu et.al.	2307.14918v1	null	The demand for efficient 3D model generation techniques has grown exponentially, as manual creation of 3D models is time-consuming and requires specialized expertise. While generative models have shown potential in creating 3D textured shapes from 2D images, their applicability in 3D industries is limited due to the lack of a well-defined camera distribution in real-world scenarios, resulting in low-quality shapes. To overcome this limitation, we propose GET3D--, the first method that directly generates textured 3D shapes from 2D images with unknown pose and scale. GET3D-- comprises a 3D shape generator and a learnable camera sampler that captures the 6D external changes on the camera. In addition, We propose a novel training schedule to stably optimize both the shape generator and camera sampler in a unified framework. By controlling external variations using the learnable camera sampler, our method can generate aligned shapes with clear textures. Extensive experiments demonstrate the efficacy of GET3D--, which precisely fits the 6D camera pose distribution and generates high-quality shapes on both synthetic and realistic unconstrained datasets.
2023-07-27	Weakly Supervised Multi-Modal 3D Human Body Pose Estimation for Autonomous Driving	Peter Bauer et.al.	2307.14889v1	null	Accurate 3D human pose estimation (3D HPE) is crucial for enabling autonomous vehicles (AVs) to make informed decisions and respond proactively in critical road scenarios. Promising results of 3D HPE have been gained in several domains such as human-computer interaction, robotics, sports and medical analytics, often based on data collected in well-controlled laboratory environments. Nevertheless, the transfer of 3D HPE methods to AVs has received limited research attention, due to the challenges posed by obtaining accurate 3D pose annotations and the limited suitability of data from other domains. We present a simple yet efficient weakly supervised approach for 3D HPE in the AV context by employing a high-level sensor fusion between camera and LiDAR data. The weakly supervised setting enables training on the target datasets without any 2D/3D keypoint labels by using an off-the-shelf 2D joint extractor and pseudo labels generated from LiDAR to image projections. Our approach outperforms state-of-the-art results by up to $\sim$ 13% on the Waymo Open Dataset in the weakly supervised setting and achieves state-of-the-art results in the supervised setting.
2023-07-27	Learning Full-Head 3D GANs from a Single-View Portrait Dataset	Yiqian Wu et.al.	2307.14770v1	null	33D-aware face generators are commonly trained on 2D real-life face image datasets. Nevertheless, existing facial recognition methods often struggle to extract face data captured from various camera angles. Furthermore, in-the-wild images with diverse body poses introduce a high-dimensional challenge for 3D-aware generators, making it difficult to utilize data that contains complete neck and shoulder regions. Consequently, these face image datasets often contain only near-frontal face data, which poses challenges for 3D-aware face generators to construct \textit{full-head} 3D portraits. To this end, we first create the dataset {$\it{360}^{\circ}$}-\textit{Portrait}-\textit{HQ} (\textit{$\it{360}^{\circ}$PHQ}), which consists of high-quality single-view real portraits annotated with a variety of camera parameters {(the yaw angles span the entire $360^{\circ}$ range)} and body poses. We then propose \textit{3DPortraitGAN}, the first 3D-aware full-head portrait generator that learns a canonical 3D avatar distribution from the body-pose-various \textit{$\it{360}^{\circ}$PHQ} dataset with body pose self-learning. Our model can generate view-consistent portrait images from all camera angles (${360}^{\circ}$) with a full-head 3D representation. We incorporate a mesh-guided deformation field into volumetric rendering to produce deformed results to generate portrait images that conform to the body pose distribution of the dataset using our canonical generator. We integrate two pose predictors into our framework to predict more accurate body poses to address the issue of inaccurately estimated body poses in our dataset. Our experiments show that the proposed framework can generate view-consistent, realistic portrait images with complete geometry from all camera angles and accurately predict portrait body pose.
2023-07-27	High Dynamic Range Imaging via Visual Attention Modules	Ali Reza Omrani et.al.	2307.14705v1	link	Thanks to High Dynamic Range (HDR) imaging methods, the scope of photography has seen profound changes recently. To be more specific, such methods try to reconstruct the lost luminosity of the real world caused by the limitation of regular cameras from the Low Dynamic Range (LDR) images. Additionally, although the State-Of-The-Art methods in this topic perform well, they mainly concentrate on combining different exposures and have less attention to extracting the informative parts of the images. Thus, this paper aims to introduce a new model capable of incorporating information from the most visible areas of each image extracted by a visual attention module (VAM), which is a result of a segmentation strategy. In particular, the model, based on a deep learning architecture, utilizes the extracted areas to produce the final HDR image. The results demonstrate that our method outperformed most of the State-Of-The-Art algorithms.
2023-07-27	FS-Depth: Focal-and-Scale Depth Estimation from a Single Image in Unseen Indoor Scene	Chengrui Wei et.al.	2307.14624v1	null	It has long been an ill-posed problem to predict absolute depth maps from single images in real (unseen) indoor scenes. We observe that it is essentially due to not only the scale-ambiguous problem but also the focal-ambiguous problem that decreases the generalization ability of monocular depth estimation. That is, images may be captured by cameras of different focal lengths in scenes of different scales. In this paper, we develop a focal-and-scale depth estimation model to well learn absolute depth maps from single images in unseen indoor scenes. First, a relative depth estimation network is adopted to learn relative depths from single images with diverse scales/semantics. Second, multi-scale features are generated by mapping a single focal length value to focal length features and concatenating them with intermediate features of different scales in relative depth estimation. Finally, relative depths and multi-scale features are jointly fed into an absolute depth estimation network. In addition, a new pipeline is developed to augment the diversity of focal lengths of public datasets, which are often captured with cameras of the same or similar focal lengths. Our model is trained on augmented NYUDv2 and tested on three unseen datasets. Our model considerably improves the generalization ability of depth estimation by 41%/13% (RMSE) with/without data augmentation compared with five recent SOTAs and well alleviates the deformation problem in 3D reconstruction. Notably, our model well maintains the accuracy of depth estimation on original NYUDv2.
2023-07-27	White-light superflare and long-term activity of the nearby M7 type binary EI~Cnc observed with GWAC system	Hua-Li Li et.al.	2307.14594v1	null	Stellar white-light flares are believed to play an essential role on the physical and chemical properties of the atmosphere of the surrounding exoplanets. Here we report an optical monitoring campaign on the nearby flaring system EI~Cnc carried out by the Ground-based Wide Angle Cameras (GWAC) and its dedicated follow-up telescope. A superflare, coming from the brighter component EI~CncA, was detected and observed, in which four components are required to properly model the complex decay light curve. The lower limit of flare energy in the $R-$band is estimated to be $3.3\times10^{32}$ ergs. 27 flares are additionally detected from the GWAC archive data with a total duration of 290 hours. The inferred cumulative flare frequency distribution follows a quite shallow power-law function with a slope of $\beta=-0.50\pm 0.03$ over the energy range between $10^{30}$ and $10^{33}$ erg, which reinforces the trend that stars cooler than M4 show enhanced superflare activity. The flares identified in EI~Cnc enable us to extend the $\tau-E$ relationship previously established in the white-light superflares of solar-type stars down to an energy as low as $\sim10^{30}$erg (i.e., by three orders): $\tau\propto E^{0.42\pm0.02}$, which suggests a common flare mechanism for stars with a type from M to solar-like, and implies an invariant of $B^{1/3}\upsilon_{\rm A}$ in the white-light flares.
2023-07-27	MCPA: Multi-scale Cross Perceptron Attention Network for 2D Medical Image Segmentation	Liang Xu et.al.	2307.14588v1	link	The UNet architecture, based on Convolutional Neural Networks (CNN), has demonstrated its remarkable performance in medical image analysis. However, it faces challenges in capturing long-range dependencies due to the limited receptive fields and inherent bias of convolutional operations. Recently, numerous transformer-based techniques have been incorporated into the UNet architecture to overcome this limitation by effectively capturing global feature correlations. However, the integration of the Transformer modules may result in the loss of local contextual information during the global feature fusion process. To overcome these challenges, we propose a 2D medical image segmentation model called Multi-scale Cross Perceptron Attention Network (MCPA). The MCPA consists of three main components: an encoder, a decoder, and a Cross Perceptron. The Cross Perceptron first captures the local correlations using multiple Multi-scale Cross Perceptron modules, facilitating the fusion of features across scales. The resulting multi-scale feature vectors are then spatially unfolded, concatenated, and fed through a Global Perceptron module to model global dependencies. Furthermore, we introduce a Progressive Dual-branch Structure to address the semantic segmentation of the image involving finer tissue structures. This structure gradually shifts the segmentation focus of MCPA network training from large-scale structural features to more sophisticated pixel-level features. We evaluate our proposed MCPA model on several publicly available medical image datasets from different tasks and devices, including the open large-scale dataset of CT (Synapse), MRI (ACDC), fundus camera (DRIVE, CHASE_DB1, HRF), and OCTA (ROSE). The experimental results show that our MCPA model achieves state-of-the-art performance. The code is available at https://github.com/simonustc/MCPA-for-2D-Medical-Image-Segmentation.
2023-07-27	A Memory-Augmented Multi-Task Collaborative Framework for Unsupervised Traffic Accident Detection in Driving Videos	Rongqin Liang et.al.	2307.14575v1	null	Identifying traffic accidents in driving videos is crucial to ensuring the safety of autonomous driving and driver assistance systems. To address the potential danger caused by the long-tailed distribution of driving events, existing traffic accident detection (TAD) methods mainly rely on unsupervised learning. However, TAD is still challenging due to the rapid movement of cameras and dynamic scenes in driving scenarios. Existing unsupervised TAD methods mainly rely on a single pretext task, i.e., an appearance-based or future object localization task, to detect accidents. However, appearance-based approaches are easily disturbed by the rapid movement of the camera and changes in illumination, which significantly reduce the performance of traffic accident detection. Methods based on future object localization may fail to capture appearance changes in video frames, making it difficult to detect ego-involved accidents (e.g., out of control of the ego-vehicle). In this paper, we propose a novel memory-augmented multi-task collaborative framework (MAMTCF) for unsupervised traffic accident detection in driving videos. Different from previous approaches, our method can more accurately detect both ego-involved and non-ego accidents by simultaneously modeling appearance changes and object motions in video frames through the collaboration of optical flow reconstruction and future object localization tasks. Further, we introduce a memory-augmented motion representation mechanism to fully explore the interrelation between different types of motion representations and exploit the high-level features of normal traffic patterns stored in memory to augment motion representations, thus enlarging the difference from anomalies. Experimental results on recently published large-scale dataset demonstrate that our method achieves better performance compared to previous state-of-the-art approaches.
2023-07-26	Patterns of Vehicle Lights: Addressing Complexities in Curation and Annotation of Camera-Based Vehicle Light Datasets and Metrics	Ross Greer et.al.	2307.14521v1	null	This paper explores the representation of vehicle lights in computer vision and its implications for various tasks in the field of autonomous driving. Different specifications for representing vehicle lights, including bounding boxes, center points, corner points, and segmentation masks, are discussed in terms of their strengths and weaknesses. Three important tasks in autonomous driving that can benefit from vehicle light detection are identified: nighttime vehicle detection, 3D vehicle orientation estimation, and dynamic trajectory cues. Each task may require a different representation of the light. The challenges of collecting and annotating large datasets for training data-driven models are also addressed, leading to introduction of the LISA Vehicle Lights Dataset and associated Light Visibility Model, which provides light annotations specifically designed for downstream applications in vehicle detection, intent and trajectory prediction, and safe path planning. A comparison of existing vehicle light datasets is provided, highlighting the unique features and limitations of each dataset. Overall, this paper provides insights into the representation of vehicle lights and the importance of accurate annotations for training effective detection models in autonomous driving applications. Our dataset and model are made available at https://cvrr.ucsd.edu/vehicle-lights-dataset
2023-07-26	Technical note: ShinyAnimalCV: open-source cloud-based web application for object detection, segmentation, and three-dimensional visualization of animals using computer vision	Jin Wang et.al.	2307.14487v1	null	Computer vision (CV), a non-intrusive and cost-effective technology, has furthered the development of precision livestock farming by enabling optimized decision-making through timely and individualized animal care. The availability of affordable two- and three-dimensional camera sensors, combined with various machine learning and deep learning algorithms, has provided a valuable opportunity to improve livestock production systems. However, despite the availability of various CV tools in the public domain, applying these tools to animal data can be challenging, often requiring users to have programming and data analysis skills, as well as access to computing resources. Moreover, the rapid expansion of precision livestock farming is creating a growing need to educate and train animal science students in CV. This presents educators with the challenge of efficiently demonstrating the complex algorithms involved in CV. Thus, the objective of this study was to develop ShinyAnimalCV, an open-source cloud-based web application. This application provides a user-friendly interface for performing CV tasks, including object segmentation, detection, three-dimensional surface visualization, and extraction of two- and three-dimensional morphological features. Nine pre-trained CV models using top-view animal data are included in the application. ShinyAnimalCV has been deployed online using cloud computing platforms. The source code of ShinyAnimalCV is available on GitHub, along with detailed documentation on training CV models using custom data and deploying ShinyAnimalCV locally to allow users to fully leverage the capabilities of the application. ShinyAnimalCV can contribute to CV research and teaching in the animal science community.
2023-07-26	AutoSourceID-Classifier. Star-Galaxy Classification using a Convolutional Neural Network with Spatial Information	F. Stoppa et.al.	2307.14456v1	null	Aims. Traditional star-galaxy classification techniques often rely on feature estimation from catalogues, a process susceptible to introducing inaccuracies, thereby potentially jeopardizing the classification's reliability. Certain galaxies, especially those not manifesting as extended sources, can be misclassified when their shape parameters and flux solely drive the inference. We aim to create a robust and accurate classification network for identifying stars and galaxies directly from astronomical images. By leveraging convolutional neural networks (CNN) and additional information about the source position, we aim to accurately classify all stars and galaxies within a survey, particularly those with a signal-to-noise ratio (S/N) near the detection limit. Methods. The AutoSourceID-Classifier (ASID-C) algorithm developed here uses 32x32 pixel single filter band source cutouts generated by the previously developed ASID-L code. ASID-C utilizes CNNs to distinguish these cutouts into stars or galaxies, leveraging their strong feature-learning capabilities. Subsequently, we employ a modified Platt Scaling calibration for the output of the CNN. This technique ensures that the derived probabilities are effectively calibrated, delivering precise and reliable results. Results. We show that ASID-C, trained on MeerLICHT telescope images and using the Dark Energy Camera Legacy Survey (DECaLS) morphological classification, outperforms similar codes like SourceExtractor. ASID-C opens up new possibilities for accurate celestial object classification, especially for sources with a S/N near the detection limit. Potential applications of ASID-C, like real-time star-galaxy classification and transient's host identification, promise significant contributions to astronomical research.
2023-07-26	US & MR Image-Fusion Based on Skin Co-Registration	Martina Paccini et.al.	2307.14288v1	null	The study and development of innovative solutions for the advanced visualisation, representation and analysis of medical images offer different research directions. Current practice in medical imaging consists in combining real-time US with imaging modalities that allow internal anatomy acquisitions, such as CT, MRI, PET or similar. Application of image-fusion approaches can be found in tracking surgical tools and/or needles, in real-time during interventions. Thus, this work proposes a fusion imaging system for the registration of CT and MRI images with real-time US acquisition leveraging a 3D camera sensor. The main focus of the work is the portability of the system and its applicability to different anatomical districts.
2023-07-26	Probing reflection from aerosols with the near-infrared dayside spectrum of WASP-80b	Bob Jacobs et.al.	2307.14399v1	null	The presence of aerosols is intimately linked to the global energy budget and composition of planet atmospheres. Their ability to reflect incoming light prevents energy from being deposited into the atmosphere, and they shape spectra of exoplanets. We observed one near-infrared secondary eclipse of WASP-80b with the Wide Field Camera 3 aboard the Hubble Space Telescope to provide constraints on the presence and properties of atmospheric aerosols. We detect a broadband eclipse depth of $34\pm10$ ppm for WASP-80b, making this the lowest equilibrium temperature planet for which a secondary eclipse has been detected so far with WFC3. We detect a higher planetary flux than expected from thermal emission alone at $1.6\sigma$ that hints toward the presence of reflecting aerosols on this planet's dayside. We paired the WFC3 data with Spitzer data and explored multiple atmospheric models with and without aerosols to interpret this spectrum. Albeit consistent with a clear dayside atmosphere, we found a slight preference for near-solar metallicities and for dayside clouds over hazes. We exclude soot haze formation rates higher than $10^{-10.7}$ g cm$^{-2}$s$^{-1}$ and tholin formation rates higher than $10^{-12.0}$ g cm$^{-2}$s$^{-1}$ at $3\sigma$. We applied the same atmospheric models to a previously published WFC3/Spitzer transmission spectrum for this planet and find weak haze formation. A single soot haze formation rate best fits both the dayside and the transmission spectra simultaneously. However, we emphasize that no models provide satisfactory fits in terms of chi-square of both spectra simultaneously, indicating longitudinal dissimilarity in the atmosphere's aerosol composition.
2023-07-26	DisguisOR: Holistic Face Anonymization for the Operating Room	Lennart Bastian et.al.	2307.14241v1	link	Purpose: Recent advances in Surgical Data Science (SDS) have contributed to an increase in video recordings from hospital environments. While methods such as surgical workflow recognition show potential in increasing the quality of patient care, the quantity of video data has surpassed the scale at which images can be manually anonymized. Existing automated 2D anonymization methods under-perform in Operating Rooms (OR), due to occlusions and obstructions. We propose to anonymize multi-view OR recordings using 3D data from multiple camera streams. Methods: RGB and depth images from multiple cameras are fused into a 3D point cloud representation of the scene. We then detect each individual's face in 3D by regressing a parametric human mesh model onto detected 3D human keypoints and aligning the face mesh with the fused 3D point cloud. The mesh model is rendered into every acquired camera view, replacing each individual's face. Results: Our method shows promise in locating faces at a higher rate than existing approaches. DisguisOR produces geometrically consistent anonymizations for each camera view, enabling more realistic anonymization that is less detrimental to downstream tasks. Conclusion: Frequent obstructions and crowding in operating rooms leaves significant room for improvement for off-the-shelf anonymization methods. DisguisOR addresses privacy on a scene level and has the potential to facilitate further research in SDS.
2023-07-26	The nature of the X-ray sources in dwarf galaxies in nearby clusters from the KIWICS	Şeyda Şen et.al.	2307.14230v1	null	We present a deep search for and analysis of X-ray sources in a sample of dwarf galaxies (M${r}$ < -15.5 mag) located within twelve galaxy clusters from the Kapteyn IAC WEAVE INT Cluster Survey (KIWICS) of photometric observations in the $\textit{r}$ and $\textit{g}$ using the Wide Field Camera (WFC) at the 2.5-m Isaac Newton telescope (INT). We first investigated the optical data, identified 2720 dwarf galaxies in all fields and determined their characteristics; namely, their colors, effective radii, and stellar masses. We then searched the $\textit{Chandra}$ data archive for X-ray counterparts of optically detected dwarf galaxies. We found a total of 20 X-ray emitting dwarf galaxies, with X-ray flux ranging from 1.7$\times10^{-15}$ to 4.1$\times10^{-14}$ erg cm$^{-2}$ s$^{-1}$ and X-ray luminosities varying from 2$\times10^{39}$ to 5.4$\times10^{41}$ erg s$^{-1}$. Our results indicate that the X-ray luminosity of the sources in our sample is larger than the Eddington luminosity limit for a typical neutron star, even at the lowest observed levels. This leads us to conclude that the sources emitting X-rays in our sample are likely black holes. Additionally, we have employed a scaling relation between black hole and stellar mass to estimate the masses of the black holes in our sample, and have determined a range of black hole masses from 4.6$\times10^{4}$ to 1.5$\times10^{6}$ M$\odot$. Finally, we find a trend between X-ray to optical flux ratio and X-ray flux. We discuss the implications of our findings and highlight the importance of X-ray observations in studying the properties of dwarf galaxies.
2023-07-26	Tackling Scattering and Reflective Flare in Mobile Camera Systems: A Raw Image Dataset for Enhanced Flare Removal	Fengbo Lan et.al.	2307.14180v1	null	The increasing prevalence of mobile devices has led to significant advancements in mobile camera systems and improved image quality. Nonetheless, mobile photography still grapples with challenging issues such as scattering and reflective flare. The absence of a comprehensive real image dataset tailored for mobile phones hinders the development of effective flare mitigation techniques. To address this issue, we present a novel raw image dataset specifically designed for mobile camera systems, focusing on flare removal. Capitalizing on the distinct properties of raw images, this dataset serves as a solid foundation for developing advanced flare removal algorithms. It encompasses a wide variety of real-world scenarios captured with diverse mobile devices and camera settings. The dataset comprises over 2,000 high-quality full-resolution raw image pairs for scattering flare and 1,100 for reflective flare, which can be further segmented into up to 30,000 and 2,200 paired patches, respectively, ensuring broad adaptability across various imaging conditions. Experimental results demonstrate that networks trained with synthesized data struggle to cope with complex lighting settings present in this real image dataset. We also show that processing data through a mobile phone's internal ISP compromises image quality while using raw image data presents significant advantages for addressing the flare removal problem. Our dataset is expected to enable an array of new research in flare removal and contribute to substantial improvements in mobile image quality, benefiting mobile photographers and end-users alike.
2023-07-26	Memory-Efficient Graph Convolutional Networks for Object Classification and Detection with Event Cameras	Kamil Jeziorek et.al.	2307.14124v1	null	Recent advances in event camera research emphasize processing data in its original sparse form, which allows the use of its unique features such as high temporal resolution, high dynamic range, low latency, and resistance to image blur. One promising approach for analyzing event data is through graph convolutional networks (GCNs). However, current research in this domain primarily focuses on optimizing computational costs, neglecting the associated memory costs. In this paper, we consider both factors together in order to achieve satisfying results and relatively low model complexity. For this purpose, we performed a comparative analysis of different graph convolution operations, considering factors such as execution time, the number of trainable model parameters, data format requirements, and training outcomes. Our results show a 450-fold reduction in the number of parameters for the feature extraction module and a 4.5-fold reduction in the size of the data representation while maintaining a classification accuracy of 52.3%, which is 6.3% higher compared to the operation used in state-of-the-art approaches. To further evaluate performance, we implemented the object detection architecture and evaluated its performance on the N-Caltech101 dataset. The results showed an accuracy of 53.7 % mAP@0.5 and reached an execution rate of 82 graphs per second.
2023-07-26	Learning heterogeneous delays in a layer of spiking neurons for fast motion detection	Antoine Grimaldi et.al.	2307.14077v1	null	The precise timing of spikes emitted by neurons plays a crucial role in shaping the response of efferent biological neurons. This temporal dimension of neural activity holds significant importance in understanding information processing in neurobiology, especially for the performance of neuromorphic hardware, such as event-based cameras. Nonetheless, many artificial neural models disregard this critical temporal dimension of neural activity. In this study, we present a model designed to efficiently detect temporal spiking motifs using a layer of spiking neurons equipped with heterogeneous synaptic delays. Our model capitalizes on the diverse synaptic delays present on the dendritic tree, enabling specific arrangements of temporally precise synaptic inputs to synchronize upon reaching the basal dendritic tree. We formalize this process as a time-invariant logistic regression, which can be trained using labeled data. To demonstrate its practical efficacy, we apply the model to naturalistic videos transformed into event streams, simulating the output of the biological retina or event-based cameras. To evaluate the robustness of the model in detecting visual motion, we conduct experiments by selectively pruning weights and demonstrate that the model remains efficient even under significantly reduced workloads. In conclusion, by providing a comprehensive, event-driven computational building block, the incorporation of heterogeneous delays has the potential to greatly improve the performance of future spiking neural network algorithms, particularly in the context of neuromorphic chips.
2023-07-26	Three-year performance of the IceAct telescopes at the IceCube Neutrino Observatory	Lars Heuermann et.al.	2307.13969v1	null	IceAct is an array of compact Imaging Air Cherenkov Telescopes at the ice surface as part of the IceCube Neutrino Observatory. The telescopes, featuring a camera of 61 silicon photomultipliers and fresnel-lens-based optics, are optimized to be operated in harsh environmental conditions, such as at the South Pole. Since 2019, the first two telescopes have been operating in a stereoscopic configuration in the center of IceCube's surface detector IceTop. With an energy threshold of about 10 TeV and a wide field-of-view, the IceAct telescopes show promising capabilities of improving current cosmic-ray composition studies: measuring the Cherenkov light emissions in the atmosphere adds new information about the shower development not accessible with the current detectors. First simulations indicate that the added information of a single telescope leads, e.g., to an improved discrimination between flux contributions from different primary particle species in the sensitive energy range. We review the performance and detector operations of the telescopes during the past 3 years (2020-2022) and give an outlook on the future of IceAct.
2023-07-26	Towards a cosmic ray composition measurement with the IceAct telescopes at the IceCube Neutrino Observatory	Larissa Paul et.al.	2307.13965v1	null	The IceCube Neutrino Observatory is equipped with the unique possibility to measure cosmic ray induced air showers simultaneously by their particle footprint on the surface with the IceTop detector and by the high-energy muonic shower component at a depth of more than 1.5 km. Since 2019 additionally two Imaging Air Cherenkov Telescopes, called IceAct, measure the electromagnetic component of air showers in the atmosphere above the IceCube detector. This opens the possibility to measure air shower parameters in three independent detectors and allows to improve mass composition studies with the IceCube data. One IceAct camera consists of 61 SiPM pixels in a hexagonal grid. Each pixel has a field of view of 1.5 degree resulting in an approximately 12-degree field of view per camera. A single telescope tube has a diameter of 50 cm, is built robust enough to withstand the harsh Antarctic conditions, and is able to detect cosmic ray particles with energies above approximately 10 TeV. A Graph Neural Network (GNN) is trained to determine the air shower properties from IceAct data. The composition analysis is then performed using Random Forest Regression (RF). Since all three detectors have a different energy threshold, we train several RFs with different inputs, combining the different detectors and taking advantage of the lower energy threshold of the IceAct telescopes. This will result in composition measurements for different detector combinations and enables cross-checks of the results in overlapping energy bands. We present the method, parameters for data selection, and the status of this analysis.
2023-07-25	Decisive Data using Multi-Modality Optical Sensors for Advanced Vehicular Systems	Muhammad Ali Farooq et.al.	2307.13600v1	null	Optical sensors have played a pivotal role in acquiring real world data for critical applications. This data, when integrated with advanced machine learning algorithms provides meaningful information thus enhancing human vision. This paper focuses on various optical technologies for design and development of state-of-the-art out-cabin forward vision systems and in-cabin driver monitoring systems. The focused optical sensors include Longwave Thermal Imaging (LWIR) cameras, Near Infrared (NIR), Neuromorphic/ event cameras, Visible CMOS cameras and Depth cameras. Further the paper discusses different potential applications which can be employed using the unique strengths of each these optical modalities in real time environment.
2023-07-25	HeightFormer: Explicit Height Modeling without Extra Data for Camera-only 3D Object Detection in Bird's Eye View	Yiming Wu et.al.	2307.13510v1	null	Vision-based Bird's Eye View (BEV) representation is an emerging perception formulation for autonomous driving. The core challenge is to construct BEV space with multi-camera features, which is a one-to-many ill-posed problem. Diving into all previous BEV representation generation methods, we found that most of them fall into two types: modeling depths in image views or modeling heights in the BEV space, mostly in an implicit way. In this work, we propose to explicitly model heights in the BEV space, which needs no extra data like LiDAR and can fit arbitrary camera rigs and types compared to modeling depths. Theoretically, we give proof of the equivalence between height-based methods and depth-based methods. Considering the equivalence and some advantages of modeling heights, we propose HeightFormer, which models heights and uncertainties in a self-recursive way. Without any extra data, the proposed HeightFormer could estimate heights in BEV accurately. Benchmark results show that the performance of HeightFormer achieves SOTA compared with those camera-only methods.
2023-07-25	Prior Based Online Lane Graph Extraction from Single Onboard Camera Image	Yigit Baran Can et.al.	2307.13344v1	null	The local road network information is essential for autonomous navigation. This information is commonly obtained from offline HD-Maps in terms of lane graphs. However, the local road network at a given moment can be drastically different than the one given in the offline maps; due to construction works, accidents etc. Moreover, the autonomous vehicle might be at a location not covered in the offline HD-Map. Thus, online estimation of the lane graph is crucial for widespread and reliable autonomous navigation. In this work, we tackle online Bird's-Eye-View lane graph extraction from a single onboard camera image. We propose to use prior information to increase quality of the estimations. The prior is extracted from the dataset through a transformer based Wasserstein Autoencoder. The autoencoder is then used to enhance the initial lane graph estimates. This is done through optimization of the latent space vector. The optimization encourages the lane graph estimation to be logical by discouraging it to diverge from the prior distribution. We test the method on two benchmark datasets, NuScenes and Argoverse. The results show that the proposed method significantly improves the performance compared to state-of-the-art methods.
2023-07-25	A Visual Quality Assessment Method for Raster Images in Scanned Document	Justin Yang et.al.	2307.13241v1	null	Image quality assessment (IQA) is an active research area in the field of image processing. Most prior works focus on visual quality of natural images captured by cameras. In this paper, we explore visual quality of scanned documents, focusing on raster image areas. Different from many existing works which aim to estimate a visual quality score, we propose a machine learning based classification method to determine whether the visual quality of a scanned raster image at a given resolution setting is acceptable. We conduct a psychophysical study to determine the acceptability at different image resolutions based on human subject ratings and use them as the ground truth to train our machine learning model. However, this dataset is unbalanced as most images were rated as visually acceptable. To address the data imbalance problem, we introduce several noise models to simulate the degradation of image quality during the scanning process. Our results show that by including augmented data in training, we can significantly improve the performance of the classifier to determine whether the visual quality of raster images in a scanned document is acceptable or not for a given resolution setting.
2023-07-24	Why Don't You Clean Your Glasses? Perception Attacks with Dynamic Optical Perturbations	Yi Han et.al.	2307.13131v1	null	Camera-based autonomous systems that emulate human perception are increasingly being integrated into safety-critical platforms. Consequently, an established body of literature has emerged that explores adversarial attacks targeting the underlying machine learning models. Adapting adversarial attacks to the physical world is desirable for the attacker, as this removes the need to compromise digital systems. However, the real world poses challenges related to the "survivability" of adversarial manipulations given environmental noise in perception pipelines and the dynamicity of autonomous systems. In this paper, we take a sensor-first approach. We present EvilEye, a man-in-the-middle perception attack that leverages transparent displays to generate dynamic physical adversarial examples. EvilEye exploits the camera's optics to induce misclassifications under a variety of illumination conditions. To generate dynamic perturbations, we formalize the projection of a digital attack into the physical domain by modeling the transformation function of the captured image through the optical pipeline. Our extensive experiments show that EvilEye's generated adversarial perturbations are much more robust across varying environmental light conditions relative to existing physical perturbation frameworks, achieving a high attack success rate (ASR) while bypassing state-of-the-art physical adversarial detection frameworks. We demonstrate that the dynamic nature of EvilEye enables attackers to adapt adversarial examples across a variety of objects with a significantly higher ASR compared to state-of-the-art physical world attack frameworks. Finally, we discuss mitigation strategies against the EvilEye attack.
2023-07-24	Automatic Infant Respiration Estimation from Video: A Deep Flow-based Algorithm and a Novel Public Benchmark	Sai Kumar Reddy Manne et.al.	2307.13110v1	link	Respiration is a critical vital sign for infants, and continuous respiratory monitoring is particularly important for newborns. However, neonates are sensitive and contact-based sensors present challenges in comfort, hygiene, and skin health, especially for preterm babies. As a step toward fully automatic, continuous, and contactless respiratory monitoring, we develop a deep-learning method for estimating respiratory rate and waveform from plain video footage in natural settings. Our automated infant respiration flow-based network (AIRFlowNet) combines video-extracted optical flow input and spatiotemporal convolutional processing tuned to the infant domain. We support our model with the first public annotated infant respiration dataset with 125 videos (AIR-125), drawn from eight infant subjects, set varied pose, lighting, and camera conditions. We include manual respiration annotations and optimize AIRFlowNet training on them using a novel spectral bandpass loss function. When trained and tested on the AIR-125 infant data, our method significantly outperforms other state-of-the-art methods in respiratory rate estimation, achieving a mean absolute error of $\sim$2.9 breaths per minute, compared to $\sim$4.7--6.2 for other public models designed for adult subjects and more uniform environments.
2023-07-24	Freeform three-mirror anastigmatic large-aperture telescope and receiver optics for CMB-S4	Patricio A. Gallardo et.al.	2307.12931v1	null	CMB-S4, the next-generation ground-based cosmic microwave background (CMB) observatory, will provide detailed maps of the CMB at millimeter wavelengths to dramatically advance our understanding of the origin and evolution of the universe. CMB-S4 will deploy large and small aperture telescopes with hundreds of thousands of detectors to observe the CMB at arcminute and degree resolutions at millimeter wavelengths. Inflationary science benefits from a deep delensing survey at arcminute resolutions capable of observing a large field of view at millimeter wavelengths. This kind of survey acts as a complement to a degree angular resolution survey. The delensing survey requires a nearly uniform distribution of cameras per frequency band across the focal plane. We present a large-throughput, large-aperture (5-meter diameter) freeform three-mirror anastigmatic telescope and an array of 85 cameras for CMB observations at arcminute resolutions, which meets the needs of the delensing survey of CMB-S4. A detailed prescription of this three-mirror telescope and cameras is provided, with a series of numerical calculations that indicate expected optical performance and mechanical tolerance.
2023-07-24	Trust-aware Safe Control for Autonomous Navigation: Estimation of System-to-human Trust for Trust-adaptive Control Barrier Functions	Saad Ejaz et.al.	2307.12815v1	null	A trust-aware safe control system for autonomous navigation in the presence of humans, specifically pedestrians, is presented. The system combines model predictive control (MPC) with control barrier functions (CBFs) and trust estimation to ensure safe and reliable navigation in complex environments. Pedestrian trust values are computed based on features, extracted from camera sensor images, such as mutual eye contact and smartphone usage. These trust values are integrated into the MPC controller's CBF constraints, allowing the autonomous vehicle to make informed decisions considering pedestrian behavior. Simulations conducted in the CARLA driving simulator demonstrate the feasibility and effectiveness of the proposed system, showcasing more conservative behaviour around inattentive pedestrians and vice versa. The results highlight the practicality of the system in real-world applications, providing a promising approach to enhance the safety and reliability of autonomous navigation systems, especially self-driving vehicles.