Nilah Ravi Nair, Lena Schmid, Christopher Reining, Fernando Moya Rueda, Markus Pauly and Gernot A. Fink
Proc. Int. Conf. on Pattern Recognition, 2024.
Kolkata, India
Neural networks trained on human motion data have various industrial and daily living applications, such as activity recognition, gesture recognition, and gait-based biometrics. These neural network models are often trained on industrial or research datasets designed for a specific application with a narrow subject pool. Given that subject re-identification and soft-biometric, such as age, gender, and height, identification is feasible using neural networks trained on human activity data, the influence of these characteristics on HAR models cannot be ignored. Biased datasets can halt neural networks from generalizing to unseen subjects. However, the biases found in activity data are not explicit. As a result, this paper focuses on representation biases caused by the training data subject characteristics in multi-channel time-series human activity data obtained from sensor technologies. We provide a statistical approach to evaluate the biases in existing datasets, a method to account for biases, and a perspective on subject selection criteria for future human activity datasets. The study is a step towards fair and trustworthy artificial intelligence by attempting to quantify the subject bias in multi-channel time-series HAR data.