Innovative Statistical Methods to Model and Evaluate Physical Activity Programs Engagement


According to recent statistics from the World Health Organization, 1 out of 4 people aged 18 years and over are not sufficiently physically active. This may have resulted in online health, wellbeing and physical activity programs becoming more popular among various communities. Advancements in sensor technology have enabled these programs to track and visualise the performance of participants throughout the program period. However, they lack structured statistical and machine learning frameworks to enhance engagement and personalisation. Therefore, this thesis develops a structured framework for the Virgin Pulse Global Challenge program. Given the competitive nature of these programs, enrolled participants may manipulate the data they enter, potentially discouraging other participants and invalidating the overall accuracy of program outcomes. This study develops two parallel models for detecting participant characteristics and abnormal step count entries, respectively. The first model uses penalised logistic regression with a synthetic minority oversampling technique to detect possible persons of interest, while the second model uses an outlier detection method to detect and reject abnormal step entries based on previously entered data. The study also introduces an application known as the abnormal activities detector to enhance decision making and limit the collection of anomalous step entries. To achieve better outcomes from the program, detection of the engagement dynamics of participants is important. The literature provides many ways of detecting and modelling engagement behaviours of participants. However, these methods are not easily interpreted or visualised. The authors of the current study use a mixture hidden Markov model to identify underlying engagement behaviours based on observable behaviour sequences. The study also uses a novel approach for visualising the complex outputs generated by the model. When participants are fully engaged, the program should provide them with effective encouragement and motivation. Predicting the achievement of 10,000 steps and identifying whether step count has improved will enable the program to further personalise interventions. Therefore, the study uses a type of recurrent neural network known as a long short term memory model to predict step count improvement one day ahead and compares the performance of five machine learning models in predicting the achievement of step count goals one day ahead. The thesis presents new knowledge and prospects for online physical activity programs such as the Virgin Pulse Global Challenge.

Swinburne University of Technology, 2021
S. Sandun M. Silva
S. Sandun M. Silva

My research interests include biostatistics, data science and genome-wide association studies (GWAS)