Plantiga’s Machine Learning Platform

Abstract

The Plantiga platform provides an ideal framework for the application of machine learning principles in the analysis of human movement. The platform was designed with this application in mind, and as such, incorporates a variety of hardware and user interface innovations which produce high quality tagged data. The convergence of these factors is exemplified in a preliminary human movement recognition case study, detailed in this document.

Introduction

Several interrelated factors that comprise the Plantiga movement data collection system contribute to its ideal integration with artificial intelligence.

Firstly, the Plantiga system employs a high fidelity inertial measurement unit (IMU) sensor placed in an insole under the foot in a fixed position. The IMUs are calibrated and synchronized between the left and right feet. Due to their location, many body-born wearable devices are prone to movement artefacts that increase measurement error. The secure location of the sensors under the foot inside footwear provides the most direct measurement of human locomotion while minimizing possibility of sensor damage or displacement during even high impact or erratic movements. Additionally, the high sampling rate (416Hz) of the Plantiga sensors allow high density data to be collected for detailed analysis.

Secondly, the mobile and unobtrusive form factor of the Plantiga hardware permits the deployment of the Plantiga system for a wide range of users and activities, contributing to a broad dataset. This is augmented by the long hardware uptime (five hours), rechargeability, and ruggedness. Additionally, the cross-platform data collection and analysis application allows easy use in the majority of environments. To date, our data collection efforts have resulted in over 4400 clean and tagged datasets collected on over 200 individuals, resulting in just over half of a terabyte of data.

Lastly, the data collection application allows streamlined identification of activities and various factors affecting activities by users. Tagging of activity types and events of interest within activities as well as identifying environmental and physical factors related to the activity are not labour intensive for the user, and the process is designed to fit with normal data collection procedures in most athletic and tactical organizations. The UX/UI design for tagging and metadata acquisition is shown in figures 1(a), (b), and (c).

Figures 1(a), 1(b), and 1(c): User tagging functionality

Figure 1(a): Selecting activity type (activity type tag)

Figure 1(b): Marking movements or factors of interest within activities (Marks)

Figure 1(c): Filling in metadata related to activity (Physical and Environmental factors)

The ease of use of this functionality encourages live movement tagging by the user, which in turn leads to the collection of a dataset well suited to supervised machine learning with minimal post-collection data preparation.

The combination of high quality, high density and high volume tagged data makes the Plantiga system ideal for a machine learning approach to automatic human activity recognition. We expect this approach to provide a framework to easily expand the Plantiga movement taxonomy beyond what is currently supported in the application. A case study on the identification of walking, running, and jumping movements by our machine learning model is detailed in this document; several additional studies applying machine learning models to various aspects of our dataset are in progress.

Developing an Accurate Human Activity Recognition Model

Analysis was completed on a previously collected dataset to validate a human activity recognition model. Participant data from a small sample of individuals performing walking, running, and jumping movements was used to support the creation of machine learning algorithms focused on recognition of a multitude of complex human movements [1]. The aim of this human activity recognition model was to automatically detect and classify these three preliminary movements with the goal of expanding the algorithms to classify a larger variety of movements in the future.

A recurrent deep learning neural network (RNN) was employed (Keras 2.2 + Tensorflow 1.13) to create the machine learning models in this report [2]. An RNN is a class of artificial neural network in which connections between nodes form a directed graph along a temporal sequence. This shows the dynamic temporal behavior of the signal. Unlike feedforward neural networks, RNNs can use their internal state (memory) to process sequences of inputs. This makes them applicable to tasks such as unsegmented connected handwriting recognition, speech recognition or movement recognition.

Testing Protocol

Seven healthy individuals performed weekly testing over fourth months. Participants were of above-average fitness and consistently engaged in their own personally designed fitness training regimen a minimum of 3 days per week.

Participants performed 2 sets of 20 m walks, 2 sets of 30 m runs, 2 sets of 5 countermovement jumps, and 2 sets of 20 cyclic jumps. Subsequently, a supervised machine learning algorithm was employed with the aim of automatic movement pattern recognition using the data collected by the Plantiga system.

To test the machine learning algorithm, a new participant was recruited. The participant had the same fitness and activity profile as the subjects in the training dataset. The validation subject did a series of consecutive and non-consecutive walks, runs and jumps of varying length and intensity. The data obtained from this participant was used to evaluate the ability of the Plantiga machine learning algorithm to successfully detect and classify movement patterns on an untrained and unstructured (random order) dataset. Future projects should expand the sample size and include participants with varying levels of physical fitness and injury status.

Machine Learning Model

The test dataset contained more than 390 tagged activities totalling 120 minutes of data. Data was sliced into 8000 2-second overlapping windows. Each window contained 12,000 data points. The resulting features were split into a test featureset containing 25% of the data and a training featureset containing the remaining 75%.