Data-Driven Prediction of Athletes’ Performance based on their Social Media Presence
We investigated whether proxies of athlete social media activity are useful features for a machine learning model to predict athletes’ performance in subsequent competitions. We extracted millions of tweets that NBA basketball players posted themselves or were tagged in and derived features reflecting players’ mood, social media behaviour, and sleep quality before games. Using these and other non-social media-related features, we performed statistical tests to examine whether the features significantly improve the accuracy of a random forest model for predicting players’ BPM scores in upcoming games.