Tinnitus-related distress after multimodal treatment can be characterized using a key subset of baseline variables

A machine learning workflow to predict tinnitus distress after treatment based on self-report questionnaire data acquired at baseline and extraction of statistics and visualizations representing feature importance on a population-, subpopulation- and individual level.

Paper authors: Uli Niemann, Benjamin Boecking, Petra Brueggemann, Wilhelm Mebus, Birgit Mazurek, and Myra Spiliopoulou

PLOS ONE

By Uli Niemann in Research Tinnitus

January 30, 2020

Publication (website) Publication (pdf)

Abstract

Background

Chronic tinnitus is a complex condition that can be associated with considerable distress. Whilst cognitive-behavioral treatment (CBT) approaches have been shown to be effective, not all patients benefit from psychological or psychologically anchored multimodal therapies. Determinants of tinnitus-related distress thus provide valuable information about tinnitus characterization and therapy planning.

Objective

The study aimed to develop machine learning models that use variables (or “features”) obtained before treatment to characterize patients’ tinnitus-related distress status after treatment. Whilst initially all available variables were considered for model training, the final model was required to achieve highest predictive performance using only a small number of features.

Methods

1,416 tinnitus patients (decompensated tinnitus: 32%) who completed a 7-day multimodal treatment encompassing tinnitus-specific components, CBT, physiotherapy and informational counseling were included in the analysis. At baseline, patients were assessed using 205 features from 10 questionnaires comprising sociodemographic and clinical information. A data-driven workflow was developed consisting of (a) an initial exploratory correlation analysis, (b) supervised machine learning to predict tinnitus-related distress after treatment (T1) using baseline data only (T0), and (c) post-hoc analysis of the best model to facilitate model inspection and understanding. Classification methods were embedded in a feature elimination wrapper that iteratively learned on features found to be important for the model in the preceding iteration, in order to keep the performance stable while successively reducing the model complexity. 10-fold cross-validation with area under the curve (AUC) as performance measure was implemented for model generalization error estimation.

Results

The best machine learning classifier (gradient boosted trees) can predict tinnitus-related distress in T1 with AUC = 0.890 using 26 features. Subjectively perceived tinnitus-related impairment, depressivity, sleep problems, physical health-related impairments in quality of life, time spent to complete questionnaires and educational level exhibited a high attribution towards model prediction.

Conclusions

Machine learning can reliably identify baseline features recorded prior to treatment commencement that characterize tinnitus-related distress after treatment. The identification of key features can contribute to an improved understanding of multifactorial contributors to tinnitus-related distress and thereon based multimodal treatment strategies.

Important figure

Figure 4. SHAP analysis results for the best model (GBT, i = 7). (A) Global feature importance based on the mean absolute magnitude of the SHAP values over all training instances. Values represent absolute change in log odds where higher values indicate higher feature importance. (B) Instance-individual SHAP values. A point represents the SHAP value for the feature depicted on the y-axis with respect to a single patient. The further afar a point from the vertical line at 0.0, the larger the attribution of the corresponding feature value to the model prediction. Vertically offset points depict regions of high density. Points are colored according to the actual feature value of the respective patient. (C) Combined SHAP feature attribution for all patients. Patients are ordered according to hierarchical clustering with complete linkage and k = 5. Blue horizontal lines depict the average sum of SHAP values of the cluster members.

Attribution of features of a model predicting tinnitus distress on a study population-, subpopulation-, and individual level.

BibTeX citation

@article{Niemann:PONE2020,
  author    = {Niemann, Uli and Boecking, Benjamin and Brueggemann, Petra and
               Mebus, Wilhelm and Mazurek, Birgit and Spiliopoulou, Myra},
  journal   = {PLOS ONE},
  title     = {Tinnitus-related distress after multimodal treatment can be
               characterized using a key subset of baseline variables},
  year      = {2020},
  number    = {1},
  pages     = {1--18},
  volume    = {15},
  doi       = {10.1371/journal.pone.0228037},
  groups    = {Own publications},
  url       = {https://doi.org/10.1371/journal.pone.0228037},
}

Posted on:: January 30, 2020

Length:: 3 minute read, 588 words

Categories:: Research Tinnitus

Tags:: Explainable AI Predictive Modeling

See Also:: Data-Driven Prediction of Athletes’ Performance based on their Social Media Presence; Classification of cardiac cohorts based on morphological and hemodynamic features derived from 4D PC-MRI data; Gender-Specific Differences in Patients With Chronic Tinnitus - Baseline Characteristics and Treatment Effects