A support vector machine (SVM) model surpassed a nontrained human rater in reliably and accurately examining the resting tremor and bradykinesia of patients with Parkinson disease (PD), according to a study published in Neurology.
Study researchers sought to determine whether the SVM model, a machine learning-based automatic rating system, would be a suitable alternative to the Movement Disorder Society (MDS)-Unified Parkinson Disease Rating Scale (UPDRS), which is often inefficient.
A movement disorder specialist certified for MDS-UPDRS watched video clips of resting tremor in 55 patients with PD and video clips of finger tapping of 55 patients with PD and rated them using the MDS-UPDRS. A neurology resident without the certification rated the clips separately, without knowing the specialist’s scores.
The study researchers measured the maximum and mean amplitude of hand movement to analyze resting tremors and the average, maximum, and minimum tapping and tapping fatigue per 3-second segment, along with tapping fatigue frequency. They tested the SVM model by training it on 80 sets of the 110 video clips to score the remaining 30 sets.
They then measured the absolute agreement rate in the UPDRS rating and the inter-rater reliability among the gold standard rating, the nontrained human rater, and the SVM model.
In the manual rating system, logarithms of resting tremor maximum and mean amplitudes were positively correlated with the gold standard rating (β=1.20, P <.001; β=1.20, P <.001). Finger tapping speed was negatively correlated with the gold standard rating (β= -0.87, P <.001), as were mean, maximum, and minimum finger tapping amplitudes (β= -0.31; β= -0.42; β = -0.15, P <.001 for all). Finger-tapping fatigue was not correlated with the gold standard rating.
The SVM model had both greater absolute and relative agreement rates with the gold standard rating (63%, 100%) than the nontrained human rater had with the gold standard rating (46%, 97%). Inter-rater reliability between the SVM model and the gold standard rating was also higher than that between the nontrained human rater and the gold standard rating (0.791, 0.662, respectively, by weighted κ).
It was the same case for intraclass correlation coefficients (ICC). Study researchers found that the SVM model and the gold standard had an ICC of 0.927 while the nontrained human rater and the gold standard rater had an ICC of 0.861.
However, SVM model reliability was lower for finger tapping than it was for resting tremor. Study researchers attributed this to the higher number of parameters needed for rating and the lesser proximity between test and training sets in examining the tremor analysis, compared with the finger tapping. Alternatively, the nontrained human rater may have also considered the patient’s demographics and hypomimia in analyzing bradykinesia.
Limitations of the study included technical errors, its retrospective nature, some lack of compliance with MDS-UPDRS protocols, and assessment of only 2 factors of the MDS-UPDRS Part III.
“Machine learning-based algorithms that automatically rate PD cardinal symptoms are feasible, with more accurate results than nontrained human ratings,” concluded the study researchers.
Park KW, Lee E, Lee JS, et al. Machine learning-based automatic rating for cardinal symptoms of Parkinson disease. Neurology. 2021;96(13):e1761-e1769. doi:10.1212/WNL.0000000000011654