top of page
CLASSIFICATION
Classification: Body
One of the goals of this project was to create a classification system that would allow us to differentiate between normal and myopathic EMG data. In order to achieve this we first had to determine what features we would need to compare and how we would extract these features. Through our research we determined that passing raw EMG data through a Daubechies 45 wavelet filter and then performing a Fourier transform would allow us to extract more accurate frequency data from the EMG signals [24]. This frequency data could then be used as features in our classifier. The frequency analysis must be conducted after the application of the wavelet filter because EMG signals are time varying, meaning that simply applying a Fourier transform to the data would cause us to lose information. Once the wavelet filter was applied and the Fourier transform was taken we were able to calculate the power spectrum using the equation: PSD = |Y|^2 / (n * fs), where Y is the Fourier transform of the filtered data, n is the number of data points, and fs is the sampling frequency. The power spectral density can be seen in the figure below.
Classification: Body
Figure 1. Power spectrum for analysis of feature data for classification of EMG signals.

Classification: Image
From the power spectrum, the team was able to extract the peak frequency (by determining the location of the maximum value), as well as, the mean and median frequencies using MATLAB functions. Myopathic muscle data will usually result in lower values for these metrics, when compared to those of normal muscle data.
In an effort to improve the accuracy of our classification we looked for more features to compare. The team found research linking lower mean and total power and lower raw EMG values to myopathic data [25]. All of these values could be extracted from either the power spectrum data or the raw filtered EMG data.
The final features added to the classification method were the age and gender of the patients tested. This data was provided by the online database used. The age data was included to help the classifier determine if the signals were from a patient with myopathy or simply from an older patient with weaker muscles. The gender data was included because (generally) female patients will have weaker signals than male patients. These particular features helped increase the accuracy of our classifier from around 75% with the initial data to a final accuracy of 98.3%. A sample from the data used to train the classifier can be seen in the figure below.
Classification: Text
Figure2. Small sample of features extracted from each EMG signal as well as the type of data (normal of disordered)

Classification: Image
The classifier was created using a total of 407 signals from 16 different patients 10 of which had normal EMG data and 6 of which had myopathy. The final classifier used an Ensemble RusBoosted Classification. This classification likely performed best for our datasets because of the imbalance of normal and disordered data, as this method is designed for imbalanced data. A confusion matrix for the classifier can be seen in the figure below.
Figure 3. Confusion matrix for Ensemble RusBoosted Classifier. Blue tiles represent correctly classified data while the light pink represents misclassified data. From this figure it can be seen that the classifier has a high accuracy despite the imbalanced training data.

Classification: Image
The misclassified data was analyzed to ensure that it was composed of outliers, therefore confirming that there were not errors in the classification method. Below is a graph of the misclassified data.
Figure 4. This table is a plot of the misclassified data plotted as a function of total power and age. After analyzing the other features of these data sets it was found that the data had mostly high values for all other features meaning that the total power was likely the last factor considered. Because most of these errors were single signals from a larger set of data taken from the patient it is assumed that they are outliers.

Classification: Image
Once the classifier had an accuracy that was deemed acceptable by a member of our team we created another dataset of myopathy and normal EMG signals to further test the classifier. This test set was composed of data from 3 myopathic patients and 2 normal patients. A sample from this test set can be seen below.
Figure 5. Test data feature extracted along with actual type and type determined by classifier.

Classification: Image
This test set was predicted with an accuracy of 85.5%. The classifier struggled with one particular dataset. Although the data was from a patient with normal EMG signals, it had low values for many of the features, therefore resulting in misclassification. The reason the dataset was inconsistent is unknown, but it is assumed to be an outlier based on the other datasets used for training and testing. The overall results of the classifier were considered a success by the team.
Classification: Text
bottom of page