" <p> Multiclass classification can also be tranferred to multiple binary classification problems. One strategy is called One-vs-rest, where one classifier is trained per class. In our case this means that for each arm movement one classifier is trained by considering only the labels of the respective arm movement.\n",
" </p>\n",
"\n",
"\n",
"</div>"
"</div>"
]
]
...
...
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
importnumpyasnp
importnumpyasnp
importmatplotlib.pyplotasplt
importmatplotlib.pyplotasplt
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
# Chapter 9: Use case - prediction of arm movements
# Chapter 9: Use case - prediction of arm movements
<figcaption>Arrangement of electrodes on head.</figcaption>
<figcaption>Arrangement of electrodes on head.</figcaption>
</figure>
</figure>
</center>
</center>
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
This data contains EEG recordings of one subject performing **grasp-and-lift (GAL)** trials.
This data contains EEG recordings of one subject performing **grasp-and-lift (GAL)** trials.
There is **1 subject** in total, **10 series** of trials for this subject, and approximately **30 trials** within each series. The number of trials varies for each series.
There is **1 subject** in total, **10 series** of trials for this subject, and approximately **30 trials** within each series. The number of trials varies for each series.
For each **GAL**, you are tasked to detect 6 events:
For each **GAL**, you are tasked to detect 6 events:
- HandStart
- HandStart
- FirstDigitTouch
- FirstDigitTouch
- BothStartLoadPhase
- BothStartLoadPhase
- LiftOff
- LiftOff
- Replace
- Replace
- BothReleased
- BothReleased
These events always occur in the same order. In this dataset, there are two files for the subject + series combination:
These events always occur in the same order. In this dataset, there are two files for the subject + series combination:
the *_data.csv files contain the raw 32 channels EEG data (sampling rate 500Hz)
the *_data.csv files contain the raw 32 channels EEG data (sampling rate 500Hz)
the *_events.csv files contains the ground truth frame-wise labels for all events
the *_events.csv files contains the ground truth frame-wise labels for all events
Detailed information about the data can be found here:
Detailed information about the data can be found here:
Luciw MD, Jarocka E, Edin BB (2014) Multi-channel EEG recordings during 3,936 grasp and lift trials with varying weight and friction. Scientific Data 1:140047. www.nature.com/articles/sdata201447
Luciw MD, Jarocka E, Edin BB (2014) Multi-channel EEG recordings during 3,936 grasp and lift trials with varying weight and friction. Scientific Data 1:140047. www.nature.com/articles/sdata201447
*Description from https://www.kaggle.com/c/grasp-and-lift-eeg-detection/data*
*Description from https://www.kaggle.com/c/grasp-and-lift-eeg-detection/data*
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
<center>
<center>
<figure>
<figure>
<imgsrc="./images/eeg_signal_preprocessing.png"title="made at imgflip.com"width=75%/>
<imgsrc="./images/eeg_signal_preprocessing.png"title="made at imgflip.com"width=75%/>
<figcaption>Preprocessing steps for EEG-signals.</figcaption>
<figcaption>Preprocessing steps for EEG-signals.</figcaption>
</figure>
</figure>
</center>
</center>
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
### Load data
### Load data
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
The data can be found in: '/data/eeg_use_case' and contains:
The data can be found in: '/data/eeg_use_case' and contains:
- 8 series of recorded EEG data
- 8 series of recorded EEG data
- 8 series of events of arm movements
- 8 series of events of arm movements
Load the EEG data and the events:
Load the EEG data and the events:
- combine all EEG series in one array (size: (total number of time series, number of channels))
- combine all EEG series in one array (size: (total number of time series, number of channels))
- combine all events in one array (size: (total number of time series, number of different arm movement))
- combine all events in one array (size: (total number of time series, number of different arm movement))
- pay attention to the order of the series
- pay attention to the order of the series
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
<divclass="alert alert-block alert-warning">
<divclass="alert alert-block alert-warning">
<iclass="fa fa-info-circle"></i> <strong>Filter strings with the lambda-operator</strong>
<iclass="fa fa-info-circle"></i> <strong>Filter strings with the lambda-operator</strong>
The lambda-operator allows to build hidden functions, which are basically functions without a name. These hidden functions have any number of parameters, execute an expression and return the value of this expression. The lambda operator can be applied in the following way to filter the filenames:
The lambda-operator allows to build hidden functions, which are basically functions without a name. These hidden functions have any number of parameters, execute an expression and return the value of this expression. The lambda operator can be applied in the following way to filter the filenames:
all_data_files = list(filter(lambda x: '_data' in x, os.listdir(path)))
all_data_files = list(filter(lambda x: '_data' in x, os.listdir(path)))
</div>
</div>
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
defload_data(file_names,path):
defload_data(file_names,path):
# read the csv file and drop the id column
# read the csv file and drop the id column
dfs=[]
dfs=[]
forfinfile_names:
forfinfile_names:
df=pd.read_csv(path+f).drop('id',axis=1)
df=pd.read_csv(path+f).drop('id',axis=1)
dfs.append(df)
dfs.append(df)
returndfs
returndfs
```
```
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
# define path and list of all data and event files
# define path and list of all data and event files
The purpose of the feature extraction is to extract time-dependent features from the EEG data. To do so, a sliding window containing 500 datapoints each is used. Three consecutive time windows each predict the event in the following time step.
The purpose of the feature extraction is to extract time-dependent features from the EEG data. To do so, a sliding window containing 500 datapoints each is used. Three consecutive time windows each predict the event in the following time step.
Extract time-dependend features from the EEG-data:
Extract time-dependend features from the EEG-data:
- define the start and end points of a sliding window with a length of 500 datapoints and a step size of 2
- define the start and end points of a sliding window with a length of 500 datapoints and a step size of 2
- loop through those start and end points
- loop through those start and end points
- per iteration:
- per iteration:
- take three consecutive time windows (window_1 = data[start:end,:], window_2 = data[start+500:end+500,:],
- take three consecutive time windows (window_1 = data[start:end,:], window_2 = data[start+500:end+500,:],
window_3 = data[start+1000:end+1000,:])
window_3 = data[start+1000:end+1000,:])
- compute the average power per window (power: square of the signal)
- compute the average power per window (power: square of the signal)
- combine the three arrays containing the average power to one array
- combine the three arrays containing the average power to one array
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
<center>
<center>
<figure>
<figure>
<imgsrc="./images/time_window.001.png"title="made at imgflip.com"width=75%/>
<imgsrc="./images/time_window.001.png"title="made at imgflip.com"width=75%/>
<figcaption>Preprocessing steps for EEG-signals.</figcaption>
<figcaption>Preprocessing steps for EEG-signals.</figcaption>
The metric <strong>'roc-auc'</strong> describes the area under the ROC-curve. Thus, the higher this values is the better is the performance of the classifier.
The metric <strong>'roc-auc'</strong> describes the area under the ROC-curve. Thus, the higher this values is the better is the performance of the classifier.
</p>
</p>
<p> All figures are from: https://classeval.wordpress.com/introduction/introduction-to-the-roc-receiver-operating-characteristics-plot/
<p> All figures are from: https://classeval.wordpress.com/introduction/introduction-to-the-roc-receiver-operating-characteristics-plot/
<p> Multiclass classification can also be tranferred to multiple binary classification problems. One strategy is called One-vs-rest, where one classifier is trained per class. In our case this means that for each arm movement one classifier is trained by considering only the labels of the respective arm movement.