diff --git a/06_preprocessing_pipelines_and_hyperparameter_optimization.ipynb b/06_preprocessing_pipelines_and_hyperparameter_optimization.ipynb
index 25d57cfa94f5c20c8baaf2434a69892b7280f939..43ac12cbf7b79001a01504835f3efca70a74a885 100644
--- a/06_preprocessing_pipelines_and_hyperparameter_optimization.ipynb
+++ b/06_preprocessing_pipelines_and_hyperparameter_optimization.ipynb
@@ -154,29 +154,48 @@
     "\n",
     "The two most important ones in `scikit-learn` are\n",
     "\n",
-    "- `MinMaxScaler`:  after applying this scaler, the minumum in every column is 0, the maximum is 1.\n",
+    "- `sklearn.preprocessing.MinMaxScaler`:  after applying this scaler, the minumum in every column is 0, the maximum is 1.\n",
     "\n",
-    "- `StandardScaler`: scales columns to mean value 0 and standard deviation 1.\n",
-    "\n",
-    "The reason to use a scaler is to compensate for different orders of magnitudes of the features. Some classifiers like `SVC` and `KNeighborsClassifier` use eucledian distances between features internally which would impose more weight on features having large values. So **don't forget to scale your features when using SVC or KNeighborsClassifier** !\n",
     "\n",
+    "- `sklearn.preprocessing.StandardScaler`: scales columns to mean value 0 and standard deviation 1.\n",
     "\n",
+    "The reason to use a scaler is to compensate for different orders of magnitudes of the features. Some classifiers like `SVC` and `KNeighborsClassifier` use euclidean distances between features internally which would impose more weight on features having large values. So **don't forget to scale your features when using SVC or KNeighborsClassifier** !"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
     "### PCA\n",
     "\n",
-    "Principal component analysis is a technique to reduce the dimensionality of a multi variate data set. One benefit of PCA is to remove redundancy in your data set, such as correlating columns or linear dependencies between columns.\n",
+    "Principal component analysis (`sklearn.decomposition.PCA` in `scikit-learn`) is a technique to reduce the dimensionality of a multi variate data set. One benefit of PCA is to remove redundancy in your data set, such as correlating columns or linear dependencies between columns.\n",
     "\n",
     "We discussed before that reducing redundancy and noise can help to avoid overfitting.\n",
     "\n",
-    "\n",
+    "This is an example how to reduce a data set from 4 to 2 dimensions. Although `PCA` does not make much sense when applied to random data we just want to demonstrate h"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
     "### Function transformers\n",
     "\n",
     "It can help to apply functions like `log` or `exp` or `1/x` to features to improve classification performance.\n",
     "\n",
     "Lets assume you want to forecast the outcome of car crash experiments and one variable is the time $t$ needed for the distance $l$ from start to crash. Transforming this to the actual speed $\\frac{l}{t}$ could be a more informative feature then $t$.\n",
     "\n",
+    "The corresponding class in `scikit learn` is `sklearn.preprocessing.FunctionTransformer`.\n",
+    "\n",
     "### Imputing missing values\n",
     "\n",
-    "Sometimes data contain missing values. Data imputation is a strategy to fill up missing values, e.g. by the columnwise mean or by applying another strategy.\n"
+    "Sometimes data contain missing values. Data imputation is a strategy to fill up missing values, e.g. by the columnwise mean or by applying another strategy. You can find more about this [here](https://scikit-learn.org/stable/modules/impute.html#impute).\n",
+    "\n",
+    "\n",
+    "\n",
+    "### Encoding non numerical values\n",
+    "\n",
+    "E.g. a `sklearn.preprocessing.LabelEncoder` encodes categorical text values to unique numbers. Lets assume one feature column has country codes like `DE`, `CH`, etc. The label encoder would replace these with unique numerical values, e.g. `DE -> 0`, `CH -> 1`, etc."
    ]
   },
   {
@@ -390,6 +409,13 @@
     "\n"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"https://i.imgflip.com/2xi5wt.jpg\" />"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -407,20 +433,13 @@
     "6. Assess performance."
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "<img src=\"https://i.imgflip.com/2xi5wt.jpg\" />"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "## The scikit-learn API (quick recap)\n",
     "\n",
-    "We've seen before that we can swap `scikit-learn` classifiers easily without changing much code. \n",
+    "We've seen before that we can interchange `scikit-learn` classifiers easily without changing much code. \n",
     "\n",
     "This is possible, because all classifiers have methods `.fit` and `.predict` which also have the same function signature (this means number and meaning of arguments is always the same for every implementation of `.fit` respectively `.predict`.)\n",
     "\n",
@@ -632,9 +651,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Parameter optimization\n",
+    "## Hyperparameter optimization\n",
     "\n",
-    "Classifiers and pipelines have parameters which must be adapted for improving performance. Finding good parameters is also called *hyper optimization* to distinguish from the optimization done during learning of many classification algorithms.\n",
+    "Classifiers and pipelines have parameters which must be adapted for improving performance. Finding good parameters is also called *hyperparameter optimization* to distinguish from the optimization done during learning of many classification algorithms.\n",
     "\n",
     "The simplest approach is to specify valid values for each parameter involved and then try out all possible combinations. This is called *grid search*:"
    ]
@@ -756,7 +775,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 289,
+   "execution_count": 24,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -812,6 +831,233 @@
     "    \n",
     "    `make_pipeline(StandardScaler(with_mean=..., with_std=...), PCA(n=...), LogisticRegression(C=...))`"
    ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 23,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/Users/uweschmitt/Projects/machinelearning-introduction-workshop/venv37/lib/python3.7/site-packages/ipykernel_launcher.py:9: UserWarning: get_ipython_dir has moved to the IPython.paths module since IPython 4.0.\n",
+      "  if __name__ == '__main__':\n"
+     ]
+    },
+    {
+     "data": {
+      "text/html": [
+       "<style>\n",
+       "    \n",
+       "    @import url('http://fonts.googleapis.com/css?family=Source+Code+Pro');\n",
+       "    \n",
+       "    @import url('http://fonts.googleapis.com/css?family=Kameron');\n",
+       "    @import url('http://fonts.googleapis.com/css?family=Crimson+Text');\n",
+       "    \n",
+       "    @import url('http://fonts.googleapis.com/css?family=Lato');\n",
+       "    @import url('http://fonts.googleapis.com/css?family=Source+Sans+Pro');\n",
+       "    \n",
+       "    @import url('http://fonts.googleapis.com/css?family=Lora'); \n",
+       "\n",
+       "    \n",
+       "    body {\n",
+       "        font-family: 'Lora', Consolas, sans-serif;\n",
+       "       \n",
+       "        -webkit-print-color-adjust: exact important !;\n",
+       "        \n",
+       "      \n",
+       "       \n",
+       "    }\n",
+       "    \n",
+       "    .alert-block {\n",
+       "        width: 95%;\n",
+       "        margin: auto;\n",
+       "    }\n",
+       "    \n",
+       "    .rendered_html code\n",
+       "    {\n",
+       "        color: black;\n",
+       "        background: #eaf0ff;\n",
+       "        background: #f5f5f5; \n",
+       "        padding: 1pt;\n",
+       "        font-family:  'Source Code Pro', Consolas, monocco, monospace;\n",
+       "    }\n",
+       "    \n",
+       "    p {\n",
+       "      line-height: 140%;\n",
+       "    }\n",
+       "    \n",
+       "    strong code {\n",
+       "        background: red;\n",
+       "    }\n",
+       "    \n",
+       "    .rendered_html strong code\n",
+       "    {\n",
+       "        background: #f5f5f5;\n",
+       "    }\n",
+       "    \n",
+       "    .CodeMirror pre {\n",
+       "    font-family: 'Source Code Pro', monocco, Consolas, monocco, monospace;\n",
+       "    }\n",
+       "    \n",
+       "    .cm-s-ipython span.cm-keyword {\n",
+       "        font-weight: normal;\n",
+       "     }\n",
+       "     \n",
+       "     strong {\n",
+       "         background: #f5f5f5;\n",
+       "         margin-top: 4pt;\n",
+       "         margin-bottom: 4pt;\n",
+       "         padding: 2pt;\n",
+       "         border: 0.5px solid #a0a0a0;\n",
+       "         font-weight: bold;\n",
+       "         color: darkred;\n",
+       "     }\n",
+       "     \n",
+       "    \n",
+       "    div #notebook {\n",
+       "        # font-size: 10pt; \n",
+       "        line-height: 145%;\n",
+       "        }\n",
+       "        \n",
+       "    li {\n",
+       "        line-height: 145%;\n",
+       "    }\n",
+       "\n",
+       "    div.output_area pre {\n",
+       "        background: #fff9d8 !important;\n",
+       "        padding: 5pt;\n",
+       "       \n",
+       "       -webkit-print-color-adjust: exact; \n",
+       "        \n",
+       "    }\n",
+       " \n",
+       "    \n",
+       " \n",
+       "    h1, h2, h3, h4 {\n",
+       "        font-family: Kameron, arial;\n",
+       "    }\n",
+       "    \n",
+       "    div#maintoolbar {display: none !important;}\n",
+       "    </style>"
+      ],
+      "text/plain": [
+       "<IPython.core.display.HTML object>"
+      ]
+     },
+     "execution_count": 23,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "#REMOVEBEGIN\n",
+    "# THE LINES BELOW ARE JUST FOR STYLING THE CONTENT ABOVE !\n",
+    "\n",
+    "from IPython import utils\n",
+    "from IPython.core.display import HTML\n",
+    "import os\n",
+    "def css_styling():\n",
+    "    \"\"\"Load default custom.css file from ipython profile\"\"\"\n",
+    "    base = utils.path.get_ipython_dir()\n",
+    "    styles = \"\"\"<style>\n",
+    "    \n",
+    "    @import url('http://fonts.googleapis.com/css?family=Source+Code+Pro');\n",
+    "    \n",
+    "    @import url('http://fonts.googleapis.com/css?family=Kameron');\n",
+    "    @import url('http://fonts.googleapis.com/css?family=Crimson+Text');\n",
+    "    \n",
+    "    @import url('http://fonts.googleapis.com/css?family=Lato');\n",
+    "    @import url('http://fonts.googleapis.com/css?family=Source+Sans+Pro');\n",
+    "    \n",
+    "    @import url('http://fonts.googleapis.com/css?family=Lora'); \n",
+    "\n",
+    "    \n",
+    "    body {\n",
+    "        font-family: 'Lora', Consolas, sans-serif;\n",
+    "       \n",
+    "        -webkit-print-color-adjust: exact important !;\n",
+    "        \n",
+    "      \n",
+    "       \n",
+    "    }\n",
+    "    \n",
+    "    .alert-block {\n",
+    "        width: 95%;\n",
+    "        margin: auto;\n",
+    "    }\n",
+    "    \n",
+    "    .rendered_html code\n",
+    "    {\n",
+    "        color: black;\n",
+    "        background: #eaf0ff;\n",
+    "        background: #f5f5f5; \n",
+    "        padding: 1pt;\n",
+    "        font-family:  'Source Code Pro', Consolas, monocco, monospace;\n",
+    "    }\n",
+    "    \n",
+    "    p {\n",
+    "      line-height: 140%;\n",
+    "    }\n",
+    "    \n",
+    "    strong code {\n",
+    "        background: red;\n",
+    "    }\n",
+    "    \n",
+    "    .rendered_html strong code\n",
+    "    {\n",
+    "        background: #f5f5f5;\n",
+    "    }\n",
+    "    \n",
+    "    .CodeMirror pre {\n",
+    "    font-family: 'Source Code Pro', monocco, Consolas, monocco, monospace;\n",
+    "    }\n",
+    "    \n",
+    "    .cm-s-ipython span.cm-keyword {\n",
+    "        font-weight: normal;\n",
+    "     }\n",
+    "     \n",
+    "     strong {\n",
+    "         background: #f5f5f5;\n",
+    "         margin-top: 4pt;\n",
+    "         margin-bottom: 4pt;\n",
+    "         padding: 2pt;\n",
+    "         border: 0.5px solid #a0a0a0;\n",
+    "         font-weight: bold;\n",
+    "         color: darkred;\n",
+    "     }\n",
+    "     \n",
+    "    \n",
+    "    div #notebook {\n",
+    "        # font-size: 10pt; \n",
+    "        line-height: 145%;\n",
+    "        }\n",
+    "        \n",
+    "    li {\n",
+    "        line-height: 145%;\n",
+    "    }\n",
+    "\n",
+    "    div.output_area pre {\n",
+    "        background: #fff9d8 !important;\n",
+    "        padding: 5pt;\n",
+    "       \n",
+    "       -webkit-print-color-adjust: exact; \n",
+    "        \n",
+    "    }\n",
+    " \n",
+    "    \n",
+    " \n",
+    "    h1, h2, h3, h4 {\n",
+    "        font-family: Kameron, arial;\n",
+    "    }\n",
+    "    \n",
+    "    div#maintoolbar {display: none !important;}\n",
+    "    </style>\"\"\"\n",
+    "    return HTML(styles)\n",
+    "css_styling()\n",
+    "#REMOVEEND"
+   ]
   }
  ],
  "metadata": {
diff --git a/create_datasets.py.ipynb b/create_datasets.py.ipynb
index 5d43765f9ce2c78c6816f85d8e7f5fdcd30fb2cb..70eb58057b7a77734c79d0a0ec84880cf5ee2633 100644
--- a/create_datasets.py.ipynb
+++ b/create_datasets.py.ipynb
@@ -1562,7 +1562,25 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.6.6"
+   "version": "3.7.2"
+  },
+  "latex_envs": {
+   "LaTeX_envs_menu_present": true,
+   "autoclose": false,
+   "autocomplete": true,
+   "bibliofile": "biblio.bib",
+   "cite_by": "apalike",
+   "current_citInitial": 1,
+   "eqLabelWithNumbers": true,
+   "eqNumInitial": 1,
+   "hotkeys": {
+    "equation": "Ctrl-E",
+    "itemize": "Ctrl-I"
+   },
+   "labels_anchors": false,
+   "latex_user_defs": false,
+   "report_style_numbering": false,
+   "user_envs_cfg": false
   }
  },
  "nbformat": 4,
diff --git a/salmon.csv b/salmon.csv
new file mode 100644
index 0000000000000000000000000000000000000000..489e32add168a4f5e14a806da283911a5a311df1
--- /dev/null
+++ b/salmon.csv
@@ -0,0 +1,101 @@
+circumference,length,kind,weight
+25.5,85.5,atlantic,31.2
+22.5,62.5,atlantic,12.4
+29.0,88.0,atlantic,34.8
+32.5,85.5,atlantic,62.7
+24.5,74.5,atlantic,24.2
+29.0,77.0,atlantic,28.4
+27.0,85.0,atlantic,26.8
+29.0,74.5,atlantic,37.0
+21.0,75.0,atlantic,12.2
+24.5,66.0,atlantic,17.3
+22.0,68.0,atlantic,16.1
+24.0,76.0,atlantic,23.8
+28.0,75.5,atlantic,35.3
+34.0,103.0,atlantic,62.8
+21.5,61.0,atlantic,14.7
+26.0,72.5,atlantic,22.3
+24.0,74.0,atlantic,20.4
+21.0,76.0,atlantic,18.1
+20.0,70.5,atlantic,14.2
+31.5,81.0,atlantic,54.1
+33.0,95.0,atlantic,65.2
+24.5,71.0,atlantic,22.9
+33.5,92.5,atlantic,49.8
+31.5,90.5,atlantic,48.3
+28.0,80.0,atlantic,30.1
+20.5,68.0,atlantic,11.9
+33.5,89.0,atlantic,72.8
+23.5,63.0,atlantic,16.6
+34.0,101.0,atlantic,82.0
+32.0,84.5,atlantic,53.5
+32.0,98.5,atlantic,67.2
+23.5,70.0,atlantic,19.3
+20.0,64.5,atlantic,12.7
+22.0,64.5,atlantic,14.9
+28.0,85.0,atlantic,30.8
+34.5,91.0,atlantic,71.6
+24.5,64.0,atlantic,17.4
+21.0,61.5,atlantic,9.9
+22.0,68.5,atlantic,17.6
+30.0,91.5,atlantic,39.9
+25.5,79.0,atlantic,32.2
+21.5,73.5,atlantic,19.5
+20.5,67.0,atlantic,16.1
+27.5,75.5,atlantic,29.6
+30.5,84.5,atlantic,34.5
+25.0,67.5,atlantic,26.4
+24.0,75.5,atlantic,21.1
+28.0,76.5,atlantic,39.4
+33.5,96.5,atlantic,78.0
+31.0,96.0,atlantic,60.8
+20.5,65.0,sockeye,23.0
+19.0,66.0,sockeye,19.9
+23.0,65.0,sockeye,25.0
+19.0,64.5,sockeye,21.2
+24.5,79.5,sockeye,27.2
+25.5,84.0,sockeye,33.8
+22.5,72.0,sockeye,22.1
+24.5,81.5,sockeye,31.0
+21.5,61.0,sockeye,18.9
+22.0,71.5,sockeye,22.3
+18.0,53.0,sockeye,13.9
+19.0,64.5,sockeye,19.5
+23.5,65.5,sockeye,28.5
+23.0,75.0,sockeye,27.7
+22.5,68.5,sockeye,28.9
+22.0,62.0,sockeye,22.0
+26.0,82.0,sockeye,31.8
+19.0,59.5,sockeye,20.4
+24.0,70.5,sockeye,25.8
+23.0,75.0,sockeye,31.0
+21.0,74.0,sockeye,20.2
+18.0,53.0,sockeye,18.1
+22.0,63.0,sockeye,20.4
+19.0,63.5,sockeye,21.5
+24.0,65.0,sockeye,26.9
+22.5,78.5,sockeye,31.1
+18.0,66.0,sockeye,15.7
+27.0,82.0,sockeye,36.5
+22.0,71.5,sockeye,28.5
+27.0,89.0,sockeye,38.3
+18.5,54.0,sockeye,17.4
+18.0,59.5,sockeye,18.1
+25.5,73.5,sockeye,31.2
+24.0,81.5,sockeye,31.1
+20.5,60.5,sockeye,20.8
+27.5,79.5,sockeye,40.8
+21.0,77.0,sockeye,20.7
+25.0,81.5,sockeye,30.1
+22.5,78.0,sockeye,30.7
+21.5,59.5,sockeye,24.2
+20.5,77.0,sockeye,23.3
+24.0,73.0,sockeye,29.3
+24.0,80.0,sockeye,28.4
+18.5,63.0,sockeye,16.3
+22.0,78.0,sockeye,25.6
+19.0,69.5,sockeye,18.8
+18.5,67.0,sockeye,18.9
+24.5,67.5,sockeye,24.7
+21.0,66.5,sockeye,26.0
+27.5,86.5,sockeye,43.4