From 1b74fff4748dfd13fabb208788254761f09c36de Mon Sep 17 00:00:00 2001
From: Mikolaj Rybinski <mikolaj.rybinski@id.ethz.ch>
Date: Fri, 17 Sep 2021 17:03:26 +0200
Subject: [PATCH] In Notebook 04, in exercise block 2: fix reference and
 decimal places typo

---
 04_measuring_quality_of_a_classifier.ipynb | 203 +++++++++++----------
 1 file changed, 103 insertions(+), 100 deletions(-)

diff --git a/04_measuring_quality_of_a_classifier.ipynb b/04_measuring_quality_of_a_classifier.ipynb
index 8f8e887..c4fcf73 100644
--- a/04_measuring_quality_of_a_classifier.ipynb
+++ b/04_measuring_quality_of_a_classifier.ipynb
@@ -3,6 +3,8 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "metadata": {},
+   "outputs": [],
    "source": [
     "# IGNORE THIS CELL WHICH CUSTOMIZES LAYOUT AND STYLING OF THE NOTEBOOK !\n",
     "import matplotlib.pyplot as plt\n",
@@ -12,57 +14,56 @@
     "warnings.filterwarnings('ignore', category=FutureWarning)\n",
     "warnings.filterwarnings = lambda *a, **kw: None\n",
     "from IPython.core.display import HTML; HTML(open(\"custom.html\", \"r\").read())"
-   ],
-   "outputs": [],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "# Chapter 4: Metrics for evaluating the performance of a classifier"
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "metadata": {},
+   "outputs": [],
    "source": [
     "import sklearn.metrics as metrics\n",
     "import matplotlib\n",
     "import matplotlib.pyplot as plt\n",
     "import numpy as np\n",
     "import pandas as pd"
-   ],
-   "outputs": [],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "Up to now we used *accuracy*, the percentage of correct classifcations, to evaluate the quality of a classifier.\n",
     "\n",
     "Regrettably _accuracy_ can produce very misleading results. \n",
     "\n",
     "This chapter will discuss other metrics used to asses the quality of a classifier, including the possible pitfalls."
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "##  The confusion matrix"
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "Before we define the **confusion matrix** we must introduce some additional terms. \n"
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "After applying a classifier to a data set with known labels `0` and `1`:\n",
     "\n",
@@ -120,11 +121,11 @@
     "\n",
     "<img src=\"./images/305c8j.jpg\" title=\"made at imgflip.com\" width=40%/>\n",
     "\n"
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "\n",
     "\n",
@@ -148,11 +149,11 @@
     "\n",
     "</div>\n",
     "\n"
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "## Pitfalls\n",
     "\n",
@@ -190,22 +191,26 @@
     "2. Does our test predict people as infected which are actually not: How many positive diagnoses are correct ?\n",
     "\n",
     "We come back to this example later."
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "## Exercise block 1\n",
     "\n",
     "1.  A classifier predicts labels `[0, 1, 0, 1, 1, 0, 1, 0]` whereas true labels are `[0, 0, 1, 1, 1, 0, 1, 1]`. First write these values as a two columned table using pen & paper and assign `FP`, `TP`, ... to each row. Now create the confusion matrix and compute accuracy.\n",
     "\n",
     "2. A random classfier just assign a randomly chosen label `0` or `1` to a given sample. What is the average accuracy of such a classifier?"
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {
+    "tags": [
+     "solution"
+    ]
+   },
    "source": [
     "SOLUTION 1.1 \n",
     "<pre>\n",
@@ -228,27 +233,27 @@
     "SOLUTION 1.2 \n",
     "\n",
     "On average all fields of the confusion matrix should contain same values, thus the accuracy would be 50 %."
-   ],
-   "metadata": {
-    "tags": [
-     "solution"
-    ]
-   }
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "### Optional exercise\n",
     "\n",
     "Assume the previously described test also produces wrong results on not-infected people, such that 5% will be diagnosed as infected. Compute the confusion matrix and the accuracy of this test.\n",
     "\n"
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {
+    "tags": [
+     "solution"
+    ]
+   },
    "source": [
-    "\n",
+    "SOLUTION Optional exercise\n",
     "\n",
     "This is the new situation:\n",
     "- On average 10 out of 10000 people are infected with a disease `X`. \n",
@@ -264,15 +269,11 @@
     "\n",
     "accuracy = 9495.5 / 10000 = 94.96 %\n",
     "</pre>"
-   ],
-   "metadata": {
-    "tags": [
-     "solution"
-    ]
-   }
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "##  Precision and Recall\n",
     "\n",
@@ -291,11 +292,11 @@
     "<img src=\"./images/precision-recall-1.png\" width=90% />\n",
     "\n",
     "\n"
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "### How to compute precision and recall for a classifier\n",
     "\n",
@@ -382,21 +383,26 @@
     "</div>\n",
     "\n",
     "For the medical test `Z` the `F1` score is `1 / 1.5 = 0.6666..`."
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "## Exercise block 2\n",
     "\n",
-    "Use your results from exercise 1.1 to compute precision, recall and F1 score."
-   ],
-   "metadata": {}
+    "Use your results from Exercise block 1.1 to compute precision, recall and F1 score."
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {
+    "tags": [
+     "solution"
+    ]
+   },
    "source": [
+    "SOLUTION 2\n",
     "<pre>\n",
     "TP = 3    FP = 1\n",
     "FN = 2    TN = 2\n",
@@ -405,68 +411,67 @@
     "recall    = 3 / (3 + 2) = 60 %\n",
     "F1        = 2 * (0.6 * 0.75) / (0.6 + 0.75) = 66.66%\n",
     "</pre>"
-   ],
-   "metadata": {
-    "tags": [
-     "solution"
-    ]
-   }
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "### Optional exercise:\n",
     "\n",
-    "Compute precision, recall and F1-score for the test described in exercise 1.2."
-   ],
-   "metadata": {}
+    "Compute precision, recall and F1-score for the test described in Exercise block 1 Optional exercise."
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {
+    "tags": [
+     "solution"
+    ]
+   },
    "source": [
+    "SOLUTION 2 Optional exercise\n",
+    "\n",
     "<pre>\n",
     "TP = 5   FP = 499.5\n",
     "FN = 5   TN = 9490.5\n",
     "\n",
     "precision = 5 / (5 + 499.5) = 0.0099\n",
     "recall    = 5 / (5 + 5) = 0.5\n",
-    "F1        = 2 * (0.099 * 0.5) / (0.0099 + 0.5) = 0.194\n",
+    "F1        = 2 * (0.0099 * 0.5) / (0.0099 + 0.5) = 0.0194\n",
     "</pre>"
-   ],
-   "metadata": {
-    "tags": [
-     "solution"
-    ]
-   }
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "## Other metrics\n",
     "\n",
     "The discussion above was just a quick introduction to measuring the accuracy of a classifier. We skipped other metrics such as `ROC` and `AUC` amongst others.\n",
     "\n",
     "A good introduction to `ROC` <a href=\"https://classeval.wordpress.com/introduction/introduction-to-the-roc-receiver-operating-characteristics-plot/\">can be found here.</a>"
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "## Metrics in scikit-learn"
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "`sklearn.metrics` contains all introduced above metrics, as well as the previously-used classification accuracy:"
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "metadata": {},
+   "outputs": [],
    "source": [
     "from sklearn.metrics import precision_score, recall_score, f1_score, accuracy_score\n",
     "\n",
@@ -483,60 +488,60 @@
     "print(\"{:20s} {:.3f}\".format(\"recall\", recall_score(labels, predicted)))\n",
     "print(\"{:20s} {:.3f}\".format(\"f1\", f1_score(labels, predicted)))\n",
     "print(\"{:20s} {:.3f}\".format(\"accuracy\", accuracy_score(labels, predicted)))\n"
-   ],
-   "outputs": [],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "### Classification report\n",
     "\n",
     "`scikit-learn` also offers a function to print a classification report, which is an overview table of precision, recall and F1 metrics:"
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "metadata": {},
+   "outputs": [],
    "source": [
     "from sklearn.metrics import classification_report\n",
     "\n",
     "print(classification_report(labels, predicted, ))"
-   ],
-   "outputs": [],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "1. The `support` column lists the number of samples in each class and in total.\n",
     "2. The `macro average` row lists unweighted mean of a metric for each label. This does NOT take classes imbalance into account.\n",
     "3. The `weighted average` row lists weighted by support mean of a metric for each label. This does take classes imbalance into account.\n",
     "\n",
     "Note: normally the precision, recall and F1 metrics are only the \"Positive\" (`1`) class metrics (cf. results above)."
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "### Confusion matrix"
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "The `sklearn.metrics` module contains also `confusion_matrix` utility which returns the confusion matrix.\n",
     "\n",
     "Beware: the matrix is transposed with respect to the conventional notation; actual (true) classes are given in rows, whereas predicted in columns."
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "metadata": {},
+   "outputs": [],
    "source": [
     "from sklearn.metrics import confusion_matrix\n",
     "\n",
@@ -553,20 +558,20 @@
     "print()\n",
     "\n",
     "#\n"
-   ],
-   "outputs": [],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "Having a classifier object, the confusion matrix can also be visualized using a `plot_confusion_matrix` utility function."
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "metadata": {},
+   "outputs": [],
    "source": [
     "from sklearn.linear_model import LogisticRegression\n",
     "from sklearn.metrics import plot_confusion_matrix\n",
@@ -598,12 +603,11 @@
     "\n",
     "cm_disp.ax_.set_title('Confusion matrix: \"beer\" dataset + LR classfier')\n",
     "plt.show()"
-   ],
-   "outputs": [],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "<div class=\"alert alert-block alert-info\">\n",
     "<p>\n",
@@ -614,11 +618,11 @@
     "<img src=\"./images/confusion_matrix-iris_svc.png\" width=\"50%\" />\n",
     "\n",
     "</div>"
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "### Case-study: an imbalanced dataset\n",
     "\n",
@@ -628,12 +632,13 @@
     "\n",
     "- the beer data samples in which labels distribution is almost 50:50, and\n",
     "- an unbalanced subset of the beer data samples."
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "metadata": {},
+   "outputs": [],
    "source": [
     "import pandas as pd\n",
     "\n",
@@ -651,13 +656,13 @@
     "print(\"unbalanced data\")\n",
     "print(beer_data_unbalanced.shape)\n",
     "print(\"#class 1:\", sum(beer_data_unbalanced.iloc[:,-1] == 1))"
-   ],
-   "outputs": [],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "metadata": {},
+   "outputs": [],
    "source": [
     "from sklearn.model_selection import cross_val_score\n",
     "from sklearn.linear_model import LogisticRegression\n",
@@ -688,16 +693,14 @@
     "\n",
     "print(\"unbalanced data\")\n",
     "assess(classifier, beer_data_unbalanced)"
-   ],
-   "outputs": [],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "You can see that for the balanced data set the values for `f1` and for `accuracy` are almost equal, but differ significantly for the unbalanced data set. The `f1` metric captures the `precision` and `recall` trade off which is visible for imbalanced datasets."
-   ],
-   "metadata": {}
+   ]
   },
   {
    "cell_type": "markdown",
-- 
GitLab