Skip to content
Snippets Groups Projects
01_introduction.ipynb 741 KiB
Newer Older
schmittu's avatar
schmittu committed
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style>\n",
       "    \n",
       "    @import url('http://fonts.googleapis.com/css?family=Source+Code+Pro');\n",
       "    \n",
       "    @import url('http://fonts.googleapis.com/css?family=Kameron');\n",
       "    @import url('http://fonts.googleapis.com/css?family=Crimson+Text');\n",
       "    \n",
       "    @import url('http://fonts.googleapis.com/css?family=Lato');\n",
       "    @import url('http://fonts.googleapis.com/css?family=Source+Sans+Pro');\n",
       "    \n",
       "    @import url('http://fonts.googleapis.com/css?family=Lora'); \n",
       "\n",
       "    \n",
       "    body {\n",
       "        font-family: 'Lora', Consolas, sans-serif;\n",
       "       \n",
       "        -webkit-print-color-adjust: exact important !;\n",
       "        \n",
       "      \n",
       "       \n",
       "    }\n",
       "    \n",
       "    .alert-block {\n",
       "        width: 95%;\n",
       "        margin: auto;\n",
       "    }\n",
       "    \n",
       "    .rendered_html code\n",
       "    {\n",
       "        color: black;\n",
       "        background: #eaf0ff;\n",
       "        background: #f5f5f5; \n",
       "        padding: 1pt;\n",
       "        font-family:  'Source Code Pro', Consolas, monocco, monospace;\n",
       "    }\n",
       "    \n",
       "    p {\n",
       "      line-height: 140%;\n",
       "    }\n",
       "    \n",
       "    strong code {\n",
       "        background: red;\n",
       "    }\n",
       "    \n",
       "    .rendered_html strong code\n",
       "    {\n",
       "        background: #f5f5f5;\n",
       "    }\n",
       "    \n",
       "    .CodeMirror pre {\n",
       "    font-family: 'Source Code Pro', monocco, Consolas, monocco, monospace;\n",
       "    }\n",
       "    \n",
       "    .cm-s-ipython span.cm-keyword {\n",
       "        font-weight: normal;\n",
       "     }\n",
       "     \n",
       "     strong {\n",
       "         background: #f5f5f5;\n",
       "         margin-top: 4pt;\n",
       "         margin-bottom: 4pt;\n",
       "         padding: 2pt;\n",
       "         border: 0.5px solid #a0a0a0;\n",
       "         font-weight: bold;\n",
       "         color: darkred;\n",
       "     }\n",
       "     \n",
       "    \n",
       "    div #notebook {\n",
       "        # font-size: 10pt; \n",
       "        line-height: 145%;\n",
       "        }\n",
       "        \n",
       "    li {\n",
       "        line-height: 145%;\n",
       "    }\n",
       "\n",
       "    div.output_area pre {\n",
       "        background: #fff9d8 !important;\n",
       "        padding: 5pt;\n",
       "       \n",
       "       -webkit-print-color-adjust: exact; \n",
       "        \n",
       "    }\n",
       " \n",
       "    \n",
       " \n",
       "    h1, h2, h3, h4 {\n",
       "        font-family: Kameron, arial;\n",
       "\n",
       "\n",
       "    }\n",
       "    \n",
       "    div#maintoolbar {display: none !important;}\n",
schmittu's avatar
schmittu committed
       "</style>\n",
       "    <script>\n",
       "IPython.OutputArea.prototype._should_scroll = function(lines) {\n",
       "        return false;\n",
       "}\n",
       "    </script>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
schmittu's avatar
schmittu committed
    "# IGNORE THIS CELL WHICH CUSTOMIZES LAYOUT AND STYLING OF THE NOTEBOOK !\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "%config InlineBackend.figure_format = 'retina'\n",
    "import warnings\n",
    "warnings.filterwarnings('ignore', category=FutureWarning)\n",
schmittu's avatar
schmittu committed
    "warnings.filterwarnings = lambda *a, **kw: None\n",
    "from IPython.core.display import HTML; HTML(open(\"custom.html\", \"r\").read())"
   ]
  },
schmittu's avatar
schmittu committed
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
schmittu's avatar
schmittu committed
    "# Chapter 1: General Introduction to machine learning (ML)"
schmittu's avatar
schmittu committed
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
schmittu's avatar
schmittu committed
    "## ML = \"learning models from data\"\n",
schmittu's avatar
schmittu committed
    "\n",
schmittu's avatar
schmittu committed
    "\n",
schmittu's avatar
schmittu committed
    "### About models\n",
schmittu's avatar
schmittu committed
    "\n",
schmittu's avatar
schmittu committed
    "A \"model\" allows us to explain observations and to answer questions. For example:\n",
    "\n",
    "   1. Where will my car at given velocity stop if I apply break now?\n",
    "   2. Where on the night sky will I see the moon tonight?\n",
    "   3. Is the email I received spam?\n",
schmittu's avatar
schmittu committed
    "   4. What product should I recommend my customer `X` ?\n",
schmittu's avatar
schmittu committed
    "   \n",
schmittu's avatar
schmittu committed
    "- The first two questions can be answered based on existing physical models (formulas). \n",
    "\n",
    "- For the  questions 3 and 4 it is difficult to develop explicitly formulated models. \n",
schmittu's avatar
schmittu committed
    "\n",
schmittu's avatar
schmittu committed
    "### What is needed to apply ML ?\n",
    "\n",
    "\n",
schmittu's avatar
schmittu committed
    "- We have no explicit formula for such a task.\n",
schmittu's avatar
schmittu committed
    "\n",
schmittu's avatar
schmittu committed
    "\n",
schmittu's avatar
schmittu committed
    "- We have a vague understanding of the problem domain, e.g. we know that some words are specific to spam emails and others are specific to my personal and work-related emails.\n",
schmittu's avatar
schmittu committed
    "\n",
    "\n",
schmittu's avatar
schmittu committed
    "- We have enough example data, as my mailbox is full of both spam and non-spam emails.\n",
    "\n",
    "\n",
    "We could handcraft a personal spam classifier by hard coding rules, like \"mail contains 'no prescription' and comes from russia or china\" plus some statistics which would be very tedious\n",
    "\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "<i class=\"fa fa-info-circle\"></i>\n",
    "    Systems with such hard coded rules are called <strong>expert systems</strong>\n",
    "</div>\n",
    "\n",
    "**In such cases machine learning offers approaches to automatically build predictive models based on example data.**\n",
schmittu's avatar
schmittu committed
    "\n",
schmittu's avatar
schmittu committed
    "\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "<i class=\"fa fa-info-circle\"></i>\n",
    "The closely-related concept of <strong>data mining</strong> usually means use of predictive machine learning models to explicitly discover previously unknown knowledge from a specific data set, such as, for instance, association rules between customer and article types in the Problem 4 above.\n",
schmittu's avatar
schmittu committed
    "\n",
    "\n",
    "\n",
    "## ML: what is \"learning\" ?\n",
    "\n",
    "To create a predictive model, we must first **train** such a model on given data. \n",
    "<div class=\"alert alert-block alert-info\">\n",
    "<i class=\"fa fa-info-circle\"></i>\n",
    "Alternative names for \"to train\" a model are \"to <strong>fit</strong>\" or \"to <strong>learn</strong>\" a model.\n",
    "</div>\n",
schmittu's avatar
schmittu committed
    "\n",
    "All ML algorithms have in common that they rely on internal data structures and/or parameters. Learning then builds up such data structures or adjusts parameters based on the given data. After that such models can be used to explain observations or to answer questions.\n",
    "\n",
    "The important difference between explicit models and models learned from data:\n",
Loading
Loading full blame...