Skip to content
Snippets Groups Projects
05_classifiers_overview.ipynb 4.34 KiB
Newer Older
  • Learn to ignore specific revisions
  • {
     "cells": [
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "# Chapter 5: An overview of classifiers"
       ]
      },
    
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "What classifiers ?\n",
        "\n",
        "- Neighrest neighbours\n",
        "- Logistic Regression\n",
        "- Linear SVM\n",
        "\n",
        "- Kernel SVM\n",
        "- Decision trees\n",
        "- Random forests\n",
        "\n",
    
        "- XGboost (https://xgboost.readthedocs.io/en/latest/) (not part of scikit-learn, won many kaggle competitions https://www.kaggle.com/dansbecker/xgboost, offers scikit-learn API https://www.kaggle.com/stuarthallows/using-xgboost-with-scikit-learn)\n",
        "\n",
        "\n",
    
        "For every classifier: some examples for decision surfaces.\n",
        "\n",
        "Historical information ?"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "## Neighrest neighbours\n",
        "\n",
        "- For a new feature $x$ look for $N$ closests examples from learning data (usually using the euclidean distance).\n",
        "- Classify $x$ as the majority of labels among these closest examples.\n",
        "\n",
        "Parameter: $N$. the larger $N$ the smoother the decision surface.\n",
        "\n",
        "Benefit: simple\n",
        "\n",
        "Disadvanages: needs lots of data, does not work well for dimesions > 8(ish) (source !?)\n",
        "\n",
        "TODO: Commentary about course of dimensionality"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "## Logistic regression\n",
        "\n",
        "$\\sigma (t)={\\frac {e^{t}}{e^{t}+1}}={\\frac {1}{1+e^{-t}}}$\n",
        "\n",
        "plot !\n",
        "\n",
        "linear classifier, sigma shrinks result of linear combinations to interval 0, 1 which are interpreted as class probabilities.\n",
        "\n",
        "works better in high dimensions\n",
        "\n",
        "weights can be interpreted\n",
        "\n",
        "Parameters: C (https://stackoverflow.com/questions/22851316/what-is-the-inverse-of-regularization-strength-in-logistic-regression-how-shoul)\n",
        "\n",
        "Penelaty to avoid overfitting\n",
        "\n",
        "Plot logistig regression diagram as very simple neural network ?"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "## Linear SVM\n",
        "\n",
        "- linear classifier such that margin is maximised (show example)\n",
        "- based on \"empirical risk minization\" (vapnik)\n",
        "\n",
        "the final weight vector is a linear combination of a subset of the features from the learning set. These are called \"support vectors\".\n",
        "\n",
        "weights can be interpreted\n",
        "\n",
        "C: how much weight to we put on examples within the \"margin strip\"\n"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "## Kernel based SVM\n",
        "\n",
        "So called kernels are used to build the classifiation surface. Default kernel is rbf.\n",
        "\n",
        "Hard to intepret the internals.\n",
        "\n",
    
        "for rbf: gamma parameter is \"decline rate\" of rbf functions, controls smoothness of decision surface.\n",
        "\n",
        "feature scaling is crucial for good performance !"
    
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "## Decision trees\n",
        "\n",
        "- simple example incl. plot\n",
        "- basic idea: \"optimal\" splits...\n",
        "\n",
        "- benefit: interpretability\n",
        "\n",
        "Parameter: depth, the deeper the higher the risk for overfitting."
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "## Random forests\n",
        "\n",
        "- generate many week classifiers by creating shallow trees with random splittings\n",
        "- use so call bagging to implement a good overall classifier\n",
        "\n",
        "- benefits: allows also estimates about feature importance\n",
        "\n",
        "- more robust to overfitting than decision trees\n"
       ]
      },
    
      {
       "cell_type": "code",
       "execution_count": null,
       "metadata": {},
       "outputs": [],
       "source": []
      }
     ],
     "metadata": {
      "kernelspec": {
       "display_name": "Python 3",
       "language": "python",
       "name": "python3"
      },
      "language_info": {
       "codemirror_mode": {
        "name": "ipython",
        "version": 3
       },
       "file_extension": ".py",
       "mimetype": "text/x-python",
       "name": "python",
       "nbconvert_exporter": "python",
       "pygments_lexer": "ipython3",
       "version": "3.7.2"
      }
     },
     "nbformat": 4,
     "nbformat_minor": 2
    }