added few things to 05_classifiers_overview.ipynb draft

c1f14503 · schmittu · 377739b7 · c1f14503
Commit c1f14503 authored 6 years ago by schmittu
--- a/05_classifiers_overview.ipynb
+++ b/05_classifiers_overview.ipynb
@@ -21,6 +21,9 @@
    "- Decision trees\n",
    "- Random forests\n",
    "\n",
+    "- XGboost (https://xgboost.readthedocs.io/en/latest/) (not part of scikit-learn, won many kaggle competitions https://www.kaggle.com/dansbecker/xgboost, offers scikit-learn API https://www.kaggle.com/stuarthallows/using-xgboost-with-scikit-learn)\n",
+    "\n",
+    "\n",
    "For every classifier: some examples for decision surfaces.\n",
    "\n",
    "Historical information ?"
@@ -93,7 +96,9 @@
    "\n",
    "Hard to intepret the internals.\n",
    "\n",
-    "for rbf: gamma parameter is \"decline rate\" of rbf functions, controls smoothness of decision surface.\n"
+    "for rbf: gamma parameter is \"decline rate\" of rbf functions, controls smoothness of decision surface.\n",
+    "\n",
+    "feature scaling is crucial for good performance !"
   ]
  },
  {

 %% Cell type:markdown id: tags:

 # Chapter 5: An overview of classifiers

 %% Cell type:markdown id: tags:

 What classifiers ?

 - Neighrest neighbours
 - Logistic Regression
 - Linear SVM

 - Kernel SVM
 - Decision trees
 - Random forests

+- XGboost (https://xgboost.readthedocs.io/en/latest/) (not part of scikit-learn, won many kaggle competitions https://www.kaggle.com/dansbecker/xgboost, offers scikit-learn API https://www.kaggle.com/stuarthallows/using-xgboost-with-scikit-learn)
+
+
 For every classifier: some examples for decision surfaces.

 Historical information ?

 %% Cell type:markdown id: tags:

 ## Neighrest neighbours

 - For a new feature $x$ look for $N$ closests examples from learning data (usually using the euclidean distance).
 - Classify $x$ as the majority of labels among these closest examples.

 Parameter: $N$. the larger $N$ the smoother the decision surface.

 Benefit: simple

 Disadvanages: needs lots of data, does not work well for dimesions > 8(ish) (source !?)

 TODO: Commentary about course of dimensionality

 %% Cell type:markdown id: tags:

 ## Logistic regression

 $\sigma (t)={\frac {e^{t}}{e^{t}+1}}={\frac {1}{1+e^{-t}}}$

 plot !

 linear classifier, sigma shrinks result of linear combinations to interval 0, 1 which are interpreted as class probabilities.

 works better in high dimensions

 weights can be interpreted

 Parameters: C (https://stackoverflow.com/questions/22851316/what-is-the-inverse-of-regularization-strength-in-logistic-regression-how-shoul)

 Penelaty to avoid overfitting

 Plot logistig regression diagram as very simple neural network ?

 %% Cell type:markdown id: tags:

 ## Linear SVM

 - linear classifier such that margin is maximised (show example)
 - based on "empirical risk minization" (vapnik)

 the final weight vector is a linear combination of a subset of the features from the learning set. These are called "support vectors".

 weights can be interpreted

 C: how much weight to we put on examples within the "margin strip"

 %% Cell type:markdown id: tags:

 ## Kernel based SVM

 So called kernels are used to build the classifiation surface. Default kernel is rbf.

 Hard to intepret the internals.

 for rbf: gamma parameter is "decline rate" of rbf functions, controls smoothness of decision surface.

+feature scaling is crucial for good performance !
+
 %% Cell type:markdown id: tags:

 ## Decision trees

 - simple example incl. plot
 - basic idea: "optimal" splits...

 - benefit: interpretability

 Parameter: depth, the deeper the higher the risk for overfitting.

 %% Cell type:markdown id: tags:

 ## Random forests

 - generate many week classifiers by creating shallow trees with random splittings
 - use so call bagging to implement a good overall classifier

 - benefits: allows also estimates about feature importance

 - more robust to overfitting than decision trees

 %% Cell type:code id: tags:

 ``` python
 ```