From fc58226a7d36a9ef04c50d3a71c74c7afe6baa52 Mon Sep 17 00:00:00 2001
From: Mikolaj Rybinski <mikolaj.rybinski@id.ethz.ch>
Date: Tue, 21 Sep 2021 16:45:27 +0200
Subject: [PATCH] In Notebook 06: clarify RF OOB error

---
 06_classifiers_overview-part_2.ipynb | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/06_classifiers_overview-part_2.ipynb b/06_classifiers_overview-part_2.ipynb
index e6985db..4e70f84 100644
--- a/06_classifiers_overview-part_2.ipynb
+++ b/06_classifiers_overview-part_2.ipynb
@@ -546,8 +546,10 @@
     "Random forests are fast and shine with high dimensional data (many features).\n",
     "\n",
     "<div class=\"alert alert-block alert-info\">\n",
-    "<i class=\"fa fa-info-circle\"></i>\n",
-    "    Random Forest can estimate <em>out-of-bag error</em> (OOB) while learning (set <code>oob_score=True</code>). It's a generalisation/predictive error, similar to cross validation accuracy (cf. <a href=https://scikit-learn.org/stable/auto_examples/ensemble/plot_ensemble_oob.html>OOB Errors for Random Forests</a> )\n",
+    "    <p><i class=\"fa fa-info-circle\"></i>\n",
+    "        Random Forest can estimate <em>out-of-bag error</em> (OOB) while learning; set <code>oob_score=True</code>. (The out-of-bag (OOB) error is the average error for each data sample, calculated using predictions from the trees that do not contain that sample in their respective bootstrap samples.)\n",
+    "    OOB is a generalisation/predictive error that, together with <code>warm_start=True</code>, can be used for efficient search for a good-enough number of trees, i.e. the <code>n_estimators</code> hyperparameter value (see: <a href=https://scikit-learn.org/stable/auto_examples/ensemble/plot_ensemble_oob.html>OOB Errors for Random Forests</a>).\n",
+    "    </p>\n",
     "</div>"
    ]
   },
-- 
GitLab