From 605487a797b6e9856051981f3e0f176db2ff8534 Mon Sep 17 00:00:00 2001 From: Uwe Schmitt <uwe.schmitt@id.ethz.ch> Date: Wed, 23 Jan 2019 21:40:51 +0100 Subject: [PATCH] updated course layout after meeting --- course_layout.md | 96 ++++++++++++++++++++++++++++-------------------- 1 file changed, 57 insertions(+), 39 deletions(-) diff --git a/course_layout.md b/course_layout.md index e27a0b4..fccff08 100644 --- a/course_layout.md +++ b/course_layout.md @@ -93,30 +93,6 @@ TBD: prepare coding session - learn regressor for movie scores. -## Part 4: accuracy, F1, ROC, ... - -Intention: accuracy is usefull but has pitfalls - -- how to measure accuracy ? - - - (TDB: skip ?) regression accuracy - - - - classifier accuracy: - - confusion matrix - - accurarcy - - pitfalls for unbalanced data sets~ - e.g. diagnose HIV - - precision / recall - - ROC ? - -### Coding session - -- evaluate accuracy of linear beer classifier from latest section - -- determine precision / recall - -- fool them: give them other dataset where classifier fails. - ## Part 3: underfitting/overfitting needs: simple accuracy measure. @@ -148,20 +124,58 @@ classifiers / regressors have parameters / degrees of freedom. - ? run crossvalidation on movie regression problem -## Part 6: pipelines / parameter tuning with scikit-learn +## Part 4: accuracy, F1, ROC, ... + +Intention: accuracy is usefull but has pitfalls + +- how to measure accuracy ? + + - (TDB: skip ?) regression accuracy + - + - classifier accuracy: + - confusion matrix + - accurarcy + - pitfalls for unbalanced data sets~ + e.g. diagnose HIV + - precision / recall + - ROC ? + +- exercise: do cross val with other metrics + +### Coding session + +- evaluate accuracy of linear beer classifier from latest section + +- determine precision / recall + +- fool them: give them other dataset where classifier fails. + + +# Day 2 + + +## Part 5: pipelines / parameter tuning with scikit-learn - Scicit learn api: recall what we have seen up to now. - pipelines, preprocessing (scaler, PCA) - cross validatioon - parameter tuning: grid search / random search. + +### Coding session + +- build SVM and Random forest crossval pipelines for previous examples +- use PCA in pipeline for (+) to improve performance +- find optimal SVM parameters +- find optimal pca components number + ### Coding par Planning: stop here, make time estimates. -# DAY 2 -### Part 6: + +## Part 6: classifiers overview Intention: quick walk throught throug reliable classifiers, give some background idea if suitable, let them play withs some incl. modification of parameters. @@ -175,28 +189,27 @@ diagram. - Random forests - Gradient Tree Boosting -### Part 7: Start with neural networks. .5 day +show decision surfaces of these classifiers on 2d examples. +### Coding session + +- apply SVM, Random Forests, Gradient boosting to previous examples +- apply clustering to previous examples +- MNIST example + + +## Part 7: Start with neural networks. .5 day -### Coding session -- apply SVM, Random Forests, Gradient boosting to previous examples -- apply clustering to previous examples -- MNIST example -### Coding session -- build SVM and Random forest crossval pipelines for previous examples -- use PCA in pipeline for (+) to improve performance -- find optimal SVM parameters -- find optimal pca components number -## Part 7: Best practices +## Part 8: Best practices - visualize features: pairwise scatter, tSNE - PCA to undertand data @@ -204,7 +217,7 @@ diagram. - start with baseline classifier / regressor - augment data to introduce variance -## Part 8: neural networks +## Part 9: neural networks - overview, history - perceptron @@ -216,3 +229,8 @@ diagram. ### Coding Session - keras reuse network and play with it. + + + + + -- GitLab