# Targeted audience

- Researchers from DBIOL, BSSE and DGESS having no machine learning experience yet.
- Basic Python knowledge.
- Almost no math knowledge.

# Concepts

- two days workshop, 1.5 days workshop + .5 day working on own data / prepared data.
- smooth learning curve
- explain fundamental concepts first, discuss exceptions, corner cases,
pitfalls late.
- plotting / pandas? / numpy first. Else participants might be fight with these
basics during coding sessions and will be disctracted from the actual
learning goal of an exercise.
- jupyter notebooks / conda, extra notebooks with solutions.
- use prepared computers in computer room, setting up personal computer during last day if required.
- exercises: empty holes to fill

TBD:

# Course structure

## Part 0: Preparation (UWE)

- quick basics matplotlib, numpy, pandas?:

TBD: installation instructions preparation.

TBD: prepare coding session

## Part 1: Introduction (UWE)

- What is machine learning ?

- learning from examples
- working with hard to understand data.
- automatation

- What are features / samples / feature matrix ?
- always numerical / categorical vectors
- examples: beer, movies, images, text to numerical examples

- Learning problems:

- unsupervised:

- find structure in set of features
- beers: find groups of beer types

- supervised:

- classification: do I like this beer ?
example: draw decision tree

## Part 2a: supervised learning: classification

Intention: demonstrate one / two simple examples of classifiers, also
introduce the concept of decision boundary

- idea of simple linear classifier: take features, produce real value ("uwes beer score"), use threshold to decide
-> simple linear classifier (linear SVM e.g.)
-> beer example with some weights

- show code example with logistic regression for beer data, show weights, plot decision function

### Coding session:

- change given code to use a linear SVM classifier

- use different data (TBD) set which can not be classified well with a linear classifier
- tell to transform data and run again (TBD: how exactly ?)

## Part 2b: supervised learning: regression (TBD: skip this ?)

Intention: demonstrate one / two simple examples of regression

- regression: how would I rate this movie ?
example: use weighted sum, also example for linear regresor
example: fit a quadratic function

- learn regressor for movie scores.

## Part 3: underfitting/overfitting

needs: simple accuracy measure.

classifiers / regressors have parameters / degrees of freedom.

- underfitting:

- linear classifier for points on a quadratic function

- overfitting:

- features have actual noise, or not enough information
not enough information: orchid example in 2d. elevate to 3d using another feature.
- polnome of degree 5 to fit points on a line + noise
- points in a circle: draw very exact boundary line

- how to check underfitting / overfitting ?

- measure accuracy or other metric on test dataset
- cross validation

### Coding session:

- How to do cross validation with scikit-learn
- use different beer feature set with redundant feature (+)
- run crossvalidation on classifier
- ? run crossvalidation on movie regression problem

## Part 4: accuracy, F1, ROC, ...

Intention: accuracy is usefull but has pitfalls

- how to measure accuracy ?

- (TDB: skip ?) regression accuracy
-
- classifier accuracy:
- confusion matrix
- accurarcy
- pitfalls for unbalanced data sets~
e.g. diagnose HIV
- precision / recall
- ROC ?

- exercise: do cross val with other metrics

### Coding session

- evaluate accuracy of linear beer classifier from latest section

- determine precision / recall

- fool them: give them other dataset where classifier fails.

# Day 2

## Part 5: classifiers overview

Intention: quick walk through reliable classifiers, give some background idea if
suitable, let them play with some, incl. modification of parameters.

To consider: decision graph from sklearn, come up with easy to understand
diagram.

- Nearest neighbours
- SVM classifier (SVC)
- demo for Radial Basis Function (RBF) kernel trick: different parameters influence on
decision line
- ?Decision trees or only in random forests?
- Random forests (ensemble method - averaging)
- Gradient Tree Boosting (ensemble method - boosting)
- Naive Bayes for text classification
- mentions - big data:
- Stochastic Gradient Descent classifier,
- kernel approximation transformation (explicitly approx. kernel trick)
- compare SVC incl. RBF vs. Random Kitchen Sinks (RBFSampler) + linear SVC (https://scikit-learn.org/stable/auto_examples/plot_kernel_approximation.html#sphx-glr-auto-examples-plot-kernel-approximation-py)

Topics to include:

- interoperability of results (in terms features importance, e.g. SVN w/ hig deg poly
kernel)
- some rules of thumbs: don't use KNN classifiers for 10 or more dimensions (why? paper
link)
- show decision surfaces for diff classifiers (extend exercise in sec 3 using
hyperparams)

### Coding session

- apply SVM, Random Forests, Gradient boosting to previous examples
- apply clustering to previous examples
- MNIST example

## Part 6: pipelines / parameter tuning with scikit-learn

- Scikit-learn API: recall what we have seen up to now.
- pipelines, preprocessing (scaler, PCA)
- cross validation
- parameter tuning: grid search / random search.

### Coding session

- build SVM and LinearRegression crossval pipelines for previous examples
- use PCA in pipeline for (+) to improve performance
- find optimal SVM parameters
- find optimal pca components number

## Part 7: Start with neural networks. .5 day

## Planning

Stop here, make time estimates.

## Part 8: Best practices

- visualize features: pairwise scatter, tSNE
- PCA to undertand data
- check balance of data set, what if not ?
- start with baseline classifier / regressor
- augment data to introduce variance

## Part 9: neural networks

- overview, history
- perceptron
- multi layer
- multi layer demoe with google online tool
- where neural networks work well
- keras demo

### Coding Session

- keras reuse network and play with it.