Skip to content
Snippets Groups Projects
Commit 453040d7 authored by schmittu's avatar schmittu :beer:
Browse files

reformatted proposal

parent bb86ffaf
No related branches found
No related tags found
No related merge requests found
Pipeline #401 failed
.ipynb_checkpoints/
venv*
# Targeted audience
- Researchers having no machine learning experience yet.
- Basic Python knowledge.
- Almost no math knowledge required.
# Course structure
- Two days workshop, 1.5 days workshop + .5 day working on own data / prepared data.
- Every part below includes a coding session using Jupyter notebooks.
- Coding sessions provide code frames which should be completed.
- We provide solutions.
# Day 1
## Part 0: Preparation
- Quick basics matplotlib, numpy, pandas?
### Coding session
- read dataframe from csv or excel sheet with beer features
- do some features vs features scatter plots
## Part 1: Introduction
- What is machine learning ?
- What are features / samples / feature matrix ?
- Learning problems: supervised / unsupervised
### Code walkthrough:
- Classification: linear SVM classifier or logistic regression example
- Clustering: scikit-learn example to find clusters.
## Part 2: classification
Intention: demonstrate one / two simple examples of classifiers, also
introduce the concept of decision boundary
- Introduction: some simple two dimensional examples incl. decision function.
- Idea of linear classifier:
- simple linear classifier (linear SVM e.g.)
- beer example with some weights
- Discuss code example with logistic regression for beer data, show weights
### Coding session:
- Change given code to use a linear SVM classifier
- Use different data set which can not be classified well with a linear classifier
## Part 3: accuracy, F1, ROC, ...
Intention: accuracy is useful but has pitfalls
- how to measure accuracy ?
- confusion matrix
- accurarcy
- pitfalls for unbalanced data sets
e.g. diagnose HIV
- precision / recall
### Coding session
- Evaluate accuracy of linear beer classifier from latest section
- Determine precision / recall
## Part 4: underfitting/overfitting
classifiers / regressors have parameters / degrees of freedom.
- underfitting: linear classifier on nonlinear problem
- overfitting:
- features have actual noise, or not enough information: orchid example in 2d. elevate to 3d using another feature.
- polynome of degree 5 to fit points on a line + noise
- points in a circle: draw very exact boundary line
- how to check underfitting / overfitting ?
- measure accuracy or other metric on test dataset
- cross validation
### Coding session:
- How to do cross validation with scikit-learn
- run cross validation on classifier for beer data
## Part 5: pipelines / parameter tuning with scikit-learn
- Scikit learn API incl. summary of what we have seen up to now.
- pipelines, preprocessing (scaler, PCA)
- cross validation
- Hyper parameter tuning: grid search / random search.
### Coding session
- examples
# DAY 2
## Part 6: Overview classifiers
- Nearest neighbours
- SVMs
- demo for RBF: different parameters influence on decision line
- Random forests
- Gradient Tree Boosting
### Coding session
- Prepare examples for 2d classification problems incl. visualization of different
decision surfaces.
- Play with different classifiers on beer data
## Part 7: Regression
- What are differences compared to classification: output, how to measure accuracy, ...
- Example: fit polynomial, examples for underfitting and overfitting
### Coding session
Introduce movie data set, learn SVR or other regressor on this data set.
## Part 8: Introduction neural networks
- Overview of the field
- Introduction to feed forward neural networks
- Demo Keras
### Coding Session
- keras reuse network and play with it.
## Workshop
- assist to setup the workshop material on own computer.
- provide example problems if attendees don't bring own data.
%% Cell type:markdown id: tags:
# Introduction to machine-learning with Python
### Targeted audience
- Researchers having no machine learning experience yet.
- Basic Python knowledge.
- Almost no math knowledge required.
### Course structure
- Two days workshop, 1.5 days workshop + .5 day working on own data / prepared data.
- Every part below includes a coding session using Jupyter notebooks.
- Coding sessions provide code frames which should be completed.
- We provide solutions.
## Day 1
### Part 0: Preparation
- Quick basics matplotlib, numpy, pandas?
#### Coding session
- read dataframe from csv or excel sheet with beer features
- do some features vs features scatter plots
### Part 1: Introduction
- What is machine learning ?
- What are features / samples / feature matrix ?
- Learning problems: supervised / unsupervised
#### Code walkthrough:
- Classification: linear SVM classifier or logistic regression example
- Clustering: scikit-learn example to find clusters.
### Part 2: classification
Intention: demonstrate one / two simple examples of classifiers, also
introduce the concept of decision boundary
- Introduction: some simple two dimensional examples incl. decision function.
- Idea of linear classifier:
- simple linear classifier (linear SVM e.g.)
- beer example with some weights
- Discuss code example with logistic regression for beer data, show weights
#### Coding session:
- Change given code to use a linear SVM classifier
- Use different data set which can not be classified well with a linear classifier
### Part 3: accuracy, F1, ROC, ...
Intention: accuracy is useful but has pitfalls
- how to measure accuracy ?
- confusion matrix
- accurarcy
- pitfalls for unbalanced data sets
e.g. diagnose HIV
- precision / recall
#### Coding session
- Evaluate accuracy of linear beer classifier from latest section
- Determine precision / recall
### Part 4: underfitting/overfitting
classifiers / regressors have parameters / degrees of freedom.
- underfitting: linear classifier on nonlinear problem
- overfitting:
- features have actual noise, or not enough information: orchid example in 2d. elevate to 3d using another feature.
- polynome of degree 5 to fit points on a line + noise
- points in a circle: draw very exact boundary line
- how to check underfitting / overfitting ?
- measure accuracy or other metric on test dataset
- cross validation
#### Coding session:
- How to do cross validation with scikit-learn
- run cross validation on classifier for beer data
### Part 5: pipelines / parameter tuning with scikit-learn
- Scikit learn API incl. summary of what we have seen up to now.
- pipelines, preprocessing (scaler, PCA)
- cross validation
- Hyper parameter tuning: grid search / random search.
#### Coding session
- examples
## DAY 2
### Part 6: Overview classifiers
- Nearest neighbours
- SVMs
- demo for RBF: different parameters influence on decision line
- Random forests
- Gradient Tree Boosting
#### Coding session
- Prepare examples for 2d classification problems incl. visualization of different
decision surfaces.
- Play with different classifiers on beer data
### Part 7: Regression
- What are differences compared to classification: output, how to measure accuracy, ...
- Example: fit polynomial, examples for underfitting and overfitting
#### Coding session
Introduce movie data set, learn SVR or other regressor on this data set.
### Part 8: Introduction neural networks
- Overview of the field
- Introduction to feed forward neural networks
- Demo Keras
#### Coding Session
- keras reuse network and play with it.
## Workshop
- assist to setup the workshop material on own computer.
- provide example problems if attendees don't bring own data.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment