Skip to content
Snippets Groups Projects
08_b-neural_networks.ipynb 43.5 KiB
Newer Older
  • Learn to ignore specific revisions
  • chadhat's avatar
    chadhat committed
    {
     "cells": [
    
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
       "source": [
        "# IGNORE THIS CELL WHICH CUSTOMIZES LAYOUT AND STYLING OF THE NOTEBOOK !\n",
        "from numpy.random import seed\n",
        "\n",
        "seed(42)\n",
        "import tensorflow as tf\n",
        "\n",
    
    chadhat's avatar
    chadhat committed
        "tf.random.set_seed(46)\n",
    
        "import matplotlib as mpl\n",
        "import matplotlib.pyplot as plt\n",
        "import seaborn as sns\n",
        "\n",
        "sns.set(style=\"darkgrid\")\n",
        "mpl.rcParams[\"lines.linewidth\"] = 3\n",
        "%matplotlib inline\n",
        "%config InlineBackend.figure_format = 'retina'\n",
        "%config IPCompleter.greedy=True\n",
        "import warnings\n",
        "\n",
        "warnings.filterwarnings(\"ignore\", category=FutureWarning)\n",
        "from IPython.core.display import HTML\n",
        "\n",
        "HTML(open(\"custom.html\", \"r\").read())"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
    
    chadhat's avatar
    chadhat committed
        "# Chapter 8b: Introduction to Tensorflow"
    
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "## Introduction to TensorFlow (keras API)"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "### A bit about Keras?\n",
        "\n",
        "* It is a high level API to create and work with neural networks\n",
        "* Used to support multiple backends such as **TensorFlow** from Google, **Theano** (Theano is dead now) and **CNTK** (Microsoft Cognitive Toolkit), up till release 2.3.0 \n",
        "* Very good for creating neural nets quickly and hides away a lot of tedious work\n",
        "* Has been incorporated into official TensorFlow (which obviously only works with tensorflow) and is its main API as of version 2.0"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "<center>\n",
        "<figure>\n",
        "<img src=\"./images/neuralnets/neural_net_keras_1.svg\" width=\"700\"/>\n",
        "<figcaption>Building this model in TensorFlow (Keras)</figcaption>\n",
        "</figure>\n",
        "</center>"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
       "source": [
        "# Say hello to Tensorflow\n",
        "from tensorflow.keras.layers import Activation, Dense\n",
        "from tensorflow.keras.models import Sequential\n",
        "\n",
        "# Creating a model\n",
        "model = Sequential()\n",
        "\n",
        "# Adding layers to this model\n",
        "# 1st Hidden layer\n",
        "# A Dense/fully-connected layer which takes as input a\n",
        "# feature array of shape (samples, num_features)\n",
        "# Here input_shape = (2,) means that the layer expects an input with num_features = 2\n",
        "# and the sample size could be anything\n",
        "# The activation function for this layer is set to \"relu\"\n",
        "model.add(Dense(units=4, input_shape=(2,), activation=\"relu\"))\n",
        "\n",
        "# 2nd Hidden layer\n",
        "# This is also a fully-connected layer and we do not need to specify the\n",
        "# shape of the input anymore (We need to do that only for the first layer)\n",
        "# NOTE: Now we didn't add the activation seperately. Instead we just added it\n",
        "# while calling Dense(). This and the way used for the first layer are Equivalent!\n",
        "model.add(Dense(units=4, activation=\"relu\"))\n",
        "\n",
        "\n",
        "# The output layer\n",
        "model.add(Dense(units=1))\n",
        "model.add(Activation(\"sigmoid\"))\n",
        "\n",
        "model.summary()"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "### XOR using neural networks"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
       "metadata": {},
       "outputs": [],
       "source": [
        "import matplotlib.pyplot as plt\n",
        "import numpy as np\n",
        "import pandas as pd\n",
        "import seaborn as sns\n",
        "from sklearn.model_selection import train_test_split\n",
        "from tensorflow.keras.layers import Dense\n",
        "from tensorflow.keras.models import Sequential"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
       "source": [
        "# Creating a network to solve the XOR problem\n",
        "\n",
        "# Loading and plotting the data\n",
        "xor = pd.read_csv(\"data/xor.csv\")\n",
        "\n",
        "# Using x and y coordinates as featues\n",
        "features = xor.iloc[:, :-1]\n",
        "# Convert boolean to integer values (True->1 and False->0)\n",
        "labels = 1 - xor.iloc[:, -1].astype(int)\n",
        "\n",
        "colors = [[\"steelblue\", \"chocolate\"][i] for i in labels]\n",
        "plt.figure(figsize=(5, 5))\n",
        "plt.xlim([-2, 2])\n",
        "plt.ylim([-2, 2])\n",
        "plt.title(\"Blue points are False\")\n",
        "plt.scatter(features[\"x\"], features[\"y\"], color=colors, marker=\"o\");"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
       "metadata": {},
       "outputs": [],
       "source": [
        "# Building a simple Tensorflow model\n",
        "\n",
    
    chadhat's avatar
    chadhat committed
        "\n",
    
        "def a_simple_NN():\n",
        "\n",
        "    model = Sequential()\n",
        "\n",
        "    model.add(Dense(4, input_shape=(2,), activation=\"relu\"))\n",
        "\n",
        "    model.add(Dense(4, activation=\"relu\"))\n",
        "\n",
        "    model.add(Dense(1, activation=\"sigmoid\"))\n",
        "\n",
        "    model.compile(loss=\"binary_crossentropy\", optimizer=\"rmsprop\", metrics=[\"accuracy\"])\n",
        "\n",
        "    return model"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
       "source": [
        "# Instantiating the model\n",
        "model = a_simple_NN()\n",
        "\n",
        "# Splitting the dataset into training (70%) and validation sets (30%)\n",
        "X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=0.3)\n",
        "\n",
        "# Setting the number of passes through the entire training set\n",
        "num_epochs = 300\n",
        "\n",
        "# model.fit() is used to train the model\n",
        "# We can pass validation data while training\n",
        "model_run = model.fit(\n",
        "    X_train, y_train, epochs=num_epochs, validation_data=(X_test, y_test)\n",
        ")"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "<div class=\"alert alert-block alert-info\"><p><i class=\"fa fa-info-circle\"></i>&nbsp;\n",
        "    NOTE: We can pass \"verbose=0\" to model.fit() to suppress the printing of model output on the terminal/notebook.\n",
        "</p></div>"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
       "source": [
        "# Plotting the loss and accuracy on the training and validation sets during the training\n",
        "# This can be done by using TensorFlow (Keras) callback \"history\" which is applied by default\n",
        "history_model = model_run.history\n",
        "\n",
        "print(\"The history has the following data: \", history_model.keys())\n",
        "\n",
        "# Plotting the training and validation accuracy during the training\n",
        "sns.lineplot(\n",
    
    chadhat's avatar
    chadhat committed
        "    x=model_run.epoch, y=history_model[\"accuracy\"], color=\"blue\", label=\"Training set\"\n",
    
        ")\n",
        "sns.lineplot(\n",
    
    chadhat's avatar
    chadhat committed
        "    x=model_run.epoch, y=history_model[\"val_accuracy\"], color=\"red\", label=\"Valdation set\"\n",
    
        ")\n",
        "plt.xlabel(\"epochs\")\n",
        "plt.ylabel(\"accuracy\");"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {
        "tags": []
       },
       "source": [
        "<div class=\"alert alert-block alert-warning\">\n",
        "<p><i class=\"fa fa-warning\"></i>&nbsp;\n",
        "The plots such as above are essential for analyzing the behaviour and performance of the network and to tune it in the right direction. However, for the example above we don't expect to derive a lot of insight from this plot as the function we are trying to fit is quite simple and there is not too much noise. We will see the significance of these curves in a later example.\n",
        "</p>\n",
        "</div>"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
       "source": [
        "# Before we move on forward we see how to save and load a TensorFlow (keras) model\n",
        "model.save(\"./data/my_first_NN.h5\")\n",
        "model.save(\"./data/my_first_NN\")\n",
        "\n",
        "\n",
        "# Optional: See what is in the hdf5 file we just created above\n",
        "\n",
        "from tensorflow.keras.models import load_model\n",
        "\n",
        "model = load_model(\"./data/my_first_NN.h5\")\n",
        "model_pb = load_model(\"./data/my_first_NN\")"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "For the training and validation in the example above we split our dataset into a 70-30 train-validation set. We know from previous chapters that to more robustly estimate the accuracy of our model we can use **K-fold cross-validation**.\n",
        "This is even more important when we have small datasets and cannot afford to reserve a validation set!\n",
        "\n",
        "One way to do the cross-validation here would be to write our own function to do this. However, we also know that **scikit-learn** provides several handy functions to evaluate and tune the models. So the question is:\n",
        "\n",
        "\n",
        "<div class=\"alert alert-block alert-warning\">\n",
        "<p><i class=\"fa fa-warning\"></i>&nbsp;\n",
        "    Can we somehow use the scikit-learn functions or the ones we wrote ourselves for scikit-learn models to evaluate and tune our TensorFlow (Keras) models?\n",
        "\n",
        "\n",
        "The Answer is **YES !**\n",
        "</p>\n",
        "</div>\n",
        "\n",
        "\n",
        "\n",
        "We show how to do this in the following section."
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "## Using scikit-learn functions on TensorFlow (Keras) models\n",
        "\n",
        "\n",
        "<div class=\"alert alert-block alert-warning\">\n",
        "<p><i class=\"fa fa-warning\"></i>&nbsp;\n",
        "TensorFlow (Keras) offers 2 wrappers which allow its Sequential models to be used with scikit-learn. \n",
        "\n",
        "There are: **KerasClassifier** and **KerasRegressor**.\n",
        "\n",
        "For more information:\n",
        "https://keras.io/scikit-learn-api/\n",
        "</p>\n",
        "</div>\n",
        "\n",
        "\n",
        "\n",
        "**Now lets see how this works!**"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
       "metadata": {},
       "outputs": [],
       "source": [
        "# We wrap the TensorFlow (Keras) model we created above with KerasClassifier\n",
        "from sklearn.model_selection import cross_val_score\n",
    
    chadhat's avatar
    chadhat committed
        "#from tensorflow.keras.wrappers.scikit_learn import KerasClassifier\n",
        "from scikeras.wrappers import KerasClassifier\n",
    
        "\n",
        "# Wrapping TensorFlow (Keras) model\n",
        "# NOTE: We pass verbose=0 to suppress the model output\n",
        "num_epochs = 400\n",
    
    chadhat's avatar
    chadhat committed
        "model_scikit = KerasClassifier(model=a_simple_NN, epochs=num_epochs, verbose=0)"
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
       "metadata": {},
       "outputs": [],
       "source": [
        "# Let's reuse the function to visualize the decision boundary which we saw in chapter 2 with minimal change\n",
        "\n",
        "\n",
    
    chadhat's avatar
    chadhat committed
        "#def list_flatten(list_of_list):\n",
        "#    flattened_list = [i for j in list_of_list for i in j]\n",
        "#    return flattened_list\n",
        "\n",
    
        "def list_flatten(list_of_list):\n",
    
    chadhat's avatar
    chadhat committed
        "  for item in list_of_list:\n",
        "    if isinstance(item, list):\n",
        "      for subitem in item: yield subitem\n",
        "    else:\n",
        "      yield item\n",
    
        "\n",
        "\n",
        "def plot_points(plt=plt, marker=\"o\"):\n",
        "    colors = [[\"steelblue\", \"chocolate\"][i] for i in labels]\n",
        "    plt.scatter(features.iloc[:, 0], features.iloc[:, 1], color=colors, marker=marker)\n",
        "\n",
        "\n",
        "def train_and_plot_decision_surface(\n",
        "    name, classifier, features_2d, labels, preproc=None, plt=plt, marker=\"o\", N=400\n",
        "):\n",
        "\n",
        "    features_2d = np.array(features_2d)\n",
        "    xmin, ymin = features_2d.min(axis=0)\n",
        "    xmax, ymax = features_2d.max(axis=0)\n",
        "\n",
        "    x = np.linspace(xmin, xmax, N)\n",
        "    y = np.linspace(ymin, ymax, N)\n",
        "    points = np.array(np.meshgrid(x, y)).T.reshape(-1, 2)\n",
        "\n",
        "    if preproc is not None:\n",
        "        points_for_classifier = preproc.fit_transform(points)\n",
        "        features_2d = preproc.fit_transform(features_2d)\n",
        "    else:\n",
        "        points_for_classifier = points\n",
        "\n",
        "    classifier.fit(features_2d, labels, verbose=0)\n",
        "\n",
        "    if name == \"Neural Net\":\n",
        "        # predicted = classifier.predict(features_2d)\n",
        "        # predicted = list_flatten(predicted)\n",
    
    chadhat's avatar
    chadhat committed
        "        predicted = list(list_flatten(\n",
    
        "            (classifier.predict(features_2d) > 0.5).astype(\"int32\")\n",
    
    chadhat's avatar
    chadhat committed
        "        ))\n",
    
        "    # else:\n",
        "    # predicted = classifier.predict(features_2d)\n",
        "\n",
        "    if preproc is not None:\n",
        "        name += \" (w/ preprocessing)\"\n",
        "    print(name + \":\\t\", sum(predicted == labels), \"/\", len(labels), \"correct\")\n",
        "\n",
        "    if name == \"Neural Net\":\n",
        "        # classes = np.array(list_flatten(classifier.predict(points_for_classifier)), dtype=bool)\n",
        "        classes = np.array(\n",
    
    chadhat's avatar
    chadhat committed
        "            list(list_flatten(\n",
    
        "                (classifier.predict(points_for_classifier) > 0.5).astype(\"int32\")\n",
    
    chadhat's avatar
    chadhat committed
        "            )),\n",
    
        "            dtype=bool,\n",
        "        )\n",
        "    # else:\n",
        "    # classes = np.array(classifier.predict(points_for_classifier), dtype=bool)\n",
        "    plt.plot(\n",
        "        points[~classes][:, 0],\n",
        "        points[~classes][:, 1],\n",
        "        \"o\",\n",
        "        color=\"steelblue\",\n",
        "        markersize=1,\n",
        "        alpha=0.01,\n",
        "    )\n",
        "    plt.plot(\n",
        "        points[classes][:, 0],\n",
        "        points[classes][:, 1],\n",
        "        \"o\",\n",
        "        color=\"chocolate\",\n",
        "        markersize=1,\n",
        "        alpha=0.04,\n",
        "    )"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
       "source": [
        "_, ax = plt.subplots(figsize=(6, 6))\n",
        "\n",
        "train_and_plot_decision_surface(\"Neural Net\", model_scikit, features, labels, plt=ax)\n",
        "plot_points(plt=ax)"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
       "source": [
        "# Applying K-fold cross-validation\n",
        "# Here we pass the whole dataset, i.e. features and labels, instead of splitting it.\n",
        "num_folds = 5\n",
        "cross_validation = cross_val_score(\n",
        "    model_scikit, features, labels, cv=num_folds, verbose=0\n",
        ")\n",
        "\n",
        "print(\"The acuracy on the \", num_folds, \" validation folds:\", cross_validation)\n",
        "print(\n",
        "    \"The Average acuracy on the \",\n",
        "    num_folds,\n",
        "    \" validation folds:\",\n",
        "    np.mean(cross_validation),\n",
        ")"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "<div class=\"alert alert-block alert-warning\">\n",
        "<p><i class=\"fa fa-warning\"></i>&nbsp;\n",
        "The code above took quite long to finish even though we used only 5  CV folds and the neural network and data size are very small! This gives an indication of the enormous compute requirements of training production-grade deep neural networks.\n",
        "</p>\n",
        "</div>"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "## Hyperparameter optimization"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "We know from chapter 6 that there are 2 types of parameters which need to be tuned for a machine learning model.\n",
        "* Internal model parameters (weights) which can be learned for e.g. by gradient-descent\n",
        "* Hyperparameters\n",
        "\n",
        "In the model created above we made some arbitrary choices such as the choice of the optimizer we used, optimizer's learning rate, number of hidden units and so on ...\n",
        "\n",
        "Now that we have the TensorFlow (keras) model wrapped as a scikit-learn model we can use the grid search functions we have seen in chapter 6."
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
       "metadata": {},
       "outputs": [],
       "source": [
        "from sklearn.model_selection import GridSearchCV\n",
        "\n",
        "# Just to remember\n",
        "model_scikit = KerasClassifier(\n",
    
    chadhat's avatar
    chadhat committed
        "    model=a_simple_NN, **{\"epochs\": num_epochs, \"verbose\": 0}\n",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
       "source": [
        "HP_grid = {\"epochs\": [30, 50, 100]}\n",
        "search = GridSearchCV(estimator=model_scikit, param_grid=HP_grid)\n",
        "search.fit(features, labels)\n",
        "print(search.best_score_, search.best_params_)"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
       "source": [
        "HP_grid = {\"epochs\": [10, 15, 30], \"batch_size\": [10, 20, 30]}\n",
        "search = GridSearchCV(estimator=model_scikit, param_grid=HP_grid)\n",
        "search.fit(features, labels)\n",
        "print(search.best_score_, search.best_params_)"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
       "metadata": {},
       "outputs": [],
       "source": [
        "# A more general model for further Hyperparameter optimization\n",
        "from tensorflow.keras import optimizers\n",
        "\n",
    
    chadhat's avatar
    chadhat committed
        "\n",
    
        "def a_simple_NN(activation=\"relu\", num_hidden_neurons=[4, 4], learning_rate=0.01):\n",
        "\n",
        "    model = Sequential()\n",
        "\n",
        "    model.add(Dense(num_hidden_neurons[0], input_shape=(2,), activation=activation))\n",
        "\n",
        "    model.add(Dense(num_hidden_neurons[1], activation=activation))\n",
        "\n",
        "    model.add(Dense(1, activation=\"sigmoid\"))\n",
        "\n",
        "    model.compile(\n",
        "        loss=\"binary_crossentropy\",\n",
        "        optimizer=optimizers.RMSprop(learning_rate=learning_rate),\n",
        "        metrics=[\"accuracy\"],\n",
        "    )\n",
        "\n",
        "    return model"
       ]
      },
    
    chadhat's avatar
    chadhat committed
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
    
        "### keras-tuner"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
       "metadata": {},
       "outputs": [],
       "source": [
        "!pip install -q -U keras-tuner"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
       "metadata": {},
       "outputs": [],
       "source": [
        "def a_simple_NN(activation=\"relu\", num_hidden_neurons=[4, 4], learning_rate=0.01):\n",
        "\n",
        "    model = Sequential()\n",
        "\n",
        "    model.add(Dense(num_hidden_neurons[0], input_shape=(2,), activation=activation))\n",
        "\n",
        "    model.add(Dense(num_hidden_neurons[1], activation=activation))\n",
        "\n",
        "    model.add(Dense(1, activation=\"sigmoid\"))\n",
        "\n",
        "    model.compile(\n",
        "        loss=\"binary_crossentropy\",\n",
        "        optimizer=optimizers.RMSprop(learning_rate=learning_rate),\n",
        "        metrics=[\"accuracy\"],\n",
        "    )\n",
    
    chadhat's avatar
    chadhat committed
        "    return model"
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
       "metadata": {},
       "outputs": [],
       "source": [
        "import keras_tuner as kt\n",
        "\n",
    
    chadhat's avatar
    chadhat committed
        "\n",
    
        "def model_builder(hp):\n",
        "\n",
        "    # Tune the number of units in the first Dense layer\n",
    
    chadhat's avatar
    chadhat committed
        "    hp_units = hp.Int(\"units\", min_value=4, max_value=8, step=2)\n",
        "    hp_units_2 = hp.Int(\"units2\", min_value=4, max_value=16, step=2)\n",
    
        "\n",
        "    # Tune the learning rate for the optimizer\n",
    
    chadhat's avatar
    chadhat committed
        "    hp_learning_rate = hp.Choice(\"learning_rate\", values=[1e-2, 1e-3, 1e-4])\n",
    
    chadhat's avatar
    chadhat committed
        "    # Tune the choice of the activation function\n",
        "    activation = hp.Choice(name=\"activation\", values=[\"relu\", \"sigmoid\"])\n",
    
        "\n",
        "    model = a_simple_NN(activation, [hp_units, hp_units_2], hp_learning_rate)\n",
    
    chadhat's avatar
    chadhat committed
        "\n",
    
        "    return model\n",
        "\n",
    
    chadhat's avatar
    chadhat committed
        "\n",
        "# The argument ‘hp’ is an instance of the class HyperParameters."
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
       "metadata": {},
       "outputs": [],
       "source": [
    
    chadhat's avatar
    chadhat committed
        "tuner = kt.BayesianOptimization(\n",
        "    model_builder,\n",
        "    objective=\"val_accuracy\",\n",
        "    max_trials=10,\n",
        "    project_name=\"intro_to_kt\",\n",
        "    overwrite=True,\n",
        ")"
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
       "source": [
        "tuner.search(X_train, y_train, epochs=100, validation_data=(X_test, y_test))\n",
        "best_model = tuner.get_best_models()[0]\n",
        "print(tuner.get_best_hyperparameters()[0].values)"
    
    chadhat's avatar
    chadhat committed
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
    
        "## Exercise section:  \n",
        "1. Create a neural network to classify the 2d points example from chapter 2 (Optional: As you create the model read a bit on the different TensorFlow (keras) commands we have used)\n",
        "2. Plot the decision boundary\n",
        "3. Choose and optimize a couple of hyperparameters\n",
        "4. **OPTIONAL:** What function from scikit-learn other than GridSearchCV can we use for hyperparameter optimization? Use it (or use the equivalent method from keras-tuner)"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
       "metadata": {},
       "outputs": [],
    
    chadhat's avatar
    chadhat committed
       "source": [
    
        "import matplotlib.pyplot as plt\n",
        "import numpy as np\n",
        "import pandas as pd\n",
        "import seaborn as sns\n",
    
    chadhat's avatar
    chadhat committed
        "from sklearn.model_selection import GridSearchCV, cross_val_score, train_test_split\n",
    
        "from tensorflow.keras import optimizers\n",
        "from tensorflow.keras.layers import Dense\n",
        "from tensorflow.keras.models import Sequential\n",
    
    chadhat's avatar
    chadhat committed
        "#from tensorflow.keras.wrappers.scikit_learn import KerasClassifier\n",
        "from scikeras.wrappers import KerasClassifier\n",
    
    chadhat's avatar
    chadhat committed
        "import keras_tuner as kt"
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
       "source": [
        "circle = pd.read_csv(\"data/circle.csv\")\n",
        "# Using x and y coordinates as featues\n",
        "features = circle.iloc[:, :-1]\n",
        "# Convert boolean to integer values (True->1 and False->0)\n",
        "labels = circle.iloc[:, -1].astype(int)\n",
        "\n",
        "colors = [[\"steelblue\", \"chocolate\"][i] for i in circle[\"label\"]]\n",
        "plt.figure(figsize=(5, 5))\n",
        "plt.xlim([-2, 2])\n",
        "plt.ylim([-2, 2])\n",
        "\n",
        "plt.scatter(features[\"x\"], features[\"y\"], color=colors, marker=\"o\");"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
       "metadata": {
        "tags": []
       },
       "outputs": [],
       "source": [
    
    chadhat's avatar
    chadhat committed
        "# Insert Code here"
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
       "metadata": {
        "tags": [
         "solution"
        ]
       },
       "outputs": [],
       "source": [
        "# Solution\n",
        "def circle_NN(activation=\"relu\", learning_rate=0.01):\n",
        "\n",
        "    model = Sequential()\n",
        "\n",
        "    model.add(Dense(4, input_shape=(2,), activation=activation))\n",
    
    chadhat's avatar
    chadhat committed
        "\n",
    
        "    model.add(Dense(4, activation=activation))\n",
        "\n",
        "    model.add(Dense(1, activation=\"sigmoid\"))\n",
        "\n",
    
    chadhat's avatar
    chadhat committed
        "    model.compile(\n",
        "        loss=\"binary_crossentropy\",\n",
        "        optimizer=optimizers.RMSprop(learning_rate=learning_rate),\n",
        "        metrics=[\"accuracy\"],\n",
        "    )\n",
    
        "\n",
        "    return model\n",
        "\n",
    
    chadhat's avatar
    chadhat committed
        "\n",
    
        "# Instantiating the model\n",
        "model = circle_NN()\n",
        "\n",
        "# Splitting the dataset into training (70%) and validation sets (30%)\n",
        "X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=0.3)\n",
        "\n",
        "# Setting the number of passes through the entire training set\n",
        "num_epochs = 400\n",
        "\n",
        "# model.fit() is used to train the model\n",
        "# We can pass validation data while training\n",
        "model_run = model.fit(\n",
        "    X_train, y_train, epochs=num_epochs, validation_data=(X_test, y_test), verbose=0\n",
        ")"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
       "metadata": {
        "tags": [
         "solution"
        ]
       },
    
    chadhat's avatar
    chadhat committed
       "outputs": [],
    
       "source": [
        "# solution\n",
        "_, ax = plt.subplots(figsize=(6, 6))\n",
        "\n",
        "num_epochs = 400\n",
    
    chadhat's avatar
    chadhat committed
        "circle_scikit = KerasClassifier(model=circle_NN, epochs=num_epochs, verbose=0)\n",
    
        "\n",
        "train_and_plot_decision_surface(\"Neural Net\", circle_scikit, features, labels, plt=ax)\n",
    
    chadhat's avatar
    chadhat committed
        "plot_points(plt=ax)"
    
    chadhat's avatar
    chadhat committed
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
       "metadata": {
        "tags": [
         "solution"
        ]
       },
    
    chadhat's avatar
    chadhat committed
       "outputs": [],
    
    chadhat's avatar
    chadhat committed
        "# solution (older method)\n",
        "\"\"\"\n",
        "HP_grid = {\n",
        "    \"activation\": [\"relu\", \"sigmoid\"],\n",
        "    \"learning_rate\": [0.01, 0.005, 0.001],\n",
        "}\n",
    
        "search = GridSearchCV(estimator=circle_scikit, param_grid=HP_grid)\n",
        "search.fit(features, labels)\n",
    
    chadhat's avatar
    chadhat committed
        "print(search.best_score_, search.best_params_)\n",
        "\"\"\""
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
    chadhat's avatar
    chadhat committed
       "metadata": {
        "tags": [
         "solution"
        ]
       },
    
    chadhat's avatar
    chadhat committed
       "outputs": [],
    
    chadhat's avatar
    chadhat committed
       "source": [
        "def model_builder(hp):\n",
        "    # Tune the learning rate for the optimizer\n",
        "    hp_learning_rate = hp.Choice(\"learning_rate\", values=[1e-2, 1e-3, 1e-4])\n",
        "    # Tune the choice of the activation function\n",
        "    activation = hp.Choice(name=\"activation\", values=[\"relu\", \"sigmoid\"])\n",
        "\n",
        "    model = circle_NN(activation, hp_learning_rate)\n",
        "\n",
        "    return model\n",
        "\n",
        "\n",
        "tuner = kt.BayesianOptimization(\n",
        "    model_builder,\n",
        "    objective=\"val_accuracy\",\n",
        "    max_trials=10,\n",
        "    project_name=\"circle_exercise\",\n",
        "    overwrite=True,\n",
        ")\n",
        "tuner.search(X_train, y_train, epochs=400, validation_data=(X_test, y_test))\n",
        "best_model = tuner.get_best_models()[0]\n",
        "print(tuner.get_best_hyperparameters()[0].values)"
    
    chadhat's avatar
    chadhat committed
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
    
        "<div class=\"alert alert-block alert-warning\">\n",
        "<p><i class=\"fa fa-warning\"></i>&nbsp;\n",
        "Another library which you should definitely look at for doing hyperparameter optimization with keras models is the <a href=\"https://github.com/maxpumperla/hyperas\">Hyperas library</a> which is a wrapper around the <a href=\"https://github.com/hyperopt/hyperopt\">Hyperopt library</a>. \n",
        "\n",
        "</p>\n",
        "</div>"
    
    chadhat's avatar
    chadhat committed
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
    
        "The examples we saw above are really nice to show various features of the TensorFlow (Keras) library and to understand how we build and train a model. However, they are not the ideal problems one should solve using neural networks. They are too simple and can be solved easily by classical machine learning algorithms. \n",
    
    chadhat's avatar
    chadhat committed
        "\n",
    
        "Now we show examples where Neural Networks really shine over classical machine learning algorithms."
    
    chadhat's avatar
    chadhat committed
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
    
        "## Handwritten Digits Classification (multi-class classification)\n",
        "**MNIST Dataset**\n",
    
    chadhat's avatar
    chadhat committed
        "\n",
    
        "MNIST datasets is a very common dataset used in machine learning. It is widely used to train and validate models.\n",
    
    chadhat's avatar
    chadhat committed
        "\n",
        "\n",
    
        "> The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a > test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.\n",
        "> It is a good database for people who want to try learning techniques and pattern recognition methods on real-world \n",
        "> data while spending minimal efforts on preprocessing and formatting.\n",
        "> source: http://yann.lecun.com/exdb/mnist/\n",
        "\n",
        "This dataset consists of images of handwritten digits between 0-9 and their corresponsing labels. We want to train a neural network which is able to predict the correct digit on the image. \n",
        "This is a multi-class classification problem. Unlike binary classification which we have seen till now we will classify data into 10 different classes."
    
    chadhat's avatar
    chadhat committed
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
    chadhat's avatar
    chadhat committed
       "metadata": {},
    
    chadhat's avatar
    chadhat committed
       "source": [
        "import matplotlib.pyplot as plt\n",
        "import numpy as np\n",
    
        "import seaborn as sns"
    
    chadhat's avatar
    chadhat committed
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
    chadhat's avatar
    chadhat committed
       "metadata": {},
    
    chadhat's avatar
    chadhat committed
       "source": [
    
        "# Loading the dataset in TensorFlow (keras)\n",
        "# Later you can explore and play with other datasets with come with TensorFlow (Keras)\n",
        "from tensorflow.keras.datasets import mnist\n",
    
    chadhat's avatar
    chadhat committed
        "\n",
    
        "# Loading the train and test data\n",
    
    chadhat's avatar
    chadhat committed
        "\n",
    
        "(X_train, y_train), (X_test, y_test) = mnist.load_data()"
    
    chadhat's avatar
    chadhat committed
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
    chadhat's avatar
    chadhat committed
       "source": [
    
        "# Looking at the dataset\n",
        "print(X_train.shape)"
    
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
    chadhat's avatar
    chadhat committed
       "source": [
    
        "# We can see that the training set consists of 60,000 images of size 28x28 pixels\n",
        "i = np.random.randint(0, X_train.shape[0])\n",
        "sns.set_style(\"white\")\n",
        "plt.imshow(X_train[i], cmap=\"gray_r\")\n",
        "sns.set(style=\"darkgrid\")\n",
        "print(\"This digit is: \", y_train[i])"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
       "source": [
        "# Look at the data values for a couple of images\n",
        "print(X_train[0].min(), X_train[1].max())"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "The data consists of values between 0-255 representing the **grayscale level**"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
       "source": [
        "# The labels are the digit on the image\n",
        "print(y_train.shape)"
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
    
       "metadata": {},
       "outputs": [],
       "source": [
        "# Scaling the data\n",
        "# It is important to normalize the input data to (0-1) before providing it to a neural net\n",
        "# We could use the previously introduced function from scikit-learn. However, here it is sufficient to\n",
        "# just divide the input data by 255\n",
        "X_train_norm = X_train / 255.0\n",
        "X_test_norm = X_test / 255.0\n",
        "\n",
        "# Also we need to reshape the input data such that each sample is a vector and not a 2D matrix\n",
        "X_train_prep = X_train_norm.reshape(X_train_norm.shape[0], 28 * 28)\n",
        "X_test_prep = X_test_norm.reshape(X_test_norm.shape[0], 28 * 28)"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "<div class=\"alert alert-block alert-warning\">\n",
        "<p><i class=\"fa fa-warning\"></i>&nbsp;\n",
        "One-Hot encoding\n",
        "\n",
        "In multi-class classification problems the labels are provided to the neural network as something called **One-hot encodings**. The categorical labels (0-9 here) are converted to vectors.\n",
        "\n",
        "For the MNIST problem where the data has **10 categories** we will convert every label to a vector of length 10. \n",
        "All the entries of this vector will be zero **except** for the index which is equal to the (integer) value of the label.\n",
        "\n",
        "For example:\n",
        "if label is 4. The one-hot vector will look like **[0 0 0 0 1 0 0 0 0 0]**\n",
        "\n",
        "Fortunately, TensorFlow (Keras) has a built-in function to achieve this and we do not have to write a code for this ourselves.\n",
        "</p>\n",
        "</div>"
    
    chadhat's avatar
    chadhat committed
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
    chadhat's avatar
    chadhat committed
       "source": [
    
        "from tensorflow.keras import utils\n",
        "\n",
        "y_train_onehot = utils.to_categorical(y_train, num_classes=10)\n",
        "y_test_onehot = utils.to_categorical(y_test, num_classes=10)\n",
        "\n",
        "print(y_train_onehot.shape)"
    
    chadhat's avatar
    chadhat committed
       ]
      },
      {
       "cell_type": "code",
    
    chadhat's avatar
    chadhat committed
       "execution_count": null,
       "metadata": {},
       "outputs": [],
    
    chadhat's avatar
    chadhat committed
       "source": [
    
        "# Building the tensorflow model\n",