whichFlower

# whichFlower : a flower species recognition app using tensorflow/keras and React-Native

View on GitHub

whichFlower : a flower species recognition app using tensorflow/keras and React-Native



demo.gif

As a passionate person about computer vision (CV), I came to know that model deployment is also important in model development process because the usefulness of a model is measured by the satisfaction of end users. In a previous project I named DEmoClassi (Demographic (age, gender race) and Emotion (happy, neutral, angry, ...) Classification) I tried to turned my trained models in a standalone python module that can be run on windows/Linux using OpenCV. You can check it here.


In this new project I decided to give mobile technologies a try. Today the models are migrating more and more to the edge devices (mobile, sensors, ... IOT in general). So I started by learning React-Native, a cross-platform mobile development framework developed by Facebook. The course is available on youtube, it is a little bit long, but it worth learning it. The end goal for me was to combine my 2 passions, CV and programming into another project : this time I opted for CV model training and deployment on mobile device of a flower species recognition app I called, with no suspens, `WhichFlower`.


I'll try to describe my journey using this post composed of three sections :


<!DOCTYPE html>

EDA

Exploratory Data Analysis

The dataset I'll explore here comes from kaggle and can be downloaded following this link. It consists of flower images belonging to five flower species :

  • Daisy
  • Dandelion
  • Rose
  • Sunflower
  • Tulip

First, let's import some useful libs :

In [1]:
%load_ext autoreload
%autoreload 2
In [2]:
import sys
sys.path.append("../utils") # or whatever the path to the utils directory containing the utilities script
In [3]:
from bokeh.io import output_notebook
from data_utils import split_flower_data
from plot_utils import display_sample_images, plot_class_distribution, plot_image_dimensions
In [4]:
output_notebook()
Loading BokehJS ...
In [5]:
%matplotlib inline
In [ ]:
 

The original dataset is not divided into training, validation and test sets, let's do it :

In [6]:
train_percent = 0.7  # part of the data to consider for train set
root_folder = "../../../flowers-recognition/flowers"  # directory containing subdirectory
                                                                                        # each containing images for 1 class
out_folder = "../../../flowers-recognition/split_flower"  # folder where to save the split data
classes = ['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']  # class names corresponding also to subdirs on `root_folder`
In [7]:
%%time
split_flower_data(root_folder, out_folder, classes, train_percent)
==========CLASS : daisy===========
*******[COPYING TRAIN IMAGES]***********
*******[COPYING VALIDATION IMAGES]***********
*******[COPYING TEST IMAGES]***********
===============================


==========CLASS : dandelion===========
*******[COPYING TRAIN IMAGES]***********
*******[COPYING VALIDATION IMAGES]***********
*******[COPYING TEST IMAGES]***********
===============================


==========CLASS : rose===========
*******[COPYING TRAIN IMAGES]***********
*******[COPYING VALIDATION IMAGES]***********
*******[COPYING TEST IMAGES]***********
===============================


==========CLASS : sunflower===========
*******[COPYING TRAIN IMAGES]***********
*******[COPYING VALIDATION IMAGES]***********
*******[COPYING TEST IMAGES]***********
===============================


==========CLASS : tulip===========
*******[COPYING TRAIN IMAGES]***********
*******[COPYING VALIDATION IMAGES]***********
*******[COPYING TEST IMAGES]***********
===============================


Wall time: 45.9 s

After data spliting, let's display some sample images for each class from training data :

In [8]:
sample_path = "../../../flowers-recognition/split_flower/train/"
In [9]:
display_sample_images(sample_path)
In [10]:
display_sample_images(sample_path)
In [11]:
display_sample_images(sample_path)

Some images are quite noisy, as they are not only composed of the flower picture, or sometimes the flower is not even visible. We should pay particular attention when assessing the model's performance as it may be affected by these noisy samples.

In [ ]:
 

One important aspect to investigate is the distribution of classes in our dataset, as imbalanced dataset may yield poor results. Following is the distribution of our data in the train, validation and test sets :

In [12]:
path_train = "../../../../COMPUTER_VISION/flowers-recognition/split_flower/train"
path_valid = "../../../../COMPUTER_VISION/flowers-recognition/split_flower/valid"
path_test = "../../../../COMPUTER_VISION/flowers-recognition/split_flower/test"
In [ ]:
 
In [13]:
plot_class_distribution(path_train, title="Train set classes distribution")
In [14]:
plot_class_distribution(path_valid, title="Validation set classes distribution")
In [15]:
plot_class_distribution(path_test, title="Test set classes distribution")

From above plots, we can see that the dataset is fairly balanced. There are two classes -dandelion and tulip- each with a little more than 20% of the data. The other classes have proportions of around 17-18%.

Overall, we have 3028 training samples, 651 validation and 644 test ones.

In [ ]:
 

When using a pretrained model for transfer learning we may be required to resize our images to match the original model input shape. In this case the choice of the images dimensions is direct.

However for our own custom models we may need some insights for choosing the right dimensions. Let's visualize the dimensions -height and width- from the training images :

In [16]:
plot_image_dimensions(path_train)

The mean height is around 245-260 and mean width around 320-350. We may take these values into account when resizing images with custom dimensions.

In [ ]:
 
In [ ]:
 

<!DOCTYPE html>

flowers-classification

Flower images classification

In this notebook I'll go through training some neural network models in the task of flower image classification (for more information about the dataset see the notebook EDA.ipynb in the same directory)

Let's first clone a repository where I store my utility functions for data processing and model training and evaluation

In [0]:
!git clone https://AlkaSaliss:********@github.com/AlkaSaliss/flowerClassif.git
Cloning into 'flowerClassif'...
remote: Enumerating objects: 80, done.
remote: Counting objects: 100% (80/80), done.
remote: Compressing objects: 100% (59/59), done.
remote: Total 80 (delta 20), reused 74 (delta 14), pack-reused 0
Unpacking objects: 100% (80/80), done.
In [0]:
# # Run this cell to mount your Google Drive.
from google.colab import drive
drive.mount('/content/drive')

Install and import some ncessary packages

In [0]:
# !pip install tensorflow==1.13.1  # uncomment this if using TPU instead of GPU on google colab
!pip install talos scipy==1.2 git+https://www.github.com/keras-team/keras-contrib.git
In [0]:
 
In [0]:
import sys
sys.path.append('/content/flowerClassif/utils')

import tensorflow as tf
import keras
import keras_contrib
from google.colab import files
import importlib
import os
from data_utils import split_flower_data
from plot_utils import plot_training_history
from model_utils import plot_confusion_matrix
from model_utils import inference_val_gen
from model_utils import CopyCheckpointToDrive
import matplotlib.pyplot as plt
import bokeh
from bokeh.io import output_notebook
from bokeh.resources import INLINE
from bokeh.layouts import gridplot, row
from bokeh.plotting import show, figure
from bokeh.models import ColumnDataSource
import json
import numpy as np
import pandas as pd
import talos as ta
import logging
import tqdm
Using TensorFlow backend.
In [0]:
 
In [0]:
output_notebook(resources=INLINE)
In [0]:
%matplotlib inline
In [0]:
print(tf.__version__)
print(tf.test.is_gpu_available())
1.15.0-rc3
True

1. Loading data

The flower dataset is hosted on kaggle, thus we need to donwload it using the kaggle CLI. (note that to install the kaggle CLI you'll simply do : pip install kaggle)

  • Copy kaggle credentials file from google drive, so that we can authenticate with the kaggle CLI :
In [0]:
# create a directory for storing kaggle credentials if it doesn't exist
os.makedirs('/root/.kaggle', exist_ok=True)

# copy the credential file in the reight directory
!cp "/content/drive/My Drive/kaggle.json" /root/.kaggle/kaggle.json
In [0]:
 

Download the data from kaggle and save it to the data directory :

In [0]:
# create a directory where the data will be stored
os.makedirs('/content/data', exist_ok=True)
In [0]:
!kaggle datasets download -d alxmamaev/flowers-recognition --unzip -p /content/data/
Downloading flowers-recognition.zip to /content/data
 99% 447M/450M [00:04<00:00, 110MB/s] 
100% 450M/450M [00:04<00:00, 115MB/s]

Finally we split the dataset into train (70%), test (15%) and validation (15%) sets :

In [0]:
data_path = '/content/data/flowers/'
out_path = '/content/data/flowers-split'
classes = ['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']
train_split = 0.7
In [0]:
split_flower_data(data_path, out_path, classes, train_split)

2. Train a baseline CNN

To ensure everything is ok, let's train a simple CNN model a s a baseline.

First we create data generators for our flower images :

In [0]:
train_path = "/content/data/flowers-split/train/"
valid_path = "/content/data/flowers-split/valid/"
In [0]:
train_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255    
)

val_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
)

train_data = train_gen.flow_from_directory(train_path, target_size=(224, 224), batch_size=128)
val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=256, shuffle=False)
Found 3028 images belonging to 5 classes.
Found 651 images belonging to 5 classes.
In [0]:
 

Next let's create the keras model

In [0]:
tf.keras.backend.clear_session()

simple_cnn = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(64, 5, activation='relu', input_shape=(224, 224, 3),
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.Conv2D(64, 5, activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Conv2D(128, 5, activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.Conv2D(128, 5, activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.GlobalAveragePooling2D(),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(128, activation='relu',
                          kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(5, activation='softmax')
])

simple_cnn_tpu = tf.contrib.tpu.keras_to_tpu_model(
        simple_cnn,
        strategy=tf.contrib.tpu.TPUDistributionStrategy(
            tf.contrib.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
        )
    )
In [0]:
simple_cnn_tpu.compile(
    optimizer=tf.train.AdagradOptimizer(learning_rate=1e-3),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)
In [0]:
print(simple_cnn_tpu.summary())
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_input (InputLayer)    (None, 224, 224, 3)       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 220, 220, 64)      4864      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 216, 216, 64)      102464    
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 108, 108, 64)      0         
_________________________________________________________________
batch_normalization_v1 (Batc (None, 108, 108, 64)      256       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 104, 104, 128)     204928    
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 100, 100, 128)     409728    
_________________________________________________________________
global_average_pooling2d (Gl (None, 128)               0         
_________________________________________________________________
batch_normalization_v1_1 (Ba (None, 128)               512       
_________________________________________________________________
dense (Dense)                (None, 128)               16512     
_________________________________________________________________
batch_normalization_v1_2 (Ba (None, 128)               512       
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 645       
=================================================================
Total params: 740,421
Trainable params: 739,781
Non-trainable params: 640
_________________________________________________________________
None

Before starting training let's add some callbacks :

  • tensorboard
  • early_stopping
  • model checkpointing
In [0]:
os.makedirs("/content/data/checkpoints/", exist_ok=True)
In [0]:
logdir = '/content/data/logs/simple_cnn_tpu'
os.makedirs(logdir, exist_ok=True)
tb_callback = tf.keras.callbacks.TensorBoard(log_dir=logdir)
In [0]:
earlystop_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=50, restore_best_weights=True, verbose=1)
In [0]:
ckpt_path = '/content/data/checkpoints/simple_cnn_tpu.h5'
os.makedirs("/content/data/checkpoints/", exist_ok=True)
ckpt_callback = tf.keras.callbacks.ModelCheckpoint(ckpt_path)
In [0]:
list_callbacks = [earlystop_callback, ckpt_callback]
In [0]:
 
In [0]:
history = simple_cnn_tpu.fit_generator(train_data,
                                       validation_data=val_data,
                                       epochs=300,
                                       callbacks=list_callbacks)
In [0]:
 

now that the training is end, let's visualize the training and validation metrics :

In [0]:
plot_training_history(history)
Loading BokehJS ...
In [0]:
 

Let's download the history data and the model :

In [0]:
history_dict = {k: [float(i) for i in v] for k, v in history.history.items()}
json.dump(history_dict, open('/content/data/checkpoints/simple_cnn_tpu.json', 'w'))
In [0]:
files.download("/content/data/checkpoints/simple_cnn_tpu.json")
In [0]:
files.download("/content/data/checkpoints/simple_cnn_tpu.h5")

Finally let's evaluate the model by ploting some classification metrics and confusion matrix

In [0]:
simple_cnn_tpu.load_weights("/content/data/checkpoints/simple_cnn_tpu.h5")
In [0]:
 

Get the true labels from generator, and predictions from the model :

In [0]:
y_true = np.concatenate([y for y in inference_val_gen(val_data, gen_type="y")])
In [0]:
y_true.shape
Out[0]:
(656,)
In [0]:
# Make predictions
y_pred = simple_cnn_tpu.predict_generator(inference_val_gen(val_data), steps=len(val_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)
INFO:tensorflow:New input shapes; (re-)compiling: mode=infer (# of cores 8), [TensorSpec(shape=(18, 224, 224, 3), dtype=tf.float32, name='conv2d_input_10')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Remapping placeholder for conv2d_input
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 4.6480162143707275 secs
3/3 [==============================] - 8s 3s/step
In [0]:
y_pred.shape
Out[0]:
(656,)

plot the confusion matrix

In [0]:
# get the labels and labels names
label_items = sorted(list(val_data.class_indices.items()), key=lambda x: x[1])
label_names = [item[0] for item in label_items]
labels = [item[1] for item in label_items]
print(labels)
print(label_names)
[0, 1, 2, 3, 4]
['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']
In [0]:
plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)
              precision    recall  f1-score   support

       daisy       0.85      0.72      0.78       116
   dandelion       0.81      0.87      0.84       158
        rose       0.77      0.59      0.67       118
   sunflower       0.89      0.79      0.84       111
       tulip       0.65      0.84      0.73       153

    accuracy                           0.77       656
   macro avg       0.79      0.76      0.77       656
weighted avg       0.78      0.77      0.77       656

We got an accuracy of 77% with this baseline CNN. It seems that the most difficult class to predict by the model is the class rose, which presents the lowest recall value (59%).

Often when the model is wrong about this class (rose), it misclassifies it as a tulip.

In [0]:
 

3. Baseline + data augmentation

Add some data augmentation techniques to see if it helps improve the model performance

In [0]:
train_path = "/content/data/flowers-split/train/"
valid_path = "/content/data/flowers-split/valid/"

train_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
    horizontal_flip=True,
    vertical_flip=True,
    rotation_range=15,
    width_shift_range=0.2,
    height_shift_range=0.2
)


val_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
)

train_data = train_gen.flow_from_directory(train_path, target_size=(224, 224), batch_size=128)
val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=256, shuffle=False)
Found 3028 images belonging to 5 classes.
Found 651 images belonging to 5 classes.
In [0]:
tf.keras.backend.clear_session()

simple_cnn = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(64, 5, activation='relu', input_shape=(224, 224, 3),
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.Conv2D(64, 5, activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Conv2D(128, 5, activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.Conv2D(128, 5, activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.GlobalAveragePooling2D(),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(128, activation='relu',
                          kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(5, activation='softmax')
])

simple_cnn_tpu_dataaug = tf.contrib.tpu.keras_to_tpu_model(
        simple_cnn,
        strategy=tf.contrib.tpu.TPUDistributionStrategy(
            tf.contrib.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
        )
    )
In [0]:
 
In [0]:
simple_cnn_tpu_dataaug.compile(
    optimizer=tf.train.AdamOptimizer(learning_rate=1e-3),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)
In [0]:
print(simple_cnn_tpu_dataaug.summary())
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_input (InputLayer)    (None, 224, 224, 3)       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 220, 220, 64)      4864      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 216, 216, 64)      102464    
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 108, 108, 64)      0         
_________________________________________________________________
batch_normalization_v1 (Batc (None, 108, 108, 64)      256       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 104, 104, 128)     204928    
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 100, 100, 128)     409728    
_________________________________________________________________
global_average_pooling2d (Gl (None, 128)               0         
_________________________________________________________________
batch_normalization_v1_1 (Ba (None, 128)               512       
_________________________________________________________________
dense (Dense)                (None, 128)               16512     
_________________________________________________________________
batch_normalization_v1_2 (Ba (None, 128)               512       
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 645       
=================================================================
Total params: 740,421
Trainable params: 739,781
Non-trainable params: 640
_________________________________________________________________
None
In [0]:
 
In [0]:
earlystop_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=50, restore_best_weights=True, verbose=1)
In [0]:
ckpt_path = '/content/data/checkpoints/simple_cnn_tpu_dataaug.h5'
os.makedirs("/content/data/checkpoints/", exist_ok=True)
ckpt_callback = tf.keras.callbacks.ModelCheckpoint(ckpt_path)
In [0]:
list_callbacks = [earlystop_callback, ckpt_callback]
In [0]:
history = simple_cnn_tpu_dataaug.fit_generator(train_data,
                                       validation_data=val_data,
                                       epochs=300,
                                       callbacks=list_callbacks)
In [0]:
plot_training_history(history)
Loading BokehJS ...
In [0]:
 
In [0]:
history_dict = {k: [float(i) for i in v] for k, v in history.history.items()}
json.dump(history_dict, open('/content/data/checkpoints/simple_cnn_tpu_dataaug.json', 'w'))
In [0]:
files.download("/content/data/checkpoints/simple_cnn_tpu_dataaug.json")
In [0]:
files.download("/content/data/checkpoints/simple_cnn_tpu_dataaug.h5")

Model evaluation with classification report

In [0]:
# load the best model

simple_cnn_tpu_dataaug.load_weights("/content/data/checkpoints/simple_cnn_tpu_dataaug.h5")
In [0]:
 

Get the true labels from generator, and predictions from the model :

In [0]:
y_true = np.concatenate([y for y in inference_val_gen(val_data, gen_type="y")])
In [0]:
y_true.shape
Out[0]:
(656,)
In [0]:
 
In [0]:
y_pred = simple_cnn_tpu_dataaug.predict_generator(inference_val_gen(val_data), steps=len(val_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)
y_pred.shape
3/3 [==============================] - 3s 1s/step
Out[0]:
(656,)

plot the confusion matrix

In [0]:
# get the labels and labels names
label_items = sorted(list(val_data.class_indices.items()), key=lambda x: x[1])
label_names = [item[0] for item in label_items]
labels = [item[1] for item in label_items]
print(labels)
print(label_names)
[0, 1, 2, 3, 4]
['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']
In [0]:
plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)
              precision    recall  f1-score   support

       daisy       0.93      0.70      0.80       116
   dandelion       0.85      0.91      0.87       158
        rose       0.74      0.73      0.73       118
   sunflower       0.95      0.86      0.90       111
       tulip       0.70      0.84      0.77       153

    accuracy                           0.81       656
   macro avg       0.83      0.81      0.81       656
weighted avg       0.83      0.81      0.81       656

Using data augmentation made us gain 4% of accuracy over the baseline model, going from 77% to 81%.

Here also we can see the modst difficult classes to identify by this model are daisy (70% of recall) and rose (73% of recall).

As previously, the model confounds rose with tulip when making mistakes. ==> maybe we need to more weight on this class to help improve its classification.

In [0]:
 

Hyperparameters search with talos

Talos is a hyperparameters tuning library for keras models. You can check it here

We'll tune three hyperparameters :

  • kernel size between 3 and 5
  • dropout rate : 0.25 or 0.5
  • activation function for hidden layers : relu or elu
In [0]:
train_path = "/content/data/flowers-split/train/"
valid_path = "/content/data/flowers-split/valid/"

train_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
    horizontal_flip=True,
    vertical_flip=True,
    rotation_range=15,
    width_shift_range=0.2,
    height_shift_range=0.2
    
)


val_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
)

train_data = train_gen.flow_from_directory(train_path, target_size=(224, 224), batch_size=128)
val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=512, shuffle=False)
Found 3028 images belonging to 5 classes.
Found 651 images belonging to 5 classes.

Create a directory in google drive where to save the results of hp-search

In [0]:
result_path = "/content/drive/My Drive/DeepLearning/flowers/hp_search_results/run1"
os.makedirs(result_path, exist_ok=True)
In [0]:
 

First we define a dictionary containing ranges/choices of the hyperparameters we want to optimize :

In [0]:
params = {
    "kernel_size": [3, 5],
    "dropout": [0.25, 0.50],
    "activation": ["relu", "elu"],
}

Then we define a function that takes as input the data and the hyperparametrs dictionary. The talos library expects the data in four parts : training features, training labels, validation features and validation labels. But in our case we are working with generators, so I will just passed dummy x_train, y_train, x_val and y_val to the optimization function; and in the function I'll access the generators I created as global variable to do my training :

In [0]:
def model_fn(x_train, y_train, x_val, y_val, params):

    tf.keras.backend.clear_session()
    model = tf.keras.models.Sequential([
        tf.keras.layers.Conv2D(64, params['kernel_size'],
                               activation=params['activation'],
                               input_shape=(224, 224, 3),
                               kernel_initializer=tf.keras.initializers.he_normal()),
        tf.keras.layers.Conv2D(64, params['kernel_size'],
                               activation=params['activation'],
                               kernel_initializer=tf.keras.initializers.he_normal()),
        tf.keras.layers.MaxPooling2D(2),
        tf.keras.layers.Dropout(params['dropout']),
        tf.keras.layers.BatchNormalization(),

        tf.keras.layers.Conv2D(128, params['kernel_size'],
                               activation=params['activation'],
                               kernel_initializer=tf.keras.initializers.he_normal()),
        tf.keras.layers.Conv2D(128, params['kernel_size'],
                               activation=params['activation'],
                               kernel_initializer=tf.keras.initializers.he_normal()),
        tf.keras.layers.GlobalAvgPool2D(),
        tf.keras.layers.Dropout(params['dropout']),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.Dense(64,
                              activation=params['activation'],
                              kernel_initializer=tf.keras.initializers.he_normal()),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.Dense(5, activation='softmax')
    ])

    tpu_model = tf.contrib.tpu.keras_to_tpu_model(
        model,
        strategy=tf.contrib.tpu.TPUDistributionStrategy(
            tf.contrib.cluster_resolver.TPUClusterResolver(
                tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
        )
    )

    tpu_model.compile(
        optimizer=tf.train.AdamOptimizer(learning_rate=1e-3, ),
        loss=tf.keras.losses.categorical_crossentropy,
        metrics=['acc']
    )
    
    out = tpu_model.fit_generator(train_data, validation_data=val_data,
                epochs=75)
    return out, tpu_model.sync_to_cpu()
In [0]:
x_dum, y_dum = np.zeros((1, 224, 224, 3)), np.zeros((1, 5))
In [0]:
h = ta.Scan(x_dum, y_dum, params, model_fn, experiment_no="1")
In [0]:
 
In [0]:
 
In [0]:
# Get the best model index with highest 'val_categorical_accuracy' 
model_id = h.data['val_acc'].astype('float').argmax() - 1
# Clear any previous TensorFlow session.
tf.keras.backend.clear_session()

# Load the model parameters from the scanner.
model = tf.keras.models.model_from_json(h.saved_models[model_id])
model.set_weights(h.saved_weights[model_id])
model.summary()
model.save(os.path.join(result_path, 'best_model.h5'))
h.data.to_csv(os.path.join(result_path, 'configs.csv'), index=False)
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 222, 222, 64)      1792      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 220, 220, 64)      36928     
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 110, 110, 64)      0         
_________________________________________________________________
dropout (Dropout)            (None, 110, 110, 64)      0         
_________________________________________________________________
batch_normalization_v1 (Batc (None, 110, 110, 64)      256       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 108, 108, 128)     73856     
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 106, 106, 128)     147584    
_________________________________________________________________
global_average_pooling2d (Gl (None, 128)               0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
batch_normalization_v1_1 (Ba (None, 128)               512       
_________________________________________________________________
dense (Dense)                (None, 64)                8256      
_________________________________________________________________
batch_normalization_v1_2 (Ba (None, 64)                256       
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 325       
=================================================================
Total params: 269,765
Trainable params: 269,253
Non-trainable params: 512
_________________________________________________________________

Load best configuration and train a model using it :

In [0]:
configs = pd.read_csv(os.path.join(result_path, 'configs.csv')).sort_values(['val_acc'])
configs
Out[0]:
round_epochs loss acc val_loss val_acc kernel_size dropout activation
0 75 0.664851 0.752645 0.652653 0.733025 3 0.25 elu
5 75 0.766767 0.708333 0.675255 0.745370 3 0.50 elu
3 75 0.646514 0.751323 0.642547 0.750000 5 0.50 elu
4 75 0.610366 0.772817 0.604745 0.751543 3 0.50 relu
2 75 0.552488 0.798611 0.601875 0.791667 5 0.25 elu
7 75 0.519103 0.810516 0.538852 0.805556 5 0.50 relu
1 75 0.427520 0.844246 0.539754 0.811728 3 0.25 relu
6 75 0.385719 0.862765 0.518036 0.831790 5 0.25 relu

The best hyperparameters configuration is :

  • kernel_size : 5
  • activation : relu
  • dropout : 0.25
In [0]:
tf.keras.backend.clear_session()
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(64, 5,
                           activation='relu',
                           input_shape=(224, 224, 3),
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.Conv2D(64, 5,
                           activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.MaxPooling2D(2),
    tf.keras.layers.Dropout(0.25),
    tf.keras.layers.BatchNormalization(),

    tf.keras.layers.Conv2D(128, 5,
                           activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.Conv2D(128, 5,
                           activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.GlobalAvgPool2D(),
    tf.keras.layers.Dropout(0.25),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(64,
                          activation='relu',
                          kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(5, activation='softmax')
])

tpu_model = tf.contrib.tpu.keras_to_tpu_model(
    model,
    strategy=tf.contrib.tpu.TPUDistributionStrategy(
        tf.contrib.cluster_resolver.TPUClusterResolver(
            tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
    )
)

tpu_model.compile(
    optimizer=tf.train.AdamOptimizer(learning_rate=1e-3, ),
    loss=tf.keras.losses.categorical_crossentropy,
    metrics=['acc']
)

print(tpu_model.summary())
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/core.py:143: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
INFO:tensorflow:Querying Tensorflow master (grpc://10.47.236.90:8470) for TPU system metadata.
INFO:tensorflow:Found TPU system:
INFO:tensorflow:*** Num TPU Cores: 8
INFO:tensorflow:*** Num TPU Workers: 1
INFO:tensorflow:*** Num TPU Cores Per Worker: 8
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, -1, 13185486934593269880)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 644367969013081572)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 17179869184, 8770085351396939743)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 17179869184, 7391174983090245070)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 17179869184, 3562341780813676187)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 17179869184, 11624688845676187994)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 17179869184, 11830354006431143299)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 17179869184, 3432732814019816995)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 17179869184, 2397811451879219779)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 17179869184, 3365187378787287639)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 8589934592, 17898618049197454467)
WARNING:tensorflow:tpu_model (from tensorflow.contrib.tpu.python.tpu.keras_support) is experimental and may change or be removed at any time, and without warning.
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_input (InputLayer)    (None, 224, 224, 3)       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 220, 220, 64)      4864      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 216, 216, 64)      102464    
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 108, 108, 64)      0         
_________________________________________________________________
dropout (Dropout)            (None, 108, 108, 64)      0         
_________________________________________________________________
batch_normalization_v1 (Batc (None, 108, 108, 64)      256       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 104, 104, 128)     204928    
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 100, 100, 128)     409728    
_________________________________________________________________
global_average_pooling2d (Gl (None, 128)               0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
batch_normalization_v1_1 (Ba (None, 128)               512       
_________________________________________________________________
dense (Dense)                (None, 64)                8256      
_________________________________________________________________
batch_normalization_v1_2 (Ba (None, 64)                256       
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 325       
=================================================================
Total params: 731,589
Trainable params: 731,077
Non-trainable params: 512
_________________________________________________________________
None
In [0]:
 
In [0]:
# define callbacks
earlystop_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=100, restore_best_weights=True, verbose=1)

ckpt_path = '/content/data/checkpoints/best_from_hp_search.h5'
os.makedirs("/content/data/checkpoints/", exist_ok=True)
ckpt_callback = tf.keras.callbacks.ModelCheckpoint(ckpt_path, monitor='val_loss')


# special callback to copy the best model checkpoint from local storage to google drive
dest_path = "/content/drive/My Drive/DeepLearning/flowers/hp_search_results/best"
os.makedirs(dest_path, exist_ok=True)
copy_cb = CopyCheckpointToDrive(ckpt_path, dest_path)

list_callbacks = [earlystop_callback, ckpt_callback, copy_cb]
In [0]:
 
In [0]:
%%time
history = tpu_model.fit_generator(train_data, validation_data=val_data, callbacks=list_callbacks,
                                    epochs=300)
In [0]:
plot_training_history(history)
Loading BokehJS ...
In [0]:
 

Get the true labels from generator, and predictions from the model :

In [0]:
y_true = np.concatenate([y for y in inference_val_gen(val_data, gen_type="y")])
In [0]:
y_true.shape
Out[0]:
(656,)
In [0]:
# Load best model
tpu_model.load_weights("/content/data/checkpoints/best_from_hp_search.h5")
In [0]:
y_pred = tpu_model.predict_generator(inference_val_gen(val_data), steps=len(val_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)
y_pred.shape
INFO:tensorflow:New input shapes; (re-)compiling: mode=infer (# of cores 8), [TensorSpec(shape=(64, 224, 224, 3), dtype=tf.float32, name='conv2d_input_10')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Remapping placeholder for conv2d_input
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 7.300978422164917 secs
INFO:tensorflow:Setting weights on TPU model.
1/2 [==============>...............] - ETA: 12sINFO:tensorflow:New input shapes; (re-)compiling: mode=infer (# of cores 8), [TensorSpec(shape=(18, 224, 224, 3), dtype=tf.float32, name='conv2d_input_10')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Remapping placeholder for conv2d_input
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 5.988226890563965 secs
2/2 [==============================] - 20s 10s/step
Out[0]:
(656,)

plot the confusion matrix

In [0]:
# get the labels and labels names
label_items = sorted(list(val_data.class_indices.items()), key=lambda x: x[1])
label_names = [item[0] for item in label_items]
labels = [item[1] for item in label_items]
print(labels)
print(label_names)
[0, 1, 2, 3, 4]
['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']
In [0]:
plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)
              precision    recall  f1-score   support

       daisy       0.99      0.76      0.86       116
   dandelion       0.91      0.91      0.91       158
        rose       0.81      0.67      0.73       118
   sunflower       0.87      0.94      0.90       111
       tulip       0.73      0.92      0.81       153

    accuracy                           0.84       656
   macro avg       0.86      0.84      0.84       656
weighted avg       0.86      0.84      0.84       656

Using hyperparameter search led us to 1% gain in accuracy. We are still facing overfitting issue, see accuracy and loss curves above. This could be explained by the fact that we only have about hundred of images per class, which is somehow small to learn a model that'll be capable of generalizing well.

Thus in the next sections we'll go with transfer learning to overcome the small data issue.

In [0]:
 

Transfer learning

Next we'll leverage the power of transfer learning given the small amount of data we have.

VGG16 as features extractor
In [0]:
tf.keras.backend.clear_session()


# get the pretrained VGG16 on imagenet
vgg_base = tf.keras.applications.vgg16.VGG16(
    include_top=False,
    weights='imagenet',
    input_shape=(224, 224, 3),
    pooling='avg'
)


# freeze pretrained weights
for layer in tqdm.tqdm_notebook(vgg_base.layers):
    layer.trainable = False
    

# add classification layer
vgg_full = tf.keras.Sequential([
    vgg_base,
    tf.keras.layers.Dense(256, activation='relu', kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.Dense(5, activation='softmax')
])



vgg_full.compile(
    optimizer=tf.keras.optimizers.Adam(lr=1e-4),
    loss='categorical_crossentropy',
    metrics=['acc']
)

print(vgg_full.summary())
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Model)                (None, 512)               14714688  
_________________________________________________________________
dense (Dense)                (None, 256)               131328    
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 1285      
=================================================================
Total params: 14,847,301
Trainable params: 132,613
Non-trainable params: 14,714,688
_________________________________________________________________
None
In [0]:
 
In [0]:
# The data generators
train_path = "/content/data/flowers-split/train/"
valid_path = "/content/data/flowers-split/valid/"

train_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
    horizontal_flip=True,
    vertical_flip=True,
    rotation_range=15,
    width_shift_range=0.2,
    height_shift_range=0.2
    
)


val_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
)

train_data = train_gen.flow_from_directory(train_path, target_size=(224, 224), batch_size=64)
val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=64, shuffle=False)
Found 3028 images belonging to 5 classes.
Found 651 images belonging to 5 classes.
In [0]:
# define callbacks
earlystop_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=50, restore_best_weights=True, verbose=1)

ckpt_path = '/content/data/checkpoints/vgg16_fext.h5'
os.makedirs("/content/data/checkpoints/", exist_ok=True)
ckpt_callback = tf.keras.callbacks.ModelCheckpoint(ckpt_path, monitor='val_loss', verbose=1)


# special callback to copy the best model checkpoint from local storage to google drive
dest_path = "/content/drive/My Drive/DeepLearning/flowers/transfer/vgg/feature_ext"
os.makedirs(dest_path, exist_ok=True)
copy_cb = CopyCheckpointToDrive(ckpt_path, dest_path)

list_callbacks = [earlystop_callback, ckpt_callback, copy_cb]
In [0]:
%%time

# train

history = vgg_full.fit_generator(train_data, validation_data=val_data, callbacks=list_callbacks,
                                        epochs=300)
In [0]:
# plot the learning curves
plot_training_history(history)
Loading BokehJS ...
In [0]:
 

Get the true labels from generator, and predictions from the model :

In [0]:
val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=64, shuffle=False)
y_true = []
len_val_data = len(val_data)
for i, (x, y) in enumerate(tqdm.tqdm_notebook(val_data)):
    y_true.append(np.argmax(y, axis=1))
    if i == len_val_data - 1:
        break
y_true = np.concatenate(y_true)
Found 651 images belonging to 5 classes.

In [0]:
y_true.shape
Out[0]:
(651,)
In [0]:
# Load best model
vgg_full.load_weights("/content/data/checkpoints/vgg16_fext.h5")
In [0]:
y_pred = vgg_full.predict_generator(val_data, verbose=1)
y_pred = np.argmax(y_pred, axis=1)
y_pred.shape
11/11 [==============================] - 3s 284ms/step
Out[0]:
(651,)

plot the confusion matrix

In [0]:
# get the labels and labels names
label_items = sorted(list(val_data.class_indices.items()), key=lambda x: x[1])
label_names = [item[0] for item in label_items]
labels = [item[1] for item in label_items]
print(labels)
print(label_names)
[0, 1, 2, 3, 4]
['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']
In [0]:
plot_confusion_matrix(y_true, y_pred, title='Classification Report - VGG16 features extractor',
                          labels_=labels, target_names=label_names, normalize=False)
              precision    recall  f1-score   support

       daisy       0.82      0.81      0.82       116
   dandelion       0.89      0.89      0.89       158
        rose       0.79      0.85      0.82       118
   sunflower       0.83      0.85      0.84       111
       tulip       0.81      0.76      0.79       148

    accuracy                           0.83       651
   macro avg       0.83      0.83      0.83       651
weighted avg       0.83      0.83      0.83       651

Using VGG16 as features extractor we were able to achieve 83% of accuracy, 1% less than the baseline using the hyperparameters search with talos lib. However with vgg16 the metrics recall and precision, are more balanced between classes.

In the next section we'll unfreeze the VGG convonlutional layers and finetune the whole model.

Finetuning VGG16

Here we'll use a lower learning rate to avoid perturbing the already learned information from the pretrained layers

Let's first clone the model from the previous section si we can start finetuning from it

In [0]:
tf.keras.backend.clear_session()
tf.keras.backend.clear_session()
vgg_finetune = tf.keras.models.load_model("/content/data/checkpoints/vgg16_fext.h5")
print(vgg_finetune.summary())
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Model)                (None, 512)               14714688  
_________________________________________________________________
dense (Dense)                (None, 256)               131328    
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 1285      
=================================================================
Total params: 14,847,301
Trainable params: 132,613
Non-trainable params: 14,714,688
_________________________________________________________________
None

Unfreeze the layers and recompile the model with a lower learning rate

In [0]:
for layer in vgg_finetune.layers:
    layer.trainable = True
    
vgg_finetune.compile(
    optimizer=tf.keras.optimizers.Adam(lr=1e-5),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

print(vgg_finetune.summary())
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Model)                (None, 512)               14714688  
_________________________________________________________________
dense (Dense)                (None, 256)               131328    
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 1285      
=================================================================
Total params: 14,847,301
Trainable params: 14,847,301
Non-trainable params: 0
_________________________________________________________________
None
In [0]:
 
In [0]:
# The data generators
train_path = "/content/data/flowers-split/train/"
valid_path = "/content/data/flowers-split/valid/"

train_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
    horizontal_flip=True,
    vertical_flip=True,
    rotation_range=15,
    width_shift_range=0.2,
    height_shift_range=0.2
    
)


val_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
)

train_data = train_gen.flow_from_directory(train_path, target_size=(224, 224), batch_size=64)
val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=64, shuffle=False)
Found 3028 images belonging to 5 classes.
Found 651 images belonging to 5 classes.
In [0]:
# define callbacks
earlystop_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=100, restore_best_weights=True, verbose=1)

ckpt_path = '/content/data/checkpoints/vgg16_finetune.h5'
os.makedirs("/content/data/checkpoints/", exist_ok=True)
ckpt_callback = tf.keras.callbacks.ModelCheckpoint(ckpt_path, monitor='val_loss', verbose=1)


# special callback to copy the best model checkpoint from local storage to google drive
dest_path = "/content/drive/My Drive/DeepLearning/flowers/transfer/vgg/finetune"
os.makedirs(dest_path, exist_ok=True)
copy_cb = CopyCheckpointToDrive(ckpt_path, dest_path)

list_callbacks = [earlystop_callback, ckpt_callback, copy_cb]
In [0]:
 
In [0]:
%%time

# train

history_vgg_finetune = vgg_finetune.fit_generator(train_data, validation_data=val_data, callbacks=list_callbacks,
                                        epochs=300)
In [0]:
 
In [0]:
plot_training_history(history_vgg_finetune)
Loading BokehJS ...
In [0]:
 
In [0]:
 

Evaluation of the finetuned model

In [0]:
val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=64, shuffle=False)
y_true = []
len_val_data = len(val_data)
for i, (x, y) in enumerate(tqdm.tqdm_notebook(val_data)):
    y_true.append(np.argmax(y, axis=1))
    if i == len_val_data - 1:
        break
y_true = np.concatenate(y_true)
Found 651 images belonging to 5 classes.

In [0]:
y_true.shape
Out[0]:
(651,)
In [0]:
# Load best model
In [0]:
y_pred = vgg_finetune.predict_generator(val_data, verbose=1)
y_pred = np.argmax(y_pred, axis=1)
y_pred.shape
11/11 [==============================] - 3s 297ms/step
Out[0]:
(651,)

plot the confusion matrix

In [0]:
# get the labels and labels names
label_items = sorted(list(val_data.class_indices.items()), key=lambda x: x[1])
label_names = [item[0] for item in label_items]
labels = [item[1] for item in label_items]
print(labels)
print(label_names)
[0, 1, 2, 3, 4]
['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']
In [0]:
plot_confusion_matrix(y_true, y_pred, title='Classification Report - VGG16 Finetuned',
                          labels_=labels, target_names=label_names, normalize=False)
              precision    recall  f1-score   support

       daisy       0.91      0.84      0.87       116
   dandelion       0.93      0.94      0.93       158
        rose       0.86      0.86      0.86       118
   sunflower       0.90      0.93      0.91       111
       tulip       0.86      0.88      0.87       148

    accuracy                           0.89       651
   macro avg       0.89      0.89      0.89       651
weighted avg       0.89      0.89      0.89       651

In [0]:
 
Finetuning VGG + dropout regularization
In [0]:
tf.keras.backend.clear_session()


# get the pretrained VGG16 on imagenet
vgg_base = tf.keras.applications.vgg16.VGG16(
    include_top=False,
    weights='imagenet',
    input_shape=(224, 224, 3),
    pooling='avg'
)


print(vgg_base.summary())
Model: "vgg16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
global_average_pooling2d (Gl (None, 512)               0         
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________
None
In [0]:
# Freeze first layers

set_trainable = False
for layer in vgg_base.layers:
    if layer.name == 'block5_conv1':
        set_trainable = True
    if set_trainable:
        layer.trainable = True
    else :
        layer.trainable = False

print(vgg_base.summary())
Model: "vgg16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
global_average_pooling2d (Gl (None, 512)               0         
=================================================================
Total params: 14,714,688
Trainable params: 7,079,424
Non-trainable params: 7,635,264
_________________________________________________________________
None
In [0]:
# add classification layer
vgg_full = tf.keras.Sequential([
    vgg_base,
    tf.keras.layers.Dropout(0.75),
    tf.keras.layers.Dense(256, activation='relu', kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.Dense(5, activation='softmax')
])


vgg_full.compile(
    optimizer=tf.keras.optimizers.Adam(lr=1e-4),
    loss='categorical_crossentropy',
    metrics=['acc']
)

print(vgg_full.summary())
W0713 20:40:06.727561 139695093868416 nn_ops.py:4224] Large dropout rate: 0.75 (>0.5). In TensorFlow 2.x, dropout() uses dropout rate instead of keep_prob. Please ensure that this is intended.
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Model)                (None, 512)               14714688  
_________________________________________________________________
dropout (Dropout)            (None, 512)               0         
_________________________________________________________________
dense (Dense)                (None, 256)               131328    
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 1285      
=================================================================
Total params: 14,847,301
Trainable params: 7,212,037
Non-trainable params: 7,635,264
_________________________________________________________________
None
In [0]:
 
In [0]:
# The data generators
train_path = "/content/data/flowers-split/train/"
valid_path = "/content/data/flowers-split/valid/"

train_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
    horizontal_flip=True,
    vertical_flip=True,
    rotation_range=15,
    width_shift_range=0.2,
    height_shift_range=0.2
    
)


val_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
)

train_data = train_gen.flow_from_directory(train_path, target_size=(224, 224), batch_size=64)
val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=64, shuffle=False)
Found 3028 images belonging to 5 classes.
Found 651 images belonging to 5 classes.
In [0]:
# define callbacks
earlystop_callback = tf.keras.callbacks.EarlyStopping(monitor='val_acc', patience=100, restore_best_weights=True, verbose=1)

ckpt_path = '/content/data/checkpoints/vgg16_finetune_dropout.h5'
os.makedirs("/content/data/checkpoints/", exist_ok=True)
ckpt_callback = tf.keras.callbacks.ModelCheckpoint(ckpt_path, monitor='val_acc', save_best_only=True, verbose=1)


# special callback to copy the best model checkpoint from local storage to google drive
dest_path = "/content/drive/My Drive/DeepLearning/flowers/transfer/vgg/finetune"
os.makedirs(dest_path, exist_ok=True)
copy_cb = CopyCheckpointToDrive(ckpt_path, dest_path)

list_callbacks = [earlystop_callback, ckpt_callback, copy_cb]
In [0]:
 
In [0]:
%%time

# train

history_vgg_finetune = vgg_full.fit_generator(train_data,
                                              validation_data=val_data,
                                              callbacks=list_callbacks,
                                              epochs=300)
In [0]:
 
In [0]:
plot_training_history(history_vgg_finetune)
Loading BokehJS ...

Evaluation of the finetuned model

In [0]:
val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=64, shuffle=False)
y_true = []
len_val_data = len(val_data)
for i, (x, y) in enumerate(tqdm.tqdm_notebook(val_data)):
    y_true.append(np.argmax(y, axis=1))
    if i == len_val_data - 1:
        break
y_true = np.concatenate(y_true)
Found 651 images belonging to 5 classes.

In [0]:
y_true.shape
Out[0]:
(651,)
In [0]:
y_pred = vgg_full.predict_generator(val_data, verbose=1)
y_pred = np.argmax(y_pred, axis=1)
y_pred.shape
11/11 [==============================] - 3s 291ms/step
Out[0]:
(651,)

plot the confusion matrix

In [0]:
# get the labels and labels names
label_items = sorted(list(val_data.class_indices.items()), key=lambda x: x[1])
label_names = [item[0] for item in label_items]
labels = [item[1] for item in label_items]
print(labels)
print(label_names)
[0, 1, 2, 3, 4]
['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']
In [0]:
plot_confusion_matrix(y_true, y_pred, title='Classification Report - VGG16 Finetuned',
                          labels_=labels, target_names=label_names, normalize=False)
              precision    recall  f1-score   support

       daisy       0.96      0.87      0.91       116
   dandelion       0.94      0.97      0.96       158
        rose       0.89      0.84      0.86       118
   sunflower       0.91      0.95      0.93       111
       tulip       0.86      0.89      0.87       148

    accuracy                           0.91       651
   macro avg       0.91      0.91      0.91       651
weighted avg       0.91      0.91      0.91       651

In [0]:
 

Final evaluation on test set

Finally, let's evaluate all the trained models on the held test to choose the best model.

  • Baseline CNN
In [0]:
# Loading the test data 
test_path = '/content/data/flowers-split/test/'

test_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
)
test_data = test_gen.flow_from_directory(test_path, target_size=(224, 224), batch_size=128, shuffle=False)
Found 644 images belonging to 5 classes.
In [0]:
# get the true labels
y_true = np.concatenate([y for y in inference_val_gen(test_data, gen_type="y")])
y_true.shape

Out[0]:
(648,)
In [0]:
# Load the best model
model_path = '/content/data/checkpoints/simple_cnn_tpu.h5'
baseline_cnn = tf.keras.models.load_model(model_path)
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling GlorotUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:No training configuration found in save file: the model was *not* compiled. Compile it manually.
In [0]:
baseline_cnn.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 220, 220, 64)      4864      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 216, 216, 64)      102464    
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 108, 108, 64)      0         
_________________________________________________________________
batch_normalization_v1 (Batc (None, 108, 108, 64)      256       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 104, 104, 128)     204928    
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 100, 100, 128)     409728    
_________________________________________________________________
global_average_pooling2d (Gl (None, 128)               0         
_________________________________________________________________
batch_normalization_v1_1 (Ba (None, 128)               512       
_________________________________________________________________
dense (Dense)                (None, 128)               16512     
_________________________________________________________________
batch_normalization_v1_2 (Ba (None, 128)               512       
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 645       
=================================================================
Total params: 740,421
Trainable params: 739,781
Non-trainable params: 640
_________________________________________________________________
In [0]:
# Make prediction using baseline model
y_pred = baseline_cnn.predict_generator(inference_val_gen(test_data), steps=len(test_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)
6/6 [==============================] - 10s 2s/step
In [0]:
y_pred.shape
Out[0]:
(648,)

plot the confusion matrix

In [0]:
# get the labels and labels names
label_items = sorted(list(test_data.class_indices.items()), key=lambda x: x[1])
label_names = [item[0] for item in label_items]
labels = [item[1] for item in label_items]
print(labels)
print(label_names)
[0, 1, 2, 3, 4]
['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']
In [0]:
plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)
              precision    recall  f1-score   support

       daisy       0.79      0.71      0.75       114
   dandelion       0.78      0.84      0.81       157
        rose       0.81      0.50      0.62       117
   sunflower       0.81      0.81      0.81       109
       tulip       0.66      0.85      0.75       151

    accuracy                           0.75       648
   macro avg       0.77      0.74      0.75       648
weighted avg       0.76      0.75      0.75       648

In [0]:
 
In [0]:
 
  • Baseline CNN + Data augmentation
In [0]:
# Load the best model
model_path = '/content/data/checkpoints/simple_cnn_tpu_dataaug.h5'
baseline_cnn_dataaug = tf.keras.models.load_model(model_path)
In [0]:
baseline_cnn_dataaug.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 220, 220, 64)      4864      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 216, 216, 64)      102464    
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 108, 108, 64)      0         
_________________________________________________________________
batch_normalization_v1 (Batc (None, 108, 108, 64)      256       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 104, 104, 128)     204928    
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 100, 100, 128)     409728    
_________________________________________________________________
global_average_pooling2d (Gl (None, 128)               0         
_________________________________________________________________
batch_normalization_v1_1 (Ba (None, 128)               512       
_________________________________________________________________
dense (Dense)                (None, 128)               16512     
_________________________________________________________________
batch_normalization_v1_2 (Ba (None, 128)               512       
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 645       
=================================================================
Total params: 740,421
Trainable params: 739,781
Non-trainable params: 640
_________________________________________________________________
In [0]:
# Make prediction using baseline model
y_pred = baseline_cnn_dataaug.predict_generator(inference_val_gen(test_data), steps=len(test_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)
4/6 [===================>..........] - ETA: 1s
6/6 [==============================] - 4s 611ms/step
In [0]:
y_pred.shape
Out[0]:
(648,)

plot the confusion matrix

In [0]:
plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)
              precision    recall  f1-score   support

       daisy       0.89      0.71      0.79       114
   dandelion       0.80      0.87      0.83       157
        rose       0.82      0.68      0.74       117
   sunflower       0.89      0.81      0.85       109
       tulip       0.71      0.89      0.79       151

    accuracy                           0.80       648
   macro avg       0.82      0.79      0.80       648
weighted avg       0.81      0.80      0.80       648

In [0]:
 
In [0]:
 
  • Tuned hyperparameters with Talos lib
In [0]:
# Load the best model
model_path = "/content/drive/My Drive/DeepLearning/flowers/hp_search_results/best/best_from_hp_search.h5"
best_from_hp_search = tf.keras.models.load_model(model_path)
WARNING:tensorflow:No training configuration found in save file: the model was *not* compiled. Compile it manually.
In [0]:
best_from_hp_search.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 220, 220, 64)      4864      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 216, 216, 64)      102464    
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 108, 108, 64)      0         
_________________________________________________________________
dropout (Dropout)            (None, 108, 108, 64)      0         
_________________________________________________________________
batch_normalization_v1 (Batc (None, 108, 108, 64)      256       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 104, 104, 128)     204928    
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 100, 100, 128)     409728    
_________________________________________________________________
global_average_pooling2d (Gl (None, 128)               0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
batch_normalization_v1_1 (Ba (None, 128)               512       
_________________________________________________________________
dense (Dense)                (None, 64)                8256      
_________________________________________________________________
batch_normalization_v1_2 (Ba (None, 64)                256       
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 325       
=================================================================
Total params: 731,589
Trainable params: 731,077
Non-trainable params: 512
_________________________________________________________________
In [0]:
# Make prediction using baseline model
y_pred = best_from_hp_search.predict_generator(inference_val_gen(test_data), steps=len(test_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)
4/6 [===================>..........] - ETA: 1s
6/6 [==============================] - 4s 655ms/step
In [0]:
y_pred.shape
Out[0]:
(648,)

plot the confusion matrix

In [0]:
 
In [0]:
plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)
              precision    recall  f1-score   support

       daisy       0.95      0.74      0.83       114
   dandelion       0.89      0.85      0.87       157
        rose       0.79      0.66      0.72       117
   sunflower       0.81      0.95      0.87       109
       tulip       0.72      0.87      0.79       151

    accuracy                           0.82       648
   macro avg       0.83      0.81      0.82       648
weighted avg       0.83      0.82      0.82       648

In [0]:
 
In [0]:
 
  • VGG as features extractor
In [0]:
# Load the best model
model_path = "/content/drive/My Drive/DeepLearning/flowers/transfer/vgg/feature_ext/vgg16_fext.h5"
vgg16_fext = tf.keras.models.load_model(model_path)
In [0]:
vgg16_fext.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Model)                (None, 512)               14714688  
_________________________________________________________________
dense (Dense)                (None, 256)               131328    
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 1285      
=================================================================
Total params: 14,847,301
Trainable params: 132,613
Non-trainable params: 14,714,688
_________________________________________________________________
In [0]:
# Make prediction using baseline model
y_pred = vgg16_fext.predict_generator(inference_val_gen(test_data), steps=len(test_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)
6/6 [==============================] - 16s 3s/step
In [0]:
y_pred.shape
Out[0]:
(648,)

plot the confusion matrix

In [0]:
plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)
              precision    recall  f1-score   support

       daisy       0.85      0.78      0.81       114
   dandelion       0.88      0.86      0.87       157
        rose       0.81      0.86      0.84       117
   sunflower       0.77      0.81      0.79       109
       tulip       0.82      0.82      0.82       151

    accuracy                           0.83       648
   macro avg       0.83      0.83      0.83       648
weighted avg       0.83      0.83      0.83       648

In [0]:
 
In [0]:
 
  • Finetuning VGG
In [0]:
# Load the best model
model_path = "/content/drive/My Drive/DeepLearning/flowers/transfer/vgg/finetune/vgg16_finetune.h5"
vgg16_finetune = tf.keras.models.load_model(model_path)
In [0]:
vgg16_finetune.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Model)                (None, 512)               14714688  
_________________________________________________________________
dense (Dense)                (None, 256)               131328    
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 1285      
=================================================================
Total params: 14,847,301
Trainable params: 14,847,301
Non-trainable params: 0
_________________________________________________________________
In [0]:
# Make prediction using baseline model
y_pred = vgg16_finetune.predict_generator(inference_val_gen(test_data), steps=len(test_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)
2/6 [=========>....................] - ETA: 5s
6/6 [==============================] - 6s 1s/step
In [0]:
y_pred.shape
Out[0]:
(648,)

plot the confusion matrix

In [0]:
plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)
              precision    recall  f1-score   support

       daisy       0.88      0.83      0.86       114
   dandelion       0.90      0.89      0.89       157
        rose       0.84      0.87      0.85       117
   sunflower       0.85      0.89      0.87       109
       tulip       0.85      0.83      0.84       151

    accuracy                           0.86       648
   macro avg       0.86      0.86      0.86       648
weighted avg       0.86      0.86      0.86       648

In [0]:
 
In [0]:
 
  • Finetuning VGG + Dropout
In [0]:
# Load the best model
model_path = "/content/drive/My Drive/DeepLearning/flowers/transfer/vgg/finetune/vgg16_finetune_dropout.h5"
vgg16_finetune_dropout = tf.keras.models.load_model(model_path)
WARNING:tensorflow:Large dropout rate: 0.75 (>0.5). In TensorFlow 2.x, dropout() uses dropout rate instead of keep_prob. Please ensure that this is intended.
In [0]:
vgg16_finetune_dropout.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Model)                (None, 512)               14714688  
_________________________________________________________________
dropout (Dropout)            (None, 512)               0         
_________________________________________________________________
dense (Dense)                (None, 256)               131328    
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 1285      
=================================================================
Total params: 14,847,301
Trainable params: 7,212,037
Non-trainable params: 7,635,264
_________________________________________________________________
In [0]:
# Make prediction using baseline model
y_pred = vgg16_finetune_dropout.predict_generator(inference_val_gen(test_data), steps=len(test_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)
2/6 [=========>....................] - ETA: 5s
6/6 [==============================] - 6s 1s/step
In [0]:
y_pred.shape
Out[0]:
(648,)

plot the confusion matrix

In [0]:
plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)
              precision    recall  f1-score   support

       daisy       0.97      0.83      0.90       114
   dandelion       0.90      0.93      0.92       157
        rose       0.85      0.85      0.85       117
   sunflower       0.85      0.94      0.89       109
       tulip       0.87      0.87      0.87       151

    accuracy                           0.89       648
   macro avg       0.89      0.89      0.89       648
weighted avg       0.89      0.89      0.89       648

In [0]:
 

Summary

Let's summarize all the results in a table to facilitate the comparison:

Model Validation accuracy Test accuracy
Baseline 77.29 75.46
Baseline+Data augmentation 81.40 80.25
Tuned Hyperparams 84.45 81.79
VGG16 feature extraction 83.10 82.87
VGG16 Finetuned 88.94 86.27
VGG16 Finetuned + dropout 90.94 88.73

From the above table we can conclude that the best model in terms of vaidation and test accuracy is the finetuned VGG16 regularized with dropout.

Improvement ideas:

  • Tune the hyperparameters for the VGG16 as there may be room for improvements (batch size, dropout rate, learning rate, ...)
  • Try different architectures for transfer learning such as Resnet, Xception, ...
  • Try to add more data
  • Used some advanced techniques such as learning rate scheduling, label smoothing, ...
  • Put yours here ...
In [0]:
 

App structure

The application content walkthrough is beyond the scope of this post. Nevertheless, I'll give an overview of the app in this last section.
The model is deployed using a library called "tflite-react-native" (see this github page for more details. The workflow consists of first converting the trained keras models into tensorflow lite format and use the tflite-react-native library to integrate the tf-lite model into the react-native application.
The application consists mainly of 4 screens for which the code is located in the `screens` folder in the repository. The screens are as follow :