Exploratory Data Analysis¶

The dataset I'll explore here comes from kaggle and can be downloaded following this link. It consists of flower images belonging to five flower species :

Daisy
Dandelion
Rose
Sunflower
Tulip

First, let's import some useful libs :

%load_ext autoreload
%autoreload 2

import sys
sys.path.append("../utils") # or whatever the path to the utils directory containing the utilities script

from bokeh.io import output_notebook
from data_utils import split_flower_data
from plot_utils import display_sample_images, plot_class_distribution, plot_image_dimensions

output_notebook()

%matplotlib inline

The original dataset is not divided into training, validation and test sets, let's do it :

train_percent = 0.7  # part of the data to consider for train set
root_folder = "../../../flowers-recognition/flowers"  # directory containing subdirectory
                                                                                        # each containing images for 1 class
out_folder = "../../../flowers-recognition/split_flower"  # folder where to save the split data
classes = ['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']  # class names corresponding also to subdirs on `root_folder`

%%time
split_flower_data(root_folder, out_folder, classes, train_percent)

==========CLASS : daisy===========
*******[COPYING TRAIN IMAGES]***********

*******[COPYING VALIDATION IMAGES]***********

*******[COPYING TEST IMAGES]***********

===============================


==========CLASS : dandelion===========
*******[COPYING TRAIN IMAGES]***********

*******[COPYING VALIDATION IMAGES]***********

*******[COPYING TEST IMAGES]***********

===============================


==========CLASS : rose===========
*******[COPYING TRAIN IMAGES]***********

*******[COPYING VALIDATION IMAGES]***********

*******[COPYING TEST IMAGES]***********

===============================


==========CLASS : sunflower===========
*******[COPYING TRAIN IMAGES]***********

After data spliting, let's display some sample images for each class from training data :

sample_path = "../../../flowers-recognition/split_flower/train/"

display_sample_images(sample_path)

display_sample_images(sample_path)

display_sample_images(sample_path)

Some images are quite noisy, as they are not only composed of the flower picture, or sometimes the flower is not even visible. We should pay particular attention when assessing the model's performance as it may be affected by these noisy samples.

One important aspect to investigate is the distribution of classes in our dataset, as imbalanced dataset may yield poor results. Following is the distribution of our data in the train, validation and test sets :

path_train = "../../../../COMPUTER_VISION/flowers-recognition/split_flower/train"
path_valid = "../../../../COMPUTER_VISION/flowers-recognition/split_flower/valid"
path_test = "../../../../COMPUTER_VISION/flowers-recognition/split_flower/test"

plot_class_distribution(path_train, title="Train set classes distribution")

plot_class_distribution(path_valid, title="Validation set classes distribution")

plot_class_distribution(path_test, title="Test set classes distribution")

From above plots, we can see that the dataset is fairly balanced. There are two classes -dandelion and tulip- each with a little more than 20% of the data. The other classes have proportions of around 17-18%.

Overall, we have 3028 training samples, 651 validation and 644 test ones.

When using a pretrained model for transfer learning we may be required to resize our images to match the original model input shape. In this case the choice of the images dimensions is direct.

However for our own custom models we may need some insights for choosing the right dimensions. Let's visualize the dimensions -height and width- from the training images :

plot_image_dimensions(path_train)

The mean height is around 245-260 and mean width around 320-350. We may take these values into account when resizing images with custom dimensions.

Flower images classification¶

In this notebook I'll go through training some neural network models in the task of flower image classification (for more information about the dataset see the notebook EDA.ipynb in the same directory)

Let's first clone a repository where I store my utility functions for data processing and model training and evaluation

!git clone https://AlkaSaliss:********@github.com/AlkaSaliss/flowerClassif.git

Cloning into 'flowerClassif'...
remote: Enumerating objects: 80, done.
remote: Counting objects: 100% (80/80), done.
remote: Compressing objects: 100% (59/59), done.
remote: Total 80 (delta 20), reused 74 (delta 14), pack-reused 0
Unpacking objects: 100% (80/80), done.

# # Run this cell to mount your Google Drive.
from google.colab import drive
drive.mount('/content/drive')

Install and import some ncessary packages

# !pip install tensorflow==1.13.1  # uncomment this if using TPU instead of GPU on google colab
!pip install talos scipy==1.2 git+https://www.github.com/keras-team/keras-contrib.git

import sys
sys.path.append('/content/flowerClassif/utils')

import tensorflow as tf
import keras
import keras_contrib
from google.colab import files
import importlib
import os
from data_utils import split_flower_data
from plot_utils import plot_training_history
from model_utils import plot_confusion_matrix
from model_utils import inference_val_gen
from model_utils import CopyCheckpointToDrive
import matplotlib.pyplot as plt
import bokeh
from bokeh.io import output_notebook
from bokeh.resources import INLINE
from bokeh.layouts import gridplot, row
from bokeh.plotting import show, figure
from bokeh.models import ColumnDataSource
import json
import numpy as np
import pandas as pd
import talos as ta
import logging
import tqdm

Using TensorFlow backend.

output_notebook(resources=INLINE)

%matplotlib inline

print(tf.__version__)
print(tf.test.is_gpu_available())

1.15.0-rc3
True

1. Loading data¶

The flower dataset is hosted on kaggle, thus we need to donwload it using the kaggle CLI. (note that to install the kaggle CLI you'll simply do : pip install kaggle)

Copy kaggle credentials file from google drive, so that we can authenticate with the kaggle CLI :

# create a directory for storing kaggle credentials if it doesn't exist
os.makedirs('/root/.kaggle', exist_ok=True)

# copy the credential file in the reight directory
!cp "/content/drive/My Drive/kaggle.json" /root/.kaggle/kaggle.json

Download the data from kaggle and save it to the data directory :

# create a directory where the data will be stored
os.makedirs('/content/data', exist_ok=True)

!kaggle datasets download -d alxmamaev/flowers-recognition --unzip -p /content/data/

Downloading flowers-recognition.zip to /content/data
 99% 447M/450M [00:04<00:00, 110MB/s] 
100% 450M/450M [00:04<00:00, 115MB/s]

Finally we split the dataset into train (70%), test (15%) and validation (15%) sets :

data_path = '/content/data/flowers/'
out_path = '/content/data/flowers-split'
classes = ['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']
train_split = 0.7

split_flower_data(data_path, out_path, classes, train_split)

2. Train a baseline CNN¶

To ensure everything is ok, let's train a simple CNN model a s a baseline.

First we create data generators for our flower images :

train_path = "/content/data/flowers-split/train/"
valid_path = "/content/data/flowers-split/valid/"

train_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255    
)

val_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
)

train_data = train_gen.flow_from_directory(train_path, target_size=(224, 224), batch_size=128)
val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=256, shuffle=False)

Found 3028 images belonging to 5 classes.
Found 651 images belonging to 5 classes.

Next let's create the keras model

tf.keras.backend.clear_session()

simple_cnn = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(64, 5, activation='relu', input_shape=(224, 224, 3),
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.Conv2D(64, 5, activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Conv2D(128, 5, activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.Conv2D(128, 5, activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.GlobalAveragePooling2D(),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(128, activation='relu',
                          kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(5, activation='softmax')
])

simple_cnn_tpu = tf.contrib.tpu.keras_to_tpu_model(
        simple_cnn,
        strategy=tf.contrib.tpu.TPUDistributionStrategy(
            tf.contrib.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
        )
    )

simple_cnn_tpu.compile(
    optimizer=tf.train.AdagradOptimizer(learning_rate=1e-3),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

print(simple_cnn_tpu.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_input (InputLayer)    (None, 224, 224, 3)       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 220, 220, 64)      4864      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 216, 216, 64)      102464    
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 108, 108, 64)      0         
_________________________________________________________________
batch_normalization_v1 (Batc (None, 108, 108, 64)      256       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 104, 104, 128)     204928    
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 100, 100, 128)     409728    
_________________________________________________________________
global_average_pooling2d (Gl (None, 128)               0         
_________________________________________________________________
batch_normalization_v1_1 (Ba (None, 128)               512       
_________________________________________________________________
dense (Dense)                (None, 128)               16512     
_________________________________________________________________
batch_normalization_v1_2 (Ba (None, 128)               512       
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 645       
=================================================================
Total params: 740,421
Trainable params: 739,781
Non-trainable params: 640
_________________________________________________________________
None

Before starting training let's add some callbacks :

tensorboard
early_stopping
model checkpointing

os.makedirs("/content/data/checkpoints/", exist_ok=True)

logdir = '/content/data/logs/simple_cnn_tpu'
os.makedirs(logdir, exist_ok=True)
tb_callback = tf.keras.callbacks.TensorBoard(log_dir=logdir)

earlystop_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=50, restore_best_weights=True, verbose=1)

ckpt_path = '/content/data/checkpoints/simple_cnn_tpu.h5'
os.makedirs("/content/data/checkpoints/", exist_ok=True)
ckpt_callback = tf.keras.callbacks.ModelCheckpoint(ckpt_path)

list_callbacks = [earlystop_callback, ckpt_callback]

history = simple_cnn_tpu.fit_generator(train_data,
                                       validation_data=val_data,
                                       epochs=300,
                                       callbacks=list_callbacks)

now that the training is end, let's visualize the training and validation metrics :

plot_training_history(history)

Let's download the history data and the model :

history_dict = {k: [float(i) for i in v] for k, v in history.history.items()}
json.dump(history_dict, open('/content/data/checkpoints/simple_cnn_tpu.json', 'w'))

files.download("/content/data/checkpoints/simple_cnn_tpu.json")

files.download("/content/data/checkpoints/simple_cnn_tpu.h5")

Finally let's evaluate the model by ploting some classification metrics and confusion matrix

simple_cnn_tpu.load_weights("/content/data/checkpoints/simple_cnn_tpu.h5")

Get the true labels from generator, and predictions from the model :

y_true = np.concatenate([y for y in inference_val_gen(val_data, gen_type="y")])

y_true.shape

(656,)

# Make predictions
y_pred = simple_cnn_tpu.predict_generator(inference_val_gen(val_data), steps=len(val_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)

INFO:tensorflow:New input shapes; (re-)compiling: mode=infer (# of cores 8), [TensorSpec(shape=(18, 224, 224, 3), dtype=tf.float32, name='conv2d_input_10')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Remapping placeholder for conv2d_input
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 4.6480162143707275 secs
3/3 [==============================] - 8s 3s/step

y_pred.shape

(656,)

plot the confusion matrix

# get the labels and labels names
label_items = sorted(list(val_data.class_indices.items()), key=lambda x: x[1])
label_names = [item[0] for item in label_items]
labels = [item[1] for item in label_items]
print(labels)
print(label_names)

[0, 1, 2, 3, 4]
['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']

plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)

              precision    recall  f1-score   support

       daisy       0.85      0.72      0.78       116
   dandelion       0.81      0.87      0.84       158
        rose       0.77      0.59      0.67       118
   sunflower       0.89      0.79      0.84       111
       tulip       0.65      0.84      0.73       153

    accuracy                           0.77       656
   macro avg       0.79      0.76      0.77       656
weighted avg       0.78      0.77      0.77       656

We got an accuracy of 77% with this baseline CNN. It seems that the most difficult class to predict by the model is the class rose, which presents the lowest recall value (59%).

Often when the model is wrong about this class (rose), it misclassifies it as a tulip.

3. Baseline + data augmentation¶

Add some data augmentation techniques to see if it helps improve the model performance

train_path = "/content/data/flowers-split/train/"
valid_path = "/content/data/flowers-split/valid/"

train_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
    horizontal_flip=True,
    vertical_flip=True,
    rotation_range=15,
    width_shift_range=0.2,
    height_shift_range=0.2
)


val_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
)

train_data = train_gen.flow_from_directory(train_path, target_size=(224, 224), batch_size=128)
val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=256, shuffle=False)

Found 3028 images belonging to 5 classes.
Found 651 images belonging to 5 classes.

tf.keras.backend.clear_session()

simple_cnn = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(64, 5, activation='relu', input_shape=(224, 224, 3),
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.Conv2D(64, 5, activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Conv2D(128, 5, activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.Conv2D(128, 5, activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.GlobalAveragePooling2D(),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(128, activation='relu',
                          kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(5, activation='softmax')
])

simple_cnn_tpu_dataaug = tf.contrib.tpu.keras_to_tpu_model(
        simple_cnn,
        strategy=tf.contrib.tpu.TPUDistributionStrategy(
            tf.contrib.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
        )
    )

simple_cnn_tpu_dataaug.compile(
    optimizer=tf.train.AdamOptimizer(learning_rate=1e-3),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

print(simple_cnn_tpu_dataaug.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_input (InputLayer)    (None, 224, 224, 3)       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 220, 220, 64)      4864      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 216, 216, 64)      102464    
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 108, 108, 64)      0         
_________________________________________________________________
batch_normalization_v1 (Batc (None, 108, 108, 64)      256       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 104, 104, 128)     204928    
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 100, 100, 128)     409728    
_________________________________________________________________
global_average_pooling2d (Gl (None, 128)               0         
_________________________________________________________________
batch_normalization_v1_1 (Ba (None, 128)               512       
_________________________________________________________________
dense (Dense)                (None, 128)               16512     
_________________________________________________________________
batch_normalization_v1_2 (Ba (None, 128)               512       
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 645       
=================================================================
Total params: 740,421
Trainable params: 739,781
Non-trainable params: 640
_________________________________________________________________
None

earlystop_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=50, restore_best_weights=True, verbose=1)

ckpt_path = '/content/data/checkpoints/simple_cnn_tpu_dataaug.h5'
os.makedirs("/content/data/checkpoints/", exist_ok=True)
ckpt_callback = tf.keras.callbacks.ModelCheckpoint(ckpt_path)

list_callbacks = [earlystop_callback, ckpt_callback]

history = simple_cnn_tpu_dataaug.fit_generator(train_data,
                                       validation_data=val_data,
                                       epochs=300,
                                       callbacks=list_callbacks)

plot_training_history(history)

history_dict = {k: [float(i) for i in v] for k, v in history.history.items()}
json.dump(history_dict, open('/content/data/checkpoints/simple_cnn_tpu_dataaug.json', 'w'))

files.download("/content/data/checkpoints/simple_cnn_tpu_dataaug.json")

files.download("/content/data/checkpoints/simple_cnn_tpu_dataaug.h5")

Model evaluation with classification report

# load the best model

simple_cnn_tpu_dataaug.load_weights("/content/data/checkpoints/simple_cnn_tpu_dataaug.h5")

Get the true labels from generator, and predictions from the model :

y_true = np.concatenate([y for y in inference_val_gen(val_data, gen_type="y")])

y_true.shape

(656,)

y_pred = simple_cnn_tpu_dataaug.predict_generator(inference_val_gen(val_data), steps=len(val_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)
y_pred.shape

3/3 [==============================] - 3s 1s/step

(656,)

plot the confusion matrix

# get the labels and labels names
label_items = sorted(list(val_data.class_indices.items()), key=lambda x: x[1])
label_names = [item[0] for item in label_items]
labels = [item[1] for item in label_items]
print(labels)
print(label_names)

[0, 1, 2, 3, 4]
['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']

plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)

              precision    recall  f1-score   support

       daisy       0.93      0.70      0.80       116
   dandelion       0.85      0.91      0.87       158
        rose       0.74      0.73      0.73       118
   sunflower       0.95      0.86      0.90       111
       tulip       0.70      0.84      0.77       153

    accuracy                           0.81       656
   macro avg       0.83      0.81      0.81       656
weighted avg       0.83      0.81      0.81       656

Using data augmentation made us gain 4% of accuracy over the baseline model, going from 77% to 81%.

Here also we can see the modst difficult classes to identify by this model are daisy (70% of recall) and rose (73% of recall).

As previously, the model confounds rose with tulip when making mistakes. ==> maybe we need to more weight on this class to help improve its classification.

Hyperparameters search with talos¶

Talos is a hyperparameters tuning library for keras models. You can check it here

We'll tune three hyperparameters :

kernel size between 3 and 5
dropout rate : 0.25 or 0.5
activation function for hidden layers : relu or elu

train_path = "/content/data/flowers-split/train/"
valid_path = "/content/data/flowers-split/valid/"

train_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
    horizontal_flip=True,
    vertical_flip=True,
    rotation_range=15,
    width_shift_range=0.2,
    height_shift_range=0.2
    
)


val_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
)

train_data = train_gen.flow_from_directory(train_path, target_size=(224, 224), batch_size=128)
val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=512, shuffle=False)

Found 3028 images belonging to 5 classes.
Found 651 images belonging to 5 classes.

Create a directory in google drive where to save the results of hp-search

result_path = "/content/drive/My Drive/DeepLearning/flowers/hp_search_results/run1"
os.makedirs(result_path, exist_ok=True)

First we define a dictionary containing ranges/choices of the hyperparameters we want to optimize :

params = {
    "kernel_size": [3, 5],
    "dropout": [0.25, 0.50],
    "activation": ["relu", "elu"],
}

Then we define a function that takes as input the data and the hyperparametrs dictionary. The talos library expects the data in four parts : training features, training labels, validation features and validation labels. But in our case we are working with generators, so I will just passed dummy x_train, y_train, x_val and y_val to the optimization function; and in the function I'll access the generators I created as global variable to do my training :

def model_fn(x_train, y_train, x_val, y_val, params):

    tf.keras.backend.clear_session()
    model = tf.keras.models.Sequential([
        tf.keras.layers.Conv2D(64, params['kernel_size'],
                               activation=params['activation'],
                               input_shape=(224, 224, 3),
                               kernel_initializer=tf.keras.initializers.he_normal()),
        tf.keras.layers.Conv2D(64, params['kernel_size'],
                               activation=params['activation'],
                               kernel_initializer=tf.keras.initializers.he_normal()),
        tf.keras.layers.MaxPooling2D(2),
        tf.keras.layers.Dropout(params['dropout']),
        tf.keras.layers.BatchNormalization(),

        tf.keras.layers.Conv2D(128, params['kernel_size'],
                               activation=params['activation'],
                               kernel_initializer=tf.keras.initializers.he_normal()),
        tf.keras.layers.Conv2D(128, params['kernel_size'],
                               activation=params['activation'],
                               kernel_initializer=tf.keras.initializers.he_normal()),
        tf.keras.layers.GlobalAvgPool2D(),
        tf.keras.layers.Dropout(params['dropout']),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.Dense(64,
                              activation=params['activation'],
                              kernel_initializer=tf.keras.initializers.he_normal()),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.Dense(5, activation='softmax')
    ])

    tpu_model = tf.contrib.tpu.keras_to_tpu_model(
        model,
        strategy=tf.contrib.tpu.TPUDistributionStrategy(
            tf.contrib.cluster_resolver.TPUClusterResolver(
                tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
        )
    )

    tpu_model.compile(
        optimizer=tf.train.AdamOptimizer(learning_rate=1e-3, ),
        loss=tf.keras.losses.categorical_crossentropy,
        metrics=['acc']
    )
    
    out = tpu_model.fit_generator(train_data, validation_data=val_data,
                epochs=75)
    return out, tpu_model.sync_to_cpu()

x_dum, y_dum = np.zeros((1, 224, 224, 3)), np.zeros((1, 5))

h = ta.Scan(x_dum, y_dum, params, model_fn, experiment_no="1")

# Get the best model index with highest 'val_categorical_accuracy' 
model_id = h.data['val_acc'].astype('float').argmax() - 1
# Clear any previous TensorFlow session.
tf.keras.backend.clear_session()

# Load the model parameters from the scanner.
model = tf.keras.models.model_from_json(h.saved_models[model_id])
model.set_weights(h.saved_weights[model_id])
model.summary()
model.save(os.path.join(result_path, 'best_model.h5'))
h.data.to_csv(os.path.join(result_path, 'configs.csv'), index=False)

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 222, 222, 64)      1792      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 220, 220, 64)      36928     
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 110, 110, 64)      0         
_________________________________________________________________
dropout (Dropout)            (None, 110, 110, 64)      0         
_________________________________________________________________
batch_normalization_v1 (Batc (None, 110, 110, 64)      256       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 108, 108, 128)     73856     
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 106, 106, 128)     147584    
_________________________________________________________________
global_average_pooling2d (Gl (None, 128)               0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
batch_normalization_v1_1 (Ba (None, 128)               512       
_________________________________________________________________
dense (Dense)                (None, 64)                8256      
_________________________________________________________________
batch_normalization_v1_2 (Ba (None, 64)                256       
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 325       
=================================================================
Total params: 269,765
Trainable params: 269,253
Non-trainable params: 512
_________________________________________________________________

Load best configuration and train a model using it :

configs = pd.read_csv(os.path.join(result_path, 'configs.csv')).sort_values(['val_acc'])
configs

The best hyperparameters configuration is :

kernel_size : 5
activation : relu
dropout : 0.25

tf.keras.backend.clear_session()
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(64, 5,
                           activation='relu',
                           input_shape=(224, 224, 3),
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.Conv2D(64, 5,
                           activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.MaxPooling2D(2),
    tf.keras.layers.Dropout(0.25),
    tf.keras.layers.BatchNormalization(),

    tf.keras.layers.Conv2D(128, 5,
                           activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.Conv2D(128, 5,
                           activation='relu',
                           kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.GlobalAvgPool2D(),
    tf.keras.layers.Dropout(0.25),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(64,
                          activation='relu',
                          kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(5, activation='softmax')
])

tpu_model = tf.contrib.tpu.keras_to_tpu_model(
    model,
    strategy=tf.contrib.tpu.TPUDistributionStrategy(
        tf.contrib.cluster_resolver.TPUClusterResolver(
            tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
    )
)

tpu_model.compile(
    optimizer=tf.train.AdamOptimizer(learning_rate=1e-3, ),
    loss=tf.keras.losses.categorical_crossentropy,
    metrics=['acc']
)

print(tpu_model.summary())

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/core.py:143: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
INFO:tensorflow:Querying Tensorflow master (grpc://10.47.236.90:8470) for TPU system metadata.
INFO:tensorflow:Found TPU system:
INFO:tensorflow:*** Num TPU Cores: 8
INFO:tensorflow:*** Num TPU Workers: 1
INFO:tensorflow:*** Num TPU Cores Per Worker: 8
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, -1, 13185486934593269880)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 644367969013081572)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 17179869184, 8770085351396939743)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 17179869184, 7391174983090245070)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 17179869184, 3562341780813676187)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 17179869184, 11624688845676187994)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 17179869184, 11830354006431143299)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 17179869184, 3432732814019816995)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 17179869184, 2397811451879219779)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 17179869184, 3365187378787287639)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 8589934592, 17898618049197454467)
WARNING:tensorflow:tpu_model (from tensorflow.contrib.tpu.python.tpu.keras_support) is experimental and may change or be removed at any time, and without warning.
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_input (InputLayer)    (None, 224, 224, 3)       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 220, 220, 64)      4864      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 216, 216, 64)      102464    
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 108, 108, 64)      0         
_________________________________________________________________
dropout (Dropout)            (None, 108, 108, 64)      0         
_________________________________________________________________
batch_normalization_v1 (Batc (None, 108, 108, 64)      256       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 104, 104, 128)     204928    
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 100, 100, 128)     409728    
_________________________________________________________________
global_average_pooling2d (Gl (None, 128)               0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
batch_normalization_v1_1 (Ba (None, 128)               512       
_________________________________________________________________
dense (Dense)                (None, 64)                8256      
_________________________________________________________________
batch_normalization_v1_2 (Ba (None, 64)                256       
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 325       
=================================================================
Total params: 731,589
Trainable params: 731,077
Non-trainable params: 512
_________________________________________________________________
None

# define callbacks
earlystop_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=100, restore_best_weights=True, verbose=1)

ckpt_path = '/content/data/checkpoints/best_from_hp_search.h5'
os.makedirs("/content/data/checkpoints/", exist_ok=True)
ckpt_callback = tf.keras.callbacks.ModelCheckpoint(ckpt_path, monitor='val_loss')


# special callback to copy the best model checkpoint from local storage to google drive
dest_path = "/content/drive/My Drive/DeepLearning/flowers/hp_search_results/best"
os.makedirs(dest_path, exist_ok=True)
copy_cb = CopyCheckpointToDrive(ckpt_path, dest_path)

list_callbacks = [earlystop_callback, ckpt_callback, copy_cb]

%%time
history = tpu_model.fit_generator(train_data, validation_data=val_data, callbacks=list_callbacks,
                                    epochs=300)

plot_training_history(history)

Get the true labels from generator, and predictions from the model :

y_true = np.concatenate([y for y in inference_val_gen(val_data, gen_type="y")])

y_true.shape

(656,)

# Load best model
tpu_model.load_weights("/content/data/checkpoints/best_from_hp_search.h5")

y_pred = tpu_model.predict_generator(inference_val_gen(val_data), steps=len(val_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)
y_pred.shape

INFO:tensorflow:New input shapes; (re-)compiling: mode=infer (# of cores 8), [TensorSpec(shape=(64, 224, 224, 3), dtype=tf.float32, name='conv2d_input_10')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Remapping placeholder for conv2d_input
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 7.300978422164917 secs
INFO:tensorflow:Setting weights on TPU model.
1/2 [==============>...............] - ETA: 12sINFO:tensorflow:New input shapes; (re-)compiling: mode=infer (# of cores 8), [TensorSpec(shape=(18, 224, 224, 3), dtype=tf.float32, name='conv2d_input_10')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Remapping placeholder for conv2d_input
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 5.988226890563965 secs
2/2 [==============================] - 20s 10s/step

(656,)

plot the confusion matrix

# get the labels and labels names
label_items = sorted(list(val_data.class_indices.items()), key=lambda x: x[1])
label_names = [item[0] for item in label_items]
labels = [item[1] for item in label_items]
print(labels)
print(label_names)

[0, 1, 2, 3, 4]
['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']

plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)

              precision    recall  f1-score   support

       daisy       0.99      0.76      0.86       116
   dandelion       0.91      0.91      0.91       158
        rose       0.81      0.67      0.73       118
   sunflower       0.87      0.94      0.90       111
       tulip       0.73      0.92      0.81       153

    accuracy                           0.84       656
   macro avg       0.86      0.84      0.84       656
weighted avg       0.86      0.84      0.84       656

Using hyperparameter search led us to 1% gain in accuracy. We are still facing overfitting issue, see accuracy and loss curves above. This could be explained by the fact that we only have about hundred of images per class, which is somehow small to learn a model that'll be capable of generalizing well.

Thus in the next sections we'll go with transfer learning to overcome the small data issue.

Transfer learning¶

Next we'll leverage the power of transfer learning given the small amount of data we have.

VGG16 as features extractor¶

tf.keras.backend.clear_session()


# get the pretrained VGG16 on imagenet
vgg_base = tf.keras.applications.vgg16.VGG16(
    include_top=False,
    weights='imagenet',
    input_shape=(224, 224, 3),
    pooling='avg'
)


# freeze pretrained weights
for layer in tqdm.tqdm_notebook(vgg_base.layers):
    layer.trainable = False
    

# add classification layer
vgg_full = tf.keras.Sequential([
    vgg_base,
    tf.keras.layers.Dense(256, activation='relu', kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.Dense(5, activation='softmax')
])



vgg_full.compile(
    optimizer=tf.keras.optimizers.Adam(lr=1e-4),
    loss='categorical_crossentropy',
    metrics=['acc']
)

print(vgg_full.summary())

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Model)                (None, 512)               14714688  
_________________________________________________________________
dense (Dense)                (None, 256)               131328    
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 1285      
=================================================================
Total params: 14,847,301
Trainable params: 132,613
Non-trainable params: 14,714,688
_________________________________________________________________
None

# The data generators
train_path = "/content/data/flowers-split/train/"
valid_path = "/content/data/flowers-split/valid/"

train_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
    horizontal_flip=True,
    vertical_flip=True,
    rotation_range=15,
    width_shift_range=0.2,
    height_shift_range=0.2
    
)


val_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
)

train_data = train_gen.flow_from_directory(train_path, target_size=(224, 224), batch_size=64)
val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=64, shuffle=False)

Found 3028 images belonging to 5 classes.
Found 651 images belonging to 5 classes.

# define callbacks
earlystop_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=50, restore_best_weights=True, verbose=1)

ckpt_path = '/content/data/checkpoints/vgg16_fext.h5'
os.makedirs("/content/data/checkpoints/", exist_ok=True)
ckpt_callback = tf.keras.callbacks.ModelCheckpoint(ckpt_path, monitor='val_loss', verbose=1)


# special callback to copy the best model checkpoint from local storage to google drive
dest_path = "/content/drive/My Drive/DeepLearning/flowers/transfer/vgg/feature_ext"
os.makedirs(dest_path, exist_ok=True)
copy_cb = CopyCheckpointToDrive(ckpt_path, dest_path)

list_callbacks = [earlystop_callback, ckpt_callback, copy_cb]

%%time

# train

history = vgg_full.fit_generator(train_data, validation_data=val_data, callbacks=list_callbacks,
                                        epochs=300)

# plot the learning curves
plot_training_history(history)

Get the true labels from generator, and predictions from the model :

val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=64, shuffle=False)
y_true = []
len_val_data = len(val_data)
for i, (x, y) in enumerate(tqdm.tqdm_notebook(val_data)):
    y_true.append(np.argmax(y, axis=1))
    if i == len_val_data - 1:
        break
y_true = np.concatenate(y_true)

Found 651 images belonging to 5 classes.

y_true.shape

(651,)

# Load best model
vgg_full.load_weights("/content/data/checkpoints/vgg16_fext.h5")

y_pred = vgg_full.predict_generator(val_data, verbose=1)
y_pred = np.argmax(y_pred, axis=1)
y_pred.shape

11/11 [==============================] - 3s 284ms/step

(651,)

plot the confusion matrix

# get the labels and labels names
label_items = sorted(list(val_data.class_indices.items()), key=lambda x: x[1])
label_names = [item[0] for item in label_items]
labels = [item[1] for item in label_items]
print(labels)
print(label_names)

[0, 1, 2, 3, 4]
['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']

plot_confusion_matrix(y_true, y_pred, title='Classification Report - VGG16 features extractor',
                          labels_=labels, target_names=label_names, normalize=False)

              precision    recall  f1-score   support

       daisy       0.82      0.81      0.82       116
   dandelion       0.89      0.89      0.89       158
        rose       0.79      0.85      0.82       118
   sunflower       0.83      0.85      0.84       111
       tulip       0.81      0.76      0.79       148

    accuracy                           0.83       651
   macro avg       0.83      0.83      0.83       651
weighted avg       0.83      0.83      0.83       651

Using VGG16 as features extractor we were able to achieve 83% of accuracy, 1% less than the baseline using the hyperparameters search with talos lib. However with vgg16 the metrics recall and precision, are more balanced between classes.

In the next section we'll unfreeze the VGG convonlutional layers and finetune the whole model.

Finetuning VGG16¶

Here we'll use a lower learning rate to avoid perturbing the already learned information from the pretrained layers

Let's first clone the model from the previous section si we can start finetuning from it

tf.keras.backend.clear_session()
tf.keras.backend.clear_session()
vgg_finetune = tf.keras.models.load_model("/content/data/checkpoints/vgg16_fext.h5")
print(vgg_finetune.summary())

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Model)                (None, 512)               14714688  
_________________________________________________________________
dense (Dense)                (None, 256)               131328    
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 1285      
=================================================================
Total params: 14,847,301
Trainable params: 132,613
Non-trainable params: 14,714,688
_________________________________________________________________
None

Unfreeze the layers and recompile the model with a lower learning rate

for layer in vgg_finetune.layers:
    layer.trainable = True
    
vgg_finetune.compile(
    optimizer=tf.keras.optimizers.Adam(lr=1e-5),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

print(vgg_finetune.summary())

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Model)                (None, 512)               14714688  
_________________________________________________________________
dense (Dense)                (None, 256)               131328    
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 1285      
=================================================================
Total params: 14,847,301
Trainable params: 14,847,301
Non-trainable params: 0
_________________________________________________________________
None

# The data generators
train_path = "/content/data/flowers-split/train/"
valid_path = "/content/data/flowers-split/valid/"

train_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
    horizontal_flip=True,
    vertical_flip=True,
    rotation_range=15,
    width_shift_range=0.2,
    height_shift_range=0.2
    
)


val_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
)

train_data = train_gen.flow_from_directory(train_path, target_size=(224, 224), batch_size=64)
val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=64, shuffle=False)

Found 3028 images belonging to 5 classes.
Found 651 images belonging to 5 classes.

# define callbacks
earlystop_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=100, restore_best_weights=True, verbose=1)

ckpt_path = '/content/data/checkpoints/vgg16_finetune.h5'
os.makedirs("/content/data/checkpoints/", exist_ok=True)
ckpt_callback = tf.keras.callbacks.ModelCheckpoint(ckpt_path, monitor='val_loss', verbose=1)


# special callback to copy the best model checkpoint from local storage to google drive
dest_path = "/content/drive/My Drive/DeepLearning/flowers/transfer/vgg/finetune"
os.makedirs(dest_path, exist_ok=True)
copy_cb = CopyCheckpointToDrive(ckpt_path, dest_path)

list_callbacks = [earlystop_callback, ckpt_callback, copy_cb]

%%time

# train

history_vgg_finetune = vgg_finetune.fit_generator(train_data, validation_data=val_data, callbacks=list_callbacks,
                                        epochs=300)

plot_training_history(history_vgg_finetune)

Evaluation of the finetuned model

val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=64, shuffle=False)
y_true = []
len_val_data = len(val_data)
for i, (x, y) in enumerate(tqdm.tqdm_notebook(val_data)):
    y_true.append(np.argmax(y, axis=1))
    if i == len_val_data - 1:
        break
y_true = np.concatenate(y_true)

Found 651 images belonging to 5 classes.

y_true.shape

(651,)

# Load best model

y_pred = vgg_finetune.predict_generator(val_data, verbose=1)
y_pred = np.argmax(y_pred, axis=1)
y_pred.shape

11/11 [==============================] - 3s 297ms/step

(651,)

plot the confusion matrix

# get the labels and labels names
label_items = sorted(list(val_data.class_indices.items()), key=lambda x: x[1])
label_names = [item[0] for item in label_items]
labels = [item[1] for item in label_items]
print(labels)
print(label_names)

[0, 1, 2, 3, 4]
['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']

plot_confusion_matrix(y_true, y_pred, title='Classification Report - VGG16 Finetuned',
                          labels_=labels, target_names=label_names, normalize=False)

              precision    recall  f1-score   support

       daisy       0.91      0.84      0.87       116
   dandelion       0.93      0.94      0.93       158
        rose       0.86      0.86      0.86       118
   sunflower       0.90      0.93      0.91       111
       tulip       0.86      0.88      0.87       148

    accuracy                           0.89       651
   macro avg       0.89      0.89      0.89       651
weighted avg       0.89      0.89      0.89       651

Finetuning VGG + dropout regularization¶

tf.keras.backend.clear_session()


# get the pretrained VGG16 on imagenet
vgg_base = tf.keras.applications.vgg16.VGG16(
    include_top=False,
    weights='imagenet',
    input_shape=(224, 224, 3),
    pooling='avg'
)


print(vgg_base.summary())

Model: "vgg16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
global_average_pooling2d (Gl (None, 512)               0         
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________
None

# Freeze first layers

set_trainable = False
for layer in vgg_base.layers:
    if layer.name == 'block5_conv1':
        set_trainable = True
    if set_trainable:
        layer.trainable = True
    else :
        layer.trainable = False

print(vgg_base.summary())

Model: "vgg16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
global_average_pooling2d (Gl (None, 512)               0         
=================================================================
Total params: 14,714,688
Trainable params: 7,079,424
Non-trainable params: 7,635,264
_________________________________________________________________
None

# add classification layer
vgg_full = tf.keras.Sequential([
    vgg_base,
    tf.keras.layers.Dropout(0.75),
    tf.keras.layers.Dense(256, activation='relu', kernel_initializer=tf.keras.initializers.he_normal()),
    tf.keras.layers.Dense(5, activation='softmax')
])


vgg_full.compile(
    optimizer=tf.keras.optimizers.Adam(lr=1e-4),
    loss='categorical_crossentropy',
    metrics=['acc']
)

print(vgg_full.summary())

W0713 20:40:06.727561 139695093868416 nn_ops.py:4224] Large dropout rate: 0.75 (>0.5). In TensorFlow 2.x, dropout() uses dropout rate instead of keep_prob. Please ensure that this is intended.

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Model)                (None, 512)               14714688  
_________________________________________________________________
dropout (Dropout)            (None, 512)               0         
_________________________________________________________________
dense (Dense)                (None, 256)               131328    
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 1285      
=================================================================
Total params: 14,847,301
Trainable params: 7,212,037
Non-trainable params: 7,635,264
_________________________________________________________________
None

# The data generators
train_path = "/content/data/flowers-split/train/"
valid_path = "/content/data/flowers-split/valid/"

train_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
    horizontal_flip=True,
    vertical_flip=True,
    rotation_range=15,
    width_shift_range=0.2,
    height_shift_range=0.2
    
)


val_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
)

train_data = train_gen.flow_from_directory(train_path, target_size=(224, 224), batch_size=64)
val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=64, shuffle=False)

Found 3028 images belonging to 5 classes.
Found 651 images belonging to 5 classes.

# define callbacks
earlystop_callback = tf.keras.callbacks.EarlyStopping(monitor='val_acc', patience=100, restore_best_weights=True, verbose=1)

ckpt_path = '/content/data/checkpoints/vgg16_finetune_dropout.h5'
os.makedirs("/content/data/checkpoints/", exist_ok=True)
ckpt_callback = tf.keras.callbacks.ModelCheckpoint(ckpt_path, monitor='val_acc', save_best_only=True, verbose=1)


# special callback to copy the best model checkpoint from local storage to google drive
dest_path = "/content/drive/My Drive/DeepLearning/flowers/transfer/vgg/finetune"
os.makedirs(dest_path, exist_ok=True)
copy_cb = CopyCheckpointToDrive(ckpt_path, dest_path)

list_callbacks = [earlystop_callback, ckpt_callback, copy_cb]

%%time

# train

history_vgg_finetune = vgg_full.fit_generator(train_data,
                                              validation_data=val_data,
                                              callbacks=list_callbacks,
                                              epochs=300)

plot_training_history(history_vgg_finetune)

Evaluation of the finetuned model

val_data = val_gen.flow_from_directory(valid_path, target_size=(224, 224), batch_size=64, shuffle=False)
y_true = []
len_val_data = len(val_data)
for i, (x, y) in enumerate(tqdm.tqdm_notebook(val_data)):
    y_true.append(np.argmax(y, axis=1))
    if i == len_val_data - 1:
        break
y_true = np.concatenate(y_true)

Found 651 images belonging to 5 classes.

y_true.shape

(651,)

y_pred = vgg_full.predict_generator(val_data, verbose=1)
y_pred = np.argmax(y_pred, axis=1)
y_pred.shape

11/11 [==============================] - 3s 291ms/step

(651,)

plot the confusion matrix

# get the labels and labels names
label_items = sorted(list(val_data.class_indices.items()), key=lambda x: x[1])
label_names = [item[0] for item in label_items]
labels = [item[1] for item in label_items]
print(labels)
print(label_names)

[0, 1, 2, 3, 4]
['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']

plot_confusion_matrix(y_true, y_pred, title='Classification Report - VGG16 Finetuned',
                          labels_=labels, target_names=label_names, normalize=False)

              precision    recall  f1-score   support

       daisy       0.96      0.87      0.91       116
   dandelion       0.94      0.97      0.96       158
        rose       0.89      0.84      0.86       118
   sunflower       0.91      0.95      0.93       111
       tulip       0.86      0.89      0.87       148

    accuracy                           0.91       651
   macro avg       0.91      0.91      0.91       651
weighted avg       0.91      0.91      0.91       651

Final evaluation on test set¶

Finally, let's evaluate all the trained models on the held test to choose the best model.

Baseline CNN

# Loading the test data 
test_path = '/content/data/flowers-split/test/'

test_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
)
test_data = test_gen.flow_from_directory(test_path, target_size=(224, 224), batch_size=128, shuffle=False)

Found 644 images belonging to 5 classes.

# get the true labels
y_true = np.concatenate([y for y in inference_val_gen(test_data, gen_type="y")])
y_true.shape

(648,)

# Load the best model
model_path = '/content/data/checkpoints/simple_cnn_tpu.h5'
baseline_cnn = tf.keras.models.load_model(model_path)

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling GlorotUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:No training configuration found in save file: the model was *not* compiled. Compile it manually.

baseline_cnn.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 220, 220, 64)      4864      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 216, 216, 64)      102464    
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 108, 108, 64)      0         
_________________________________________________________________
batch_normalization_v1 (Batc (None, 108, 108, 64)      256       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 104, 104, 128)     204928    
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 100, 100, 128)     409728    
_________________________________________________________________
global_average_pooling2d (Gl (None, 128)               0         
_________________________________________________________________
batch_normalization_v1_1 (Ba (None, 128)               512       
_________________________________________________________________
dense (Dense)                (None, 128)               16512     
_________________________________________________________________
batch_normalization_v1_2 (Ba (None, 128)               512       
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 645       
=================================================================
Total params: 740,421
Trainable params: 739,781
Non-trainable params: 640
_________________________________________________________________

# Make prediction using baseline model
y_pred = baseline_cnn.predict_generator(inference_val_gen(test_data), steps=len(test_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)

6/6 [==============================] - 10s 2s/step

y_pred.shape

(648,)

plot the confusion matrix

# get the labels and labels names
label_items = sorted(list(test_data.class_indices.items()), key=lambda x: x[1])
label_names = [item[0] for item in label_items]
labels = [item[1] for item in label_items]
print(labels)
print(label_names)

[0, 1, 2, 3, 4]
['daisy', 'dandelion', 'rose', 'sunflower', 'tulip']

plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)

              precision    recall  f1-score   support

       daisy       0.79      0.71      0.75       114
   dandelion       0.78      0.84      0.81       157
        rose       0.81      0.50      0.62       117
   sunflower       0.81      0.81      0.81       109
       tulip       0.66      0.85      0.75       151

    accuracy                           0.75       648
   macro avg       0.77      0.74      0.75       648
weighted avg       0.76      0.75      0.75       648

Baseline CNN + Data augmentation

# Load the best model
model_path = '/content/data/checkpoints/simple_cnn_tpu_dataaug.h5'
baseline_cnn_dataaug = tf.keras.models.load_model(model_path)

baseline_cnn_dataaug.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 220, 220, 64)      4864      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 216, 216, 64)      102464    
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 108, 108, 64)      0         
_________________________________________________________________
batch_normalization_v1 (Batc (None, 108, 108, 64)      256       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 104, 104, 128)     204928    
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 100, 100, 128)     409728    
_________________________________________________________________
global_average_pooling2d (Gl (None, 128)               0         
_________________________________________________________________
batch_normalization_v1_1 (Ba (None, 128)               512       
_________________________________________________________________
dense (Dense)                (None, 128)               16512     
_________________________________________________________________
batch_normalization_v1_2 (Ba (None, 128)               512       
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 645       
=================================================================
Total params: 740,421
Trainable params: 739,781
Non-trainable params: 640
_________________________________________________________________

# Make prediction using baseline model
y_pred = baseline_cnn_dataaug.predict_generator(inference_val_gen(test_data), steps=len(test_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)

4/6 [===================>..........] - ETA: 1s
6/6 [==============================] - 4s 611ms/step

y_pred.shape

(648,)

plot the confusion matrix

plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)

              precision    recall  f1-score   support

       daisy       0.89      0.71      0.79       114
   dandelion       0.80      0.87      0.83       157
        rose       0.82      0.68      0.74       117
   sunflower       0.89      0.81      0.85       109
       tulip       0.71      0.89      0.79       151

    accuracy                           0.80       648
   macro avg       0.82      0.79      0.80       648
weighted avg       0.81      0.80      0.80       648

Tuned hyperparameters with Talos lib

# Load the best model
model_path = "/content/drive/My Drive/DeepLearning/flowers/hp_search_results/best/best_from_hp_search.h5"
best_from_hp_search = tf.keras.models.load_model(model_path)

WARNING:tensorflow:No training configuration found in save file: the model was *not* compiled. Compile it manually.

best_from_hp_search.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 220, 220, 64)      4864      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 216, 216, 64)      102464    
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 108, 108, 64)      0         
_________________________________________________________________
dropout (Dropout)            (None, 108, 108, 64)      0         
_________________________________________________________________
batch_normalization_v1 (Batc (None, 108, 108, 64)      256       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 104, 104, 128)     204928    
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 100, 100, 128)     409728    
_________________________________________________________________
global_average_pooling2d (Gl (None, 128)               0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
batch_normalization_v1_1 (Ba (None, 128)               512       
_________________________________________________________________
dense (Dense)                (None, 64)                8256      
_________________________________________________________________
batch_normalization_v1_2 (Ba (None, 64)                256       
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 325       
=================================================================
Total params: 731,589
Trainable params: 731,077
Non-trainable params: 512
_________________________________________________________________

# Make prediction using baseline model
y_pred = best_from_hp_search.predict_generator(inference_val_gen(test_data), steps=len(test_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)

4/6 [===================>..........] - ETA: 1s
6/6 [==============================] - 4s 655ms/step

y_pred.shape

(648,)

plot the confusion matrix

plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)

              precision    recall  f1-score   support

       daisy       0.95      0.74      0.83       114
   dandelion       0.89      0.85      0.87       157
        rose       0.79      0.66      0.72       117
   sunflower       0.81      0.95      0.87       109
       tulip       0.72      0.87      0.79       151

    accuracy                           0.82       648
   macro avg       0.83      0.81      0.82       648
weighted avg       0.83      0.82      0.82       648

VGG as features extractor

# Load the best model
model_path = "/content/drive/My Drive/DeepLearning/flowers/transfer/vgg/feature_ext/vgg16_fext.h5"
vgg16_fext = tf.keras.models.load_model(model_path)

vgg16_fext.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Model)                (None, 512)               14714688  
_________________________________________________________________
dense (Dense)                (None, 256)               131328    
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 1285      
=================================================================
Total params: 14,847,301
Trainable params: 132,613
Non-trainable params: 14,714,688
_________________________________________________________________

# Make prediction using baseline model
y_pred = vgg16_fext.predict_generator(inference_val_gen(test_data), steps=len(test_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)

6/6 [==============================] - 16s 3s/step

y_pred.shape

(648,)

plot the confusion matrix

plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)

              precision    recall  f1-score   support

       daisy       0.85      0.78      0.81       114
   dandelion       0.88      0.86      0.87       157
        rose       0.81      0.86      0.84       117
   sunflower       0.77      0.81      0.79       109
       tulip       0.82      0.82      0.82       151

    accuracy                           0.83       648
   macro avg       0.83      0.83      0.83       648
weighted avg       0.83      0.83      0.83       648

Finetuning VGG

# Load the best model
model_path = "/content/drive/My Drive/DeepLearning/flowers/transfer/vgg/finetune/vgg16_finetune.h5"
vgg16_finetune = tf.keras.models.load_model(model_path)

vgg16_finetune.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Model)                (None, 512)               14714688  
_________________________________________________________________
dense (Dense)                (None, 256)               131328    
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 1285      
=================================================================
Total params: 14,847,301
Trainable params: 14,847,301
Non-trainable params: 0
_________________________________________________________________

# Make prediction using baseline model
y_pred = vgg16_finetune.predict_generator(inference_val_gen(test_data), steps=len(test_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)

2/6 [=========>....................] - ETA: 5s
6/6 [==============================] - 6s 1s/step

y_pred.shape

(648,)

plot the confusion matrix

plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)

              precision    recall  f1-score   support

       daisy       0.88      0.83      0.86       114
   dandelion       0.90      0.89      0.89       157
        rose       0.84      0.87      0.85       117
   sunflower       0.85      0.89      0.87       109
       tulip       0.85      0.83      0.84       151

    accuracy                           0.86       648
   macro avg       0.86      0.86      0.86       648
weighted avg       0.86      0.86      0.86       648

Finetuning VGG + Dropout

# Load the best model
model_path = "/content/drive/My Drive/DeepLearning/flowers/transfer/vgg/finetune/vgg16_finetune_dropout.h5"
vgg16_finetune_dropout = tf.keras.models.load_model(model_path)

WARNING:tensorflow:Large dropout rate: 0.75 (>0.5). In TensorFlow 2.x, dropout() uses dropout rate instead of keep_prob. Please ensure that this is intended.

vgg16_finetune_dropout.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Model)                (None, 512)               14714688  
_________________________________________________________________
dropout (Dropout)            (None, 512)               0         
_________________________________________________________________
dense (Dense)                (None, 256)               131328    
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 1285      
=================================================================
Total params: 14,847,301
Trainable params: 7,212,037
Non-trainable params: 7,635,264
_________________________________________________________________

# Make prediction using baseline model
y_pred = vgg16_finetune_dropout.predict_generator(inference_val_gen(test_data), steps=len(test_data), verbose=1)
y_pred = np.argmax(y_pred, axis=1)

2/6 [=========>....................] - ETA: 5s
6/6 [==============================] - 6s 1s/step

y_pred.shape

(648,)

plot the confusion matrix

plot_confusion_matrix(y_true, y_pred, title='Classification Report - CNN baseline',
                          labels_=labels, target_names=label_names, normalize=False)

              precision    recall  f1-score   support

       daisy       0.97      0.83      0.90       114
   dandelion       0.90      0.93      0.92       157
        rose       0.85      0.85      0.85       117
   sunflower       0.85      0.94      0.89       109
       tulip       0.87      0.87      0.87       151

    accuracy                           0.89       648
   macro avg       0.89      0.89      0.89       648
weighted avg       0.89      0.89      0.89       648

Summary¶

Let's summarize all the results in a table to facilitate the comparison:

Model	Validation accuracy	Test accuracy
Baseline	77.29	75.46
Baseline+Data augmentation	81.40	80.25
Tuned Hyperparams	84.45	81.79
VGG16 feature extraction	83.10	82.87
VGG16 Finetuned	88.94	86.27
VGG16 Finetuned + dropout	90.94	88.73

From the above table we can conclude that the best model in terms of vaidation and test accuracy is the finetuned VGG16 regularized with dropout.

Improvement ideas:

Tune the hyperparameters for the VGG16 as there may be room for improvements (batch size, dropout rate, learning rate, ...)
Try different architectures for transfer learning such as Resnet, Xception, ...
Try to add more data
Used some advanced techniques such as learning rate scheduling, label smoothing, ...
Put yours here ...

	round_epochs	loss	acc	val_loss	val_acc	kernel_size	dropout	activation
0	75	0.664851	0.752645	0.652653	0.733025	3	0.25	elu
5	75	0.766767	0.708333	0.675255	0.745370	3	0.50	elu
3	75	0.646514	0.751323	0.642547	0.750000	5	0.50	elu
4	75	0.610366	0.772817	0.604745	0.751543	3	0.50	relu
2	75	0.552488	0.798611	0.601875	0.791667	5	0.25	elu
7	75	0.519103	0.810516	0.538852	0.805556	5	0.50	relu
1	75	0.427520	0.844246	0.539754	0.811728	3	0.25	relu
6	75	0.385719	0.862765	0.518036	0.831790	5	0.25	relu

whichFlower

# whichFlower : a flower species recognition app using tensorflow/keras and React-Native

whichFlower : a flower species recognition app using tensorflow/keras and React-Native

Exploratory Data Analysis¶

Flower images classification¶

1. Loading data¶

2. Train a baseline CNN¶

3. Baseline + data augmentation¶

Hyperparameters search with talos¶

Transfer learning¶

VGG16 as features extractor¶

Finetuning VGG16¶

Finetuning VGG + dropout regularization¶

Final evaluation on test set¶

Summary¶

App structure