In this post, I’ll explain how to use Google Colab to train an image classifier.
If you are a researcher working on an AI research or if you are a developer who’s eager to combine modern applications of AI like face detection, object detection to your application, Google Colab is the perfect sidekick to help you out. Don’t worry if you are a beginner or completely new to the field, let’s just try to catch up with me. But if you have no idea of what AI researches are capable of you could check out these recent researches on Two Minute Papers that will help you understand it.
Coming back to Google Colab, you could consider yourself to be lucky because you don’t need to waste hours and hours configuring the environment, installing Python, Tensorflow, etc like we did before Google Colab came into action.
What is Google Colab?
Colaboratory is a Google research project created to help disseminate machine learning education and research. It’s a Jupyter notebook environment that requires no setup to use and runs entirely in the cloud.Colab Notebooks – Magenta Tensorflow
Yes, you read it correctly. It comes with everything you’ll need for machine learning education and research pre-installed. All you need to do is to start coding, and it’s that simple. So bye-bye to the time we install python, TensorFlow, or anaconda locally.
Is that all? No! It’s not the main reason why I recommend Colab. It’s the hardware performance of the workspace they offer. If you are a student or just a developer you may not be able to afford to buy a high-performance computer with a powerful GPU. And the time to train any AI model depends on the hardware performance. An ordinary lap even with a Nvidia notebook GPU is not enough unless you are patient enough to wait hours till the training process finishes. But worry no more Google Colab comes with a free Tesla K80 GPU which cost about $5000.
So if you dare to compare an ordinary laptop GPU with this giant, here’s how it’s going to end up.
It is sort of like comparing a bullet train to a Toyota Corolla. Tesla K80 has 24GB GDDR5 RAM, the 940m has 4GB GDDR3. The Tesla has a memory bandwidth of 480 GB/sec, the 940m has 14 GB/sec. The Tesla has 5000 CUDA cores, the 940m has 384.Stefan Gebhardt, Chemical Process Engineer
Hope the numbers make sense about what I’m trying to say here. Besides, don’t ignore the fact that it consists of a 12 GB memory of its own.
So if I summarize,
Why Google Colab
- No setup to use, everything pre-installed.
- Saves time.
- Runs entirely in the cloud.
- Powerful GPU and memory.
- Only a browser is needed, can even be accessed via a smartphone.
Let’s get started
- Go to https://colab.research.google.com. You will see the introduction page (it’s better if you can through the introduction).
- Click File -> New Python 3 Notebook. You’ll be asked to sign in to a Google account. Once you have done that you should be in an empty notebook page.
What are we going to build?
We are going to build a simple image classifier to classify images of flowers. The technique we are going to use is called transfer learning, which means we are going to re-train a model that has been previously trained for another problem. The advantage here is that the pre-trained model will have a bit of knowledge on image classification features like colors and edges. But not precisely on classifying flower images, hence accuracy won’t be enough. By re-training the model we can sharpen its filters to classify flower images and save a lot of time because training from scratch would take days.
1. Choose a model
Convolution Neural Networks are the class of deep neural networks, most commonly applied to analyzing visual imagery. Since we are building an image classifier we’ll be needing a good CNN model. TensorFlow contains a lot of ready to use CNN models where we don’t need to worry about implementing the model.
But we are going to use MobileNet, which is a smaller but efficient convolution neural network since large models consume a lot of time to train. MobileNet in that case is just simple and perfect for this. It is trained on the ImageNet Large Visual Recognition Challenge dataset. These models can differentiate between 1,000 different classes, like Dalmatian or dishwasher.
2. Prepare the Dataset
First, we need to download the flower data-set to our Colab workspace.
!curl http://download.tensorflow.org/example_images/flower_photos.tgz \ | tar xz
Paste this code on the Colab notebook and press Shift + Enter.
In Colab notebook we write our codes in cells, and to run a cell we press Shift + Enter.
To distinguish terminal commands from python codes we user ! before the command.
You are doing good if you get this.
Check whether the downloaded folder is there using ls command. The folder consists of 5 classes of flowers Daisy, Dandelion, Roses, Sunflowers, Tulips.
To train a model we need to split the data set into 3 sets as train, test, and validate. I prefer 60% for training, 20% for testing and 20% for validating.
- Training data set is used to train the model.
- Validating data set is used to avoid overfitting which means the model gets overfitted to the training dataset. In such cases, the model gives high accuracy for images in the training data set and when an image that it hasn’t seen before is given the accuracy is low. To avoid this while training, after each cycle the model is trained with all the images in the training dataset and images from the validate dataset and their accuracy is measured. If the accuracy is low, it means the model is overfitted to the training dataset. Then it is retrained and forced not to overfit. And it is import to make the validate dataset consisting of all types of scenarios of the images in the dataset like closeup, wide, background change, etc.
- Testing data set is used to test the final accuracy of the model at the end.
Now we need to split the downloaded dataset into the above-mentioned sets. For this, I use a python package.
!pip install split-folders tqdm
After installing, run the following command
import split_folders split_folders.ratio('flower_photos', output="output", seed=1337, ratio=(.6, .2, .2))
3. Train the model
To train the model I’ll be using Keras Framework. Keras is a high-level neural network API, written in Python and capable of running on top of TensorFlow. So Keras does the boilerplate codes and lets us use the APIs provided which is very easy.
Let’s import the libraries that we’ll need
import split_folders import keras from keras.preprocessing.image import ImageDataGenerator from keras.layers.core import Dense, Flatten from keras.models import Model from keras.optimizers import Adam import numpy as np import itertools from sklearn.metrics import confusion_matrix import matplotlib.pyplot as plt %matplotlib inline
Now import the dataset we prepared
train_path = 'output/train' valid_path = 'output/val' test_path = 'output/test' train_batches = ImageDataGenerator(preprocessing_function=keras.applications.mobilenet.preprocess_input).flow_from_directory(train_path,target_size=(224,224),batch_size=64) valid_batches = ImageDataGenerator(preprocessing_function=keras.applications.mobilenet.preprocess_input).flow_from_directory(valid_path,target_size=(224,224),batch_size=32) test_batches = ImageDataGenerator(preprocessing_function=keras.applications.mobilenet.preprocess_input).flow_from_directory(test_path,target_size=(224,224),batch_size=32,shuffle=False)
Here the ImageDataGenerator does the processing and split the images into batches. I have chosen 64 as the batch size for the training set and 32 for validation and test set. You should get this response if you do it correctly.
Found 2199 images belonging to 5 classes. Found 731 images belonging to 5 classes. Found 740 images belonging to 5 classes.
Import the model
mobile = keras.applications.mobilenet.MobileNet()
You can get a summery of the model using following code.
Modify the model
Update the model to match the requirement. So as I mentioned earlier the MobileNet model is trained to classify 1000 categories. Now in our case, we want the model to classify between 5 flower types. So we need to change the model to give only 5 outputs.
x = mobile.layers[-1].output predictions = Dense(5,activation='softmax')(x) model = Model(inputs=mobile.input,outputs=predictions)
Here we remove the last layer which was set to classify 1000 categories and create our own Dense layer with just 5 categories. So in the future, that’s where you need to change to match the categories you have in your specific scenario. Then we combine it to the model.
Since we are going to retrain the model, we want to preserve some weights and make them fixed. To do that I freeze all the layer except the last 5. So we are actually going to train the last 5 layers only. This has a major impact on the final accuracy of the model. The more layers you train the more accuracy you’ll get but also more time. That’s why I’m using 5. You can use any number up to the max number of layers.
for layer in model.layers[:-5]: layer.trainable=False
Compile the model
Here steps_per_epoch is equal to total training images/training image batch size.
So 2199/64 rounded off is 34. The same applies to validation steps. 731/32 is 22.
Here the epochs means how many times we want to send all the images through the network and train the model. The accuracy will increase the more the epochs you use.
Before you start training, make sure you have enabled GPU runtime for that.
Runtime -> Change runtime type -> Hardware Accelerator -> GPU
Now start training…
You should get an output like this.
Here is the result of last epoch
Epoch 50/50 - 8s - loss: 1.2775 - acc: 0.8974 - val_loss: 1.3544 - val_acc: 0.7296
I’m not satisfied. Lets train again. What happens when you train again is that the model improves. It starts with higher accuracy than last time and tries to improve it more. I’m going to try 1000 epochs this time.
Usually, we stop the training when the accuracy goes 0.99+ and the model stops showing any improvement. So after 1000 epochs, this is the result I got.
Epoch 1000/1000 - 16s - loss: 8.3051e-04 - acc: 1.0000 - val_loss: 0.5433 - val_acc: 0.8709
Now let’s test our model with the test dataset.
predictions = model.predict_generator(test_batches,steps=1,verbose=0) test_labels = test_batches.classes def plot_confusion_matrix(cm, classes, normalize=False, title='Confusion matrix', cmap=plt.cm.Blues): """ This function prints and plots the confusion matrix. Normalization can be applied by setting `normalize=True`. """ plt.imshow(cm, interpolation='nearest', cmap=cmap) plt.title(title) plt.colorbar() tick_marks = np.arange(len(classes)) plt.xticks(tick_marks, classes, rotation=45) plt.yticks(tick_marks, classes) if normalize: cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis] print("Normalized confusion matrix") else: print('Confusion matrix, without normalization') print(cm) thresh = cm.max() / 2. for i, j in itertools.product(range(cm.shape), range(cm.shape)): plt.text(j, i, cm[i, j], horizontalalignment="center", color="white" if cm[i, j] > thresh else "black") plt.tight_layout() plt.ylabel('True label') plt.xlabel('Predicted label') cm = confusion_matrix(test_labels, predictions.argmax(axis=1)) cm_plot_labels = ['Daisy', 'Dandelion', 'Roses', 'Sunflowers', 'Tulips'] plot_confusion_matrix(cm,cm_plot_labels,title='Confusion Matrix')
This code will draw a confusion matrix for the test dataset. It’s a true label vs predicted label matrix. So you can see how many labels are predicted correctly and how many are not.
model.evaluate_generator(test_batches, steps=1, verbose=0)
From this code, you can get an overall evaluation of the model. The first number is the loss and the other is the accuracy.
In my next article, I’ll show you how to save the trained model and use it in mobile applications and web applications.