arrow left
Back to Developer Education

    How to Build an Image Classifier with Keras

    How to Build an Image Classifier with Keras

    In this article, we will learn how to use a convolutional neural network to build an image classifier. We will use Keras with TensorFlow at the backend. Image classification helps us recognize and identify images. We apply image classifiers in fields such as healthcare, agriculture, education, surveillance, etc. <!--more--> We will see how we can apply image classifiers in healthcare. Our goal here is to train a CNN model to classify COVID-19 and normal chest X-ray scans of patients.

    The colab notebook for this project is here.

    Table of contents

    • Importing Libraries and Exploring Dataset
    • Data Visualization
    • Data Preprocessing and Augmentation
    • Build CNN Model
    • Compile and Train the Model
    • Model Evaluation
    • Prediction on New Data

    Prerequisites

    Before we begin it would be helpful to have the following understanding on the:

    • Basics of Convolutional Neural Network. I recommend this article by Willies Ogola to get started.
    • Python Programming.
    • Colab Notebook.

    Importing libraries and exploring the dataset

    Let us start by importing the libraries we will be using for this project.

    # Importing Libraries
    import tensorflow as tf
    from tensorflow.keras.models import Sequential
    from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense
    from tensorflow.keras.optimizers import Adam
    from tensorflow.keras.preprocessing.image import ImageDataGenerator
    import numpy as np
    import matplotlib.pyplot as plt
    

    We will create a directory where we can store our model as it trains. Let's check the version of TensorFlow we are using and confirm if we are using a GPU.

    # creating models directory
    import os
    if not os.path.isdir('models'):
      os.mkdir('models')
    # checking TensorFlow version and GPU usage
    print('Tensorflow version:', tf.__version__)
    print('Is using GPU?', tf.test.is_gpu_available())
    

    alt text

    We are using TensorFlow 2.4, but our output is False for GPU. We need to connect to a GPU runtime.

    To do that, we:

    1. Click on Runtime

    alt text

    1. Select Change runtime type

    alt text

    1. Select GPU as the hardware accelerator

    alt text

    1. Click on Save

    Let us run the code above again.

    alt text

    Now our output is True. It indicates that we are using a GPU on Google’s Colab. With a GPU, we can train our model faster.

    Let us clone the GitHub repo containing the dataset.

    # Clone and Explore Dataset
    !git clone https://github.com/education454/datasets.git
    

    alt text

    The dataset is now available in Colab.

    alt text

    Let us set the path to the main directory of our dataset.

    # setting path to the main directory
    main_dir = "/content/datasets/Data"
    

    We also set the paths to the training set and testing set subdirectories.

    # Setting path to the training directory
    train_dir = os.path.join(main_dir, 'train')
    
    # Setting path to the test directory
    test_dir = os.path.join(main_dir, 'test')
    

    Let's set the path to the training covid images and training normal images. We will do the same for testing covid images and testing normal images.

    # Directory with train covid images
    train_covid_dir = os.path.join(train_dir, 'COVID19')
    
    # Directory with train normal images
    train_normal_dir = os.path.join(train_dir, 'NORMAL')
    
    # Directory with test covid image
    test_covid_dir = os.path.join(test_dir, 'COVID19')
    
    # Directory with test normal image
    test_normal_dir = os.path.join(test_dir, 'NORMAL')
    

    Let us put all the images in each directory in a list and print out the first ten file names in each list.

    # Creating a list of filenames in each directory
    train_covid_names = os.listdir(train_covid_dir)
    print(train_covid_names[:10])  # printing a list of the first 10 filenames
    
    train_normal_names = os.listdir(train_normal_dir)
    print(train_normal_names[:10])
    
    test_covid_names = os.listdir(test_covid_dir)
    print(test_covid_names[:10])
    
    test_normal_names = os.listdir(test_normal_dir)
    print(test_normal_names[:10])
    

    alt text

    Let's see the total number of images available in the training set and test set.

    # Printing total number of images present in each set
    print('Total no of images in training set:', len(train_covid_names
                                                    + train_normal_names))
    print("Total no of images in test set:", len(test_covid_names
                                                + test_normal_names))
    

    alt text

    We have 1,811 images in the training set and 484 images in the test set.

    Data visualization

    The image module in the matplotlib package enables us to load, rescale, and display images. We plot a grid of 16 images, 8 Covid19 images, and 8 Normal images.

    # Data Visualization
    import matplotlib.image as mpimg
    # Setting the no of rows and columns
    ROWS = 4
    COLS = 4
    # Setting the figure size
    fig = plt.gcf()
    # get current figure; allows us to get a reference to current figure when using pyplot
    fig.set_size_inches(12, 12)
    

    We create a list containing the path to the directories of eight Covid-19 images in the training set. We do the same for the normal class. Let us merge the two lists to get a total of 16 images, which we will display.

    # get the directory to each image file in the trainset
    covid_pic = [os.path.join(train_covid_dir, filename) for filename in train_covid_names[:8]]
    normal_pic = [os.path.join(train_normal_dir, filename) for filename in train_normal_names[:8]]
    print(covid_pic)
    print(normal_pic)
    # merge covid and normal lists
    merged_list = covid_pic + normal_pic
    print(merged_list)
    

    alt text

    # Plotting the images in the merged list
    for i, img_path in enumerate(merged_list):
        # getting the filename from the directory
        data = img_path.split('/', 6)[6]
        # creating a subplot of images with the no. of rows and colums with index no
        sp = plt.subplot(ROWS, COLS, i+1)
        # turn off axis
        sp.axis('Off')
        # reading the image data to an array
        img = mpimg.imread(img_path)
        # setting title of plot as the filename
        sp.set_title(data, fontsize=6)
        # displaying data as image
        plt.imshow(img, cmap='gray')
        
    plt.show()  # display the plot
    

    alt text

    Data preprocessing and augmentation

    Our dataset has training data and testing data. There is no validation data. We use 20% of our training data for validation.

    We use the training data to train the model. We use the validation data to check the model during training. We use the testing data to test the model after training.

    
    # Data Preprocessing and Augmentation
    # Generate training, testing and validation batches
    dgen_train = ImageDataGenerator(rescale=1./255,
                                    validation_split=0.2,  # using 20% of training data for validation 
                                    zoom_range=0.2,
                                    horizontal_flip=True)
    dgen_validation = ImageDataGenerator(rescale=1./255)
    dgen_test = ImageDataGenerator(rescale=1./255)
    

    Image Augmentation helps us increase the size of our training set. It also helps to reduce overfitting. The ImageDataGenerator() class generates batches of tensor image data with real-time data augmentation.

    We create objects of the class. One is for the training images, where we apply image augmentation. We also create objects for the validation images and the test images.

    We begin by rescaling the images, which normalizes the pixel values of our images. The rescale parameter is for feature scaling. It is vital when training neural networks.

    The validation_split parameter allows us to split a subset of our training data into the validation set. 0.2 means we use 20% of our training set as the validation set. We set zoom_range to 0.2. Finally, we set horizontal_flip to True.

    Now that we have created the objects, we need to connect them to our dataset. The ImageDataGenerator() class has a function called flow_from_directory. The function connects the image augmentation tool to the images in our dataset. It takes the path to the directory and generate batches of augmented data.

    # Awesome HyperParameters!!!
    TARGET_SIZE = (200, 200)
    BATCH_SIZE = 32
    CLASS_MODE = 'binary'  # for two classes; categorical for over 2 classes
    
    # Connecting the ImageDataGenerator objects to our dataset
    train_generator = dgen_train.flow_from_directory(train_dir,
                                                     target_size=TARGET_SIZE,
                                                     subset='training',
                                                     batch_size=BATCH_SIZE,
                                                     class_mode=CLASS_MODE)
    
    validation_generator = dgen_train.flow_from_directory(train_dir,
                                                          target_size=TARGET_SIZE,
                                                          subset='validation',
                                                          batch_size=BATCH_SIZE,
                                                          class_mode=CLASS_MODE)
    test_generator = dgen_test.flow_from_directory(test_dir,
                                                   target_size=TARGET_SIZE,
                                                   batch_size=BATCH_SIZE,
                                                   class_mode=CLASS_MODE)
    

    alt text

    The first argument is the path to the dataset. The next parameter is the target_size. It resizes all the images to the specified target size of 200x200. The batch size defines how many images we want to have in each batch.

    We use a batch size of 32, and the class mode is either binary or categorical. Binary is for two output classes, while categorical is for more than two classes. Since we are working with only two classes, we set the class mode to binary.

    The subset parameter keeps track of the images we use for training and validation from the train_dir. We don’t need the subset parameter for the test generator. We use the subset parameter only if we use validation_split.

    Note: We only apply image augmentation to the training set. We can normalize the validation and test set as well.

    To get the class indices, we use the class_indices attribute.

    # Get the class indices
    train_generator.class_indices
    

    alt text

    We can also get the shape of our image by using the image_shape function.

    # Get the image shape
    train_generator.image_shape
    

    alt text

    Build CNN model

    # Building CNN Model
    model = Sequential()
    model.add(Conv2D(32, (5,5), padding='same', activation='relu',
                    input_shape=(200, 200, 3)))
    model.add(MaxPooling2D(pool_size=(2,2)))
    model.add(Dropout(0.2))
    model.add(Conv2D(64, (5,5), padding='same', activation='relu'))
    model.add(MaxPooling2D(pool_size=(2,2)))
    model.add(Dropout(0.2))
    model.add(Flatten())
    model.add(Dense(256, activation='relu'))
    model.add(Dropout(0.2))
    model.add(Dense(1, activation='sigmoid'))
    model.summary()
    

    The Convolutional layer helps us detect features in our images. We are using 32 filters with the size 5x5 each. We set padding to same so we do not lose the information in the image. We add the ReLU activation function to introduce non-linearity. The input image is 200x200 with three color channels for RGB.

    The MaxPooling layer with pool_size of (2,2) reduces the image size by half. It helps maintain the features of the image. It also reduces the number of parameters, that shortens the training time.

    The Dropout layer helps us avoid overfitting. We use a rate of 0.2. You can learn more about the dropout layer here.

    We add another Convolutional layer with 64 filters and a MaxPooling layer. We also add another Dropout layer.

    The Flatten layer converts the data to a 1D vector. The dense layer is our fully connected layer. We are using 256 nodes with a ReLU activation function. We add another dropout layer, and we create an output layer with one node, using a sigmoid activation function. We are using one node at the output because it is a binary classification problem.

    alt text

    Compile and train the model

    # Compile the Model
    model.compile(Adam(lr=0.001), loss='binary_crossentropy', metrics=['accuracy'])
    

    We compile our model using the Adam optimizer with a learning rate of 0.001. That is also the default learning rate.

    We are using binary_crossentropy loss since we have two classes. If we have more than two classes, then we would use categorical_crossentropy. The loss function gives the measure of our model’s performance. We also keep track of the accuracy while our model trains.

    # Train the Model
    history = model.fit(train_generator,
              epochs=30,
              validation_data=validation_generator,
              callbacks=[
              # Stopping our training if val_accuracy doesn't improve after 20 epochs
              tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', 
                                               patience=20),
              # Saving the best weights of our model in the model directory
            
              # We don't want to save just the weight, but also the model architecture
              tf.keras.callbacks.ModelCheckpoint('models/model_{val_accuracy:.3f}.h5',
                                               save_best_only=True,
                                               save_weights_only=False,
                                               monitor='val_accuracy'
                                                 )
        ])
    

    We fit our model to the train_generator and train for 30 epochs. We set the validation_data parameter to the validation_generator. We also set some callbacks. As the model trains, we track the validation accuracy.

    If the validation accuracy doesn’t improve after 20 epochs, our model will stop training. The save_best_only parameter is set to True. It will save the model with the highest accuracy after each epoch. The save_weights_only parameter is set to False. It stores not only the weights but the whole architecture of the model.

    alt text

    alt text

    We store the best performing model in the models’ directory that we created earlier.

    alt text

    We added the validation accuracy to the name of the model file. It will be easy for us to identify the best model in the directory.

    Our best performing model has a training loss of 0.0366 and a training accuracy of 0.9857. It has a validation loss of 0.0601 and a validation accuracy of 0.9890.

    If you do not get a good validation accuracy, you can increase the number of epochs for training. It’s advisable to get more training data. The more data you have, the more accurate your model will be.

    After training, we get the history object. It contains the progress of the network during training. That is the loss and the metrics we defined when we compiled our model.

    Performance evaluation

    Let us get the keys of the history object.

    # Performance Evaluation
    history.history.keys()
    

    alt text

    The history object stores the values of loss, accuracy, validation loss, and validation accuracy in each epoch.

    We now plot a graph to visualize the training and validation loss after each epoch of training.

    # Plot graph between training and validation loss
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.legend(['Training', 'Validation'])
    plt.title('Training and Validation Losses')
    plt.xlabel('epoch')
    

    alt text

    Let us also plot a graph to visualize the training and validation accuracy after each epoch of training.

    # Plot graph between training and validation accuracy
    plt.plot(history.history['accuracy'])
    plt.plot(history.history['val_accuracy'])
    plt.legend(['Training', 'Validation'])
    plt.xlabel('epoch')
    

    alt text

    Let us see how our model performs on the testing data. Remember, the testing data is the data the model hasn’t seen during the training process.

    # loading the best perfoming model
    model = tf.keras.models.load_model('/content/models/model_0.989.h5')
    
    # Getting test accuracy and loss
    test_loss, test_acc = model.evaluate(test_generator)
    print('Test loss: {} Test Acc: {}'.format(test_loss, test_acc))
    

    alt text

    We get a test loss of approximately 0.0768 and a test accuracy of approximately 0.9731.

    Prediction on new data

    Let us use our model to make predictions on new data. Our new data will be a random image from our test set. We get the path to the test image's directory and pass it to the load_img() function.

    # Making a Single Prediction
    import numpy as np
    from keras.preprocessing import image
    
    # load and resize image to 200x200
    test_image = image.load_img('/content/datasets/Data/test/COVID19/COVID-19 (457).jpg',
                                target_size=(200,200))
    
    # convert image to numpy array
    images = image.img_to_array(test_image)
    # expand dimension of image
    images = np.expand_dims(images, axis=0)
    # making prediction with model
    prediction = model.predict(images)
        
    if prediction == 0:
      print('COVID Detected')
    else:
      print('Report is Normal')
    

    alt text

    The load_img() function from keras.preprocessing.image lets us load images in a PIL format. The first argument is the path to the image, and the second argument resizes our input image. After we load and resize our image, we convert it to a numpy array.

    We also have to expand the dimension of the image. This is because we trained our model with images in a batch, 32 images entering the model at a time. So we create a batch of one image before calling the predict method. The axis parameter of the expand_dims function specifies where we want to add that extra dimension.

    The dimension of the batch is always the first dimension so that the argument will be 0. We now encode the result, so our model will tell us if the scan has Covid or is normal, for instance, instead of 0 or 1.

    Remember, from the class indices, 0 represents a scan with covid, and 1 represents a normal scan. So if the prediction is 0, the patient has covid. Otherwise, the patient is normal. Our model has detected COVID-19 in the patient’s X-ray scan.

    Conclusion

    In this article, we learned how to build an image classifier using Keras. We applied data augmentation to increase the size of our dataset. We were able to visualize our training images. We created a CNN model and trained it to classify Covid-19 chest X-ray scans and normal chest X-ray scans.

    We got a test accuracy of 97% and a loss of 0.0768. We were also able to save our model and load it to make predictions on new data. We cannot use this model in real life because we did not train it on a large dataset.

    Now you can build an image classifier to classify images, for example, a dog and a cat, a car and a bike, etc.

    Thank you for reading!!!

    References

    1. Keras Documentation
    2. Coursera Guided Project

    Peer Review Contributions by: Lalithnarayan C

    Published on: Feb 8, 2021
    Updated on: Jul 15, 2024
    CTA

    Start your journey with Cloudzilla

    With Cloudzilla, apps freely roam across a global cloud with unbeatable simplicity and cost efficiency