Image Classification using Convolution Neural Networks in Keras using own datasets

In this post, I will try to give you an idea of Convolution Neural Networks (CNN) and How it is important in Image Classification. I will use my own image data set for CNN training and test. You can use the MNIST and CIFAR10 data sets for easy purpose. In Online, There is already some tutorials using those data sets.


Basic Data set for test purpose


  1. MNIST Database: http://yann.lecun.com/exdb/mnist/
  2. CIFAR10: https://www.cs.toronto.edu/~kriz/cifar.html

WHY CNN in Image Classification

  • In my training data set, It is important to get different variations and orientation of all images in the training in machine learning. This can be implemented so easily through CNN using Image Augmentation.
Picture with rotation 0 degree
Picture with rotation 30 degree
Picture with rotation 60 degree
  • It is easy to design the layers in a pattern so that every layer independently doing it's feature extraction.

Convolution Neural Network(CNN)

CNN is a form of Feed forward Neural Networks. CNN has 2 parts - 
  • 1st part is called Feature Extractor layers consists of
    • Convolution Layer
    • Pooling Layer
  • 2nd part consists of fully connected layer which acts as classifier
Convolution Neural Network



Convolution layer

In a simple thought, Convolution may be weighted sum between two matrix. 
  • 1st matrix is come from Image dataset
  • 2nd matrix is called kernel matrix/filter
In image classification, convolution can be measured at a particular location (x, y), we extract k x k sized matrix from the image


ImageMatrix * KernelMatrix/filter = OutputMatrix

Note That: You can use any mathematical operation based on scenario. For simplification, I used multiplication. Output Matrix is also called activation maps.




Stride

Kernel can be moved over the entire matrix by 1 or 2 or 3 pixel. In above example, I slid the Image Matrix by 2 pixel.

Multiple Filters

You can use multiple filters/kernel matrix. In this example, I used only 1 filter and create only 1 output metrics / activation maps.

Activation Maps/Output

For 2D Image

Scenario 1

Input Image = 32x32
Filter = 3x3
Stride = 2
output/activation maps = 15x15 [Output Matrix size < Input Matrix Size]

Scenario 2

Input Image = 32x32
Filter = 3x3
Stride = 1
output/activation maps = 30x30 [Output Matrix size < Input Matrix Size]

For 3D Image

Scenario 1

Input Image = 32x32x3
Filter = 3x3x3
Stride = 1
output/activation maps = 30x30x1 [Output Matrix size < Input Matrix Size]


Network Model Creation

will be updated...

Training the network

will be updated...

Find loss & Accuracy

will be updated...

Prediction

will be updated...

Comments