In this post, I will try to give you an idea of Convolution Neural Networks (CNN) and How it is important in Image Classification. I will use my own image data set for CNN training and test. You can use the MNIST and CIFAR10 data sets for easy purpose. In Online, There is already some tutorials using those data sets.
Basic Data set for test purpose
- MNIST Database: http://yann.lecun.com/exdb/mnist/
- CIFAR10: https://www.cs.toronto.edu/~kriz/cifar.html
WHY CNN in Image Classification
- In my training data set, It is important to get different variations and orientation of all images in the training in machine learning. This can be implemented so easily through CNN using Image Augmentation.
![]() |
| Picture with rotation 0 degree |
![]() |
| Picture with rotation 30 degree |
![]() |
| Picture with rotation 60 degree |
- It is easy to design the layers in a pattern so that every layer independently doing it's feature extraction.
Convolution Neural Network(CNN)
CNN is a form of Feed forward Neural Networks. CNN has 2 parts -
- 1st part is called Feature Extractor layers consists of
- Convolution Layer
- Pooling Layer
- 2nd part consists of fully connected layer which acts as classifier
![]() |
| Convolution Neural Network |
Convolution layer
In a simple thought, Convolution may be weighted sum between two matrix.
- 1st matrix is come from Image dataset
- 2nd matrix is called kernel matrix/filter
In image classification, convolution can be measured at a particular location
, we extract
x
sized matrix from the image.
Note That: You can use any mathematical operation based on scenario. For simplification, I used multiplication. Output Matrix is also called activation maps.
ImageMatrix * KernelMatrix/filter = OutputMatrix
Note That: You can use any mathematical operation based on scenario. For simplification, I used multiplication. Output Matrix is also called activation maps.
Stride
Kernel can be moved over the entire matrix by 1 or 2 or 3 pixel. In above example, I slid the Image Matrix by 2 pixel.
Multiple Filters
You can use multiple filters/kernel matrix. In this example, I used only 1 filter and create only 1 output metrics / activation maps.
Activation Maps/Output
For 2D Image
Scenario 1
Input Image = 32x32
Filter = 3x3
Stride = 2
output/activation maps = 15x15 [Output Matrix size < Input Matrix Size]
Scenario 2
Input Image = 32x32
Filter = 3x3
Stride = 1
output/activation maps = 30x30 [Output Matrix size < Input Matrix Size]
For 3D Image
Scenario 1
Input Image = 32x32x3
Filter = 3x3x3
Stride = 1
output/activation maps = 30x30x1 [Output Matrix size < Input Matrix Size]
Network Model Creation
will be updated...
Training the network
will be updated...
Find loss & Accuracy
will be updated...
Prediction
will be updated...





Comments
Post a Comment