Deep learning (also known as deep structured learning or hierarchical learning) is part of a broader family of machine learning methods based on learning data representations, as opposed to task-specific algorithms. Learning can be supervised, semi-supervised or unsupervised.
Deep learning architectures such as deep neural networks, deep belief networks, convolutional neural network and recurrent neural networks have been applied to fields including computer vision, speech recognition, natural language processing,image processing, audio recognition, social network filtering, machine translation, bioinformatics, drug design and board game programs, where they have produced results comparable to and in some cases superior to human experts. Deep learning models are vaguely inspired by information processing and communication patterns in biological nervous systems yet have various differences from the structural and functional properties of biological brains (especially human brains), which make them incompatible with neuroscience evidences. Because of the rapid advancement in computer and the development of sophisticated algorthm, computer can compute and predict somthing samrter than ever before. When for the first time competion was released for image processing it was thought that more than 50% accuracy will not be possible. But beacuse of some breakthrough algorithms such as CNN, RNN and evolving of deep learning image recognition and image processing drastically increased the prediction. And today’s coputer vision can predict with 99.0% of accuracy.
2.0 Project Objectives
- To observe students dress up among the IIUM community
- To collect pictures of students
- To learn about how to collect data
- T0 learn about data preprocessing and data augmentation
- To apply deep learning for imgage classification
- To build deep learning model and evaluate it.
- To learn how to use deep learning libray to solve real world problem such as image classification
- To dive deep into deep learning field and gather knowledge.
- Obtain practical experience with machine learning
3.0 Expected output/ Result
From our collected data which are images of students, we will classify it into two parts. One classification is to put together all images of students whose dress up are align with university dresscode. The images of students whose dresscode are not align with the university dresscode go to the category of another classification. Our goal is to classify it using convolutional nueral retwork. Once the model is built we will evaluate it using validation set of our data. As we use binary classification in our model, our model will return in range between 0 and 1 as output whereas the greater score represent that the student is well dressed up.
4.0 Gathering Knowledge
To Create the our model, we have used keras deep learning library. We have used most of the effective features of keras for image classification. For better understading of the keras features, such as models, layer, Activation functions, we have picked up some of the explanation from the keras documentation which goes as follows.
4.1 What is Keras?
Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.
4.2 Keras Sequential model
The core data structure of Keras is a model, a way to organize layers. The simplest type of model is the Sequential model, a linear stack of layers
4.3 Keras Layer
keras.layers.Conv2D(filters, kernel_size, strides=(1, 1), padding=’valid’, data_format=None, dilation_rate=(1, 1), activation=None, use_bias=True, kernel_initializer=’glorot_uniform’, bias_initializer=’zeros’, kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None) 2D convolution layer (e.g. spatial convolution over images).
This layer creates a convolution kernel that is convolved with the layer input to produce a tensor of outputs. If use_bias is True, a bias vector is created and added to the outputs. Finally, if activation is not None, it is applied to the outputs as well.
When using this layer as the first layer in a model, provide the keyword argument input_shape (tuple of integers, does not include the batch axis), e.g. input_shape=(128, 128, 3) for 128x128 RGB pictures in data_format=”channels_last”.
ReLU keras.layers.ReLU(max_value=None, negative_slope=0.0, threshold=0.0) Rectified Linear Unit activation function.
With default values, it returns element-wise max(x, 0).
Otherwise, it follows: f(x) = max_value for x >= max_value, f(x) = x for threshold <= x < max_value, f(x) = negative_slope * (x - threshold) otherwise.
Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.
Same shape as the input.
MaxPooling2D keras.layers.MaxPooling2D(pool_size=(2, 2), strides=None, padding=’valid’, data_format=None) Max pooling operation for spatial data. Input shape
If data_format=’channels_last’: 4D tensor with shape: (batch_size, rows, cols, channels) If data_format=’channels_first’: 4D tensor with shape: (batch_size, channels, rows, cols) Output shape
If data_format=’channels_last’: 4D tensor with shape: (batch_size, pooled_rows, pooled_cols, channels) If data_format=’channels_first’: 4D tensor with shape: (batch_size, channels, pooled_rows, pooled_cols)
5.0 Gooing Deep
5.1 Importing Modules
from keras.preprocessing.image import ImageDataGenerator from keras.models import Sequential from keras.layers import Conv2D, MaxPooling2D from keras.layers import Activation, Dropout, Flatten, Dense from keras import backend as K from keras.models import load_model import numpy as np from PIL import Image from keras.preprocessing import image from keras.utils import plot_model from IPython.display import SVG from keras.utils.vis_utils import model_to_dot import matplotlib.pyplot as plt import os os.environ["PATH"] += os.pathsep + 'C:/Program Files (x86)/Graphviz2.38/bin/'
5.2 Data pre-processing and data augmentation
In order to make the most of our few training examples, we will “augment” them via a number of random transformations, so that our model would never see twice the exact same picture. This helps prevent overfitting and helps the model generalize better.
In Keras this can be done via the keras.preprocessing.image.ImageDataGenerator class.
datagen = ImageDataGenerator( rotation_range=40, width_shift_range=0.2, height_shift_range=0.2, rescale=1./255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True, fill_mode='nearest')
# dimensions of our images img_width, img_height = 150, 150 train_data_dir = 'data/train' validation_data_dir = 'data/validation' nb_train_samples = 65 nb_validation_samples = 12 epochs = 50 batch_size = 16
if K.image_data_format() == 'channels_first': input_shape = (3, img_width, img_height) else: input_shape = (img_width, img_height, 3)
5.3 Training a small convnet
In our case we will use a very small convnet with few layers and few filters per layer, alongside data augmentation and dropout. Dropout also helps reduce overfitting, by preventing a layer from seeing twice the exact same pattern, thus acting in a way analoguous to data augmentation (you could say that both dropout and data augmentation tend to disrupt random correlations occuring in your data).
The code snippet below is our first model, a simple stack of 3 convolution layers with a ReLU activation and followed by max-pooling layers.
model = Sequential() model.add(Conv2D(32, (3, 3), input_shape=input_shape)) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3))) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3))) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten()) model.add(Dense(64)) model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(1)) model.add(Activation('sigmoid')) model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
train_datagen = ImageDataGenerator( rescale=1. / 255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory( train_data_dir, target_size=(img_width, img_height), batch_size=batch_size, class_mode='binary')
Found 130 images belonging to 2 classes.
validation_generator = test_datagen.flow_from_directory( validation_data_dir, target_size=(img_width, img_height), batch_size=batch_size, class_mode='binary')
Found 24 images belonging to 2 classes.
model.fit_generator( train_generator, steps_per_epoch=nb_train_samples, epochs=epochs, validation_data=validation_generator, validation_steps=nb_validation_samples) model.save('dressClassifier2.h5')
Epoch 1/50 65/65 [==============================] - 194s 3s/step - loss: 0.6203 - acc: 0.7259 - val_loss: 0.3312 - val_acc: 0.8750 ..... Epoch 48/50 65/65 [==============================] - 103s 2s/step - loss: 0.0021 - acc: 0.9990 - val_loss: 4.1991 - val_acc: 0.6667 Epoch 49/50 65/65 [==============================] - 102s 2s/step - loss: 0.0049 - acc: 0.9971 - val_loss: 3.4218 - val_acc: 0.7500 Epoch 50/50 65/65 [==============================] - 95s 1s/step - loss: 1.4678e-05 - acc: 1.0000 - val_loss: 3.3415 - val_acc: 0.7917
Now that our model has trained and we can see the accuracy of our model with our validation data set. We can test our model anywhere we want. As our model will be not go in production level. So we test it locally. In order to test it need to load the model. At the same time we need to preprocessing our images before we can eventually test it.
#laoding model dressPredictor = load_model('dressClassifier2.h5')
test_image = image.load_img('data/test/t5.png', target_size = (150, 150)) test_image
test_image = image.img_to_array(test_image) test_image = np.expand_dims(test_image, axis = 0) result = dressPredictor.predict(test_image) print(result) if result == 1: prediction = 'good_Attire' else: prediction = 'bad_Attire' print('Flag: ',prediction)
[[0.]] Flag: bad_Attire
7.0 Model Visualization
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_1 (Conv2D) (None, 148, 148, 32) 896 _________________________________________________________________ activation_1 (Activation) (None, 148, 148, 32) 0 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 74, 74, 32) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 72, 72, 32) 9248 _________________________________________________________________ activation_2 (Activation) (None, 72, 72, 32) 0 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 36, 36, 32) 0 _________________________________________________________________ conv2d_3 (Conv2D) (None, 34, 34, 64) 18496 _________________________________________________________________ activation_3 (Activation) (None, 34, 34, 64) 0 _________________________________________________________________ max_pooling2d_3 (MaxPooling2 (None, 17, 17, 64) 0 _________________________________________________________________ flatten_1 (Flatten) (None, 18496) 0 _________________________________________________________________ dense_1 (Dense) (None, 64) 1183808 _________________________________________________________________ activation_4 (Activation) (None, 64) 0 _________________________________________________________________ dropout_1 (Dropout) (None, 64) 0 _________________________________________________________________ dense_2 (Dense) (None, 1) 65 _________________________________________________________________ activation_5 (Activation) (None, 1) 0 ================================================================= Total params: 1,212,513 Trainable params: 1,212,513 Non-trainable params: 0 _________________________________________________________________