How can Tensorflow be used to pre-process the flower training dataset?
The flower dataset present in TensorFlow large catalog of datasets is an extensive collection of images of flowers. There are five classes of flowers present in the dataset namely:
- Daisy
- Dandelion
- Roses
- Sunflowers
- Tulips
Here in this article, we will study how to preprocess the dataset. After the completion of this article, you will be able to alter any large dataset of images. But before that let’s see what is preprocessing and why we need it.
Preprocessing and its need
The Convolutional Neural Network (CNN) model can work with raw images too but the accuracy of the model is quite low and processing the images during model training takes a much greater time than just training the model using the preprocessed images.
Preprocessing is reducing the image’s extra data such as the background and parts that are not necessary for the neural network. It converts the image normally into an image where the important parts are kept and all the unnecessary noise and data are lightened.
Preprocessing the Image data in the Flower Dataset
The images are first scanned normally using the glob function and then the image is loaded using the keras.utils.load_img function which also changes the image’s resolution into 224 cross 244 which is a common input size for most of the sequential models in CNN. Then the loaded image is converted into a numpy array which has only a single axis. Then the numpy array is passed to the preprocess_input function for preprocessing.
Let’s start the code.
Importing Libraries
Initially, we have to import some libraries like:
- NumPy – NumPy arrays are very fast and can perform large computations in a very short time.
- Pandas – Use to load the data frame in a 2D array format.
- Matplotlib – This library is used to draw visualizations.
- Tensorflow – This module will be used to load dataset and preprocess the image.
Python3
# Importing required Modules import os import PIL import glob import pathlib import numpy as np import tensorflow as tf from tensorflow import keras import matplotlib.pyplot as plt from keras.applications.imagenet_utils import preprocess_input from keras.preprocessing import image |
Downloading Dataset
This step is done by using Keras API.
Python3
# Secure google api url for the dataset dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz" data_dir = tf.keras.utils.get_file( 'flower_photos' , origin = dataset_url, untar = True ) data_dir = pathlib.Path(data_dir) print (data_dir) print ( "Flower dataset has been downloaded" ) |
Output:
/root/.keras/datasets/flower_photos Flower dataset has been downloaded
Counting images in directory
Without actually preprocessing the images, we can count total images using glob module.
Python3
# Counting the Images and Opening one image from the dataset print ( "Total number of flower images in the database are:" ) print ( len ( list (glob.glob( str (data_dir) + '/*/*.jpg' )))) print ( "Exploring the flower images inside the dataset:" ) print ( "A Dandelion" ) Dandelion = list (glob.glob( str (data_dir) + '/dandelion/*' )) |
Output:
Total number of flower images in the database are: 3670 Exploring the flower images inside the dataset: A Dandelion
Preprocessing the Image
Final step, in this we will preprocess the image using preprocess_input() function.
Python3
# Preprocessing the Image loaded above img_path = str (Dandelion[ 8 ]) img = tf.keras.utils.load_img(img_path, target_size = ( 224 , 224 )) x = tf.keras.utils.img_to_array(img) x = np.expand_dims(x, axis = 0 ) x = preprocess_input(x) # Displaying the preprocessed and Original Image plt.imshow(x[ 0 , :, :, : 3 ]) plt.show() PIL.Image. open (Dandelion[ 8 ]) |
Output:
The scanned image is converted into a 244 cross 244 pixel image and then it is converted into a numpy array which is then provided as input to the preprocess_input function from the keras.applications.imagenet_utils library. The image is then displayed using the plt function from the matplotlib.pyplot library.
Contact Us