Object Detection with TensorFlow Object Detection API
Object Detection is a computer vision technique with which, we will identify and locate the objects in an image or video. With images containing specific class for objects as input, we can train Deep learning models, to get a solution that can identify the given set of classes in completely new image.
This industry has seen huge advancement in the field of Object Detection and Classification. In this article we will look into a simple way one can implement TensorFlow’s Object Detection for object detection task.
We will setup the complete API on Google Colab. Google Colab runs on Linux based hosted machine and hence it is easy to follow the TFOD API steps there. For running a notebook, we can use GPU compute option that helps us train the model much faster compared to that of conventional RAM based systems [ Note : Free account on Colab lets you use GPU option for 12 hours only and hence for more computer time you can anytime switch to Colab Pro ]. Lets move towards the installation process without wasting much time.
Installing/upgrading the required libraries and dependencies
To start with the TFOD installation, you can open a fresh notebook, and select the runtime type as GPU from Runtime option. This will assign a GPU for your notebook once you click on connect button.
You can verify the same with the help of below command;
Once you are sure about the GPU, now we can start with installing the required packages for TFOD in below order:
Cloning the TFOD Github Repository
With the below command, we can clone the github repository of Tensorflow Object Detection API.
With this a folder named models will be created in current path as below
Once finished, we can install protoc. TFOD API uses protobufs to configure models and training parameters. To utilize the framework, it is mandatory to have protoc installed in current environment.
Installation of COCO API
For Tensorflow 2.x, there is dependency of pycocotools for TF Object Detection API. To satisfy this requirement, we can clone the git repository using below command
Once you are done with all above steps, which installed
- Tensorflow for GPU
- Cloning of Tensorflow Object Detection Git Repository
- Protobuf installation
- COCO API cloning
We can now test the installation process using below command;
Once the execution is finished, the cell output should show something like this.
Creating folder structure for API
Now, we will create a folder structure where we will keep a base annotation file, images for training and testing, python files that will run some execution.
You can download .py files from my git repo. Also, file under annotations/label_map.pbtxt will also be available in the same repo. Make sure to edit the file as per your problem [ file contains id, name pair corresponding to the class present in your use case].
Now select best suitable model as per your requirement. This can be selected from this link and reference of the same can be added in below cell. For this run, I have selected “centernet_resnet101_v1_fpn” model.
Note: When the model is loaded into current file system, you can see the folder named checkpoint under the model folder structure. Change the name of this Checkpoint folder (I renamed it as Old_Checkpoint). When we run a cell to initiate a model training, the current execution will try to create a file named Checkpoint which will store the model checkpoint. Since path already contains the folder named Checkpoint, the cell will observe exception.
Once the model is done, make sure to be in the path of training_demo and run generate_tfrecord.py with below mentioned input parameters
This will create train.record and test.record in training_demo/annotations folder.
After completion of all file uploads, the folder structure should look like this
Changes in the Default Model Configuration
Before we proceed for training, we need to make some changes in model configuration file. But first, lets copy the file from main model folder into different location with folder named models.
Now change things in below order:
- num_classes : correspond to number of classes in your dataset
- batch_size : corresponds to batch of images to consider for training
- num_steps : this is similar to maximum number of iterations/epochs
- total_steps, warmup_steps : these parameters comes under optimizer section. One should keep same total_steps as that of num_steps but the warmup_steps should be less that or equal to that of total_steps. Lets say, you keep 100 as your warmup_steps, the you will see the performance of the model after every 100 iterations.
- Under the same section of optimizer, you can see fine_tune_checkpoint and fine_tune_checkpoint_type. You can copy a checkpoint file path from the model that you cloned from git repo (in my case, I copied it from ~/training_demo/pre-trained-models/centernet_resnet101_v1_fpn_512x512_coco17_tpu-8/old_checkpoint/ckpt-0 and remove extension .indx).
- fine_tune_checkpoint_type : If you are performing classification then fine_tune_checkpoint_type should be kept as “classification” and in case of object detection then your tag should be “detection”.
- label_map_path : In the end, you can copy path of ~/training_demo/annotations/label_map.pbtxt and paste for label_map_path parameter of train_input_reader and eval_input_reader respectively.
- input_path for train/test.record : You can copy path of ~/training_demo/annotations/train.record and paste in the input_path parameter of train_input_reader and test.record path into input_path parameter of eval_input_reader.
With the all setup for training the Object Detection API is complete. And now you are few steps away from training the model. But wait…!!! Lets have a look at possible errors that I encountered.
Avoid Exception in Training Execution
Below errors I observed after I started the model training.
- OpenCV package version mismatch
You can encounter error (ImportError: cannot import name ‘_registerMatType’ from ‘cv2.cv2’) if there is mismatch between all opencv packages. One can check the same and fix accordingly as follows:
2. CuDNN Error on Compatibility
Since the model I used was compiled to run on CuDNN with version 8.1.0 whereas if the environment is not updated, you can find the current CuDNN version to be 8.0.5 in exception message (Loaded runtime CuDNN library: 8.0.5 but source was compiled with: 8.1.0. CuDNN library needs to have matching major version and equal or higher minor version). Below command will let you fix that package discrepancy in no time
Model API Training
Finally…….!!!
After going through lot of checks, its time for us to hit that Run Cell button to initiate the training. Haha…!!!
But before that, one more check. Be sure to be in ~/training_demo directory before initiating the model training as below.
cloned_model_folder : Give the folder path of the model that you cloned.
path fo pipeline.config : path of pipeline.config where we made the necessary changes of num_classes, batch_size, num_steps, etc.
With the model training, you will observe step number and corresponding losses in the cell output.
Here are some of the inference results of model trained on mask detection dataset using TFOD API. The model returns a bounding box and the confidence score of class that it detects in the image.
The images of people not wearing proper mask are comparatively less than that of other two classes and hence not so accurate results were obtained for mask_weared_incorrect class. The performance can be improved using equal images per class.
You can refer the actual IPython Notebook from my github repo for more understanding.
Finally… If this article helped you in setting up ,complete environment for TensorFlow Object Detection API, do upvote this article.
Stay Safe. Happy Coding…!!!