Region Proposal in R-CNN family

R-CNN stands for Region-based Convolutional Neural Network. It is a family of machine learning models used for computer vision tasks, specifically object detection. Traditionally, object detection was done by scanning every grid position of an image using different sizes of frames to identify the object’s location and class. Applying CNN on every frame took a very long time. R-CNN reduced this problem. It uses Selective Search to select the candidate region and then applies CNN to each region proposal. However, it was still slow due to the repeated application of CNN on overlapping candidate regions.

Fast R-CNN extracts features by applying convolution layers on the entire image. It then selects CNN features on each region proposal obtained by Selective Search. Thus, Fast R-CNN was more than 200 times faster than R-CNN but the latency due to region proposal using selective search was still high.

Faster R-CNN eliminated the bottleneck due to Selective Search by using a neural network for region proposal. RPN reduced the latency by 10 times and the model could run in real-time. It was proven to be more efficient because it used feature maps, whereas, selective search used raw image pixels. Moreover, it does not add much overhead because the feature maps are shared between RPN and the rest of the network. Refer to the figure below.

Python

def create_faster_rcnn_model(features, scaled_gt_boxes, dims_input, cfg): # Load the pre-trained classification # net and clone layers base_model = load_model(cfg['BASE_MODEL_PATH']) conv_layers = clone_conv_layers(base_model, cfg) fc_layers = clone_model(base_model, [cfg["MODEL"].POOL_NODE_NAME], [cfg["MODEL"].LAST_HIDDEN_NODE_NAME], clone_method=CloneMethod.clone) # Normalization and conv layers feat_norm = features - Constant([[[v]] for v in cfg["MODEL"].IMG_PAD_COLOR]) conv_out = conv_layers(feat_norm) # RPN and prediction targets rpn_rois, rpn_losses = create_rpn(conv_out, scaled_gt_boxes, dims_input, cfg) rois, label_targets, \ bbox_targets, bbox_inside_weights = \ create_proposal_target_layer(rpn_rois, scaled_gt_boxes, cfg) # Fast RCNN and losses cls_score, \ bbox_pred = create_fast_rcnn_predictor(conv_out, rois, fc_layers, cfg) detection_losses = create_detection_losses(...) loss = rpn_losses + detection_losses pred_error = classification_error(cls_score, label_targets, axis=1) return loss, pred_error

The bounding boxes around the objects or the RoIs proposed by RPN will look like the image shown below. With the right training data, the model gives good accuracy and precision.

Region Proposal Network (RPN) in Object Detection

In recent times Object Detection Algorithms have evolved manifold and this has led to many advancements in the applications which helped us solve real-world problems with the utmost efficiency and latency of real-time. In this article, we will look a Region Proposal Networks which serve as an important milestone in the advancements of Object Detection Algorithms.

Table of Content

  • What is Object Detection?
  • Region Proposal in R-CNN family
  • Working of Region Proposal Network (RPN)

Similar Reads

What is Object Detection?

...

Region Proposal in R-CNN family

Object Detection is a computer vision technique that is used for locating objects in a digital image or video, and identifying (or classifying) them. It can be done using single-stage approaches as well as two-stage. Each approach has its pros and cons. Typically, the two stages of object detection are:...

Working of Region Proposal Network (RPN)

R-CNN stands for Region-based Convolutional Neural Network. It is a family of machine learning models used for computer vision tasks, specifically object detection. Traditionally, object detection was done by scanning every grid position of an image using different sizes of frames to identify the object’s location and class. Applying CNN on every frame took a very long time. R-CNN reduced this problem. It uses Selective Search to select the candidate region and then applies CNN to each region proposal. However, it was still slow due to the repeated application of CNN on overlapping candidate regions....