Skip to content

Latest commit






This is a IPU implementation of Faster-RCNN detection framework based on ruotianluo's pytorch-faster-rcnn. This model is based on the original paper "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks":

Currently, we get 80.5(mixed-precision, 4IPUs and FP32 and 8IPUs) versus 80.3(Detectron2, FP32) on VOC, Faster-RCNN C4, Resnet50 training.

NOTE, different SDK versions will influence the mAP slightly.

The instructions to run the model are as follows:

1. Prepare working environment

  1. Go to your local version of the examples repo:
cd public_examples/applications/popart/detection/faster-rcnn
  1. Install requirements
# install python third-party libraries
pip install -r requirements.txt
# make two custom ops named ROI-Align and NMS

2. Download and setup the dataset

You can run bash to install the dataset with one click. Of course, if you are interested in how the dataset is deployed, you can see the step-by-step tutorial below.

NOTE: The data is placed in the ./data by default. If you don’t like to put the data and code together, you can modify the var $DATASETS_DIR in the script, and you also need to modify _C.DATA_DIR in

  1. The VOC PASCAL training, validation and test datasets need to be downloaded:

Extract all of these tars into one directory, which should have the following structure:


        VOC2007/ (from VOCtrainval_06-Nov-2007.tar)
        VOC2012/ (from VOCtrainval_11-May-2012.tar)
  1. generate data reference files (voc_train.txt and voc_test.txt)
python3 ${DATASETS_DIR}/VOCdevkit/VOC2007
python3 ${DATASETS_DIR}/VOCdevkit/VOC2012
  1. merge VOC2007 and VOC2012 We will train on VOC 2007 train+val + VOC 2012 train+val, testing on VOC 2007. This is a popular training method, called 07+12 in many papers. Another 07++12 (testing on 2012 test) requires submitting test data to the official VOC server. Those who are interested can try it by themselves. The mAP of 07+12 should be aournd 80.5 for Faster-RCNN-R50-C4 and Detectron2 is 80.3.

To train on VOC 2007 train+val + VOC 2012 train+val, we need to put them together.

# merge VOC2007 and VOC2012 annotations
mkdir ${DATASETS_DIR}/VOC_annotrainval_2007_2012
cp ${DATASETS_DIR}/VOCdevkit/VOC2007/Annotations_trainval/* ${DATASETS_DIR}/VOC_annotrainval_2007_2012/
cp ${DATASETS_DIR}/VOCdevkit/VOC2012/Annotations_trainval/* ${DATASETS_DIR}/VOC_annotrainval_2007_2012/

# merge VOC2007 and VOC2012 images
mkdir ${DATASETS_DIR}/VOC_images
cp ${DATASETS_DIR}/VOCdevkit/VOC2007/JPEGImages/* ${DATASETS_DIR}/VOC_images/
cp ${DATASETS_DIR}/VOCdevkit/VOC2012/JPEGImages/* ${DATASETS_DIR}/VOC_images/

2. Train and evaluation

  1. get pretrained weights for backbone ResNet-50 and convert to IPU format The pretrained weights of backbone can be downloaded from here( Then make a folder named weights and put in it.
# The script will convert weights from weights/resnet50-caffe.pth to weights/GC_init_weights.pth 

The weights are from Detectron2(, they are licensed under the Apache 2.0

  1. train and evaluate model For multi-batch, 4 IPUs, mixed-presision training on VOC 2007 train+val + VOC 2012 train+val and testing on VOC 2007, you will get mAP around 80.5 on VOC 2007 test. This is a popular training method, called 07+12 in many papers. Another 07++12 (testing on 2012 test) requires submitting test data to the official VOC server. Those who are interested can try it by themselves. The mAP of 07+12 should be aournd 80.5 for Faster-RCNN-R50-C4 and the result on Detectron2 is 80.3.
bash yamls/example_mixed_precision_VOC0712_16batch.yaml

Also you can try FP32 config contained in yamls/ which requires 8 IPUs, and example.yaml is used for debugging. NOTE: Due to the custom ops and complicated network, it takes long time to complie model, but correspondingly, the training speed will be fast.

2. check the throughput of Faster-RCNN training

python3 ${your config}

This script will output Tput of model training to the log.

License information

This application is licensed under Apache License 2.0. Please see the LICENSE file in this directory for full details of the license conditions.

The following files are licensed under MIT license and are derived from the work of Microsoft: ./layer/ ./datasets/ ./datasets/ ./datasets/ ./datasets/

The files contained in following folder are licensed under MIT license and are derived from the work of ONNX Project: ./IPU/custom_ops/include/onnx

The files contained in following folder are licensed under google's license and are derived from the work of Google Inc.: ./IPU/custom_ops/include/google

The following files are licensed under MIT license and are derived from the work of Ross Girshick: ./datasets/

The following files are licensed under MIT license and are derived from the work of Bharath Hariharan: ./datasets/

The following files are licensed under Apache license 2.0 and are derived from the work of TensorFlow and modified by Graphcore Ltd.: ./layer/ ./layer/

The files contained in following folder are licensed under Apache license 2.0 and are derived from the work of RangiLyu: ./nanodata/

The following files are licensed under Apache license 2.0 and are derived from the work of RangiLyu and modified by Graphcore Ltd.: ./nanodata/dataset/ ./nanodata/dataset/

The following files are created by Graphcore Ltd. and are licensed under Apache 2.0: ./ ./ ./ ./ ./keys_mappin.txt ./Makefile ./ ./requirements.txt ./ ./ ./ ./ ./ ./yamls/ ./utils/ ./models/ ./layer/ ./layer/ ./layer/ ./IPU/ ./datasets/ ./tests

opencv-python, pytest, tensorboardX, wandb and PyYAML are licensed under MIT license. pycocotools and torch are licensed under BSD-3-Clause license. onnx is licensed under Apache 2.0 license.

easydict is licensed under LGPL license.