Skip to content

Latest commit





Folders and files

Last commit message
Last commit date

parent directory


PyTorch Object Detector reference application on IPUs


Run Object Detector inference on Graphcore IPUs using PyTorch.

The following models are supported in inference:

  1. YOLOv4-P5, implementation of Scaled-YOLOv4: Scaling Cross Stage Partial Network. Original repository.

Folder structure

  • Reference script to run the application.
  • models Models definition.
  • utils Contains common code such as tools and dataset functionalities.
  • configs Contains the default configuration for inference.
  • tests Contains all the different tests for the model.
  • This file.
  • requirements.txt Required Python packages.
  • Test helper functions.

Installation instructions

  1. Prepare the PopTorch environment. Install the Poplar SDK following the Getting Started guide for your IPU system. Make sure to source the scripts for Poplar and PopART and activate a Python virtualenv with PopTorch installed.

  2. Install the pip dependencies:

pip install -r requirements.txt
  1. Download labels and sample images for inference and training:

Download the labels:

curl -L -o && unzip -q -d '/localdata/datasets' && rm

This command might need root access or sudo to unzip the folders.

Download the images:

bash utils/
  1. Build the custom ops:

Running inference

  1. Running inference without the weight:
python will use the default config defined in configs/inference-yolov4p5.yaml which can be overridden by the following options:

  -h, --help            show this help message and exit
  --data DATA           Path to the dataset root dir (default:
  --config CONFIG       Configuration of the model (default:
  --show-config         Show configuration for the program (default: False)
  --weights WEIGHTS     Pretrained weight path to use if specified (default:
  --num-ipus NUM_IPUS   Number of IPUs to use (default: 1)
  --num-workers NUM_WORKERS
                        Number of workers to use (default: 20)
  --input-channels INPUT_CHANNELS
                        Number of channels in the input image (default: 3)
  --activation ACTIVATION
                        Activation function to use in the model (default:
  --normalization NORMALIZATION
                        Normalization function to use in the model (default:
  --num-classes NUM_CLASSES
                        Number of classes of the model (default: 80)
  --class-name-path CLASS_NAME_PATH
                        Path to the class names yaml (default:
  --image-size IMAGE_SIZE
                        Size of the input image (default: 896)
  --micro-batch-size MICRO_BATCH_SIZE
                        The number of samples calculated in one full
                        forward/backward pass (default: 1)
  --mode MODE           Mode to run the model (default: test)
  --half                Half precision (default: False)
  --benchmark           Run performance benchmark (default: False)
  --cpu                 Use cpu to run model (default: True)
  --batches-per-step BATCHES_PER_STEP
                        Number of batches per step (default: 1)
  --class-conf-threshold CLASS_CONF_THRESHOLD
                        Minimum threshold for class prediction probability
                        (default: 0.4)
  --obj-threshold OBJ_THRESHOLD
                        Minimum threshold for the objectness score (default:
  --iou-threshold IOU_THRESHOLD
                        Minimum threshold for IoU used in NMS (default: 0.65)
  --plot-step PLOT_STEP
                        Plot every n image (default: 250)
  --plot-dir PLOT_DIR   Directory for storing the plot output (default: plots)
  --dataset-name DATASET_NAME
                        Name of the dataset (default: coco)
  --max-bbox-per-scale MAX_BBOX_PER_SCALE
                        Maximum number of bounding boxes per image (default:
  --train-file TRAIN_FILE
                        Path to the train annotations (default: train2017.txt)
  --test-file TEST_FILE
                        Path to the test annotations (default: val2017.txt)
  --no-eval             Dont compute the precision recall metrics (default:
  --verbose             Print out class wise eval (default: False)
  1. Running pre-trained model:

To download the pretrained weights, run the following commands:

mkdir weights
cd weights
curl -o yolov4_p5_reference_weights.tar.gz && tar -zxvf yolov4_p5_reference_weights.tar.gz && rm yolov4_p5_reference_weights.tar.gz
cd ..

These weights are derived from the a pre-trained model shared by the YOLOv4's author. We have post-processed these weights to remove the model description and leave a state_dict compatible with the IPU model description.

To run inference with the weights:

python --weights weights/yolov4_p5_reference_weights/


To compute evaluation metrics run:

python --weights '/path/to/your/' --obj-threshold 0.001 --class-conf-threshold 0.001

You can use the --verbose flag if you want to print the metrics per class. Here is a comparison of our metrics against theirs on the COCO 2017 detection validation set:

Model Image Size Type Classes Precision Recall mAP@0.5 mAP@0.5:.95
GPU 896 FP32 all 0.4501 0.76607 0.6864 0.49034
GPU 896 FP16 all 0.44997 0.7663 0.68663 0.49037
IPU 896 FP16 all 0.45032 0.7674 0.68674 0.49159

We generate the numbers for the GPU by re-running the Scaled-YOLOv4 repo code on an AWS instance. Please note that these numbers are slightly different from what they report in their repo. This is attributed to the rect parameter. In their inference, this is set to be True. The IPU currently can not support different sized images, and therefore, we set this to False in their evaluation in order to draw a fair comparison. In that regard, we do perform at par with SOTA.

Running the tests

After following installation instructions run:
