Deep Reorganization (DERO): Retaining Residuals in TinyML

DERO is a simple yet systematic approach that exploits the characteristics and memory allocation behavior of operations to reorganize the residual connections in a network model. DERO maintains the same level of inference peak memory requirement as a plain-style model, while preserving the accuracy and training efficiency of the original model with residuals.

This repository contains the DERO tool and the training scripts used to evaluate the reorganized models. Kindly follow the steps mentioned below to reproducible our results.
All models were trained using eight GTX2080Ti GPUs.

Requirements

Python>=3.7.0
Pytorch>=1.7.1

Initial steps

Clone repo and install requirements.txt in a Python>=3.7.0 environment.

pip install -r requirements.txt

DERO usage

Running the tool to reorganize the residuals

python dero.py --model resnet34 --output-dir <PATH_TO_DERO_OUTPUT> --input-size 224

Training

Training for baseline models:

torchrun --nproc_per_node=8 train.py --model resnet34 --data-path <PATH_TO_DATASET> --amp --output-dir <PATH_TO_MODEL_OUTPUT> -b 64 --wd 0.00004 --random-erase 0.1 --label-smoothing 0.1 --mixup-alpha 0.2 --cutmix-alpha 1.0
                                             resnet50
                                             mcunet_v4
                                             densenet121

Training for DERO models:

torchrun --nproc_per_node=8 train.py --model resnet34dero --data-path <PATH_TO_DATASET> --amp --output-dir <PATH_TO_MODEL_OUTPUT> -b 64 --wd 0.00004 --random-erase 0.1 --label-smoothing 0.1 --mixup-alpha 0.2 --cutmix-alpha 1.0
                                             resnet50dero
                                             mcunet_dero_v4
                                             densenet121_dero

Evaluate

Testing for models:

python train.py --model <MODEL_NAME> --data-path <PATH_TO_DATASET> -b 64 --test-only --weights <PATH_TO_MODEL>

Comparision

	YOLOV5 (Plain)	YOLOV5 (DERO)
Ground truth
Predicted result

DERO model details

Models	Accuracy	Parameters (M)	Training time	Latency (S)	Peak memory (KB)	Architecture
ResNet34(DERO)	72.32%	20.64	24:23:29	167.0	294.0	Orig./DERO
ResNet50(DERO)	75.56%	21.78	25:53:13	169.9	294.0	Orig./DERO
MCUNet(DERO)	55.59%	0.72	17:29:27	6.4	302.5	Orig./DERO
DenseNet(DERO)	71.55%	7.58	32:52:39	73.3	266.4	Orig./DERO
YOLOV5n(DERO)	25.90% (mAP)	1.73	43:22:35	52.1	253.5	Orig./DERO

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
arch		arch
object detection		object detection
pics		pics
visionfordero		visionfordero
README.md		README.md
dero.py		dero.py
licence		licence
presets.py		presets.py
requirements.txt		requirements.txt
sampler.py		sampler.py
train.py		train.py
train_quantization.py		train_quantization.py
transforms.py		transforms.py
utils.py		utils.py

License

EMCLab-Sinica/DERO

Folders and files

Latest commit

History

Repository files navigation

Deep Reorganization (DERO): Retaining Residuals in TinyML

Requirements

Initial steps

DERO usage

Training

Evaluate

Comparision

DERO model details

About

Resources

License

Stars

Watchers

Forks

Languages