Skip to content

Latest commit

 

History

History
49 lines (39 loc) · 2.06 KB

README.md

File metadata and controls

49 lines (39 loc) · 2.06 KB

Results below are obtained before AutoTVM was introduced.

When AutoTVM was merged, the TVM Direct convolution schedule was replaced with a completely new one.

So the "TVM Direct" results below do not represent current performance of TVM. It should be much better now.


All numbers in msec.

TVM Direct convolution cannot handle specific inputs shape. These are denoted as N/A.

R9 Nano

(batch,CI,size,CO) TVM Winograd (This code) TVM Direct MIOpen Winograd
(1, 128, 122, 128) 1.174 N/A 0.433
(1, 128, 128, 128) 1.139 1.404 0.437
(1, 64, 56, 64) 0.136 0.349 0.044
(1, 64, 64, 32) 0.173 0.743 0.044
(1, 64, 224, 64) 1.233 1.865 0.373
(1, 64, 112, 128) 0.650 1.126 0.206
(1, 512, 28, 512) 0.747 N/A 0.422
(1, 128, 28, 128) 0.179 0.218 0.066
(1, 256, 14, 256) 0.316 0.463 0.109
(8, 128, 122, 128) 9.288 N/A 4.204
(16, 64, 56, 64) 1.397 N/A 0.377
(32, 64, 64, 32) 5.732 N/A 0.473
(64, 128, 32, 128) 5.980 N/A 2.243

GTX1070 Ti

(batch,CI,size,CO) TVM Winograd (This code) TVM Direct TVM Winograd NVPTX (This code) TVM Direct NVPTX cuDNN Winograd
(1, 128, 122, 128) 1.668 2.272 1.099 N/A 0.391
(1, 128, 128, 128) 0.947 1.740 0.950 1.797 0.395
(1, 64, 56, 64) 0.180 0.165 0.116 0.162 0.038
(1, 64, 64, 32) 0.114 0.480 0.078 0.481 0.025
(1, 64, 224, 64) 1.252 1.817 1.238 1.722 0.362
(1, 64, 112, 128) 0.759 1.236 0.556 1.216 0.187
(1, 512, 28, 512) 1.103 N/A 0.654 N/A 0.603
(1, 128, 28, 128) 0.124 0.120 0.074 0.117 0.043
(1, 256, 14, 256) 0.182 0.487 0.118 0.332 0.098
(8, 128, 122, 128) 11.175 N/A 7.644 N/A 3.293
(16, 64, 56, 64) 1.783 N/A 1.330 N/A 0.391
(32, 64, 64, 32) 2.849 N/A 2.300 N/A 0.531
(64, 128, 32, 128) 3.707 N/A 3.651 N/A 1.589