- ranking metric acceleration on the gpu #5398

sriramch · 2020-03-08T19:18:10Z

this is the last part of #5326 that has been split. the performance numbers are here

please note the following:

metrics applicable to only ranking datasets are fully accelerated on gpu - map, ndcg, pre
metrics (auc[pr]) applicable to ranking and non-ranking datasets work thusly:
- these metrics when computed on non-ranking datasets aren't accelerated - i.e. they still run on cpu, but optimized on cpu (hence, should be better than what we had before - see here)
  - this can be worked on as a follow-up pr
- these metrics when computed on ranking datasets are semi accelerated - meaning, processing multiple groups are still in parallel, but there is a linear iteration of predictions within each group to bucketize them
  - note: it is still much better than the version that runs on the cpu though (~ 6x-8x eval time improvement), even for datasets that have large number of elements/group and smaller group cardinality (which are atypical for ranking datasets). i tried a couple of 100 groups with ~250k items/group, and it still performed well

@RAMitchell @trivialfis - please review.

- this is the last part of dmlc#5326 that has been split - this also includes dmlc#5387

…k_metric

mli · 2020-03-08T20:28:15Z

Codecov Report

Merging #5398 into master will not change coverage by %.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #5398   +/-   ##
=======================================
  Coverage   84.07%   84.07%           
=======================================
  Files          11       11           
  Lines        2411     2411           
=======================================
  Hits         2027     2027           
  Misses        384      384

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a38e7bd...810e537. Read the comment docs.

…k_metric

- this will *significantly* help train non ranking datasets that uses the auc metric (which i hear is quite popular!) - i'll post the perf. numbers shortly

sriramch · 2020-03-10T20:30:13Z

auc metric performance numbers

test environment

1 socket
6 cores/socket
2 threads/core
80 gb system memory
v100 gpu

test

uses all cpu threads
builds 100 trees
metric eval times are reported below

results

no additional gpu memory was used
all times are in seconds
mortgage dataset used 60m training instances (to be able to fit into the available gpu memory)

dataset	eval time master	eval time this pr
higgs	70.79	1.42
mortgage	266.98	30.19

This reverts commit a38e7bd.

src/metric/rank_metric.cu

This reverts commit 6a85632.

…k_metric

sriramch · 2020-03-18T12:44:51Z

@trivialfis i would appreciate your review when you get a chance.

…k_metric

trivialfis

LGTM! Sorry for the long wait. Previously I mentioned with @RAMitchell that maybe simple functions are more suitable for implementing the GPU metrics as I think the registry is just too tricky and unnecessary. But I won't block the PR for this as we can refactor them later when needed (like implementing other metrics).

…k_metric

…eve memory pressure

sriramch added 3 commits March 6, 2020 20:57

- ranking metric accelaration on the gpu

604c63b

- this is the last part of dmlc#5326 that has been split - this also includes dmlc#5387

Merge branch 'master' of https://github.com/dmlc/xgboost into gpu_ran…

450c2f7

…k_metric

Merge branch 'master' of https://github.com/dmlc/xgboost into gpu_ran…

810e537

…k_metric

sriramch added 2 commits March 10, 2020 16:18

Merge branch 'master' of https://github.com/dmlc/xgboost into gpu_ran…

06f268e

…k_metric

- accelerate the auc metric on gpu

987b241

- this will *significantly* help train non ranking datasets that uses the auc metric (which i hear is quite popular!) - i'll post the perf. numbers shortly

Revert "Sketching from adapters (dmlc#5365)"

6a85632

This reverts commit a38e7bd.

RAMitchell approved these changes Mar 10, 2020

View reviewed changes

src/metric/rank_metric.cu Outdated Show resolved Hide resolved

sriramch added 10 commits March 11, 2020 14:44

Revert "Revert "Sketching from adapters (dmlc#5365)""

2bc60f9

This reverts commit 6a85632.

Merge branch 'master' of https://github.com/dmlc/xgboost into gpu_ran…

e4a4803

…k_metric

- fix bugs and add tests to detect them

3c319ed

- use sensible labels in test

dac5a95

- add more exception scenario tests

5e70e6c

Merge branch 'master' of https://github.com/dmlc/xgboost into gpu_ran…

82b58ad

…k_metric

Merge branch 'master' of https://github.com/dmlc/xgboost into gpu_ran…

43f411d

…k_metric

Merge branch 'master' of https://github.com/dmlc/xgboost into gpu_ran…

0f04285

…k_metric

Merge branch 'master' of https://github.com/dmlc/xgboost into gpu_ran…

d754882

…k_metric

Merge branch 'master' of https://github.com/dmlc/xgboost into gpu_ran…

5ee54ba

…k_metric

Merge branch 'master' of https://github.com/dmlc/xgboost into gpu_ran…

e74fd13

…k_metric

trivialfis approved these changes Mar 19, 2020

View reviewed changes

sriramch added 4 commits March 19, 2020 23:02

Merge branch 'master' of https://github.com/dmlc/xgboost into gpu_ran…

509d299

…k_metric

Merge branch 'master' of https://github.com/dmlc/xgboost into gpu_ran…

3290cc8

…k_metric

- release intermediary vectors to free up space for more ones to reli…

8416aab

…eve memory pressure

- ignore lint error

d502df4

RAMitchell merged commit d2231fc into dmlc:master Mar 22, 2020

lock bot locked as resolved and limited conversation to collaborators Jun 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

- ranking metric acceleration on the gpu #5398

- ranking metric acceleration on the gpu #5398

sriramch commented Mar 8, 2020 •

edited

mli commented Mar 8, 2020

sriramch commented Mar 10, 2020

sriramch commented Mar 18, 2020

trivialfis left a comment

- ranking metric acceleration on the gpu #5398

- ranking metric acceleration on the gpu #5398

Conversation

sriramch commented Mar 8, 2020 • edited

mli commented Mar 8, 2020

Codecov Report

sriramch commented Mar 10, 2020

auc metric performance numbers

test environment

test

results

sriramch commented Mar 18, 2020

trivialfis left a comment

Choose a reason for hiding this comment

sriramch commented Mar 8, 2020 •

edited