Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dask predict #6412

Merged
merged 2 commits into from Nov 20, 2020
Merged

Fix dask predict #6412

merged 2 commits into from Nov 20, 2020

Conversation

trivialfis
Copy link
Member

Close #6407 .

@hcho3 hcho3 added the Blocking label Nov 19, 2020
@hcho3
Copy link
Collaborator

hcho3 commented Nov 19, 2020

Do you have some performance numbers?

@trivialfis
Copy link
Member Author

Yes. But please let me finish testing on GKE first. It's a bit cumbersome.

@trivialfis
Copy link
Member Author

Simple test, using HIGGS:

Train::Duration 5.479685544967651
0.5290751
Predict::Duration 0.9758050441741943
def dmain():
    with LocalCUDACluster() as cluster:
        print('Dashboard link:', cluster.dashboard_link)
        with Client(cluster) as client:
            dask_df = dask_cudf.read_csv(fname, header=None, names=colnames)
            y = dask_df['label']
            X = dask_df[dask_df.columns.difference(['label'])]
            dtrain = dxgb.DaskDMatrix(client, X, y)
            start = time()
            output = dxgb.train(client, {'tree_method': 'gpu_hist'}, dtrain,
                                num_boost_round=10)
            end = time()
            print('Train::Duration', end - start)

            start = time()
            predictions = dxgb.predict(client, output, dtrain)
            predictions = client.persist(predictions)
            wait(predictions)
            predictions.mean().compute()
            end = time()

            print('Predict::Duration', end - start)
            return output['booster'], predictions


if __name__ == '__main__':
    dmain()

@hcho3
Copy link
Collaborator

hcho3 commented Nov 19, 2020

@trivialfis Can you post the perf number before and after this patch? I want to know if this patch improves performance.

@trivialfis
Copy link
Member Author

Before:

Train::Duration 5.479079008102417
Predict::Duration 1.6077783107757568

The diff should be more significant on platforms with more workers. Right now I'm just using 2 GPUs.

@trivialfis
Copy link
Member Author

trivialfis commented Nov 19, 2020

I have finished testing on GKE. Could you please review?

@codecov-io
Copy link

codecov-io commented Nov 19, 2020

Codecov Report

Merging #6412 (c53479a) into master (c763b50) will decrease coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6412      +/-   ##
==========================================
- Coverage   79.94%   79.92%   -0.03%     
==========================================
  Files          12       12              
  Lines        3476     3472       -4     
==========================================
- Hits         2779     2775       -4     
  Misses        697      697              
Impacted Files Coverage Δ
python-package/xgboost/dask.py 81.00% <100.00%> (-0.15%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c763b50...c53479a. Read the comment docs.

@trivialfis trivialfis merged commit a7b42ad into dmlc:master Nov 20, 2020
@trivialfis trivialfis deleted the fix-dask-predict branch November 20, 2020 02:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dask predictions are done in serial over workers in muitinode case
4 participants