Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch UpdatePosition using cudaMemcpy #7964

Merged
merged 72 commits into from Jun 30, 2022

Conversation

RAMitchell
Copy link
Member

Alternative version of #7925 using cudaMemcpy and an unstable row partitioning scheme using warp aggregated atomics.

Considerably faster than master branch in row partitioning, but is unstable i.e. produces random orderings of rows in the partitioner. I think this is not a problem as the only consumer of the row partitions is the histogram kernel, which doesn't care about the ordering.

Will follow-up with some benchmarking and more extensive tests, in particular checking that we still have determinism.

src/tree/gpu_hist/row_partitioner.cu Show resolved Hide resolved
src/tree/updater_gpu_hist.cu Outdated Show resolved Hide resolved
@trivialfis
Copy link
Member

Overall looks good to me. Some small nitpicks. The optimization looks exciting!

@hcho3 hcho3 closed this Jun 27, 2022
@hcho3 hcho3 reopened this Jun 27, 2022
@hcho3 hcho3 closed this Jun 27, 2022
@hcho3 hcho3 reopened this Jun 27, 2022
@hcho3 hcho3 closed this Jun 27, 2022
@hcho3 hcho3 reopened this Jun 27, 2022
@hcho3 hcho3 closed this Jun 27, 2022
@hcho3 hcho3 reopened this Jun 27, 2022
@hcho3 hcho3 closed this Jun 27, 2022
@hcho3 hcho3 reopened this Jun 27, 2022
@hcho3
Copy link
Collaborator

hcho3 commented Jun 27, 2022

Sorry for the noise. I finally restored Jenkins back to health, and now builds jobs will launch automatically.

Copy link
Member

@trivialfis trivialfis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. Some questions are in the comments.

src/tree/updater_gpu_hist.cu Show resolved Hide resolved
src/common/device_helpers.cuh Show resolved Hide resolved
src/tree/gpu_hist/row_partitioner.cu Show resolved Hide resolved
@RAMitchell RAMitchell merged commit bc4f802 into dmlc:master Jun 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants