Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using column_sampler for optimization of ColWiseBuildHist #8319

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

razdoburdin
Copy link
Contributor

Hi,
this PR is a part of #7192.
It continues the optimization of BuildHistKernel, which was started at #8233.
In #8233 I removed the optimization related to column_sampler to simplify the reviewing process. Here I return it.

The methodology of the benchmarking is the same as in #8218. The only dataset affected by optimization is santander. Training speed for santander increases twice after applying this optimization. You can find the exact numbers at #8233.

I am looking forward for your review and comments!

@hcho3
Copy link
Collaborator

hcho3 commented Oct 7, 2022

Please ignore the failing MacOS tests from GitHub Actions. We're working to fix it.

@razdoburdin
Copy link
Contributor Author

Hi @trivialfis ,
what is your opinion about this optimization?

@trivialfis
Copy link
Member

Apologies for the slow response, I will get to this tomorrow.

@razdoburdin
Copy link
Contributor Author

Apologies for the slow response, I will get to this tomorrow.

Hi,
have you any decision about this PR?

Copy link
Member

@trivialfis trivialfis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay. May I ask the specific objective of this PR? Are you trying to optimize the case to build for less columns when sampling is enabled?

I'm concerned with optimizations that don't generalize and requires edge cases. I haven't been able to track which test can confirm the correctness of the code.

@@ -284,13 +300,14 @@ void ColsWiseBuildHistKernel(const std::vector<GradientPair> &gpair,
};

const size_t n_features = gmat.cut.Ptrs().size() - 1;
const size_t n_columns = n_features;
const size_t n_columns = kColumnSampling ? fids.size() : n_features;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is fids empty if there's no sampling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fids is empty is case of condition here is false.

@@ -76,6 +84,20 @@ class HistogramBuilder {
buffer_.Reset(this->n_threads_, n_nodes, space, target_hists);
}

constexpr float kColsampleTh = 0.1;
bool column_sampling = (column_sampler_ != nullptr) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When column sampler is nullptr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set it to nullptr in tests of the histogram builder without using this optimization.

@@ -76,6 +84,20 @@ class HistogramBuilder {
buffer_.Reset(this->n_threads_, n_nodes, space, target_hists);
}

constexpr float kColsampleTh = 0.1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 0.1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an ad-hoc threshold value.

constexpr float kColsampleTh = 0.1;
bool column_sampling = (column_sampler_ != nullptr) &&
(train_param_.colsample_bytree < kColsampleTh ||
train_param_.colsample_bylevel < kColsampleTh);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about bynode? Is it used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not now, maybe later, one can investigate these options more deeply.

@razdoburdin
Copy link
Contributor Author

razdoburdin commented Oct 21, 2022

Apologies for the delay. May I ask the specific objective of this PR? Are you trying to optimize the case to build for less columns when sampling is enabled?

yes, that was the idea

I'm concerned with optimizations that don't generalize and requires edge cases. I haven't been able to track which test can confirm the correctness of the code.

It is reasonable. I added a subcase for TestEvaluateSplits that forces using the code with kColumnSampling = true. The idea is to fill fids vector manually. It is enough for pushing the execution to the branch with kColumnSampling = true.

@trivialfis
Copy link
Member

Let me try to look deeper into this PR. The optimization is definitely important but I would like to keep lesser conditional branches than it currently has.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants