Partition optimization #7208

ShvetsKS · 2021-09-02T20:43:56Z

it's part of #7192

Current implementation of partition allow to read bin matrix data with less random access

full train	airline, ~100m	higgs, ~10m
master, s	403	53.7
this PR, s	170s	32.5
speedup	2.37	1.65

codecov-commenter · 2021-09-03T06:08:51Z

Codecov Report

Merging #7208 (137183e) into master (3290a4f) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #7208   +/-   ##
=======================================
  Coverage   82.67%   82.67%           
=======================================
  Files          13       13           
  Lines        4017     4017           
=======================================
  Hits         3321     3321           
  Misses        696      696

Impacted Files	Coverage Δ
python-package/xgboost/core.py	`84.35% <0.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3290a4f...137183e. Read the comment docs.

ShvetsKS · 2021-09-22T14:14:08Z

@trivialfis Can we start detailed review of this PR as this part is related to partition optimizations only?
Optimizations of histogram, evaluate and sync kernels are prepared in separate branches

trivialfis · 2021-09-22T19:07:48Z

tests/cpp/tree/hist/test_histogram.cc

-    check_hist(parent_hist, this_hist, sibling_hist, 0, total_bins);
-    ++node_id;
-  }
+  // const common::OptPartitionBuilder* p_opt_partition_builder;


I think this is a sign of working in progress?

my mistake, sorry!

tests were aligned to optimized partition kernel usage

ShvetsKS · 2021-09-26T15:50:31Z

@trivialfis failed steps Jenkins Linux: Test and Jenkins Linux: Deploy have similar logs as latest commit in master branch^ https://xgboost-ci.net/blue/organizations/jenkins/xgboost/detail/release_1.5.0/1/pipeline
seems there is no any failed checks/tests affected by current changes.

trivialfis · 2021-09-26T16:01:54Z

Yup, I'm aware of the failure. It's blocking the release.

napetrov · 2021-10-01T12:54:56Z

@trivialfis - are we considering this PR to be integrated prior to release?

trivialfis · 2021-10-01T13:33:30Z

No, I don't think it's a good idea to integrate this amount of changes. I looked briefly before and I think it can use some cleanups. I'm on holiday right now. Will provide detailed review and possible changes once I come back.

napetrov · 2021-10-01T13:34:13Z

@trivialfis - ok, thanks. Have good holidays!

ShvetsKS · 2021-10-03T10:09:02Z

similar fails in master branch here:
docker: Error response from daemon: OCI runtime create failed
@trivialfis should I try to restart tests again?

trivialfis · 2021-10-03T10:33:01Z

Restarted.

@hcho3 The issue seems to be lingering.

trivialfis

Hi, I appreciate your excellent optimization. But I do think there's a way to write more readable code.

The review is not in detail. A new partitioner that implements std::partition spans 550 LOC without any comment and with 29 public variables is a bit too much for me. Also, the old one seems to be still here for some reason. I can't merge a PR with this level of complexity without seriously thinking about a redesign/rewrite. There are some good reading materials in c++ core guidelines and are highly recommended:

https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#main
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#S-performance

trivialfis · 2021-10-12T14:20:42Z

...kages/xgboost4j-spark/src/test/scala/ml/dmlc/xgboost4j/scala/spark/XGBoostGeneralSuite.scala

@@ -115,7 +115,7 @@ class XGBoostGeneralSuite extends FunSuite with TmpFolderPerSuite with PerTest {
    val eval = new EvalError()
    val training = buildDataFrame(Classification.train)
    val testDM = new DMatrix(Classification.test.iterator)
-    val paramMap = Map("eta" -> "1", "gamma" -> "0.5", "max_depth" -> "0",
+    val paramMap = Map("eta" -> "1", "gamma" -> "0.5", "max_depth" -> "6",


What's changed?

reverted back

trivialfis · 2021-10-12T14:25:00Z

src/common/opt_partition_builder.h

@@ -0,0 +1,552 @@
+


Is the original one deleted?

yes, deleted currently, test for partition was updated

trivialfis · 2021-10-12T14:26:40Z

src/common/opt_partition_builder.h

+          dynamic_cast<const SparseColumn<BinIdxType>*>(GetColumnsRef<BinIdxType>()[fid].get());
+      }
+      for (size_t tid = 0; tid < nthreads; ++tid) {
+        states[tid].resize(1 << (max_depth + 1), 0);


There's a MaxNodes function in TreeParam.

trivialfis · 2021-10-12T14:28:22Z

src/common/opt_partition_builder.h

+    if (is_loss_guided) {
+      const int cleft = compleate_trees_depth_wise[0];
+      const int cright = compleate_trees_depth_wise[1];
+      // template!


typo, fixed

trivialfis · 2021-10-12T14:29:50Z

src/tree/updater_quantile_hist.h

+    std::vector<uint16_t> curr_level_nodes_;
+    std::vector<int32_t> split_conditions_;
+    std::vector<uint64_t> split_ind_;
+    std::vector<uint16_t> compleate_trees_depth_wise_;


compleate?

fixed, thanks!

trivialfis · 2021-10-12T14:33:22Z

src/common/opt_partition_builder.h

+  std::vector<std::vector<Slice>> threads_addr;
+  std::vector<std::vector<uint16_t>> threads_id_for_nodes;
+  std::vector<std::vector<uint16_t>> node_id_for_threads;
+  std::vector<std::vector<uint32_t>> threads_rows_nodes_wise;
+  std::vector<std::vector<uint32_t>> threads_nodes_count;
+  std::vector<std::vector<int>> nodes_count;
+  std::vector<Slice> partitions;
+  std::vector<std::vector<uint32_t>> vec_rows;
+  std::vector<std::vector<uint32_t>> vec_rows_remain;
+  std::vector<std::unique_ptr<const Column<uint8_t> >> columns8;
+  std::vector<const DenseColumn<uint8_t, true>*> dcolumns8;
+  std::vector<const SparseColumn<uint8_t>*> scolumns8;
+  std::vector<std::unique_ptr<const Column<uint16_t> >> columns16;
+  std::vector<const DenseColumn<uint16_t, true>*> dcolumns16;
+  std::vector<const SparseColumn<uint16_t>*> scolumns16;
+  std::vector<std::unique_ptr<const Column<uint32_t> >> columns32;
+  std::vector<const DenseColumn<uint32_t, true>*> dcolumns32;
+  std::vector<const SparseColumn<uint32_t>*> scolumns32;
+  std::vector<std::vector<size_t> > states;
+  const RegTree* p_tree;
+  // can be common for all threads!
+  std::vector<std::vector<bool>> default_flags;
+  const uint8_t* data_hash;
+  std::vector<uint32_t> row_set_collection_vec;
+  uint32_t gmat_n_rows;
+  uint32_t* row_indices_ptr;
+  size_t n_threads = 0;
+  uint32_t summ_size = 0;
+  uint32_t summ_size_remain = 0;
+  uint32_t max_depth = 0;
+


29 public variables ... how do you keep track of that ...

trivialfis · 2021-10-12T14:40:21Z

src/common/opt_partition_builder.h

+    const auto& dense_columns = GetDenseColumnsRef<BinIdxType>();
+    const auto& sparse_columns = GetSparseColumnsRef<BinIdxType>();
+    uint32_t count = 0;
+    uint32_t count2 = 0;


temp_0, temp_1 is not a good way to name variables.

changed, thanks!

trivialfis · 2021-10-12T14:51:08Z

src/tree/updater_quantile_hist.cc

-      const RowSetCollection::Elem e = row_set_collection_[nid];
-      for (const size_t *it = e.begin; it < e.end; ++it) {
-        grad_stat.Add(gpair_h[*it].GetGrad(), gpair_h[*it].GetHess());
+      for (const GradientPair& gh : gpair_h) {


Maybe std::accumulate ?

it will affect the precision, otherwise we need to support new operator+(xgboost::detail::GradientPairInternal<double>, const xgboost::detail::GradientPairInternal<float>)

trivialfis · 2021-10-12T14:51:58Z

src/tree/updater_quantile_hist.cc

  for (auto const& entry : expand) {
    if (entry.IsValid(param_, *num_leaves)) {
      nodes_for_apply_split->push_back(entry);
      evaluator_->ApplyTreeSplit(entry, p_tree);
      (*num_leaves)++;
+      curr_level_nodes_[2*entry.nid] = (*p_tree)[entry.nid].LeftChild();
+      curr_level_nodes_[2*entry.nid + 1] = (*p_tree)[entry.nid].RightChild();
+      compleate_node_ids.push_back((*p_tree)[entry.nid].LeftChild());


compleate?

fixed thanks!

trivialfis · 2021-10-12T14:55:28Z

src/common/opt_partition_builder.h

+  }
+
+  template<typename BinIdxType, bool is_loss_guided, bool all_dense = true>
+  void CommonPartition(size_t tid, const size_t row_indices_begin,


https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#i23-keep-the-number-of-function-arguments-low

trivialfis · 2021-10-12T14:57:00Z

src/common/opt_partition_builder.h

+  std::vector<std::unique_ptr<const Column<uint32_t> >> columns32;
+  std::vector<const DenseColumn<uint32_t, true>*> dcolumns32;
+  std::vector<const SparseColumn<uint32_t>*> scolumns32;
+  std::vector<std::vector<size_t> > states;


https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rconc-data

trivialfis · 2021-10-12T14:59:25Z

src/tree/updater_quantile_hist.cc

-        has_neg_hess = true;
-      }
-    }
+    const size_t block_size = info.num_row_ / this->nthread_ + !!(info.num_row_ % this->nthread_);


This block size calculation is used in multiple places, including prediction I think? Any plan to create some abstraction for it?

https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rp-library

std::swap is a small function, yet it's a worthy function to create.

Added new function GetBlockSize and tried to use it in all similar cases. Thanks!

trivialfis · 2021-10-12T15:01:01Z

src/common/opt_partition_builder.h

+      vec_rows_remain.resize(nthreads);
+    } else {
+      threads_nodes_count.clear();
+      threads_nodes_count.resize(n_threads);


Why clear then resize?

need to clear full threads_nodes_cout buffers. If keep only resize(), information in threads_nodes_cout is not valid.

ShvetsKS

@trivialfis thanks a lot for review and sorry for long response!
I applied part of comments but related to code complexity, number of input arguments, and reworking in general are still in progress.

ShvetsKS · 2021-10-17T13:00:16Z

src/common/opt_partition_builder.h

+    const auto& dense_columns = GetDenseColumnsRef<BinIdxType>();
+    const auto& sparse_columns = GetSparseColumnsRef<BinIdxType>();
+    uint32_t count = 0;
+    uint32_t count2 = 0;


changed, thanks!

ShvetsKS · 2021-10-17T13:02:42Z

src/common/opt_partition_builder.h

+    if (is_loss_guided) {
+      const int cleft = compleate_trees_depth_wise[0];
+      const int cright = compleate_trees_depth_wise[1];
+      // template!


typo, fixed

ShvetsKS · 2021-10-17T13:28:49Z

src/tree/updater_quantile_hist.cc

-      const RowSetCollection::Elem e = row_set_collection_[nid];
-      for (const size_t *it = e.begin; it < e.end; ++it) {
-        grad_stat.Add(gpair_h[*it].GetGrad(), gpair_h[*it].GetHess());
+      for (const GradientPair& gh : gpair_h) {


it will affect the precision, otherwise we need to support new operator+(xgboost::detail::GradientPairInternal<double>, const xgboost::detail::GradientPairInternal<float>)

ShvetsKS · 2021-10-17T13:29:50Z

src/tree/updater_quantile_hist.cc

  for (auto const& entry : expand) {
    if (entry.IsValid(param_, *num_leaves)) {
      nodes_for_apply_split->push_back(entry);
      evaluator_->ApplyTreeSplit(entry, p_tree);
      (*num_leaves)++;
+      curr_level_nodes_[2*entry.nid] = (*p_tree)[entry.nid].LeftChild();
+      curr_level_nodes_[2*entry.nid + 1] = (*p_tree)[entry.nid].RightChild();
+      compleate_node_ids.push_back((*p_tree)[entry.nid].LeftChild());


fixed thanks!

ShvetsKS · 2021-10-17T13:31:30Z

src/tree/updater_quantile_hist.h

+    std::vector<uint16_t> curr_level_nodes_;
+    std::vector<int32_t> split_conditions_;
+    std::vector<uint64_t> split_ind_;
+    std::vector<uint16_t> compleate_trees_depth_wise_;


fixed, thanks!

ShvetsKS · 2021-10-17T13:47:22Z

src/tree/updater_quantile_hist.cc

-        has_neg_hess = true;
-      }
-    }
+    const size_t block_size = info.num_row_ / this->nthread_ + !!(info.num_row_ % this->nthread_);


Added new function GetBlockSize and tried to use it in all similar cases. Thanks!

ShvetsKS · 2021-10-24T14:28:27Z

src/common/opt_partition_builder.h

@@ -0,0 +1,552 @@
+


yes, deleted currently, test for partition was updated

ShvetsKS · 2021-10-24T14:40:24Z

src/common/opt_partition_builder.h

+      vec_rows_remain.resize(nthreads);
+    } else {
+      threads_nodes_count.clear();
+      threads_nodes_count.resize(n_threads);


need to clear full threads_nodes_cout buffers. If keep only resize(), information in threads_nodes_cout is not valid.

trivialfis · 2021-10-29T17:57:57Z

t will affect the precision, otherwise we need to support new

Got it.

For some reason, I can't reply inline. I will take a deeper dive into the optimization and see if we can work on smaller subsets first. Thanks for the hard work!

trivialfis

Hi, I took another look, but only at the histogram building kernel this time. Can we start by working on this function first? Merging smaller functions makes review and test easier, also has the benefit of decoupling different code sections. What do you think?

src/objective/regression_obj.cu

src/common/threading_utils.h

trivialfis · 2021-10-29T18:10:06Z

src/common/hist_builder.h

+    const size_t nrows = row_end - row_begin;
+    const size_t no_prefetch_size = Prefetch::NoPrefetchSize(nrows);
+
+    if (is_root) {


May I ask what's the difference between the is_root and existing contiguousBlock?

contiguous block is mostly possible on the root node and it's really rare case when nodes with depth > 0 have contiguous blocks so there is no need to handle such cases specifically.
Also is_root parameter handles the reset to zero node_ids buffer. As described above we need to track the node ids for each row (rows are not grouped for each node) so we have to reset nod_ids buffer to zero at the start of each tree building

src/common/hist_builder.h

trivialfis · 2021-10-29T18:12:10Z

src/common/hist_builder.h

+                     const uint32_t row_end,
+                     const GHistIndexMatrix& gmat,
+                     const BinIdxType* numa_data,
+                     uint16_t* nodes_ids,


Could you please share why do we need to pass nodes_id for building histogram instead of passing the histogram buffer for this particular node? This way we can have a small scope for this function and makes it easier to test.

Also, I think the node should have 32-bit type bst_node_t.

Sure.
Changes in hist_builder.h were introduced due to changes in partition builder.
Before optimizations "full" bin matrix went through the memory for each node on the same level during histogram building: potentially for each node on specific level we have to read the bin matrix almost from the beginning to the end with random strides. The bandwidth of the memory is significantly dropped in such conditions.
So to mitigate this problem in optimized code we have "sorted" row indecies (but no sorting is done exactly! :) ) for each specific level and as a side effect (rows are not grouped by nodes) we have to know what node_id specific row is belong to. node_id is required to know what node's histogram should be updated with specific bin matrix row.

Sorry if description is not clear, please let me know and I'll try to add more details.

As for 32-bit type usage, now it works for max_depth <= 16 perfectly. I have to check the performance impact before changing it from 16 to 32 bit and I hope there won't be any reason to add template parameter for it.

Thank you for the detailed explanation! That is an insightful optimization.

For future references with these types of optimization, I think you should put that on the front page of PR before anything else and in the code comment, and potentially open an RFC like #7308 before making the change, so we can be clear about the design choice. (I write design doc from time to time just to keep myself clear about the changes ...). For this PR, I asked myself again, there's no way I can guess what you are trying to do by looking at the code. ;-)

So, let me try to summarize your summary ;-) Feel free to correct me if I'm wrong. Instead of asking which row should we process given a node index, you ask which node should we process given a row index. This optimization is only applicable when the depth-wise grow policy is used, so the code is split and specialized.

Can you design an intuitive interface for the partitioner and the histogram builder so that the code can be easily reused? I can give it some thoughts if you can confirm my summary.

As for 32-bit type usage, now it works for max_depth <= 16 perfectly.

I don't think this optimization is necessary.

Instead of asking which row should we process given a node index, you ask which node should we process given a row index. This optimization is only applicable when the depth-wise grow policy is used, so the code is split and specialized.

It's clear and right summary :)
Yes. I had to specialize the partition builder related code for depth-wise and loss-guided grow polices, as for loss-guided policy we have to know exactly the set of rows splited to left and to right child nodes.

Can you design an intuitive interface for the partitioner and the histogram builder so that the code can be easily reused? I can give it some thoughts if you can confirm my summary.

Would it be applicable if the OptPartitionBuilder class supported the depthwise and lossguided policies (strategy pattern) instead of is_loss_guided template usage?
As for histogram builder, there is no significant changes in BuildHistKernel, only node_id usage which is applicable for both growing policies.

trivialfis · 2021-10-29T18:14:48Z

src/common/hist_builder.h

+    const size_t icol_end =  any_missing ? row_ptr[ri+1] : icol_start + n_features;
+    const size_t row_size = icol_end - icol_start;
+    const size_t idx_gh = two * ri;
+    const uint32_t nid = is_root ? 0 : mapping_ids[nodes_ids[ri]];


It's unclear to me what is this mapping doing.

This is simple mapping of the real node id (in the currently build tree) to node number on current depth (0 <= nid < 2^current_depth). This mapping allows to re-use allocated histograms.

src/common/hist_builder.h

trivialfis · 2021-10-29T18:20:05Z

src/common/hist_builder.h

+template<typename FPType, bool do_prefetch,
+         typename BinIdxType, bool is_root,
+         bool any_missing>
+void BuildHistKernel(const std::vector<GradientPair>& gpair,


Excellent that it's now in a header so we don't have to define those specializations!

yes, seems some simplification was added :)

ShvetsKS · 2021-10-31T15:10:49Z

Hi, I took another look, but only at the histogram building kernel this time. Can we start by working on this function first? Merging smaller functions makes review and test easier, also has the benefit of decoupling different code sections. What do you think?

@trivialfis Thanks a lot for provided review!
I'd love to follow your proposal, but main changes in histogram builder (handling of node ids, mapping, memory allocation etc.) are introduced to support changes in partition builder and not applicable without it. Seems only the movement of build hist kernels to header file to avoid template specialization can be decoupled. But I'll try to simplify it anyway.

trivialfis · 2021-11-03T20:05:29Z

src/common/opt_partition_builder.h

@@ -204,6 +204,13 @@ class OptPartitionBuilder {
    if (!all_dense) {
      std::vector<size_t>& local_states = states[tid];
      std::vector<bool>& local_default_flags = default_flags[tid];
+      if (row_indices_begin >= row_indices_end) {


Hey, I saw that you have pushed a fix, presumably found during some local testing. But really, there's no way I can guess what you are trying to fix. These few lines look like a hacky workaround to some edge cases. It's fine, I do that all the time during development. But after the trial and error period, can we think of a more general design that doesn't have these conditions at all? Or make some cleanup so that the code can explain itself?

I changed this fix. Thanks for comment!
Seems, currently it doesn't look like some hacky workaround ;)
And I agree that clear and more general design should be proposed for OptPartitionBuilder, possibly with strategy patter applying.

trivialfis · 2021-11-03T20:08:06Z

I'm happy to schedule a call if needed. I think we have some different points of view and would be great to share more.

…ation_part_applysplit

ShvetsKS · 2022-05-15T20:54:45Z

Thank you for continuing the work on optimization. But looking at the code I think my comment will be similar to previous ones. Can we work on smaller chunks of code first and merge the rest incrementally?

All changes in this PR are related to each other: new partition builder required to introduce changes into histogram building kernels, and each part doesn't work separetly.
As common histogram building kernels are used in 'approx' and 'hist' methods, I had to introduce changes in 'approx' builder also. Now new optimized partition builder is applicable for both methods and even common RowPartitioner was introduced to avoid existed code duplication :)
@trivialfis Do you think it's better to split changes in 'approx' and 'hist' methods into separate PR's?
Now I see only three not addressed comments related only to 'opt_partition_builder.h' about number of fileds and parameters, sorry for not addressing it yet. I'll try to simplify the code.

…ation_part_applysplit

…Matrix are implemented.

razdoburdin · 2022-06-17T07:29:03Z

Hello, everyone.
I have started decomposition of this PR into the smaller parts. The first one is presented here: #7991

…ation_part_applysplit

ShvetsKS force-pushed the optimization_part_applysplit branch from bb7253a to 137183e Compare September 3, 2021 05:28

trivialfis marked this pull request as draft September 7, 2021 11:34

ShvetsKS marked this pull request as ready for review September 22, 2021 14:15

trivialfis reviewed Sep 22, 2021

View reviewed changes

ShvetsKS force-pushed the optimization_part_applysplit branch from 137183e to e932247 Compare September 26, 2021 12:55

ShvetsKS requested a review from trivialfis September 26, 2021 15:50

ShvetsKS force-pushed the optimization_part_applysplit branch 2 times, most recently from a2a6ccd to fd8244c Compare October 3, 2021 09:29

trivialfis self-assigned this Oct 10, 2021

trivialfis requested changes Oct 12, 2021

View reviewed changes

trivialfis reviewed Oct 12, 2021

View reviewed changes

ShvetsKS force-pushed the optimization_part_applysplit branch from fd8244c to 2e08a58 Compare October 24, 2021 14:51

ShvetsKS commented Oct 24, 2021

View reviewed changes

ShvetsKS force-pushed the optimization_part_applysplit branch 2 times, most recently from eeb7a88 to 8ed8a91 Compare October 24, 2021 15:41

trivialfis reviewed Oct 29, 2021

View reviewed changes

ShvetsKS force-pushed the optimization_part_applysplit branch from a5ec1d6 to fecf89d Compare October 31, 2021 09:41

trivialfis reviewed Nov 3, 2021

View reviewed changes

remove duplicated code

912105c

ShvetsKS force-pushed the optimization_part_applysplit branch 3 times, most recently from 906d5a9 to d155edc Compare May 14, 2022 17:37

dispatched call for update position

3b4ead6

ShvetsKS force-pushed the optimization_part_applysplit branch 2 times, most recently from bf0705f to b327799 Compare May 15, 2022 13:49

Merge branch 'master' of https://github.com/dmlc/xgboost into optimiz…

3b08089

…ation_part_applysplit

ShvetsKS force-pushed the optimization_part_applysplit branch from b327799 to 3b08089 Compare May 15, 2022 14:13

Merge branch 'master' of https://github.com/dmlc/xgboost into optimiz…

efb4f50

…ation_part_applysplit

ShvetsKS force-pushed the optimization_part_applysplit branch from 5307902 to efb4f50 Compare May 16, 2022 19:37

trivialfis mentioned this pull request May 18, 2022

Use adapter to initialize column matrix. #7912

Merged

Merge branch 'master' of https://github.com/dmlc/xgboost into optimiz…

8cd8cd5

…ation_part_applysplit

ShvetsKS force-pushed the optimization_part_applysplit branch from d41893c to 2525030 Compare May 22, 2022 20:42

razdoburdin pushed a commit to razdoburdin/xgboost that referenced this pull request Jun 13, 2022

Implementation and tests of column_matrix from dmlc#7208

c30367c

razdoburdin pushed a commit to razdoburdin/xgboost that referenced this pull request Jun 14, 2022

This is a part of large PR dmlc#7208. Here only the changes in Column…

3d7ba63

…Matrix are implemented.

razdoburdin mentioned this pull request Jun 14, 2022

Optimization/apply split/column matrix #7991

Open

applied new driver logic

7780172

ShvetsKS force-pushed the optimization_part_applysplit branch 6 times, most recently from 3ccb329 to 7780172 Compare June 26, 2022 09:48

ShvetsKS added 2 commits June 26, 2022 12:54

Merge branch 'master' of https://github.com/dmlc/xgboost into optimiz…

7b7ca83

…ation_part_applysplit

fix alreduce

9227502

ShvetsKS force-pushed the optimization_part_applysplit branch from 32aef0d to 9227502 Compare July 2, 2022 07:17

razdoburdin mentioned this pull request Aug 26, 2022

Optimization/applysplit/common row partitioner #8204

Merged

Partition optimization #7208

Are you sure you want to change the base?

Partition optimization #7208

Conversation

ShvetsKS commented Sep 2, 2021 • edited

codecov-commenter commented Sep 3, 2021 • edited

Codecov Report

ShvetsKS commented Sep 22, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ShvetsKS commented Sep 26, 2021

trivialfis commented Sep 26, 2021

napetrov commented Oct 1, 2021

trivialfis commented Oct 1, 2021

napetrov commented Oct 1, 2021

ShvetsKS commented Oct 3, 2021

trivialfis commented Oct 3, 2021

trivialfis left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ShvetsKS left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trivialfis commented Oct 29, 2021

trivialfis left a comment

Choose a reason for hiding this comment

trivialfis Oct 29, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trivialfis Oct 29, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trivialfis Nov 3, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ShvetsKS commented Oct 31, 2021

trivialfis Nov 3, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trivialfis commented Nov 3, 2021

ShvetsKS commented May 15, 2022 • edited

razdoburdin commented Jun 17, 2022

ShvetsKS commented Sep 2, 2021 •

edited

codecov-commenter commented Sep 3, 2021 •

edited

trivialfis left a comment •

edited

trivialfis Oct 29, 2021 •

edited

trivialfis Oct 29, 2021 •

edited

trivialfis Nov 3, 2021 •

edited

trivialfis Nov 3, 2021 •

edited

ShvetsKS commented May 15, 2022 •

edited