[Roadmap] 1.4.0 Roadmap #6500

trivialfis · 2020-12-14T19:35:17Z

@dmlc/xgboost-committer Please add your items here by editing this post. Let's ensure that

Each item has to be associated with a ticket
Major design/refactoring are associated with a RFC before committing the code
Blocking issue must be marked as blocking
Breaking change must be marked as breaking

For other contributors who have no permission to edit the post, please comment here about what you think should be in 1.4.0.

Main

Dask

Make dask scikit-learn interface feature complete. ([dask] Complete features in regressor and classifier. #6471, [dask] Random forest estimators #6602)
- Add ranking for dask. ([dask] Add DaskXGBRanker #6576)
- Add other random forest interfaces. ([dask] Random forest estimators #6602)
- Use inplace prediction for dask. ([breaking] Add prediction fucntion for DMatrix and use inplace predict for dask. #6668)
Support training multiple models in parallel using dask. ([dask] Use distributed.MultiLock #6743)
Optimize prediction performance. ([dask] Specify shape in prediction contrib and interaction. #6614)
Support shap value computation. ([dask] Add a dummy sample to infer output shape. #6645)

For brief notes, at 1.4, dask interface should be feature complete, categorical data support for GPU is ready for public testing and inplace prediction will be more mature.

The text was updated successfully, but these errors were encountered:

Roffild · 2020-12-29T14:38:55Z

#6507

SmirnovEgorRu · 2020-12-29T21:17:14Z

I want to propose to replace 'approx' to 'hist when tree_method is set to 'auto' on CPU. I observed #5178 was about this, but it's closed.

hist is faster than approx and my observations that even accuracy is better/on par. GPUs also have only 'hist' method, not 'approx'.

Do we have any concerns not to do this?

CC: @trivialfis, @hcho3, @ShvetsKS

trivialfis · 2020-12-30T06:50:15Z

I will put up some documents on theoretical aspect of various tree methods, then we can decide together.

trivialfis · 2021-01-01T12:20:14Z

in #6564

SmirnovEgorRu · 2021-01-02T03:49:32Z

@trivialfis, do we need to run experiments to decide, probably?

trivialfis · 2021-01-02T07:40:28Z

@SmirnovEgorRu I don't have objection of changing the default in this or next release. I mentioned there's a huge refactor for CPU implementations to @ShvetsKS . I would like to see some parts of it merged before making the change so we can make some fair comparison. Will come back to it after sorting out issues in dask interface. (which should be quite fast as most of the features are now supported).

trivialfis · 2021-01-02T07:45:46Z

I closed the PR you referenced because I couldn't get all tests passing, I think even if we decided to change the default now, we still have some blockers to track down. So refactoring first might help making the change clearer and easier.

trivialfis · 2021-01-14T18:17:28Z

Hi @ShvetsKS @SmirnovEgorRu I have been trying to refactor the CPU code for categorical data support based on the efficient CPU Hist code. I found that on URL dataset the cpu hist is slower than approx. It's not a conventional dataset as it's unusually wide and sparse. Just curious if you have plan on optimizing it.

trivialfis · 2021-01-14T18:19:25Z

The approx implementation is parallelizing on features with dynamic scheduling, so it has an advantage on these kind of datasets.

SmirnovEgorRu · 2021-01-15T05:59:25Z

@trivialfis, yep, we are thinking how to tune wide data sets as well. I suppose we can outperform approx with hist on URL.

Denisevi4 · 2021-02-23T01:39:05Z

"Support training multiple models in parallel using dask". Does this include cross-validation with early stopping?

trivialfis · 2021-02-23T18:38:19Z

@Denisevi4 No, it's for running multiple training sessions on a single cluster simultaneously. But it's a basic requirement for cv.

Roffild · 2021-02-24T10:02:36Z

#6731

trivialfis · 2021-03-04T17:38:07Z

@hcho3 I would like to get the 1.4 out once we got AUC re-implemented. I can try fixing the gamma metric if the AUC re-implementation goes well.

trivialfis · 2021-03-20T08:59:25Z

I will branch out next week.

Roffild · 2021-03-20T09:12:38Z

Will you fix other metrics (gamma-nloglik, logloss)?

trivialfis · 2021-03-20T09:36:41Z

Yeah, I will take a deeper look into them this weekend.

trivialfis · 2021-03-22T18:07:43Z

@Roffild Will reply on the original thread: #6731

trivialfis · 2021-04-12T18:40:50Z

1.4 is out, submit status will be on #6793 .

trivialfis added the type: roadmap label Dec 14, 2020

trivialfis pinned this issue Dec 14, 2020

hcho3 mentioned this issue Mar 29, 2021

1.4.0 Release Candidate #6793

Closed

8 tasks

trivialfis closed this as completed Apr 12, 2021

trivialfis unpinned this issue Apr 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Roadmap] 1.4.0 Roadmap #6500

[Roadmap] 1.4.0 Roadmap #6500

trivialfis commented Dec 14, 2020 •

edited

Roffild commented Dec 29, 2020

SmirnovEgorRu commented Dec 29, 2020

trivialfis commented Dec 30, 2020

trivialfis commented Jan 1, 2021

SmirnovEgorRu commented Jan 2, 2021

trivialfis commented Jan 2, 2021

trivialfis commented Jan 2, 2021

trivialfis commented Jan 14, 2021 •

edited

trivialfis commented Jan 14, 2021

SmirnovEgorRu commented Jan 15, 2021

Denisevi4 commented Feb 23, 2021

trivialfis commented Feb 23, 2021

Roffild commented Feb 24, 2021

trivialfis commented Mar 4, 2021

trivialfis commented Mar 20, 2021

Roffild commented Mar 20, 2021

trivialfis commented Mar 20, 2021

trivialfis commented Mar 22, 2021

trivialfis commented Apr 12, 2021

[Roadmap] 1.4.0 Roadmap #6500

[Roadmap] 1.4.0 Roadmap #6500

Comments

trivialfis commented Dec 14, 2020 • edited

Main

Dask

Roffild commented Dec 29, 2020

SmirnovEgorRu commented Dec 29, 2020

trivialfis commented Dec 30, 2020

trivialfis commented Jan 1, 2021

SmirnovEgorRu commented Jan 2, 2021

trivialfis commented Jan 2, 2021

trivialfis commented Jan 2, 2021

trivialfis commented Jan 14, 2021 • edited

trivialfis commented Jan 14, 2021

SmirnovEgorRu commented Jan 15, 2021

Denisevi4 commented Feb 23, 2021

trivialfis commented Feb 23, 2021

Roffild commented Feb 24, 2021

trivialfis commented Mar 4, 2021

trivialfis commented Mar 20, 2021

Roffild commented Mar 20, 2021

trivialfis commented Mar 20, 2021

trivialfis commented Mar 22, 2021

trivialfis commented Apr 12, 2021

trivialfis commented Dec 14, 2020 •

edited

trivialfis commented Jan 14, 2021 •

edited