Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[doc][dask] Update notes about k8s. #10271

Merged
merged 3 commits into from
May 13, 2024
Merged

Conversation

trivialfis
Copy link
Member

The new version of dask Kubernetes is based on the k8s operator. In addition, I added notes about XGBoost expects pods to remain unchanged during training.

@trivialfis trivialfis changed the title [dock][dask] Update notes about k8s. [doc][dask] Update notes about k8s. May 13, 2024
@trivialfis trivialfis merged commit 871fabe into dmlc:master May 13, 2024
22 of 29 checks passed
@trivialfis trivialfis deleted the doc-dask-k8s branch May 13, 2024 20:21
trivialfis added a commit that referenced this pull request May 23, 2024
* [pyspark] rework the log (#10077)

* Add CUDA iterator to tensor view. (#10074)

* Disable column sample by node for the exact tree method. (#10083)

* [R] Refactor callback structure and attributes (#9957)

* [sycl] add partitioning and related tests (#10080)

Co-authored-by: Dmitry Razdoburdin <>

* Small cleanup for mock tests. (#10085)

* [CI] Test R package with CMake (#10087)

* [CI] Test R package with CMake

* Fix

* Fix

* Update test_r_package.py

* Fix CMake flag for R package

* Install system deps

* Fix

* Use sudo

* [CI] Cancel GH Action job if a newer commit is published (#10088)

* Optional normalization for learning to rank. (#10094)

* Support graphviz plot for multi-target tree. (#10093)

* [R] Rename `watchlist` -> `evals` (#10032)

* [doc] Fix the default value for `lambdarank_pair_method`. (#10098)

* Fix pairwise objective with NDCG metric along with custom gain. (#10100)

* Fix pairwise objective with NDCG metric.

- Allow setting `ndcg_exp_gain` for `rank:pairwise`.

This is useful when using pairwise for objective but ndcg for metric.

* [R] deprecate watchlist (#10110)

* [SYCL] Add split evaluation (#10119)



---------

Co-authored-by: Dmitry Razdoburdin <>

* Fix compilation with the latest ctk. (#10123)

* Use `std::uint64_t` for row index. (#10120)


- Use std::uint64_t instead of size_t to avoid implementation-defined type.
- Rename to bst_idx_t, to account for other types of indexing.
- Small cleanup to the base header.

* Work with IPv6 in the new tracker. (#10125)

* [CI] Update scorecard actions. (#10133)

* [CI] Fix yml in github action. (#10134)

* add sycl reaslisation of ghist builder (#10138)

Co-authored-by: Dmitry Razdoburdin <>

* Cleanup set info. (#10139)

- Use the array interface internally.
- Deprecate `XGDMatrixSetDenseInfo`.
- Deprecate `XGDMatrixSetUIntInfo`.
- Move the handling of `DataType` into the deprecated C function.

---------

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>

* Update collective implementation. (#10152)

* Update collective implementation.

- Cleanup resource during `Finalize` to avoid handling threads in destructor.
- Calculate the size for allgather automatically.
- Use simple allgather for small (smaller than the number of worker) allreduce.

* [R] Make `xgb.cv` work with `xgb.DMatrix` only, adding support for survival and ranking fields (#10031)



---------

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>

* docs: fix bug in tutorial (#10143)

* Bump org.apache.maven.plugins:maven-gpg-plugin from 3.1.0 to 3.2.2 in /jvm-packages/xgboost4j-spark (#10151)

* Fix pyspark with verbosity=3. (#10172)

* Fix global config for external memory. (#10173)

Pass the thread-local configuration between threads.

* [doc] Update python3statement URL (#10179)

* [CI] Update create-pull-request action

* [SYCL] Add basic features for QuantileHistMaker (#10174)


---------

Co-authored-by: Dmitry Razdoburdin <>

* [CI] Use latest RAPIDS; Pandas 2.0 compatibility fix (#10175)

* [CI] Update RAPIDS to latest stable

* [CI] Use rapidsai stable channel; fix syntax errors in Dockerfile.gpu

* Don't combine astype() with loc()

* Work around #10181

* Fix formatting

* Fix test

---------

Co-authored-by: hcho3 <hcho3@users.noreply.github.com>
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>

* docs: update Ruby package link (#10182)

* [CI] Reduce clutter from dependabot (#10187)

* [jvm-packages] Ombinus patch to update all minor dependencies (#10188)

* Fold in #10184

* Fold in #10176

* Fold in #10168

* Fold in #10165

* Fold in #10164

* Fold in #10155

* Fold in #10062

* Fold in #9984

* Fold in #9843

* Upgrade to Maven 3.6.3

* Bump org.apache.maven.plugins:maven-jar-plugin (#10191)

Bumps [org.apache.maven.plugins:maven-jar-plugin](https://github.com/apache/maven-jar-plugin) from 3.3.0 to 3.4.0.
- [Release notes](https://github.com/apache/maven-jar-plugin/releases)
- [Commits](apache/maven-jar-plugin@maven-jar-plugin-3.3.0...maven-jar-plugin-3.4.0)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-jar-plugin
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [coll] Improve event loop. (#10199)

- Add a test for blocking calls.
- Do not require the queue to be empty after waking up; this frees up the thread to answer blocking calls.
- Handle EOF in read.
- Improve the error message in the result. Allow concatenation of multiple results.

* [CI] Update machine images (#10201)

* Bump org.apache.maven.plugins:maven-jar-plugin (#10202)

Bumps [org.apache.maven.plugins:maven-jar-plugin](https://github.com/apache/maven-jar-plugin) from 3.3.0 to 3.4.0.
- [Release notes](https://github.com/apache/maven-jar-plugin/releases)
- [Commits](apache/maven-jar-plugin@maven-jar-plugin-3.3.0...maven-jar-plugin-3.4.0)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-jar-plugin
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [pyspark] Reuse the collective communicator. (#10198)

* Bump org.scala-lang.modules:scala-collection-compat_2.12 (#10193)

Bumps [org.scala-lang.modules:scala-collection-compat_2.12](https://github.com/scala/scala-collection-compat) from 2.11.0 to 2.12.0.
- [Release notes](https://github.com/scala/scala-collection-compat/releases)
- [Commits](scala/scala-collection-compat@v2.11.0...v2.12.0)

---
updated-dependencies:
- dependency-name: org.scala-lang.modules:scala-collection-compat_2.12
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump scalatest.version from 3.2.17 to 3.2.18 in /jvm-packages/xgboost4j (#10196)

Bumps `scalatest.version` from 3.2.17 to 3.2.18.

Updates `org.scalatest:scalatest_2.12` from 3.2.17 to 3.2.18
- [Release notes](https://github.com/scalatest/scalatest/releases)
- [Commits](scalatest/scalatest@release-3.2.17...release-3.2.18)

Updates `org.scalactic:scalactic_2.12` from 3.2.17 to 3.2.18
- [Release notes](https://github.com/scalatest/scalatest/releases)
- [Commits](scalatest/scalatest@release-3.2.17...release-3.2.18)

---
updated-dependencies:
- dependency-name: org.scalatest:scalatest_2.12
  dependency-type: direct:development
  update-type: version-update:semver-patch
- dependency-name: org.scalactic:scalactic_2.12
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [coll] Add global functions. (#10203)

* Bump org.apache.flink:flink-clients in /jvm-packages (#10197)

Bumps [org.apache.flink:flink-clients](https://github.com/apache/flink) from 1.18.0 to 1.19.0.
- [Commits](apache/flink@release-1.18.0...release-1.19.0)

---
updated-dependencies:
- dependency-name: org.apache.flink:flink-clients
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [pyspark] support stage-level for yarn/k8s (#10209)

* [coll] Implement shutdown for tracker and comm. (#10208)


- Force shutdown the tracker.
- Implement shutdown notice for error handling thread in comm.

* [doc] Add typing to dask demos. (#10207)

* [SYCL] Add sampling initialization (#10216)



---------

Co-authored-by: Dmitry Razdoburdin <>

* [CI] Test new setup-r. (#10228)

* [CI] Use native arm64 worker in GHAction to build M1 wheel (#10225)

* [CI] Use native arm64 worker in GHAction to build M1 wheel

* Set up Conda

* Use mamba

* debug

* fix

* fix

* fix

* fix

* fix

* Temporarily disable other tests

* Fix prefix

* Use micromamba

* Use conda-incubator/setup-miniconda

* Use mambaforge

* Fix

* Fix prefix

* Don't use deprecated set-output

* Add verbose output from build

* verbose

* Specify arch

* Bump setup-miniconda to v3

* Use Python 3.9

* Restore deleted files

* WAR.

---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>

* Bump hadoop.version from 3.3.6 to 3.4.0 in /jvm-packages/xgboost4j (#10156)

Bumps `hadoop.version` from 3.3.6 to 3.4.0.

Updates `org.apache.hadoop:hadoop-hdfs` from 3.3.6 to 3.4.0

Updates `org.apache.hadoop:hadoop-common` from 3.3.6 to 3.4.0

---
updated-dependencies:
- dependency-name: org.apache.hadoop:hadoop-hdfs
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.apache.hadoop:hadoop-common
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump net.alchim31.maven:scala-maven-plugin in /jvm-packages/xgboost4j (#10217)

Bumps net.alchim31.maven:scala-maven-plugin from 4.8.1 to 4.9.0.

---
updated-dependencies:
- dependency-name: net.alchim31.maven:scala-maven-plugin
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump org.apache.maven.plugins:maven-jar-plugin (#10210)

Bumps [org.apache.maven.plugins:maven-jar-plugin](https://github.com/apache/maven-jar-plugin) from 3.4.0 to 3.4.1.
- [Release notes](https://github.com/apache/maven-jar-plugin/releases)
- [Commits](apache/maven-jar-plugin@maven-jar-plugin-3.4.0...maven-jar-plugin-3.4.1)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-jar-plugin
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump org.apache.maven.plugins:maven-gpg-plugin (#10211)

Bumps [org.apache.maven.plugins:maven-gpg-plugin](https://github.com/apache/maven-gpg-plugin) from 3.2.3 to 3.2.4.
- [Release notes](https://github.com/apache/maven-gpg-plugin/releases)
- [Commits](apache/maven-gpg-plugin@maven-gpg-plugin-3.2.3...maven-gpg-plugin-3.2.4)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-gpg-plugin
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [pyspark] Sort workers by task ID. (#10220)

* Bump org.apache.spark:spark-mllib_2.12 (#10070)

Bumps org.apache.spark:spark-mllib_2.12 from 3.4.1 to 3.5.1.

---
updated-dependencies:
- dependency-name: org.apache.spark:spark-mllib_2.12
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Support more sklearn tags for testing. (#10230)

* Update nvtx. (#10227)

* [sycl] add data initialisation for training (#10222)

Co-authored-by: Dmitry Razdoburdin <>
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>

* Fixes for numpy 2.0. (#10252)

* [jvm-packagaes] Freeze spark to 3.4.1 for now. (#10253)

The newer spark version for CPU conflicts with the more conservative version used by
rapids.

* [jvm-packages] fix group col for gpu packages (#10254)

* [sycl] add loss guided hist building (#10251)

Co-authored-by: Dmitry Razdoburdin <>

* Be more lenient on floating point error for AUC. (#10264)

* [CI] Upgrade setup-r. (#10267)

* Fixes for the latest pandas. (#10266)

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>

* Keep GitHub Actions up to date with Dependabot (#10268)

# Fixes software supply chain safety warnings like at the bottom right of
https://github.com/dmlc/xgboost/actions/runs/9048469681

* [Keeping your actions up to date with Dependabot](https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot)
* [Configuration options for the dependabot.yml file - package-ecosystem](https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#package-ecosystem)

* [doc][dask] Update notes about k8s. (#10271)

* [CI] Fixes for using the latest modin. (#10285)

* Release data in cache. (#10286)

* Adopt new logo (#10270)

* Use a thread pool for external memory. (#10288)

* Fix pylint. (#10296)

* Revamp the rabit implementation. (#10112)

This PR replaces the original RABIT implementation with a new one, which has already been partially merged into XGBoost. The new one features:
- Federated learning for both CPU and GPU.
- NCCL.
- More data types.
- A unified interface for all the underlying implementations.
- Improved timeout handling for both tracker and workers.
- Exhausted tests with metrics (fixed a couple of bugs along the way).
- A reusable tracker for Python and JVM packages.

* Bump conda-incubator/setup-miniconda from 2.1.1 to 3.0.4 (#10278)

Bumps [conda-incubator/setup-miniconda](https://github.com/conda-incubator/setup-miniconda) from 2.1.1 to 3.0.4.
- [Release notes](https://github.com/conda-incubator/setup-miniconda/releases)
- [Commits](conda-incubator/setup-miniconda@v2.1.1...v3.0.4)

---
updated-dependencies:
- dependency-name: conda-incubator/setup-miniconda
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>

* Bump ossf/scorecard-action from 2.3.1 to 2.3.3 (#10280)

Bumps [ossf/scorecard-action](https://github.com/ossf/scorecard-action) from 2.3.1 to 2.3.3.
- [Release notes](https://github.com/ossf/scorecard-action/releases)
- [Changelog](https://github.com/ossf/scorecard-action/blob/main/RELEASE.md)
- [Commits](ossf/scorecard-action@0864cf1...dc50aa9)

---
updated-dependencies:
- dependency-name: ossf/scorecard-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>

* Bump actions/checkout from 2 to 4 (#10274)

Bumps [actions/checkout](https://github.com/actions/checkout) from 2 to 4.
- [Release notes](https://github.com/actions/checkout/releases)
- [Commits](actions/checkout@v2...v4)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>

* Bump commons-logging:commons-logging in /jvm-packages/xgboost4j (#10294)

Bumps commons-logging:commons-logging from 1.3.1 to 1.3.2.

---
updated-dependencies:
- dependency-name: commons-logging:commons-logging
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>

* [CI] Bump checkout action version. (#10305)

* [SYCL] Add nodes initialisation (#10269)


---------

Co-authored-by: Dmitry Razdoburdin <>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>

* Bump mamba-org/provision-with-micromamba from 14 to 16 (#10275)

Bumps [mamba-org/provision-with-micromamba](https://github.com/mamba-org/provision-with-micromamba) from 14 to 16.
- [Release notes](https://github.com/mamba-org/provision-with-micromamba/releases)
- [Commits](mamba-org/provision-with-micromamba@f347426...3c96c0c)

---
updated-dependencies:
- dependency-name: mamba-org/provision-with-micromamba
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [JVM-packages] Prevent memory leak. (#10307)

* Bump dorny/paths-filter from 2 to 3 (#10276)

Bumps [dorny/paths-filter](https://github.com/dorny/paths-filter) from 2 to 3.
- [Release notes](https://github.com/dorny/paths-filter/releases)
- [Changelog](https://github.com/dorny/paths-filter/blob/master/CHANGELOG.md)
- [Commits](dorny/paths-filter@v2...v3)

---
updated-dependencies:
- dependency-name: dorny/paths-filter
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>

* Bump org.apache.maven.plugins:maven-deploy-plugin (#10235)

Bumps [org.apache.maven.plugins:maven-deploy-plugin](https://github.com/apache/maven-deploy-plugin) from 3.1.1 to 3.1.2.
- [Release notes](https://github.com/apache/maven-deploy-plugin/releases)
- [Commits](apache/maven-deploy-plugin@maven-deploy-plugin-3.1.1...maven-deploy-plugin-3.1.2)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-deploy-plugin
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>

* Add timeout for distributed tests. (#10315)

* [coll] Keep the tracker alive during initialization error. (#10306)

* Fix non-fed.

* Fix non-fed.

* macos.

* macos.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Bobby Wang <wbo4958@gmail.com>
Co-authored-by: david-cortes <david.cortes.rivera@gmail.com>
Co-authored-by: Dmitry Razdoburdin <dmitry.razdoburdin@intel.com>
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
Co-authored-by: Michael Mayer <mayermichael79@gmail.com>
Co-authored-by: Fabi <117525608+fabfabi@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Trinh Quoc Anh <trinhquocanh94@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: hcho3 <hcho3@users.noreply.github.com>
Co-authored-by: Eric Leung <2754821+erictleung@users.noreply.github.com>
Co-authored-by: Christian Clauss <cclauss@me.com>
Co-authored-by: Dmitry Razdoburdin <d.razdoburdin@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants