[pyspark][doc] add more doc for pyspark #8271

wbo4958 · 2022-09-26T03:40:19Z

Add doc for pyspark gpu support.

wbo4958 · 2022-09-26T10:23:59Z

@WeichenXu123 @trivialfis please help to review this PR. Thx

trivialfis · 2022-09-26T11:08:20Z

doc/tutorials/spark_estimator.rst

+
+We recommend using Conda or Virtualenv to manage python dependencies
+in PySpark. Please refer to
+`How to Manage Python Dependencies in PySpark <https://www.databricks.com/blog/2020/12/22/how-to-manage-python-dependencies-in-pyspark.html>`_.


This varies between different providers of pyspark environments. On dataproc we can't submit the environment through spark-submit.

I mentioned this tutorial is for spark standalone mode, I didn't want to involve other CSPs to xgboost.

trivialfis · 2022-09-26T11:11:01Z

doc/tutorials/spark_estimator.rst

+  # load the model
+  model2 = SparkXGBRankerModel.load("/tmp/xgboost-pyspark-model")
+
+The above code snippet shows how to save/load xgboost pyspark model. And you can also


I strongly prefer using the booster attribute and would like to keep this special file name as a workaround that should be used sparingly.

Hmm, this is the standard spark way to save/load model. I can also mention the booster attribute.

trivialfis · 2022-09-26T11:12:38Z

doc/tutorials/spark_estimator.rst

+you can accelerate the whole pipeline (ETL, Train, Transform) for xgboost pyspark
+without any code change by leveraging GPU.
+
+You only need to add some configurations to enable RAPIDS plugin when submitting.


Suggested change

You only need to add some configurations to enable RAPIDS plugin when submitting.

You only need to add some configurations to enable RAPIDS plugin when submitting.

Below is a simple example submit command for enabling GPU acceleration:

WeichenXu123

LGTM

wbo4958 · 2022-09-27T10:58:06Z

@trivialfis please help to review it again. Thx

trivialfis · 2022-09-29T01:39:19Z

@wbo4958 I made some modifications to your PR, do you have any concerns?

wbo4958 · 2022-09-29T02:05:50Z

It looks pretty much better than my original expression. Thx

[pyspark][doc] add more doc for pyspark

00a9b4e

wbo4958 marked this pull request as ready for review September 26, 2022 03:40

Merge remote-tracking branch 'upstream/master' into pyspark-doc

6f8fc34

trivialfis reviewed Sep 26, 2022

View reviewed changes

WeichenXu123 approved these changes Sep 26, 2022

View reviewed changes

resolve comments

668e492

trivialfis added this to In progress in 1.7 Roadmap Sep 28, 2022

Merge branch 'master' into pyspark-doc

80f51af

wbo4958 mentioned this pull request Sep 28, 2022

1.7.0 Release Roadmap #8282

Closed

Description.

7cf4e6a

trivialfis approved these changes Sep 28, 2022

View reviewed changes

1.7 Roadmap automation moved this from In progress to Reviewer approved Sep 28, 2022

trivialfis merged commit cbf3a5f into dmlc:master Sep 29, 2022

1.7 Roadmap automation moved this from Reviewer approved to Done Sep 29, 2022

wbo4958 deleted the pyspark-doc branch April 23, 2024 07:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pyspark][doc] add more doc for pyspark #8271

[pyspark][doc] add more doc for pyspark #8271

wbo4958 commented Sep 26, 2022

wbo4958 commented Sep 26, 2022

trivialfis Sep 26, 2022

wbo4958 Sep 26, 2022

trivialfis Sep 26, 2022

wbo4958 Sep 26, 2022

trivialfis Sep 26, 2022

wbo4958 Sep 27, 2022

WeichenXu123 left a comment

wbo4958 commented Sep 27, 2022

trivialfis commented Sep 29, 2022

wbo4958 commented Sep 29, 2022

	You only need to add some configurations to enable RAPIDS plugin when submitting.
	You only need to add some configurations to enable RAPIDS plugin when submitting.

[pyspark][doc] add more doc for pyspark #8271

[pyspark][doc] add more doc for pyspark #8271

Conversation

wbo4958 commented Sep 26, 2022

wbo4958 commented Sep 26, 2022

trivialfis Sep 26, 2022

Choose a reason for hiding this comment

wbo4958 Sep 26, 2022

Choose a reason for hiding this comment

trivialfis Sep 26, 2022

Choose a reason for hiding this comment

wbo4958 Sep 26, 2022

Choose a reason for hiding this comment

trivialfis Sep 26, 2022

Choose a reason for hiding this comment

wbo4958 Sep 27, 2022

Choose a reason for hiding this comment

WeichenXu123 left a comment

Choose a reason for hiding this comment

wbo4958 commented Sep 27, 2022

trivialfis commented Sep 29, 2022

wbo4958 commented Sep 29, 2022