Skip to content

Commit

Permalink
[doc] Fix broken links.
Browse files Browse the repository at this point in the history
* Fix most of the link checks from sphinx.
* Remove duplicate explicit target name.
  • Loading branch information
trivialfis committed Oct 19, 2021
1 parent fbb0dc4 commit 50bc97e
Show file tree
Hide file tree
Showing 7 changed files with 15 additions and 15 deletions.
2 changes: 1 addition & 1 deletion doc/contrib/release.rst
Expand Up @@ -25,6 +25,6 @@ Making a Release
4. Create a tag on release branch, either on GitHub or locally.
5. Make a release on GitHub tag page, which might be done with previous step if the tag is created on GitHub.
6. Submit pip, CRAN, and Maven packages.
- The pip package is maintained by [Hyunsu Cho](http://hyunsu-cho.io/) and [Jiaming Yuan](https://github.com/trivialfis). There's a helper script for downloading pre-built wheels on ``xgboost/dev/release-pypi.py`` along with simple instructions for using ``twine``.
- The pip package is maintained by [Hyunsu Cho](https://github.com/hcho3) and [Jiaming Yuan](https://github.com/trivialfis). There's a helper script for downloading pre-built wheels on ``xgboost/dev/release-pypi.py`` along with simple instructions for using ``twine``.
- The CRAN package is maintained by [Tong He](https://github.com/hetong007).
- The Maven package is maintained by [Nan Zhu](https://github.com/CodingCat).
6 changes: 3 additions & 3 deletions doc/gpu/index.rst
Expand Up @@ -95,13 +95,13 @@ XGBoost makes use of `GPUTreeShap <https://github.com/rapidsai/gputreeshap>`_ as
shap_interaction_values = model.predict(dtrain, pred_interactions=True)
See examples `here
<https://github.com/dmlc/xgboost/tree/master/demo/gpu_acceleration>`_.
<https://github.com/dmlc/xgboost/tree/master/demo/gpu_acceleration>`__.

Multi-node Multi-GPU Training
=============================
XGBoost supports fully distributed GPU training using `Dask <https://dask.org/>`_. For
getting started see our tutorial :doc:`/tutorials/dask` and worked examples `here
<https://github.com/dmlc/xgboost/tree/master/demo/dask>`_, also Python documentation
<https://github.com/dmlc/xgboost/tree/master/demo/dask>`__, also Python documentation
:ref:`dask_api` for complete reference.


Expand Down Expand Up @@ -238,7 +238,7 @@ Working memory is allocated inside the algorithm proportional to the number of r

The quantile finding algorithm also uses some amount of working device memory. It is able to operate in batches, but is not currently well optimised for sparse data.

If you are getting out-of-memory errors on a big dataset, try the `external memory version <../tutorials/external_memory.html>`_.
If you are getting out-of-memory errors on a big dataset, try the :doc:`external memory version </tutorials/external_memory>`.

Developer notes
===============
Expand Down
6 changes: 3 additions & 3 deletions doc/jvm/xgboost4j_spark_tutorial.rst
Expand Up @@ -79,7 +79,7 @@ The first thing in data transformation is to load the dataset as Spark's structu
StructField("class", StringType, true)))
val rawInput = spark.read.schema(schema).csv("input_path")
At the first line, we create a instance of `SparkSession <http://spark.apache.org/docs/latest/sql-programming-guide.html#starting-point-sparksession>`_ which is the entry of any Spark program working with DataFrame. The ``schema`` variable defines the schema of DataFrame wrapping Iris data. With this explicitly set schema, we can define the columns' name as well as their types; otherwise the column name would be the default ones derived by Spark, such as ``_col0``, etc. Finally, we can use Spark's built-in csv reader to load Iris csv file as a DataFrame named ``rawInput``.
At the first line, we create a instance of `SparkSession <https://spark.apache.org/docs/latest/sql-getting-started.html#starting-point-sparksession>`_ which is the entry of any Spark program working with DataFrame. The ``schema`` variable defines the schema of DataFrame wrapping Iris data. With this explicitly set schema, we can define the columns' name as well as their types; otherwise the column name would be the default ones derived by Spark, such as ``_col0``, etc. Finally, we can use Spark's built-in csv reader to load Iris csv file as a DataFrame named ``rawInput``.

Spark also contains many built-in readers for other format. The latest version of Spark supports CSV, JSON, Parquet, and LIBSVM.

Expand Down Expand Up @@ -130,7 +130,7 @@ labels. A DataFrame like this (containing vector-represented features and numeri
Dealing with missing values
~~~~~~~~~~~~~~~~~~~~~~~~~~~

XGBoost supports missing values by default (`as desribed here <https://xgboost.readthedocs.io/en/latest/faq.html#how-to-deal-with-missing-value>`_).
XGBoost supports missing values by default (`as desribed here <https://xgboost.readthedocs.io/en/latest/faq.html#how-to-deal-with-missing-values>`_).
If given a SparseVector, XGBoost will treat any values absent from the SparseVector as missing. You are also able to
specify to XGBoost to treat a specific value in your Dataset as if it was a missing value. By default XGBoost will treat NaN as the value representing missing.

Expand Down Expand Up @@ -369,7 +369,7 @@ Then we can load this model with single node Python XGBoost:

When interacting with other language bindings, XGBoost also supports saving-models-to and loading-models-from file systems other than the local one. You can use HDFS and S3 by prefixing the path with ``hdfs://`` and ``s3://`` respectively. However, for this capability, you must do **one** of the following:

1. Build XGBoost4J-Spark with the steps described in `here <https://xgboost.readthedocs.io/en/latest/jvm/index.html#installation-from-source>`_, but turning `USE_HDFS <https://github.com/dmlc/xgboost/blob/e939192978a0c152ad7b49b744630e99d54cffa8/jvm-packages/create_jni.py#L18>`_ (or USE_S3, etc. in the same place) switch on. With this approach, you can reuse the above code example by replacing "nativeModelPath" with a HDFS path.
1. Build XGBoost4J-Spark with the steps described in :ref:`here <install_jvm_packages>`, but turning `USE_HDFS <https://github.com/dmlc/xgboost/blob/e939192978a0c152ad7b49b744630e99d54cffa8/jvm-packages/create_jni.py#L18>`_ (or USE_S3, etc. in the same place) switch on. With this approach, you can reuse the above code example by replacing "nativeModelPath" with a HDFS path.

- However, if you build with USE_HDFS, etc. you have to ensure that the involved shared object file, e.g. libhdfs.so, is put in the LIBRARY_PATH of your cluster. To avoid the complicated cluster environment configuration, choose the other option.

Expand Down
6 changes: 3 additions & 3 deletions doc/parameter.rst
Expand Up @@ -366,8 +366,8 @@ Specify the learning task and the corresponding learning objective. The objectiv
- ``rank:pairwise``: Use LambdaMART to perform pairwise ranking where the pairwise loss is minimized
- ``rank:ndcg``: Use LambdaMART to perform list-wise ranking where `Normalized Discounted Cumulative Gain (NDCG) <http://en.wikipedia.org/wiki/NDCG>`_ is maximized
- ``rank:map``: Use LambdaMART to perform list-wise ranking where `Mean Average Precision (MAP) <http://en.wikipedia.org/wiki/Mean_average_precision#Mean_average_precision>`_ is maximized
- ``reg:gamma``: gamma regression with log-link. Output is a mean of gamma distribution. It might be useful, e.g., for modeling insurance claims severity, or for any outcome that might be `gamma-distributed <https://en.wikipedia.org/wiki/Gamma_distribution#Applications>`_.
- ``reg:tweedie``: Tweedie regression with log-link. It might be useful, e.g., for modeling total loss in insurance, or for any outcome that might be `Tweedie-distributed <https://en.wikipedia.org/wiki/Tweedie_distribution#Applications>`_.
- ``reg:gamma``: gamma regression with log-link. Output is a mean of gamma distribution. It might be useful, e.g., for modeling insurance claims severity, or for any outcome that might be `gamma-distributed <https://en.wikipedia.org/wiki/Gamma_distribution#Occurrence_and_applications>`_.
- ``reg:tweedie``: Tweedie regression with log-link. It might be useful, e.g., for modeling total loss in insurance, or for any outcome that might be `Tweedie-distributed <https://en.wikipedia.org/wiki/Tweedie_distribution#Occurrence_and_applications>`_.

* ``base_score`` [default=0.5]

Expand All @@ -390,7 +390,7 @@ Specify the learning task and the corresponding learning objective. The objectiv
- ``error@t``: a different than 0.5 binary classification threshold value could be specified by providing a numerical value through 't'.
- ``merror``: Multiclass classification error rate. It is calculated as ``#(wrong cases)/#(all cases)``.
- ``mlogloss``: `Multiclass logloss <http://scikit-learn.org/stable/modules/generated/sklearn.metrics.log_loss.html>`_.
- ``auc``: `Receiver Operating Characteristic Area under the Curve <http://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_curve>`_.
- ``auc``: `Receiver Operating Characteristic Area under the Curve <https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve>`_.
Available for classification and learning-to-rank tasks.

- When used with binary classification, the objective should be ``binary:logistic`` or similar functions that work on probability.
Expand Down
2 changes: 1 addition & 1 deletion doc/tutorials/kubernetes.rst
Expand Up @@ -11,7 +11,7 @@ In order to run a XGBoost job in a Kubernetes cluster, perform the following ste

1. Install XGBoost Operator on the Kubernetes cluster.

a. XGBoost Operator is designed to manage the scheduling and monitoring of XGBoost jobs. Follow `this installation guide <https://github.com/kubeflow/xgboost-operator#installing-xgboost-operator>`_ to install XGBoost Operator.
a. XGBoost Operator is designed to manage the scheduling and monitoring of XGBoost jobs. Follow `this installation guide <https://github.com/kubeflow/xgboost-operator#install-xgboost-operator>`_ to install XGBoost Operator.

2. Write application code that will be executed by the XGBoost Operator.

Expand Down
6 changes: 3 additions & 3 deletions doc/tutorials/saving_model.rst
Expand Up @@ -227,15 +227,15 @@ XGBoost has a function called ``dump_model`` in Booster object, which lets you t
the model in a readable format like ``text``, ``json`` or ``dot`` (graphviz). The primary
use case for it is for model interpretation or visualization, and is not supposed to be
loaded back to XGBoost. The JSON version has a `schema
<https://github.com/dmlc/xgboost/blob/master/doc/dump.schema>`_. See next section for
<https://github.com/dmlc/xgboost/blob/master/doc/dump.schema>`__. See next section for
more info.

***********
JSON Schema
***********

Another important feature of JSON format is a documented `Schema
<https://json-schema.org/>`_, based on which one can easily reuse the output model from
Another important feature of JSON format is a documented `schema
<https://json-schema.org/>`__, based on which one can easily reuse the output model from
XGBoost. Here is the initial draft of JSON schema for the output model (not
serialization, which will not be stable as noted above). It's subject to change due to
the beta status. For an example of parsing XGBoost tree model, see ``/demo/json-model``.
Expand Down
2 changes: 1 addition & 1 deletion python-package/xgboost/core.py
Expand Up @@ -1792,7 +1792,7 @@ def predict(
.. note::
See `Prediction
<https://xgboost.readthedocs.io/en/latest/tutorials/prediction.html>`_
<https://xgboost.readthedocs.io/en/latest/prediction.html>`_
for issues like thread safety and a summary of outputs from this function.
Parameters
Expand Down

0 comments on commit 50bc97e

Please sign in to comment.