Skip to content

Commit

Permalink
Addressing NCCL issue with binary classification for distributed trai…
Browse files Browse the repository at this point in the history
  • Loading branch information
Nikhil Raverkar committed Mar 17, 2023
1 parent 6dcd442 commit 5982cee
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions docker/1.7-1/final/Dockerfile.cpu
Expand Up @@ -69,6 +69,9 @@ ENV SM_INPUT /opt/ml/input
ENV SM_INPUT_TRAINING_CONFIG_FILE $SM_INPUT/config/hyperparameters.json
ENV SM_INPUT_DATA_CONFIG_FILE $SM_INPUT/config/inputdataconfig.json
ENV SM_CHECKPOINT_CONFIG_FILE $SM_INPUT/config/checkpointconfig.json
# See: https://github.com/dmlc/xgboost/issues/7982#issuecomment-1379390906 https://github.com/dmlc/xgboost/pull/8257
ENV NCCL_SOCKET_IFNAME eth


# Set SageMaker serving environment variables
ENV SM_MODEL_DIR /opt/ml/model
Expand Down

0 comments on commit 5982cee

Please sign in to comment.