Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pyspark] fix empty data issue when constructing DMatrix #8245

Merged
merged 8 commits into from Sep 20, 2022

Conversation

wbo4958
Copy link
Contributor

@wbo4958 wbo4958 commented Sep 13, 2022

  1. Fix empty data issue when user specifies the validation col
  2. Fix the empty partition issues

To fix #8221

1. Fix empty data issue when user specifies the validation col
2. Fix the empty partition issues
# construct DMatrix even there is no any data since we need to ensure every
# worker do the AllReduce when constructing DMatrix, or else it may hang
# forever.
dvalid = make(valid_data, kwargs) if has_validation_col else None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good fix!

Copy link
Member

@trivialfis trivialfis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix. Please resolve the CI errors.

@hcho3
Copy link
Collaborator

hcho3 commented Sep 14, 2022

Re-running the CI, since we recently transitioned to another CI system

@wbo4958
Copy link
Contributor Author

wbo4958 commented Sep 15, 2022

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-jar-plugin:3.0.2:jar (default-jar) on project xgboost4j-gpu_2.12: Execution default-jar of goal org.apache.maven.plugins:maven-jar-plugin:3.0.2:jar failed: Plugin org.apache.maven.plugins:maven-jar-plugin:3.0.2 or one of its dependencies could not be resolved: Could not transfer artifact org.codehaus.plexus:plexus-component-annotations:jar:1.6 from/to central (https://repo.maven.apache.org/maven2): Connection reset -> [Help 1] Seems connection reset?

@wbo4958
Copy link
Contributor Author

wbo4958 commented Sep 15, 2022

@hcho3 could you help to re-trigger this build?

@hcho3
Copy link
Collaborator

hcho3 commented Sep 15, 2022

Done

@trivialfis
Copy link
Member

@wbo4958
Copy link
Contributor Author

wbo4958 commented Sep 19, 2022

@hcho3 could you help to trigger CI?

@trivialfis
Copy link
Member

Started the CI.

@trivialfis trivialfis merged commit 520586f into dmlc:master Sep 20, 2022
@wbo4958 wbo4958 deleted the empty-dmatrix branch April 23, 2024 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[pyspark] SparkXGBClassifier failed to train with early_stopping_rounds and validation_indicator_col
4 participants