Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address all lintr warnings. #8012

Closed
trivialfis opened this issue Jun 20, 2022 · 6 comments · Fixed by #8613
Closed

Address all lintr warnings. #8012

trivialfis opened this issue Jun 20, 2022 · 6 comments · Fixed by #8613

Comments

@trivialfis
Copy link
Member

https://github.com/dmlc/xgboost/runs/6965009976?check_suite_focus=true

@jameslamb
Copy link
Contributor

👋 Could I work on this one?

I originally set up LightGBM's {lintr} stuff (microsoft/LightGBM#2437) and recently updated it for {lintr} v3.0.0 (microsoft/LightGBM#5294), so I'm pretty familiar it.

@trivialfis
Copy link
Member Author

Thank you for the offer! Actually, I would love to hear your opinion on #7906 and microsoft/LightGBM#4968 before spending a non-trivial amount of time on the R package. I started to learn more about R and find some of the issues in the existing codebase of various packages including xgb and lgb. I saw microsoft/LightGBM#4968 , which is a great improvement for R users I believe. But it's using R6 and might bring significant breaking change.

@trivialfis
Copy link
Member Author

Forgot to ping @jameslamb ;-)

@jameslamb
Copy link
Contributor

Sorry for the delay in responding!


before spending a non-trivial amount of time on the R package

I think getting {lintr} linting running in CI again is worth doing regardless of any future design work. In my experience, most of the fixes for {lintr} warnings are quick, easy, and don't impact the public API of the library. It's also great to have prior to doing new development because {lintr} gives you some free minimal coverage of code that doesn't yet have tests.


to hear your opinion on #7906 and microsoft/LightGBM#4968

Short answer

I agree with @RAMitchell (#7906 (comment)). XGBoost should take on this list of proposed changes in the form of new public APIs, and then consider a lengthy deprecation cycle.

Long answer

In LightGBM, we've been accepting PRs with some of the breaking changes proposed in microsoft/LightGBM#4968 (I asked its author to break them into smaller pieces, so we could do higher-quality reviews and merge things gradually). The only reason that work isn't complete is because of a severe lack of contributor/maintainer availability in LightGBM, which has meant not much time left for things other than fixing CI, improving packaging details, and supporting new versions of dependencies.

That lack of contributor/maintainer attention is also why we've been accepting those breaking changes directly instead of doing a deprecation cycle. It's been more than a year since LightGBM had a substantive release...in its current state, the project just can't sustain a high enough frequency of releases to do deprecations well 😫

In light of this, in LightGBM, I've proposed that we treat the two main entrypoints for training differently:

  • lgb.train() = low-level interface to LightGBM, focused on performance
    • this is the one that external package developers, e.g. {bonsai}, are encouraged to integrate with
  • lightgbm() = high-level interface to LightGBM, focused on compatibility with the rest of the R modeling ecosystem
    • i.e., intended to be the more idiomatic, "R-thonic" interface, similar to the sklearn interface in the Python package
    • this is more likely to get breaking changes than lgb.train()

See microsoft/LightGBM#5273 and microsoft/LightGBM#5203 (comment).

But I don't think {xgboost} should do the same thing, given that {xgboost} is so much more widely used (based on the list of reverse dependencies at https://cran.r-project.org/web/packages/xgboost/index.html).

Tags

I'm going to also tag in @simonpcouch, who has been working on an alternative interface to LightGBM that fits with other components in the "tidymodels" ecosystem: https://github.com/tidymodels/bonsai.

@trivialfis
Copy link
Member Author

Thank you for the input!

a severe lack of contributor/maintainer availability in LightGBM,

We are facing the same issue here. :-(

fits with other components in the "tidymodels" ecosystem

That's interesting. If we can get XGBoost to be part of that project we can start recommending people use that interface instead.

It's also great to have prior to doing new development because {lintr} gives you some free minimal coverage of code that doesn't yet have tests.

That's an excellent suggestion! Please help look into the issues when you get a chance and feel free to ping me if there's anything I can help. ;-)

@simonpcouch
Copy link

Thanks for looping me in, @jameslamb! No comments, but will be sure to keep an eye on this and related issues.

If we can get XGBoost to be part of that project we can start recommending people use that interface instead.

We do indeed maintain an interface to xgboost in tidymodels.🙂 Some docs here and here, and a worked example here.

@trivialfis trivialfis added this to 2.0 TODO in 2.0 Roadmap via automation Dec 17, 2022
@trivialfis trivialfis moved this from 2.0 TODO to 2.0 In Progress in 2.0 Roadmap Dec 17, 2022
jameslamb added a commit to jameslamb/xgboost that referenced this issue Dec 19, 2022
2.0 Roadmap automation moved this from 2.0 In Progress to 2.0 Done Dec 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
2.0 Roadmap
  
2.0 Done
Development

Successfully merging a pull request may close this issue.

3 participants