Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] CRAN error: r-devel-linux-x86_64-debian-clang #5502

Closed
Tracked by #5525
jameslamb opened this issue Sep 24, 2022 · 20 comments
Closed
Tracked by #5525

[R-package] CRAN error: r-devel-linux-x86_64-debian-clang #5502

jameslamb opened this issue Sep 24, 2022 · 20 comments
Assignees

Comments

@jameslamb
Copy link
Collaborator

Description

CRAN checks for {lightgbm} are showing an ERROR for the r-devel-linux-x86_64-debian-clang check flavor.

image

https://cran.r-project.org/web/checks/check_results_lightgbm.html

Reproducible example

I haven't tried to reproduce this yet outside of the CRAN system.

According to CRAN's logs (link), exactly one test is failing.

     == Failed ======================================================================
     -- 1. Failure (test_basic.R:1276:5): lgb.train() supports non-ASCII feature name
     dumped_model[["feature_names"]] not identical to iconv(feature_names, to = "UTF-8").
     4/4 mismatches
     x[1]: "F_é\u009b¶"
     y[1]: "F_<U+96F6>"
    
     x[2]: "F_äž\u0080"
     y[2]: "F_<U+4E00>"
    
     x[3]: "F_äº\u008c"
     y[3]: "F_<U+4E8C>"
    
     x[4]: "F_äž\u0089"
     y[4]: "F_<U+4E09>"
    
     == DONE ========================================================================
     Error: Test failures
     Execution halted

Additional Comments

This project has a CI job (#4164) that is intended to exactly replicate CRAN's r-devel-linux-x86_64-debian-clang check. That job has been succeeding. I'm not sure yet what the difference is between that job and CRAN's setup.

@jameslamb
Copy link
Collaborator Author

@shiyu1994 can you please check your email and let us know here if CRAN has contacted you?

The last time something similar happened (#4923), we were given just a few days to submit a patched version (#4930) before CRAN archived the package.

If we need a v3.3.3 release with a patch, it's important that we start that work as soon as possible, to avoid disruptions to users.

@shiyu1994
Copy link
Collaborator

@jameslamb Thanks for the reminder! Just searched my email inbox and found this
截屏2022-10-08 00 58 09

For convenience, I'll paste the link here
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcran.r-project.org%2Fweb%2Fchecks%2Fcheck_results_lightgbm.html&amp;data=05%7C01%7Cyushi2%40microsoft.com%7Cdf142fa6d9534ae2055a08daa0b88902%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637999012089128423%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=uwrg7AukCXHgbUFoJ5GXCeGk%2BBan6wsSzKyWuTYbbe4%3D&amp;reserved=0

Seems that the issue is due to the test case for non-ASCII characters in feature names.

  -- 1. Failure (test_basic.R:1276:5): lgb.train() supports non-ASCII feature name
  dumped_model[["feature_names"]] not identical to iconv(feature_names, to = "UTF-8").
  4/4 mismatches
  x[1]: "F_é\u009b¶"
  y[1]: "F_<U+96F6>"
  
  x[2]: "F_äž\u0080"
  y[2]: "F_<U+4E00>"
  
  x[3]: "F_äº\u008c"
  y[3]: "F_<U+4E8C>"
  
  x[4]: "F_äž\u0089"
  y[4]: "F_<U+4E09>"

@shiyu1994
Copy link
Collaborator

Sorry for ignoring the email. Will pay more attention to such emails later on.

@jameslamb
Copy link
Collaborator Author

AH! @shiyu1994 thank you. This is a VERY serious problem.

It means we now only have 3 more days to upload a new version of the R package to CRAN, or {lightgbm} will be archived on CRAN, which would mean:

  • users will not be able to install it with install.packages()
  • the maintainers of the other packages depending on {lightgbm} will receive emails stating that their packages will be archived as well, which might lead them to drop their support for {lightgbm}

The current master branch contains many breaking changes intended for v4.0.0 (#5153), so I don't think it would be appropriate to release it.

So I think we should do a special, patch-only v3.3.3 release containing only the changes necessary to satisfy CRAN, similar to what we did the last time this happened (#4930).

I'm going to do the following:

  • create a new branch, release_v3.3.3, branched off of the v3.3.2 tag
  • push two fixes to it:
  • test the R package built from it on additional platforms using R Hub and the win-builder service

Then I think we should try to release from that branch.

If we run into any issues (e.g. maybe our CI jobs don't work any more with such an old version of the code), I think we can manually create a v3.3.3 release and ONLY release the R package to CRAN manually + not move the stable tag. I'd rather have a slight inconsistency in the project's tags and releases than have R users negatively impacted by the package being removed from CRAN.

Usually I'd rely heavily on @StrikerRUS 's support for this, but I haven't seen him here for a few weeks so I don't think we should wait for him.

cc @jmoralez @guolinke @btrotta so you're aware too

@shiyu1994
Copy link
Collaborator

@jameslamb Thanks. If you need any help during the release, please feel free to ping me.

@jameslamb
Copy link
Collaborator Author

ok thank you very much!

@jameslamb
Copy link
Collaborator Author

Actually @shiyu1994 ... could you review my proposed patch in #5503? I think the change to configure.win in that PR for inet_pton should probably also be included in this release to CRAN, since otherwise we might face issues with inet_pton when R eventually updates its compilers on Windows.

@jameslamb jameslamb self-assigned this Oct 7, 2022
@jameslamb
Copy link
Collaborator Author

I'm investigating this one right now.

@jameslamb
Copy link
Collaborator Author

jameslamb commented Oct 7, 2022

❗ I just noticed that on the most recent successful build of the CI job that is supposed to mimic this check (build link), we're getting the following environment:

* using R Under development (unstable) (2022-07-31 r82648)
* using platform: x86_64-pc-linux-gnu (64-bit)
* using session charset: ISO8859-15

That is 3 2 months older than the one being used by CRAN's checks (logs link) 😭

using R Under development (unstable) (2022-10-02 r82991)
using platform: x86_64-pc-linux-gnu (64-bit)
using session charset: ISO8859-15

It looks like the container image rhub/debian-clang-devel was last updated on August 1st, 2022.

Screen Shot 2022-10-07 at 1 38 02 PM

source: https://hub.docker.com/r/rhub/debian-clang-devel/tags

And I can see why.... the GitHub Actions job that is supposed to publish a new version of that image daily seems to have been broken since August 1st 😭

image

source: https://github.com/r-hub/rhub-linux-builders/actions/workflows/debian.yml

@jameslamb
Copy link
Collaborator Author

jameslamb commented Oct 7, 2022

Ok, so!

After applying a small change to the rhub/rhub-linux-builders images (r-hub/rhub-linux-builders#62), I'm happy to say that I was able to reproduce the same test failure as CRAN is reporting 🎉

I suspect that that means the root cause then is that between July 31st and September 24th, one of the following changed in a way that causes this test to fail:

  • something in R-devel changed
  • something else (e.g. the version of clang) in the setup for this check changed

I'll continue investigating.

@jameslamb
Copy link
Collaborator Author

I'm able to reproduce the failing test, but it's weird.... when I run the code of the test directly in an R REPL in the new rhub/debian-clang-devel image I've built, it succeeds!

But when I run something like the following in there

cd LightGBM/R-package/tests
Rscript testthat.R

That test on non-ASCII feature names fails 🤔 .

So maybe something about the {testthat} machinery is responsible? Also linking this relevant-looking issue with some good reproducible examples of similar issues on the same CRAN check flavor: r-lib/waldo#66.

@jameslamb
Copy link
Collaborator Author

It really seems to me like something {testthat} is doing is causing a behavior change in {lightgbm}.

Consider the following R script.

library(lightgbm)

dtrain <- lgb.Dataset(
  data = matrix(rnorm(400L), ncol =  4L)
  , label = rnorm(100L)
)
feature_names <- c("F_零", "F_一", "F_二", "F_三")
bst <- lgb.train(
  data = dtrain
  , nrounds = 5L
  , obj = "regression"
  , params = list(
    metric = "rmse"
    , verbose = -1L
  )
  , colnames = feature_names
)
dumped_model <- jsonlite::fromJSON(bst$dump_model())

all_equal <- all(
  dumped_model[["feature_names"]] == feature_names
)
stopifnot(all_equal)

testthat::test_that("empty test", {
  testthat::expect_true(TRUE)
})

The presence of a {testthat} test at the end shouldn't have any impact on whether or not the feature names in the dumped model are identical to the original ones, right?

Well...

Rscript --vanilla test-ascii.R

succeeds, like this:

Test passed

but using {testthat}...

Rscript --vanilla -e "testthat::test_file('test-ascii.R')"

fails!

-- Error (test-ascii.R:23:1): (code run outside of `test_that()`) --------------
Error in `eval(code, test_env)`: all_equal is not TRUE
Backtrace:
 1. base::stopifnot(all_equal)
      at test-ascii.R:23:0

[ FAIL 1 | WARN 0 | SKIP 0 | PASS 0 ]

I'm so confused 😫

@jameslamb
Copy link
Collaborator Author

I should have clarified....that difference only happens in the rhub/debian-clang-devel image I've built for this. Running on my Mac, for example, that script succeeds regardless of how I run the code.

@jameslamb
Copy link
Collaborator Author

Found some evidence that {testthat} DOES mess with the R session's locale 😬

https://github.com/r-lib/testthat/blob/df20dbb54f4aa125e22eb21e2fd5e55bc1112391/vignettes/third-edition.Rmd#L200

In the third edition, test_that() automatically calls local_reproducible_output() which automatically sets a number of options and environment variables to ensure output is as reproducible across systems.
This includes setting:

  • Sys.setLocale("LC_COLLATE" = "C") so that sorting a character vector returns the same order regardless of the system language.

I'm not sure how or if that's related to the behavior I'm seeing yet, but it does feel relevant.

@StrikerRUS
Copy link
Collaborator

Fixed via #5526 and #5525.

@jameslamb
Copy link
Collaborator Author

@StrikerRUS I know you have a lot of notifications to catch up on, want to be sure you see this: r-hub/rhub-linux-builders#62

^ until that PR is merged, the CI check here in LightGBM that tries to replicate this CRAN failure will be stuck on a version of r-devel from July 31st, 2022.

So you might want to subscribe to notifications there.

@StrikerRUS
Copy link
Collaborator

you have a lot of notifications to catch up o

350+, hahaha 😬

the CI check here in LightGBM that tries to replicate this CRAN failure will be stuck on a version of r-devel from July 31st, 2022.

Thanks for the reminder! Yeah, I saw that PR. Hope it will be merged.

Also, I saw you

manually test R package with new rhub/debian-clang-devel image

for the v3.3.3 release. Thank you!

@jameslamb
Copy link
Collaborator Author

r-hub/rhub-linux-builders#62 has been merged, and a new rhub/debian-clang-devel was successfully built! (build link)

I can see that that image contains a recent version of R-devel.

docker run \
    --rm \
    --entrypoint="" \
    -it rhub/debian-clang-devel \
    /opt/R-devel/bin/R --version

R Under development (unstable) (2022-11-04 r83277) -- "Unsuffered Consequences"

So the CI job in this repo using that image should again be replicating the CRAN check!

cc @hcho3 @trivialfis, might be helpful for you as welll

@trivialfis
Copy link

Thank you for the ping, that's really helpful as we looking to improve the tests for the R package.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants