Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-39214: [Java] Support reproducible build #39215

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

jbonofre
Copy link
Member

@jbonofre jbonofre commented Dec 13, 2023

Rationale for this change

This PR add reproducible builds support.

Are these changes tested?

I tested the builds are reproducible on arrow-vector, arrow-memory, ... using mvn clean verify artifact:compare.

Are there any user-facing changes?

No

Copy link

⚠️ GitHub issue #39214 has been automatically assigned in GitHub to PR creator.

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting review Awaiting review labels Dec 13, 2023
@jbonofre
Copy link
Member Author

@lidavidm let me check why GH Action is not happy 😃

@lidavidm
Copy link
Member

@github-actions crossbow submit java

Copy link

Revision: 502eb17

Submitted crossbow builds: ursacomputing/crossbow @ actions-b2f5994f24

Task Status
java-jars GitHub Actions
verify-rc-source-java-linux-almalinux-8-amd64 GitHub Actions
verify-rc-source-java-linux-conda-latest-amd64 GitHub Actions
verify-rc-source-java-linux-ubuntu-20.04-amd64 GitHub Actions
verify-rc-source-java-linux-ubuntu-22.04-amd64 GitHub Actions
verify-rc-source-java-macos-amd64 GitHub Actions

java/pom.xml Outdated Show resolved Hide resolved
Comment on lines 26 to 28
<properties>
<project.build.outputTimestamp>2023-12-13T00:00:00Z</project.build.outputTimestamp>
</properties>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide some commands or steps for validating a reproducible build, such as in the Arrow Format module?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I tried these commands, it finished ok with or without any changes:

$ mvn clean install
$ mvn clean verify artifact:compare
...
[INFO] Saved info on build to /Users/dsusanibar/fork/arrow/java/format/target/arrow-format-15.0.0-SNAPSHOT.buildinfo
[INFO] Checking against reference build from central...
[INFO] Reference buildinfo file not found: it will be generated from downloaded reference artifacts
[INFO] Reference build java.version: 11 (from MANIFEST.MF Build-Jdk-Spec)
[INFO] Reference build os.name: Unix (from pom.properties newline)
[INFO] Minimal buildinfo generated from downloaded artifacts: /Users/dsusanibar/fork/arrow/java/format/target/reference/arrow-format-15.0.0-SNAPSHOT.buildinfo
[INFO] Reproducible Build output summary: 5 files ok
[INFO] Reproducible Build output comparison saved to /Users/dsusanibar/fork/arrow/java/format/target/arrow-format-15.0.0-SNAPSHOT.buildcompare
...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right for the check: mvn clean install and then mvn clean verify artifact:compare do a check.
This PR ensures we have reproducible build, not meaning that some module are not ok already (AFAIR, flight and algorithm were not reproducible).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also check with artifact:check-buildplan.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we might want is to add this to CI somewhere (or possibly the release verification process) so we can make sure it stays working.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidavidm good idea. Let me check to add a workflow in GH actions.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting merge Awaiting merge labels Dec 14, 2023
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Dec 15, 2023
@jbonofre
Copy link
Member Author

FYI, I'm resuming work on this PR:

  • checking the build failure on CI
  • checking to add a GH Action workflow to verify

@lidavidm
Copy link
Member

checking to add a GH Action workflow to verify

It might be best to add it to an existing workflow, or again, adding it to the release verification script even (what do you think @assignUser?)

@jbonofre
Copy link
Member Author

checking to add a GH Action workflow to verify

It might be best to add it to an existing workflow, or again, adding it to the release verification script even (what do you think @assignUser?)

@lidavidm it makes sense, I do like this. Thanks !

@assignUser
Copy link
Member

How exactly does this work, it compares the current build against a reference build? How does that work for dev versions?

We do have binary verification scripts so it would make sense to add it there, that would automatically also run it. dev/release/verify-release-candidate.sh with `TEST_JARS=1' is run by the jar binary verification script (this downloads previously build nightly jars iirc) cc @raulcd who is a bit more familiar with the release scripts

@lidavidm
Copy link
Member

https://maven.apache.org/plugins/maven-artifact-plugin/compare-mojo.html

Compare current build output (from package) against reference either previously install-ed or downloaded from a remote repository: comparison results go to .buildcompare file.

So maybe we finally have a useful thing to do in the binary JAR verification step.

@jbonofre
Copy link
Member Author

@lidavidm yup, it's what I plan to do (and I used while working on this PR 😄 ).

First I will fix the failing checks (AMD64 Conda Integration Test, etc).

@jbonofre
Copy link
Member Author

Error:  Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:3.4.1:enforce (enforce-maven-version) on project arrow-bom: 
Error:  Rule 0: org.apache.maven.enforcer.rules.version.RequireMavenVersion failed with message:
Error:  Detected Maven Version: 3.5.0 is not in the allowed range [3.6.3,).
Error:  -> [Help 1]

That's due to the Apache POM update. I will propose fix as part of this PR.

@lidavidm
Copy link
Member

See #39279

@lidavidm
Copy link
Member

@github-actions crossbow submit java

Copy link

Only contributors can submit requests to this bot. Please ask someone from the community for help with getting the first commit in.
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/7584811545

@lidavidm
Copy link
Member

Hmm well Archery doesn't like me but it seems CI still has a few issues even after a re-run.

@jbonofre
Copy link
Member Author

@lidavidm yup, to be honest, I don't know why it's failing 😄 I will probably try on main branch to compare.
I will do a new pass again, maybe I missed something 😄

@kou
Copy link
Member

kou commented Jan 19, 2024

@assignUser Could you take a look at #39215 (comment) and #39215 (comment) ?

#39610 is related.

@assignUser
Copy link
Member

@kou hm weird asf membership is public for David so the bot should pick it up... if this keeps happening we may need to add a non gh api based method to identify committers (e.g. committer list, IIRC we have a yaml one around somehwere right?)

@lidavidm
Copy link
Member

I fixed my membership after that. You may want to update the bot's error message to tell people what to do, I only knew because I happened to see someone else ask the same question in a private channel.

@jbonofre
Copy link
Member Author

I'm rebasing and doing new tests on this one.

@jbonofre jbonofre force-pushed the GH-39214 branch 3 times, most recently from 35de715 to 4f4343f Compare January 30, 2024 16:27
@lidavidm
Copy link
Member

@github-actions crossbow submit java

Copy link

Revision: e6e5d16

Submitted crossbow builds: ursacomputing/crossbow @ actions-02510129a8

Task Status
java-jars GitHub Actions
verify-rc-source-java-linux-almalinux-8-amd64 GitHub Actions
verify-rc-source-java-linux-conda-latest-amd64 GitHub Actions
verify-rc-source-java-linux-ubuntu-20.04-amd64 GitHub Actions
verify-rc-source-java-linux-ubuntu-22.04-amd64 GitHub Actions
verify-rc-source-java-macos-amd64 GitHub Actions

@jbonofre
Copy link
Member Author

@lidavidm as I'm back on Arrow, I'm resuming several PRs/works I have in progress 😄 Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Java] Provide reproducible builds
7 participants