Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maven build generates consistent output #1211

Merged
merged 1 commit into from May 10, 2023

Conversation

bdemers
Copy link
Contributor

@bdemers bdemers commented Mar 29, 2023

@dsyer
Copy link
Member

dsyer commented Apr 13, 2023

It's an interesting point, but I'm not sure we need a reproducible build for petclinic (and a hard-coded timestamp wouldn't be the best way to achieve it, IMO).

@dsyer dsyer closed this Apr 13, 2023
@hboutemy
Copy link

hboutemy commented Apr 20, 2023

I'm not sure we need a reproducible build for petclinic

Sure, you don't absolutely need it.
But Pet Clinic is about learning best practices, and Reproducible Builds is a best practice that developers need to learn

a hard-coded timestamp wouldn't be the best way to achieve it, IMO

I worked on it very hard, and I can tell you that surprisingly it is. When people will use Maven Build Cache extensively in the future, the impact will be visible.
From a logical point of view, some people prefer using the timestamp of the last Git commit, which looks more reasonable for reasons I perfectly understand:

  • on releases, it brings the same result,
  • that's on SNAPSHOTs that it gives a moving value even if nothing changed in the content = that's why it's not the best choice (again, performance with Build Cache will show it in the future)

see https://maven.apache.org/guides/mini/guide-reproducible-builds.html#faq for a summary on value update strategies: if you have a new idea, I'm very much interested

I should definitively submit a talk to Devoxx UK next year, and in fact every Devoxx, to spread the word: too late for this year, I'll be there only as attendee...

@dsyer
Copy link
Member

dsyer commented Apr 21, 2023

There are open issues linked to in the wiki (https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=74682318). Note: I can see that probably most are fixed (although I was initially confused by the notation in the table).

I do think reproducible builds are a good idea, but it seems to be a bit of an evolving standard for Maven, so I'm not convinced we should take the dependency.

Also I don't understand how a hard-coded property in pom.xml fixes anything - it's just going to be stale every time someone makes a change and forgets to update the timestamp.

Maven Build Cache

I don't think I know what that is, and it isn't referenced in any of the links you posted (as far as I could tell).

@bdemers
Copy link
Contributor Author

bdemers commented Apr 21, 2023

The Maven Build Cache is an extension that adds caching to Maven builds. This works similarly to how local and remote caching works with Gradle builds and is compatible with other remote build caches (e.g. ge.spring.io)

The "hard-coded" property is similar to how other maven properties work, for example, if you set a dependency version as a property, it's hardcoded until someone or some automated process updates them (e.g. Dependabot).
In the case of this property, it's usually done via the maven-release-plugin or versions-maven-plugin automatically

This timestamp is used as the date in the table of contents of archive files (jars). When this property is not set, jars with identical contents would produce a different checksum.

NOTE: This property could also be epoch seconds (instead of a ISO-8601 formatted date), which may cause less confusion or at least more likely to be overlooked.

I can also update this PR with a comment too, if you think that would help! Just let me know!

@dsyer
Copy link
Member

dsyer commented Apr 22, 2023

A version property can be hard-coded because it's meaning is clear. Hard-coding a timestamp that is supposed to represent the state of the source code makes no sense at all to me - it seems like little more than a hack - it will always be wrong or misleading. I didn't see any features in the versions plugin that would help to update it. Maybe I need to see a complete example where the property value is actually updated in a sensible way?

@hboutemy
Copy link

  • it works at large scale https://github.com/jvm-repo-rebuild/reproducible-central
  • of course, the property does not fix everything magically: there is code in plugins that use this value and adapt their behaviour, particularly to get reproducible zip/jar files (order and timestamp)
  • on updating the value: that's why some people prefer git commit time, but other tested and saw how m-release-p and versions-m-p do the update (see MRELEASE-1029 and versions PR 453 for implementation details if you really want to dig into it)

now, if you find a new strategy to get a reproducible zip, please share: I'm all ears open

@dsyer
Copy link
Member

dsyer commented Apr 23, 2023

I can see the behaviour you implemented in the versions plugin. Unfortunately it only seems to update the timestamp when you change the version. For snapshots that doesn't make sense (to me).

That means that if I used this feature it would have to be supported by hand-written scripts, which I kind of dislike.

Also, apparently, not all plugins know the convention about epoch timestamps (if you try it the build will fail).

@bdemers
Copy link
Contributor Author

bdemers commented May 1, 2023

We could also hardcode the initial Maven build timestamp to the same one Gradle uses for its reproducible zip/jar tasks:

1980 February 1st CET.

https://github.com/gradle/gradle/blob/a0fc74acf9b31ac4282d8e763d2c8b49225633cd/subprojects/core/src/main/java/org/gradle/api/internal/file/archive/ZipCopyAction.java#L41-L56

NOTE: In Gradle, this date is used when the archive task has the preserveFileTimestamps flag enabled.

@dsyer
Copy link
Member

dsyer commented May 1, 2023

A hard-coded fictional date is kind of nonsense though, right? It's using a hard-to-guess arbitrary value to say "don't put any real timestamps in here please"? Is that necessary - maybe it is - maybe the JAR file standard (or something we absolutely cannot control) just has to have a timestamp, so this fictional one is the only workaround that preserves reproducible builds. Is that what is going on here? Setting a timestamp to an arbitrary value in the distant past seems like it will only confuse people (e.g. spring-projects/spring-framework#29633), but maybe it is less confusing than an arbitrary value in the near past. I still don't know.

@bdemers
Copy link
Contributor Author

bdemers commented May 1, 2023

Yes, the table of contents of various archives formats (zip, jar, etc) require a timestamp. There are some complexities with date/times (see the javadoc on why Gradle picked that 1980 date).

IMHO, the initial date is arbitrary, though aligning it with the last tag/release is probably ideal (keeps things consistent with future updates and can be used as an explanation as to why a date is picked)

If folks are NOT using a maven plugin that automatically updates the project.build.outputTimestamp property, the result is essentially the technique that Gradle is using (an arbitrary hardcoded date). And they still get the benefits of reproducible builds (both from a security and build cache perspective).

@dsyer
Copy link
Member

dsyer commented May 9, 2023

IMHO, the initial date is arbitrary, though aligning it with the last tag/release is probably ideal

Agree (but the adjective "initial" is misleading). However, this isn't supported by any of the plugins? We'd have to implement it with shell scripts or something.

@bdemers
Copy link
Contributor Author

bdemers commented May 9, 2023

@dsyer I'm not sure what you mean by isn't supported by any of the plugins.

To update the project version and date you could run something like:

mvn versions:set -DgenerateBackupPoms=false -DnewVersion=<your-new-version-here>
# DgenerateBackupPoms is optional

NOTE: @dsyer, I hope I'm not coming across as pushy. I just want to make sure I understand your concerns; maybe there is something we can do on the Maven side to make things better (even if there is no interest in merging this PR)

@dsyer
Copy link
Member

dsyer commented May 10, 2023

IIRC that only works if the version changes. So there’s no way for snapshots to be up to date. Maybe that’s acceptable, but I’m not sure why it has to be if it is.

No pushiness detected. I sense it is important, so it’s worth making sure I understand. Thank you for your patience.

@dsyer dsyer reopened this May 10, 2023
@dsyer dsyer merged commit cdb2eb1 into spring-projects:main May 10, 2023
2 checks passed
@bdemers bdemers deleted the mvn-reproducible-time-stamp branch May 15, 2023 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants