Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducible Builds not working when using modular jar #164

Closed
jorsol opened this issue Feb 24, 2021 · 18 comments · Fixed by #205
Closed

Reproducible Builds not working when using modular jar #164

jorsol opened this issue Feb 24, 2021 · 18 comments · Fixed by #205

Comments

@jorsol
Copy link
Contributor

jorsol commented Feb 24, 2021

I have a project that compiles a module-info.java using the <multiReleaseOutput> approach of maven-compiler-plugin as show here:

        <plugin>
          <groupId>org.apache.maven.plugins</groupId>
          <artifactId>maven-compiler-plugin</artifactId>
          <version>3.8.1</version>
          <executions>
            <execution>
              <id>compile-java9</id>
              <phase>compile</phase>
              <goals>
                <goal>compile</goal>
              </goals>
              <configuration>
                <fork>true</fork>
                <release>9</release>
                <compileSourceRoots>
                  <compileSourceRoot>${project.basedir}/src/main/java9</compileSourceRoot>
                </compileSourceRoots>
                <multiReleaseOutput>true</multiReleaseOutput>
              </configuration>
            </execution>
          </executions>
        </plugin>

The <project.build.outputTimestamp> property is set on the project and the jar is created correctly, all the files set the timestamp except for the /META-INF/versions/9/module-info.class file which causes that the build is no longer reproducible.

@jorsol
Copy link
Contributor Author

jorsol commented Feb 26, 2021

I have found where the issue lies, the JarToolModularJarArchiver uses the jar command as a postCreateArchive() step, which updates the module-info.class file changing the last modification date of that file.

I have tested to add the following to the last part of the postCreateArchive() method, where the moduleDescriptorsPath is a List with all the module descriptors found.

  // Fix Last Modified Date of the module descriptor.
  if (getLastModifiedDate() != null) {
    URI uri = URI.create("jar:" + getDestFile().toURI().toString());
    try (FileSystem zipfs = FileSystems.newFileSystem(uri,
        new java.util.HashMap<String, Object>()) ) {
      for (String path : moduleDescriptorsPath) {
        Path module = zipfs.getPath(path);
        long lastModDate = getLastModifiedDate().getTime();
        Files.setLastModifiedTime(module, FileTime.fromMillis(lastModDate));
      }
    }
  }

This more or less works but I think the precision of the last modified time set is not the same (for some reason it's not using UTC).

@hboutemy sorry if I'm pinging on you, but since the reproducible-builds is your area, without this, all the modular jars won't be reproducible never.

@jorsol jorsol changed the title Reproducible Builds not working when using multi-release jar Reproducible Builds not working when using modular jar Feb 26, 2021
@michael-o
Copy link
Member

There is an open issue about the jar tool breeaking this. Search JBS.

@jorsol
Copy link
Contributor Author

jorsol commented Feb 26, 2021

Ok, it might be a bug on the jar tool, but probably a workaround could be implemented here, or should we wait until a new release of Java fix this?

@michael-o
Copy link
Member

@michael-o
Copy link
Member

See my analysis here: https://issues.apache.org/jira/browse/MJAR-275
Worst bullshit ever by Oracle.

@michael-o
Copy link
Member

The root cause has been fixed in Java 18. Well, we now need a backport...

@jorsol
Copy link
Contributor Author

jorsol commented Feb 9, 2022

As this doesn't seem to be backported, I could provide a workaround for this issue that should work, WDYT?

IMHO depending on a specific Java release is less than ideal, not many users will use Java 18 just to compile (even if they can event target to Java 7) just to get reproducible builds.

@michael-o
Copy link
Member

@jorsol You can write to the appropriate mailing list for a backport request.

@hboutemy
Copy link
Member

or add a jar rewriting process after the jar tool execution:
https://github.com/codehaus-plexus/plexus-archiver/blob/master/src/main/java/org/codehaus/plexus/archiver/jar/JarToolModularJarArchiver.java#L99

notice that I did not test the update of Java 18, I don't get how the jar tool even in Java 18 will get a reproducible timestamp of the .class: IIUC, it will use the timestamp from the file system, but this timestamp is not reproducible

@jorsol
Copy link
Contributor Author

jorsol commented Feb 10, 2022

or add a jar rewriting process after the jar tool execution

I already have a working fix after the jar tool execution, that's why I'm asking if it's of interest.

I don't get how the jar tool even in Java 18 will get a reproducible timestamp of the .class

In previous versions of Java, the jar tool basically rewrites the module-info.class and in the rewrite process the last modified time of the jar file was lost, in Java 18 this process fetches the last modified time of the zip entry and then use that to set the last modified time of the zip entry that is overwritten.

@hboutemy
Copy link
Member

I think a temporary fix will be useful: if you can share yours, that would be awesome
and I suppose that we can check that if Java 18 is currently used, no fix is necessary

on the change, ok, if the Java 18 logic is "keep the existing module-info.class timestamp", we'll get reproducible result: did you check that we get expected result with JDK 18? Notice: now I remember that last Logback was done with JDK 18 https://github.com/jvm-repo-rebuild/reproducible-central/blob/master/wip/logback-1.3.0-alpha13.buildspec , which was a pain for me because there is no Maven Dockerhub image with JDK 18 yet, but I imagine that Ceki wanted to get that reproducible output by using this EA JDK 18. I suppose I'll test with that release :)

thanks for the feedback @jorsol

@jorsol
Copy link
Contributor Author

jorsol commented Feb 11, 2022

Thanks, @hboutemy, I could provide my fix but it actually depends on this PR #199, so if you or any of the maintainers don't mind reviewing and merging it would be nice.

@jorsol
Copy link
Contributor Author

jorsol commented Feb 18, 2022

I just fall into a rabbit hole trying to provide a fix, after many tests and research I found other issues:

The bad:

  • From my findings, JDK 18 does fix the module-info.class timestamp, BUT if the Main-Class attribute is set, then the MANIFEST.MF is updated without preserving the timestamp, so essentially the fix applied in the JDK 18 only partially fix the problem.
  • My original idea was to use the Zip File System Provider to update only the specific entries, BUT this uses Files.setLastModifiedTime and this sets the extended timestamp attributes, which is NOT great for reproducible builds.
  • In the future, on Java 8 (once this project uses Java 8 as minimum #200) the ZipEntry could use setLastModifiedTime(FileTime), but again, it has the same problem as it sets the extended timestamp attributes (see the previous item). This can be ok for normal Zip files but for jars, not so much.

The good:

  • There are some efforts in the OpenJDK project to allow reproducible builds (or at least set the timestamp), there is a new option in the JAR tool --date (bad name but whatever), this sets the timestamps in every entry of the zip file, that option is already integrated into Java 19-ea, and will be backported to Java 18, but the best part is that it's also being backported into the next Java 17.0.3 🎉
  • My (new) fix includes a "detection" of the --date option and uses that if available.
  • For every version that doesn't have the --date option, there is a fallback mechanism to fix the time so we can have reproducible builds way down to Java 9 (Java 8 and 7 should not be affected by modular jars).

The ugly:

  • The fallback mechanism will post-process the time in the zip entries, which essentially means that it needs to create a new copy of the jar file, this adds a few seconds to the jar generation depending on the size, BUT the good part is that the output will be practically the same as if the --date option was available, meaning that a jar created with Java 17.0.3 should be identical to a jar created with Java 17.0.2.

I think the post-processing part is not a bad trade-off to allow reproducible builds.

@michael-o
Copy link
Member

What a PITA instead of Oracle monkeys fixing this the right way and back porting down to 8. Joke.

@jorsol
Copy link
Contributor Author

jorsol commented Feb 21, 2022

I will open the PR as soon as the FileTime API is merged.

@jorsol
Copy link
Contributor Author

jorsol commented Mar 14, 2022

The PR #205 fixes this issue if someone want to take a look 👀 .

@hboutemy
Copy link
Member

I just merged #199 : this is the update that I was not confident on, but everybody looks ok
I'll have a look back at this PR later this week

thanks for you patience and efforts

@jorsol
Copy link
Contributor Author

jorsol commented Mar 15, 2022

I just merged #199 : this is the update that I was not confident on, but everybody looks ok

Thanks @hboutemy, that update was only changing the API from the old Date to FileTime, it looks large but all the functionality should be preserved, is backward compatible, and it can be used as-is with the maven-jar-plugin without changes (tested by using the local SNAPSHOT dependency of plexus-archiver in the jar-plugin).

But please, if you found something that is broken or something that needs clarification ping me to check it.

Now, back to the the real fix of the reproducible build with modular jars, the PR #205 is smaller in files touched, but is actually more invasive than the previous one, this is the one that probably you should look out more closely (not saying that is should broke anything).

This fix has many hours of research and it will use the new --date jar option if available, I'm confident that it should be ready, I think I added enough comments to explain it, but if something is not clear, please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants