Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ArtifactsReportTask caching for included builds #1168

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

seregamorph
Copy link
Contributor

@seregamorph seregamorph commented Apr 16, 2024

Problem statement

We've observed an issue with our isolated (included) builds: the projectHealth was producing illegal advices to remove a dependency - and these advices were breaking the compilation. Worth to mention that problem only happened with cache enabled.

Root cause analysis

Finally, after debugging it was found that the root problem is in the ArtifactsReportTask and it's related to caching. With equal (from the Gradle perspective) task inputs it was reusing inappropriate cached task output from another isolated builds.
The detailed analysis highlighted key differences in generated build/reports/dependency-analysis/{sourceSet}/intermediates/artifacts.json file (taken from cache vs no build cache):

  • coordinates.identifier which can be different depending on current root build project (prefix)
  • coordinates.resolvedProject.gradleVariantIdentification.capabilities - same - different prefix, depends on current root build project

Solution

All possible ways to address the problem are to extend the task inputs - the input should be reproducible, but different for different isolated builds. Options considered:

  • include root project name as a task input (does not cover all cases)
  • include root project full path as a task input (does not cover all cases)
  • include relative path from current project to root (like "..") (explained below) (does not cover all cases)
  • include variant capabilities as task input as they reflect the difference of isolated builds (explained below)

In this PR you can see the root absolute project path as input and there are few comments why this solution seems to be valid:

  • the generated artifacts.json contains absolute file paths, so it seems to be reasonable to make inputs absolute-path sensitive e.g. to distinguish CI/CD and local builds where such paths are usually different
  • the ArtifactsReportTask already has an absolute-path sensitive input:
@PathSensitive(PathSensitivity.ABSOLUTE)
@InputFiles
fun getClasspathArtifactFiles(): FileCollection = artifacts.artifactFiles

Additional notes

This seems to be related to #916 authored by @jjohannes where there was an effort to address the problem. Unfortunately, the task input parameter buildPath always has value ":", so it's equal input for Gradle - and this fix was not enough.

Testing

I'm not quite sure how to write a proper functional test scenario for this change, advices are welcome.
This change is validated on our (pretty big) project.

Alternative solution: relative path to root

Using absolute root path as input is motivated, but still there is an opportunity to declare relative path from current project of task to root project (note: not vice versa). It means, that in most cases such Input parameter will have value like ".." or "../..". But for included builds it will look like "../../included/composite-project"

Alternative solution: Set Input

I'm not quite sure, but perhaps the capabilities as input parameter can be an alternative solution:

@Input
fun getClasspathCapabilities(): Set<Capability> = artifacts.asSequence()
    .filterNonGradle()
    .flatMap { it.variant.capabilities }
    .toSet()

Elements of this collection are different depending on current root project, the difference is in the field group (has specific prefix).
This was an initial implementation that worked pretty well. But I'd like to not to complicate things too early.

@@ -29,6 +29,10 @@ abstract class ArtifactsReportTask : DefaultTask() {
description = "Produces a report that lists all direct and transitive dependencies, along with their artifacts"
}

/** To properly support the case when analyzed project module is a part of included build */
@get:Input
val rootDir: String = project.rootDir.absolutePath
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. I think that using the absolute file path for an input is probably overly aggressive. It will mean this task out would never come from the remote build cache on different machines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the description I've mentioned, that this task already:

  • has getClasspathArtifactFiles Input with PathSensitivity.ABSOLUTE - isn't it the same? For each user it will be different as home dirs are not equal
  • declares generated output artifacts.json with absolute File path:
...
  {
    "coordinates": {
      "type": "module",
      "identifier": "com.google.code.findbugs:jsr305",
      "resolvedVersion": "3.0.2",
      "gradleVariantIdentification": {
        "capabilities": [
          "com.google.code.findbugs:jsr305"
        ],
        "attributes": {}
      }
    },
    "file": "/Users/username/.gradle/caches/modules-2/files-2.1/com.google.code.findbugs/jsr305/3.0.2/25ea2e8b0c338a877313bd4672d3fe056ea78f0d/jsr305-3.0.2.jar"
  },
...

So it seems that this new parameter will not make it worse. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the description and suggested one more option with relative path from current project to root.

Using absolute root path as input is motivated for reasons explained above, but still there is an opportunity to declare relative path from current project of task to root project (note: not vice versa). It means, that in most cases such Input parameter will have value like ".." or "../..". But for included builds it will look like "../../included/composite-project".

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're not wrong that it won't be worse, but nevertheless I would like to improve the situation. I have it on my agenda to inspect usages of this task more closely and see if we need to output the absolute path to the file. I'd like to improve the task's cacheability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good, I also think this task has potential for caching optimization.
I'd like to proceed with this PR. WDYT about changing absolute path to relative path here from current project to root project?
The comment above explains possible values. I'd like to add one more note here: in some special cases it's not possible to calculate relative path, e.g. for Windows if paths are on different logical disks. But in this case the absolute path can be used as a failover.
Should I proceed with this approach?

@autonomousapps autonomousapps added the bug Something isn't working label Apr 16, 2024
@autonomousapps autonomousapps added this to the next milestone Apr 16, 2024
@seregamorph
Copy link
Contributor Author

seregamorph commented Apr 26, 2024

Some updates from experience with this issue.
The fix with adding new Input with absolute path was not enough. Eventually similar problem happened on moving one of the modules in one of our included builds. So the set of input artifacts were the same (hence cached task result was obtained), but the actual project name became different. In the generated artifacts.json it was represented also as different coordindates.identifier and different gradleVariantIdentification.capabilities.

To fix the problem the mentioned capabilities input was added to task inputs:

@Input
fun getClasspathCapabilities(): Set<Capability> = artifacts.asSequence()
    .filterNonGradle()
    .flatMap { it.variant.capabilities }
    .toSet()

So far works good for our projects. I hope you will have time soon to dig deeper this issue.

Copy link
Owner

@autonomousapps autonomousapps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you contribute a test fixture demonstrating the issue?

@seregamorph
Copy link
Contributor Author

seregamorph commented May 27, 2024

Sorry for the delay, I've created simple project reproducing the issue https://github.com/seregamorph/demo-gradle-included-issue.git

I'd like to focus attention:

  • the build has to be binary reproducible (files packed to jar have constant timestamp), it's done via ReproducibleBuildProjectPlugin
  • for the sake of own security you can run gradle command instead of ./gradlew - the result is the same
  • the remote cache is not required here, the problem is visible with local cache

Please follow the instruction. Run the build for isolated-build-1:

cd isolated-build-1
./gradlew clean artifactsReportMain -i

Check output:

> Task :module-impl:artifactsReportMain
Build cache key for task ':module-impl:artifactsReportMain' is 3c9a8a4ff40ea0212d6eefd2be101534
Task ':module-impl:artifactsReportMain' is not up-to-date because:
  No history is available.
Stored cache entry for task ':module-impl:artifactsReportMain' with cache key 3c9a8a4ff40ea0212d6eefd2be101534
^^^^ Pay attention to build cache key ^^^^

Then run for isolated-build-2

cd ../isolated-build-2
./gradlew clean artifactsReportMain -i

and check output:

> Task :module-impl:artifactsReportMain FROM-CACHE
Build cache key for task ':module-impl:artifactsReportMain' is 3c9a8a4ff40ea0212d6eefd2be101534
Task ':module-impl:artifactsReportMain' is not up-to-date because:
  No history is available.
Loaded cache entry for task ':module-impl:artifactsReportMain' with cache key 3c9a8a4ff40ea0212d6eefd2be101534

^^^^ Same build cache key ^^^^

Now we can check the task output in the console:

cat ../module-impl/build/reports/dependency-analysis/main/intermediates/artifacts.json | jq
[
  {
    "coordinates": {
      "type": "included_build",
      "identifier": "isolated-build-1:module-contracts",
      "resolvedProject": {
        "identifier": ":module-contracts",
        "gradleVariantIdentification": {
          "capabilities": [
            "isolated-build-1:module-contracts"
          ],
          "attributes": {}
        },
        "buildPath": ":"
      },
      "gradleVariantIdentification": {
        "capabilities": [
          "isolated-build-1:module-contracts"
        ],
        "attributes": {}
      }
    },
    "file": "/Users/username/Projects/zzz/tmp11/demo-gradle-included-issue/module-contracts/build/libs/module-contracts.jar"
  }
]

As you can see, the dependency coordinates is "identifier": "isolated-build-1:module-contracts" which is correct for first build, but not second.

@seregamorph seregamorph force-pushed the included-builds-ArtifactsReportTask branch from 28824db to 36ef06a Compare May 28, 2024 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants