Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to see the first location a package was added #1724

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

wagoodman
Copy link
Contributor

@wagoodman wagoodman commented Apr 7, 2023

Adds a squashed-with-all-layers resolver which acts like the squashed resolver with the additional behavior of returning instances of the path found in all other layers. This, combined with additional changes to denote the layer index directly in locations, allows for someone to be able to know the first location a package was introduced.

For example:

# Dockerfile for test:latest
FROM alpine:latest
RUN apk add wget
RUN apk add curl

When running syft...

$ syft -o json -s squashed-with-all-layers test:latest  -vvv
...
[0000] DEBUG discovered 58 packages cataloger=apkdb-cataloger
[0000] DEBUG found path duplicate of /lib/ld-musl-x86_64.so.1
[0000] DEBUG found path duplicate of /usr/share/apk/keys/alpine-devel@lists.alpinelinux.org-58199dcc.rsa.pub
[0000] DEBUG found path duplicate of /usr/share/apk/keys/alpine-devel@lists.alpinelinux.org-616ae350.rsa.pub
[0000] DEBUG found path duplicate of /usr/share/apk/keys/alpine-devel@lists.alpinelinux.org-524d27bb.rsa.pub
[0000] DEBUG found path duplicate of /usr/share/apk/keys/alpine-devel@lists.alpinelinux.org-616a9724.rsa.pub
...
[0000] TRACE merging similar packages id=291d1267b40d636f purl=pkg:apk/alpine/alpine-baselayout-data@3.4.0-r0?arch=x86_64&upstream=alpine-baselayout&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=d9700f02cf26e8b8 purl=pkg:apk/alpine/musl@1.2.3-r4?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=623d53216342d45e purl=pkg:apk/alpine/busybox@1.35.0-r29?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=256fc96b4a8c4da8 purl=pkg:apk/alpine/busybox-binsh@1.35.0-r29?arch=x86_64&upstream=busybox&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=92b19c7750fb559d purl=pkg:apk/alpine/alpine-baselayout@3.4.0-r0?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=2b5e23d349b556cf purl=pkg:apk/alpine/alpine-keys@2.4-r1?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=b805d823ae624f04 purl=pkg:apk/alpine/ca-certificates-bundle@20220614-r4?arch=x86_64&upstream=ca-certificates&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=d3084c788891fb28 purl=pkg:apk/alpine/libcrypto3@3.0.8-r3?arch=x86_64&upstream=openssl&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=2a95f0251fba7a33 purl=pkg:apk/alpine/libssl3@3.0.8-r3?arch=x86_64&upstream=openssl&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=b15247aafcd4a647 purl=pkg:apk/alpine/ssl_client@1.35.0-r29?arch=x86_64&upstream=busybox&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=94014313cfcd2b71 purl=pkg:apk/alpine/zlib@1.2.13-r0?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=e5f757b0df1f62bc purl=pkg:apk/alpine/apk-tools@2.12.10-r1?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=e903138d19e85b80 purl=pkg:apk/alpine/scanelf@1.3.5-r1?arch=x86_64&upstream=pax-utils&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=f71ecf5267e6c37b purl=pkg:apk/alpine/musl-utils@1.2.3-r4?arch=x86_64&upstream=musl&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=8126b232e2d3c608 purl=pkg:apk/alpine/libc-utils@0.7.2-r3?arch=x86_64&upstream=libc-dev&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=291d1267b40d636f purl=pkg:apk/alpine/alpine-baselayout-data@3.4.0-r0?arch=x86_64&upstream=alpine-baselayout&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=d9700f02cf26e8b8 purl=pkg:apk/alpine/musl@1.2.3-r4?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=623d53216342d45e purl=pkg:apk/alpine/busybox@1.35.0-r29?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=256fc96b4a8c4da8 purl=pkg:apk/alpine/busybox-binsh@1.35.0-r29?arch=x86_64&upstream=busybox&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=92b19c7750fb559d purl=pkg:apk/alpine/alpine-baselayout@3.4.0-r0?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=2b5e23d349b556cf purl=pkg:apk/alpine/alpine-keys@2.4-r1?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=b805d823ae624f04 purl=pkg:apk/alpine/ca-certificates-bundle@20220614-r4?arch=x86_64&upstream=ca-certificates&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=d3084c788891fb28 purl=pkg:apk/alpine/libcrypto3@3.0.8-r3?arch=x86_64&upstream=openssl&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=2a95f0251fba7a33 purl=pkg:apk/alpine/libssl3@3.0.8-r3?arch=x86_64&upstream=openssl&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=b15247aafcd4a647 purl=pkg:apk/alpine/ssl_client@1.35.0-r29?arch=x86_64&upstream=busybox&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=94014313cfcd2b71 purl=pkg:apk/alpine/zlib@1.2.13-r0?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=e5f757b0df1f62bc purl=pkg:apk/alpine/apk-tools@2.12.10-r1?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=e903138d19e85b80 purl=pkg:apk/alpine/scanelf@1.3.5-r1?arch=x86_64&upstream=pax-utils&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=f71ecf5267e6c37b purl=pkg:apk/alpine/musl-utils@1.2.3-r4?arch=x86_64&upstream=musl&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=8126b232e2d3c608 purl=pkg:apk/alpine/libc-utils@0.7.2-r3?arch=x86_64&upstream=libc-dev&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=58d60d9b7d1565f1 purl=pkg:apk/alpine/libunistring@1.1-r0?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=3841a3199a1ee118 purl=pkg:apk/alpine/libidn2@2.3.4-r0?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=e40c4f862e3949e8 purl=pkg:apk/alpine/pcre2@10.42-r0?arch=x86_64&distro=alpine-3.17.3
[0000] TRACE merging similar packages id=971b42d7909ea972 purl=pkg:apk/alpine/wget@1.21.3-r2?arch=x86_64&distro=alpine-3.17.3

# proceeds to output 25 packages, not 58

You'll see merged location elements for each package:

{
  "id": "94014313cfcd2b71",
  "name": "zlib",
  "version": "1.2.13-r0",
  "type": "apk",
  "foundBy": "apkdb-cataloger",
  "locations": [
    {
      "path": "/lib/apk/db/installed",
      "layerID": "sha256:0d71e44edab1e63f802dfd59cbf8c128c4f89f2ae3c4edb79475678dcedb5bff"
    },
    {
      "path": "/lib/apk/db/installed",
      "layerID": "sha256:a2ea955c0abfa7fb734e0991ef02fb4e4f35e8090ae76cd6f14dc58d037fa23e"
    },
    {
      "path": "/lib/apk/db/installed",
      "layerID": "sha256:f1417ff83b319fbdae6dd9cd6d8c9c88002dcd75ecf6ec201c8c6894681cf2b5"
    }
  ],
  "licenses": [
    "Zlib"
  ],
  "language": "",
  "cpes": [
    "cpe:2.3:a:zlib:zlib:1.2.13-r0:*:*:*:*:*:*:*"
  ],
  "purl": "pkg:apk/alpine/zlib@1.2.13-r0?arch=x86_64&distro=alpine-3.17.3",
...

TODO:

  • add tests 🧛 🩸
  • add layer index to location?
  • sort slice from location set not lexically, but by layer order.
  • there are a log of "found path duplicate of " log entries, which hints that there is an issue with relationship creation for these duplicate packages found.

Open question:

  • Should we omit packages for certain ecosystems that have been found in previous layers but are known to be the same? E.g. deb/apk/rpm packages are in a single DB, so adding any new package will make the previously installed packages look like they've been installed again, which isn't what's happening here.

Problems:

  • This will report packages that get removed and are not logically in the squashed representation (introducing FPs relative to the squashed representation).

Closes #435

Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
@wagoodman wagoodman added the question Further information is requested label Apr 7, 2023
@github-actions
Copy link

github-actions bot commented Apr 7, 2023

Benchmark Test Results

Benchmark results from the latest changes vs base branch
goos: linux%0Agoarch: amd64%0Apkg: github.com/anchore/syft/test/integration%0Acpu: Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz%0A                                                          │ ./.tmp/benchmark-14e8cb4.txt │%0A                                                          │            sec/op            │%0AImagePackageCatalogers/alpmdb-cataloger-2                                   11.80m ± 24%25%0AImagePackageCatalogers/ruby-gemspec-cataloger-2                             856.1µ ±  2%25%0AImagePackageCatalogers/python-package-cataloger-2                           3.097m ±  1%25%0AImagePackageCatalogers/php-composer-installed-cataloger-2                   695.8µ ±  1%25%0AImagePackageCatalogers/javascript-package-cataloger-2                       356.7µ ±  2%25%0AImagePackageCatalogers/dpkgdb-cataloger-2                                   511.1µ ±  1%25%0AImagePackageCatalogers/rpm-db-cataloger-2                                   491.1µ ±  3%25%0AImagePackageCatalogers/java-cataloger-2                                     10.73m ±  1%25%0AImagePackageCatalogers/graalvm-native-image-cataloger-2                     8.390µ ±  2%25%0AImagePackageCatalogers/apkdb-cataloger-2                                    556.0µ ±  0%25%0AImagePackageCatalogers/go-module-binary-cataloger-2                         18.95µ ±  2%25%0AImagePackageCatalogers/dotnet-deps-cataloger-2                              981.6µ ±  1%25%0AImagePackageCatalogers/portage-cataloger-2                                  344.5µ ±  1%25%0AImagePackageCatalogers/nix-store-cataloger-2                                222.9µ ±  2%25%0AImagePackageCatalogers/sbom-cataloger-2                                     110.8µ ±  0%25%0AImagePackageCatalogers/binary-cataloger-2                                   190.1µ ±  0%25%0Ageomean                                                                     451.0µ%0A%0A                                                          │ ./.tmp/benchmark-14e8cb4.txt │%0A                                                          │             B/op             │%0AImagePackageCatalogers/alpmdb-cataloger-2                                   5.064Mi ± 0%25%0AImagePackageCatalogers/ruby-gemspec-cataloger-2                             123.8Ki ± 0%25%0AImagePackageCatalogers/python-package-cataloger-2                           947.4Ki ± 0%25%0AImagePackageCatalogers/php-composer-installed-cataloger-2                   155.8Ki ± 0%25%0AImagePackageCatalogers/javascript-package-cataloger-2                       90.79Ki ± 0%25%0AImagePackageCatalogers/dpkgdb-cataloger-2                                   144.6Ki ± 0%25%0AImagePackageCatalogers/rpm-db-cataloger-2                                   170.2Ki ± 0%25%0AImagePackageCatalogers/java-cataloger-2                                     2.720Mi ± 0%25%0AImagePackageCatalogers/graalvm-native-image-cataloger-2                     1.555Ki ± 0%25%0AImagePackageCatalogers/apkdb-cataloger-2                                    129.2Ki ± 0%25%0AImagePackageCatalogers/go-module-binary-cataloger-2                         3.133Ki ± 0%25%0AImagePackageCatalogers/dotnet-deps-cataloger-2                              314.5Ki ± 0%25%0AImagePackageCatalogers/portage-cataloger-2                                  77.23Ki ± 0%25%0AImagePackageCatalogers/nix-store-cataloger-2                                36.07Ki ± 0%25%0AImagePackageCatalogers/sbom-cataloger-2                                     13.57Ki ± 0%25%0AImagePackageCatalogers/binary-cataloger-2                                   29.91Ki ± 0%25%0Ageomean                                                                     101.7Ki%0A%0A                                                          │ ./.tmp/benchmark-14e8cb4.txt │%0A                                                          │          allocs/op           │%0AImagePackageCatalogers/alpmdb-cataloger-2                                    86.71k ± 0%25%0AImagePackageCatalogers/ruby-gemspec-cataloger-2                              2.049k ± 0%25%0AImagePackageCatalogers/python-package-cataloger-2                            15.49k ± 0%25%0AImagePackageCatalogers/php-composer-installed-cataloger-2                    3.457k ± 0%25%0AImagePackageCatalogers/javascript-package-cataloger-2                        1.205k ± 0%25%0AImagePackageCatalogers/dpkgdb-cataloger-2                                    2.646k ± 0%25%0AImagePackageCatalogers/rpm-db-cataloger-2                                    3.759k ± 0%25%0AImagePackageCatalogers/java-cataloger-2                                      38.26k ± 0%25%0AImagePackageCatalogers/graalvm-native-image-cataloger-2                       40.00 ± 0%25%0AImagePackageCatalogers/apkdb-cataloger-2                                     3.438k ± 0%25%0AImagePackageCatalogers/go-module-binary-cataloger-2                           101.0 ± 0%25%0AImagePackageCatalogers/dotnet-deps-cataloger-2                               5.011k ± 0%25%0AImagePackageCatalogers/portage-cataloger-2                                   1.539k ± 0%25%0AImagePackageCatalogers/nix-store-cataloger-2                                  671.0 ± 0%25%0AImagePackageCatalogers/sbom-cataloger-2                                       392.0 ± 0%25%0AImagePackageCatalogers/binary-cataloger-2                                     872.0 ± 0%25%0Ageomean                                                                      2.062k

@spiffcs spiffcs self-requested a review June 22, 2023 16:21
@Deep232
Copy link

Deep232 commented Feb 21, 2024

May I know why this pr is not merged . Its extremely helpful in deduping the components across layers

@tomerse-sg
Copy link

do you have an eta for this addition? can be helpful

@tgerla
Copy link
Contributor

tgerla commented Mar 28, 2024

Hi @tomerse-sg and @Deep232, thanks for the notes, we don't have an ETA but we will take a look and see if we can move this forward. Thank you for letting us know this would be useful for you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide a way to get the LayerID the package was first found in
4 participants