Add support for go workspaces #457

ejegrova · 2024-01-24T12:42:02Z

discussion: #398

checksum validation:

go.work.sum + all go.sum files in modules are used to get checksums
if checksum is not found, the property cachi2:missing_hash:in_file has value of go.work.sum

JIRA: STONEBLD-2043

Maintainers will complete the following section

Commit messages are descriptive enough
Code coverage from testing does not decrease and new code is covered
Docs updated (if applicable)
Docs links in the code are still valid (if docs were updated)

Note: if the contribution is external (not from an organization member), the CI
pipeline will not run automatically. After verifying that the CI is safe to run:

approve GitHub Actions workflows by clicking a button
approve the Red Hat Trusted App Pipeline container build by commenting /ok-to-test
(as is the standard for Pipelines as Code)

eskultety

Since this is still a draft I only had some immediate comments on the proposal but didn't do a detailed in-depth review of the logic fitting the overall design.

cachi2/core/package_managers/gomod.py

tests/unit/package_managers/test_gomod.py

cachi2/core/package_managers/gomod.py

ejegrova · 2024-05-13T09:15:05Z

changed integration test repo to include checks for missing_hash property

taylormadore

Functionally, it looks good to me 🚀 Hopefully my knowledge of go workspaces is sufficient for that statement to mean anything :harold:

Left a couple of minor nitpicks. Also, can the packages with missing checksums be noted in the README of the integration test repo+branch?

cachi2/core/package_managers/gomod.py

taylormadore · 2024-05-14T20:46:44Z

tests/unit/package_managers/test_gomod.py

+
+    repo_root = RootedPath(tmp_path)
+
+    go_work_root = _get_go_work_root(repo_root, Go(), {"cwd": repo_root})


Is this calling go directly? Should we be mocking that?

eskultety

Looking good, I learnt something new during the review, e.g. that any repo-local submodule still needs to be declared uniquely within all Go modules and so it is imperative one still uses the replace keyword to denote that the local implementation should be used even though that is the only one!
Anyway, I think we need some documentation update on workspaces, don't we?

tests/unit/package_managers/test_gomod.py

eskultety · 2024-05-14T10:14:09Z

tests/unit/package_managers/test_gomod.py

@@ -32,6 +32,8 @@
    _create_modules_from_parsed_data,


This commit fails unit tests with:

FAILED tests/unit/package_managers/test_gomod.py::test_resolve_gomod_vendor_without_flag - FileNotFoundError: [Errno 2] No such file or directory: '/usr/bin/go' FAILED tests/unit/package_managers/test_gomod.py::test_get_go_work_root - FileNotFoundError: [Errno 2] No such file or directory: '/usr/bin/go' FAILED tests/unit/package_managers/test_gomod.py::test_get_go_work_root_when_go_work_does_not_exist - FileNotFoundError: [Errno 2] No such file or directory: '/usr/bin/go' FAILED tests/unit/package_managers/test_gomod.py::test_get_go_work_root_when_go_work_is_outside_of_repo - FileNotFoundError: [Errno 2] No such file or directory: '/usr/bin/go'

so a mock to subprocess.run or maybe even better get_go_work_root needs to be added to the test cases in question, though test_resolve_gomod_vendor_without_flag in particular might be a harder one to mock correctly without affecting anything else, but it might turn out just fine as well, you'll see.

cachi2/core/package_managers/gomod.py

eskultety · 2024-05-15T16:15:30Z

cachi2/core/package_managers/gomod.py

+        else:
+            module_list.append(module)
+
+    # should never happen, since the main module will always be a part of the json stream


so all the workspace branches in the integration test repo are invalid then, correct?

I can currently see 3 branches that are related to workspaces: nested_main_module, worspace_separate_modules and go_workspaces. I believe the first two ones are leftovers that should be deleted. All of these were created as we evolved the code to try to cover more corner cases.

The go_workspaces branch now covers the most weird scenario: the root repo is not a go project, and the workspaces root is the first nested directory (worspace_modules). I am not sure we need such a complicated case in the integration test, but the bright side is that we're covering the worst we could come up with.

eskultety · 2024-05-15T16:16:33Z

cachi2/core/package_managers/gomod.py

+
+def _process_modules_json_stream(
+    app_dir: RootedPath, modules_json_stream: str
+) -> tuple[ModuleDict, list[ModuleDict]]:


nitpick: we should return a single list with the main module as the first element, the extraction should be done on the caller's end.

IMO, that will make the output a little less clear, since there's no way to tell that the first module is always the main module unless you look at this function's docstring or implementation. By returning a tuple, it is at least implicit that the first tuple element is unique somehow when compared to the elements in the list.

That is IMO an inferior approach from design POV. If an element is special it needs to be reflected in the docs instead of trying to figure out a creative way how such a semantics needs to be represented in a data type.

cachi2/core/package_managers/gomod.py

eskultety · 2024-05-16T14:30:39Z

cachi2/core/package_managers/gomod.py

+        go_sum_files = _get_go_sum_files(go_work_root, go, run_params)
+        modules_in_go_sum = _parse_go_sum_files(go_sum_files)


The fact we need to first get the list of sum files should be transparent to _resolve_gomod as a caller and instead be done in _parse_go_sum_files for enhanced clarity, especially since the variable is not being used anywhere later in the function.
Alternatively, you could keep _get_go_sum_files here as a convenience function call, however, you'd then loop directly over it as _parse_go_sum_files doesn't seem to bring much value really since it's just a simple loop wrapper which is maybe even better than the former suggestion.

eskultety · 2024-05-16T14:52:37Z

cachi2/core/package_managers/gomod.py

-    main_module_name = go([*go_list, "-m"], run_params).rstrip()
+    modules_json_stream = go([*go_list, "-m", "-json"], run_params).rstrip()
+    main_module_dict, workspace_dict_list = _process_modules_json_stream(
+        app_dir, modules_json_stream
+    )
+
+    path = main_module_dict["Path"]


Having recently worked on the toolchain selection which gave me headaches wrt/ unit tests because _resolve_gomod is a beast of a function my impression has become that anything that needs to execute a Go command should go into a separate function to potentially make mocking in test_resolve_gomod much easier.

Also from logical perspective this particular block seems to be open-coded quite a bit. What if we introduced a function, say _parse_modules along the following lines:

def _parse_modules(app_dir, version_resolver) -> list[ParsedModule]: run_go_list_json process_modules_json_stream ... return [main_module] + workspace_modules

and then in resolve_gomod we would just pop the main module out of the list for some further processing, but the idea is to make the code in resolve_gomod cleaner, leaner and easier to follow. Do you think the ^above would help achieving that by consolidating the special casing of the main module we're doing here?

I completely agree this is a better approach, but since this is refactoring work, can we do it as a follow up?

I'd be afraid it would remain TODO forever :), but okay, UNLESS it turns out more substantial changes are needed within this PR.

cachi2/core/package_managers/gomod.py

eskultety · 2024-05-17T08:33:19Z

tests/integration/test_gomod.py

+        # Test case checks if cachi2 can process go workspaces properly.
+        pytest.param(
+            utils.TestParameters(
+                repo="https://github.com/cachito-testing/cachi2-gomod.git",


looking at the repo the workspace cases don't seem to include a nuance of a main module i.e. having a go.mod file checked in - shouldn't we change that and make sure the integration test repos are proper Go projects?

This case does have a go.mod file, it's just buried very deep within: https://github.com/cachito-testing/cachi2-gomod/tree/go_workspaces/workspace_modules/hello

I meant the main go.mod module, should a project, even if using workspaces, have a go.mod file right next to the go.work file?

Whenever workspaces are enabled, the "go list -m" command will return a list of all workspaces modules instead of the usual single module present in the path being processed by Cachi2. For this reason, we need to properly parse this extra data so that they can be included in the resulting SBOM. Signed-off-by: ejegrova <ejegrova@redhat.com>

All go.sum files and go.work.sum are checked for checksums. If not found, the property cachi2:missing_hash:in_file has value of go.work.sum. Signed-off-by: Bruno Pimentel <bpimente@redhat.com>

Signed-off-by: ejegrova <ejegrova@redhat.com>

eskultety · 2024-06-03T11:04:49Z

My question is this: what is the main problem of letting users point cachi2 to the go.work file directly the same way as we expect them to do with go.mod for the main module to be processed? That would lead to a consistent behaviour from UX perspective where we'd have to process all dependent modules. What are the drawbacks of that? @brunoapimentel do you see a use case where ^this wouldn't work and simply fail with cachi2 logic?

eskultety · 2024-06-03T11:18:10Z

cachi2/core/package_managers/gomod.py

@@ -860,13 +880,27 @@ def _resolve_gomod(
        # Make Go ignore the vendor dir even if there is one
        go_list.extend(["-mod", "readonly"])

-    main_module_name = go([*go_list, "-m"], run_params).rstrip()
+    # breakpoint()


This looks like a left-over from some debugging.

eskultety · 2024-06-03T11:20:33Z

tests/unit/package_managers/test_gomod.py

+    else:
+        go_work_root = None


nitpick: if go_work_root is initialized at the top of the function, this clause isn't needed.

eskultety · 2024-06-03T12:05:51Z

tests/unit/package_managers/test_gomod.py

+    run_side_effects.append(
+        proc_mock(
+            "go work edit -json",
+            returncode=0,
+            stdout=get_mocked_data(data_dir, "workspaces/go_work.json"),
+        )
+    )


Apart from this particular hunk, this test function is pretty much bit-for-bit identical to the existing test_resolve_gomod function. Given that that's the case, we should better mock _get_go_sum_files and add this as another parametrize case to the existing test case, this is simply a beast of a test testing a beast of a function which leads to enormous code redundancy.

eskultety reviewed Feb 1, 2024

View reviewed changes

ejegrova force-pushed the workspaces branch from a19c282 to 79b4d51 Compare February 19, 2024 12:08

ejegrova force-pushed the workspaces branch from 79b4d51 to 822098e Compare March 11, 2024 16:09

github-advanced-security bot found potential problems Mar 11, 2024

View reviewed changes

cachi2/core/package_managers/gomod.py Fixed Show fixed Hide fixed

ejegrova force-pushed the workspaces branch from 822098e to fee8f1e Compare March 12, 2024 14:16

github-advanced-security bot found potential problems Mar 12, 2024

View reviewed changes

cachi2/core/package_managers/gomod.py Fixed Show fixed Hide fixed

ejegrova force-pushed the workspaces branch 4 times, most recently from 63706be to 3be712f Compare March 14, 2024 12:05

ejegrova marked this pull request as ready for review March 14, 2024 12:41

ejegrova requested review from brunoapimentel, taylormadore, ben-alkov and slimreaper35 March 14, 2024 12:42

brunoapimentel reviewed Mar 20, 2024

View reviewed changes

cachi2/core/package_managers/gomod.py Outdated Show resolved Hide resolved

brunoapimentel reviewed Mar 20, 2024

View reviewed changes

cachi2/core/package_managers/gomod.py Show resolved Hide resolved

ejegrova force-pushed the workspaces branch from 3be712f to 66e52a6 Compare March 25, 2024 13:27

ejegrova marked this pull request as draft April 10, 2024 09:49

ejegrova force-pushed the workspaces branch 2 times, most recently from c67073f to a1cbcdb Compare April 15, 2024 13:00

github-advanced-security bot found potential problems Apr 15, 2024

View reviewed changes

tests/unit/package_managers/test_gomod.py Fixed Show fixed Hide fixed

tests/unit/package_managers/test_gomod.py Fixed Show fixed Hide fixed

tests/unit/package_managers/test_gomod.py Fixed Show fixed Hide fixed

ejegrova force-pushed the workspaces branch 2 times, most recently from 3c186c7 to a9b5e69 Compare April 17, 2024 08:19

ejegrova marked this pull request as ready for review April 17, 2024 12:29

ben-alkov reviewed Apr 17, 2024

View reviewed changes

cachi2/core/package_managers/gomod.py Outdated Show resolved Hide resolved

ben-alkov reviewed Apr 17, 2024

View reviewed changes

cachi2/core/package_managers/gomod.py Outdated Show resolved Hide resolved

ejegrova force-pushed the workspaces branch from a9b5e69 to 971fe99 Compare April 18, 2024 09:16

ejegrova force-pushed the workspaces branch 2 times, most recently from 0e85b79 to 40477f4 Compare May 3, 2024 15:49

ben-alkov requested a review from eskultety May 6, 2024 15:22

ejegrova force-pushed the workspaces branch from 40477f4 to 0f7d2de Compare May 7, 2024 11:54

ejegrova requested review from brunoapimentel and ben-alkov May 7, 2024 12:47

ejegrova force-pushed the workspaces branch from 0f7d2de to 6f8cdc2 Compare May 13, 2024 09:14

ejegrova force-pushed the workspaces branch from 6f8cdc2 to af16bf7 Compare May 14, 2024 07:27

taylormadore reviewed May 14, 2024

View reviewed changes

eskultety reviewed May 17, 2024

View reviewed changes

ejegrova force-pushed the workspaces branch 4 times, most recently from 6c17a97 to cc7eb2d Compare May 28, 2024 15:26

ejegrova and others added 3 commits May 28, 2024 18:17

Parse multiple go.sum files when workspaces are present

e2b5e66

All go.sum files and go.work.sum are checked for checksums. If not found, the property cachi2:missing_hash:in_file has value of go.work.sum. Signed-off-by: Bruno Pimentel <bpimente@redhat.com>

Add integration test for go workspaces

362afd4

Signed-off-by: ejegrova <ejegrova@redhat.com>

ejegrova force-pushed the workspaces branch from cc7eb2d to 362afd4 Compare May 28, 2024 16:18

eskultety reviewed Jun 3, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for go workspaces #457

Add support for go workspaces #457

ejegrova commented Jan 24, 2024 •

edited

eskultety left a comment

ejegrova commented May 13, 2024

taylormadore left a comment

taylormadore May 14, 2024

eskultety left a comment

eskultety May 14, 2024

eskultety May 15, 2024

brunoapimentel May 28, 2024

eskultety May 15, 2024

brunoapimentel May 28, 2024

eskultety May 28, 2024

eskultety May 16, 2024

eskultety May 16, 2024

brunoapimentel May 28, 2024

eskultety Jun 3, 2024

eskultety May 17, 2024

brunoapimentel May 28, 2024

eskultety Jun 3, 2024

eskultety commented Jun 3, 2024

eskultety Jun 3, 2024

eskultety Jun 3, 2024

eskultety Jun 3, 2024


		repo_root = RootedPath(tmp_path)

		go_work_root = _get_go_work_root(repo_root, Go(), {"cwd": repo_root})

		go_sum_files = _get_go_sum_files(go_work_root, go, run_params)
		modules_in_go_sum = _parse_go_sum_files(go_sum_files)

Add support for go workspaces #457

Are you sure you want to change the base?

Add support for go workspaces #457

Conversation

ejegrova commented Jan 24, 2024 • edited

Maintainers will complete the following section

eskultety left a comment

Choose a reason for hiding this comment

ejegrova commented May 13, 2024

taylormadore left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eskultety left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eskultety commented Jun 3, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ejegrova commented Jan 24, 2024 •

edited