-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for go workspaces #457
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is still a draft I only had some immediate comments on the proposal but didn't do a detailed in-depth review of the logic fitting the overall design.
63706be
to
3be712f
Compare
c67073f
to
a1cbcdb
Compare
3c186c7
to
a9b5e69
Compare
0e85b79
to
40477f4
Compare
changed integration test repo to include checks for missing_hash property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functionally, it looks good to me 🚀 Hopefully my knowledge of go workspaces is sufficient for that statement to mean anything :harold:
Left a couple of minor nitpicks. Also, can the packages with missing checksums be noted in the README of the integration test repo+branch?
|
||
repo_root = RootedPath(tmp_path) | ||
|
||
go_work_root = _get_go_work_root(repo_root, Go(), {"cwd": repo_root}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this calling go directly? Should we be mocking that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, I learnt something new during the review, e.g. that any repo-local submodule still needs to be declared uniquely within all Go modules and so it is imperative one still uses the replace
keyword to denote that the local implementation should be used even though that is the only one!
Anyway, I think we need some documentation update on workspaces, don't we?
@@ -32,6 +32,8 @@ | |||
_create_modules_from_parsed_data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commit fails unit tests with:
FAILED tests/unit/package_managers/test_gomod.py::test_resolve_gomod_vendor_without_flag - FileNotFoundError: [Errno 2] No such file or directory: '/usr/bin/go'
FAILED tests/unit/package_managers/test_gomod.py::test_get_go_work_root - FileNotFoundError: [Errno 2] No such file or directory: '/usr/bin/go'
FAILED tests/unit/package_managers/test_gomod.py::test_get_go_work_root_when_go_work_does_not_exist - FileNotFoundError: [Errno 2] No such file or directory: '/usr/bin/go'
FAILED tests/unit/package_managers/test_gomod.py::test_get_go_work_root_when_go_work_is_outside_of_repo - FileNotFoundError: [Errno 2] No such file or directory: '/usr/bin/go'
so a mock to subprocess.run
or maybe even better get_go_work_root
needs to be added to the test cases in question, though test_resolve_gomod_vendor_without_flag
in particular might be a harder one to mock correctly without affecting anything else, but it might turn out just fine as well, you'll see.
else: | ||
module_list.append(module) | ||
|
||
# should never happen, since the main module will always be a part of the json stream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so all the workspace branches in the integration test repo are invalid then, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can currently see 3 branches that are related to workspaces: nested_main_module
, worspace_separate_modules
and go_workspaces
. I believe the first two ones are leftovers that should be deleted. All of these were created as we evolved the code to try to cover more corner cases.
The go_workspaces
branch now covers the most weird scenario: the root repo is not a go project, and the workspaces root is the first nested directory (worspace_modules
). I am not sure we need such a complicated case in the integration test, but the bright side is that we're covering the worst we could come up with.
|
||
def _process_modules_json_stream( | ||
app_dir: RootedPath, modules_json_stream: str | ||
) -> tuple[ModuleDict, list[ModuleDict]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: we should return a single list with the main module as the first element, the extraction should be done on the caller's end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, that will make the output a little less clear, since there's no way to tell that the first module is always the main module unless you look at this function's docstring or implementation. By returning a tuple, it is at least implicit that the first tuple element is unique somehow when compared to the elements in the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is IMO an inferior approach from design POV. If an element is special it needs to be reflected in the docs instead of trying to figure out a creative way how such a semantics needs to be represented in a data type.
go_sum_files = _get_go_sum_files(go_work_root, go, run_params) | ||
modules_in_go_sum = _parse_go_sum_files(go_sum_files) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact we need to first get the list of sum files should be transparent to _resolve_gomod
as a caller and instead be done in _parse_go_sum_files
for enhanced clarity, especially since the variable is not being used anywhere later in the function.
Alternatively, you could keep _get_go_sum_files
here as a convenience function call, however, you'd then loop directly over it as _parse_go_sum_files
doesn't seem to bring much value really since it's just a simple loop wrapper which is maybe even better than the former suggestion.
main_module_name = go([*go_list, "-m"], run_params).rstrip() | ||
modules_json_stream = go([*go_list, "-m", "-json"], run_params).rstrip() | ||
main_module_dict, workspace_dict_list = _process_modules_json_stream( | ||
app_dir, modules_json_stream | ||
) | ||
|
||
path = main_module_dict["Path"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having recently worked on the toolchain selection which gave me headaches wrt/ unit tests because _resolve_gomod
is a beast of a function my impression has become that anything that needs to execute a Go command should go into a separate function to potentially make mocking in test_resolve_gomod
much easier.
Also from logical perspective this particular block seems to be open-coded quite a bit. What if we introduced a function, say _parse_modules
along the following lines:
def _parse_modules(app_dir, version_resolver) -> list[ParsedModule]:
run_go_list_json
process_modules_json_stream
...
return [main_module] + workspace_modules
and then in resolve_gomod
we would just pop the main module out of the list for some further processing, but the idea is to make the code in resolve_gomod
cleaner, leaner and easier to follow. Do you think the ^above would help achieving that by consolidating the special casing of the main module we're doing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely agree this is a better approach, but since this is refactoring work, can we do it as a follow up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be afraid it would remain TODO forever :), but okay, UNLESS it turns out more substantial changes are needed within this PR.
# Test case checks if cachi2 can process go workspaces properly. | ||
pytest.param( | ||
utils.TestParameters( | ||
repo="https://github.com/cachito-testing/cachi2-gomod.git", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking at the repo the workspace cases don't seem to include a nuance of a main module i.e. having a go.mod
file checked in - shouldn't we change that and make sure the integration test repos are proper Go projects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case does have a go.mod
file, it's just buried very deep within: https://github.com/cachito-testing/cachi2-gomod/tree/go_workspaces/workspace_modules/hello
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant the main go.mod module, should a project, even if using workspaces, have a go.mod
file right next to the go.work
file?
6c17a97
to
cc7eb2d
Compare
Whenever workspaces are enabled, the "go list -m" command will return a list of all workspaces modules instead of the usual single module present in the path being processed by Cachi2. For this reason, we need to properly parse this extra data so that they can be included in the resulting SBOM. Signed-off-by: ejegrova <ejegrova@redhat.com>
All go.sum files and go.work.sum are checked for checksums. If not found, the property cachi2:missing_hash:in_file has value of go.work.sum. Signed-off-by: Bruno Pimentel <bpimente@redhat.com>
Signed-off-by: ejegrova <ejegrova@redhat.com>
My question is this: what is the main problem of letting users point cachi2 to the |
@@ -860,13 +880,27 @@ def _resolve_gomod( | |||
# Make Go ignore the vendor dir even if there is one | |||
go_list.extend(["-mod", "readonly"]) | |||
|
|||
main_module_name = go([*go_list, "-m"], run_params).rstrip() | |||
# breakpoint() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a left-over from some debugging.
else: | ||
go_work_root = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: if go_work_root
is initialized at the top of the function, this clause isn't needed.
run_side_effects.append( | ||
proc_mock( | ||
"go work edit -json", | ||
returncode=0, | ||
stdout=get_mocked_data(data_dir, "workspaces/go_work.json"), | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from this particular hunk, this test function is pretty much bit-for-bit identical to the existing test_resolve_gomod
function. Given that that's the case, we should better mock _get_go_sum_files
and add this as another parametrize case to the existing test case, this is simply a beast of a test testing a beast of a function which leads to enormous code redundancy.
discussion: #398
checksum validation:
cachi2:missing_hash:in_file
has value of go.work.sumJIRA: STONEBLD-2043
Maintainers will complete the following section
Note: if the contribution is external (not from an organization member), the CI
pipeline will not run automatically. After verifying that the CI is safe to run:
/ok-to-test
(as is the standard for Pipelines as Code)