Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpm PoC: Allow specifying DNF options via CLI input JSON #522

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

eskultety
Copy link
Collaborator

@eskultety eskultety commented Apr 18, 2024

Introduce a new input JSON attribute 'options' for the PoC rpm package
manager allowing consumers to pass arbitrary DNF repo configuration
options that would be applied later through the inject-files command.

The general schema for the input JSON is as follows:

{
    type: <package_manager>
    path: <relative_path>
    options: {
        <namespace>: {
            <key1>..<keyN>: Union[str, Any]<val1>
        }
    }
}

which translated to DNF repo options for RPM might then look like:

{
    type: rpm
    path: .
    options: {
        dnf: {
             repoid1: {gpgcheck: 0, deltarpm: 1},
             repoid2: {fastestmirro: 1, sslverify: 0}
        }
    }
}

which in turn would end up being dumped to the .build-config.json similarly to this:

{
  "environment_variables": [],
  "project_files": [],
  "options": {
    "rpm": {
      "dnf": {
        "releases": {
          "gpgcheck": 0,
          "enabled": 0,
          "type": "yum"
        }
      }
    }
  }

and finally the resulting generated .repo file might look like:

[releases]
gpgcheck = 0
enabled = 0
type = yum
baseurl = file:///etc/yum.repos.d/deps/rpm/x86_64/releases

Notes:

  • worth mentioning is that the way this is currently designed is that options are only ever formatted in the build config if some were actually set on the CLI (rather than being formatted as options: {}) so that this would be transparent to other package manager for which we don't intend to support options at the moment; this was all achieved by setting exclude_none=True globally for buildconfig.model_dump_json
  • the reason why there's an additional level of nesting in the .build-config.json namespaced by "rpm" is due to the fact that it's possible to specify more than one package with the input JSON on CLI, thus theoretically possible to provide various options for various package managers (if ever turned on) so we'd need to know which options relate to which package manager type which wouldn't otherwise be possible from current build config contents.

Maintainers will complete the following section

  • Commit messages are descriptive enough
  • Code coverage from testing does not decrease and new code is covered
  • Docs updated (if applicable)
  • Docs links in the code are still valid (if docs were updated)

Note: if the contribution is external (not from an organization member), the CI
pipeline will not run automatically. After verifying that the CI is safe to run:

@eskultety
Copy link
Collaborator Author

@brunoapimentel I wonder if I should extract the ConfigParser bits on .repo file generation and propose that as a separate PR as I didn't get any comments on that and I'm not sure whether we're in agreement or it got lost in context of the main change - options in RpmPackageInput.

@brunoapimentel
Copy link
Contributor

@brunoapimentel I wonder if I should extract the ConfigParser bits on .repo file generation and propose that as a separate PR as I didn't get any comments on that and I'm not sure whether we're in agreement or it got lost in context of the main change - options in RpmPackageInput.

I did some tests with the implementation based on ConfigParser, and it LGTM. So I'd probably keep it the way it is since the implementation is already done.

The only thing I was in middle of reviewing before getting carried away with something else was the custom validation you added in the last commit. From the few manual tests I did, it seemed that Pydantic handled these same use cases pretty well, and I thought that the error messages were reasonable. But I've likely missed the issues you had with unit tests.

@eskultety
Copy link
Collaborator Author

The only thing I was in middle of reviewing before getting carried away with something else was the custom validation you added in the last commit. From the few manual tests I did, it seemed that Pydantic handled these same use cases pretty well, and I thought that the error messages were reasonable. But I've likely missed the issues you had with unit tests.

Soo, let me illustrate the issue on 1 example:

x = _DNFOptions.model_validate({'type': 'rpm', 'options': 'bad_type'})
*** pydantic_core._pydantic_core.ValidationError: 3 validation errors for _DNFOptions
dnf
  Field required [type=missing, input_value={'type': 'rpm', 'options': 'bad_type'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.6/v/missing
type
  Extra inputs are not permitted [type=extra_forbidden, input_value='rpm', input_type=str]
    For further information visit https://errors.pydantic.dev/2.6/v/extra_forbidden
options
  Extra inputs are not permitted [type=extra_forbidden, input_value='bad_type', input_type=str]
    For further information visit https://errors.pydantic.dev/2.6/v/extra_forbidden

^This is what I get without the validator, IOW the only bit of information I get is 'Field required' but it doesn't tell me anything else like what field is missing, so despite input_value the error message is IMO very vague.

vs

This is after introducing the validator

x = _DNFOptions.model_validate({'type': 'rpm', 'options': 'bad_type'})
*** pydantic_core._pydantic_core.ValidationError: 1 validation error for _DNFOptions
  Value error, Missing required namespace attribute in '{'type': 'rpm', 'options': 'bad_type'}': 'dnf' [type=value_error, input_value={'type': 'rpm', 'options': 'bad_type'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.6/v/value_error

It's at least IMO clear that a keyword 'dnf' is missing, so from that perspective I think that message is clearer.
Besides being vague, based on the fact that you get multiple error messages with the default behaviour I had some troubles figuring out the error msg regex for unit tests properly and so I created a custom validator with more deterministic messages. @brunoapimentel do you think there's a better way? I'm fine with dropping the validator if there is :) .

@brunoapimentel
Copy link
Contributor

brunoapimentel commented Apr 30, 2024

@brunoapimentel do you think there's a better way? I'm fine with dropping the validator if there is :) .

Oh, I see it now. When you invoke the actual CLI, the output you get is different (although I'm not sure which piece is making the conversion).

I started with your example here, and extrapolated a little:

$ cachi2 fetch-deps --source ../test-repos/cachi2-rpm --dev-package-managers '{"type": "rpm", "options": "bad_type"}'
2024-04-29 22:18:48,960 ERROR InvalidInput: 1 validation error for user input\npackages -> 0 -> rpm -> options\n  Value error, Unexpected data type for 'options.bad_type' in input JSON: expected 'dict'
Error: InvalidInput: 1 validation error for user input
packages -> 0 -> rpm -> options
  Value error, Unexpected data type for 'options.bad_type' in input JSON: expected 'dict'

$ cachi2 fetch-deps --source ../test-repos/cachi2-rpm --dev-package-managers '{"type": "rpm", "options": {"bad_type": "this"}}'
2024-04-29 22:19:39,650 ERROR InvalidInput: 1 validation error for user input\npackages -> 0 -> rpm -> options\n  Value error, Missing required namespace attribute in '{'bad_type': 'this'}': 'dnf'
Error: InvalidInput: 1 validation error for user input
packages -> 0 -> rpm -> options
  Value error, Missing required namespace attribute in '{'bad_type': 'this'}': 'dnf'

$ cachi2 fetch-deps --source ../test-repos/cachi2-rpm --dev-package-managers '{"type": "rpm", "options": {"dnf": "this"}}'
2024-04-29 22:19:47,235 ERROR InvalidInput: 1 validation error for user input\npackages -> 0 -> rpm -> options\n  Value error, Unexpected data type for 'options.dnf.this' in input JSON: expected 'dict'
Error: InvalidInput: 1 validation error for user input
packages -> 0 -> rpm -> options
  Value error, Unexpected data type for 'options.dnf.this' in input JSON: expected 'dict'

I think these messages are a lot clearer than the ones you posted before, so maybe we could just go with them. Wdyt?

@eskultety
Copy link
Collaborator Author

I think these messages are a lot clearer than the ones you posted before, so maybe we could just go with them. Wdyt?

Uhm, I'm sorry, I'm afraid you lost me a bit in your examples, so do you want me to modify the messages I emit in the validator in any way or are we okay with what is proposed?

Copy link
Contributor

@brunoapimentel brunoapimentel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only nitpick here is about implementing the custom validation, but I don't see it as blocker, so I'm leaving my ack.

@eskultety eskultety mentioned this pull request May 14, 2024
2 tasks
Cosmetic change - this is mainly to improve the UX of the test results
run summary helping to identify/describe what test cases we run for the
input validation.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
Cosmetic change - we can convey the same message on a single
line and enhance code readability just a tiny bit without line splits.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
This is called as a callback from the cli.py module where we always
have both from_output_dir and for_output_dir resolved. It's fair to
assume any inject_files post action would need both of these to
accomplish whatever it is the action needs to do, so listing both
arguments as positional in the actual callback implementation makes the
code a bit easier to read.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
@eskultety
Copy link
Collaborator Author

@brunoapimentel I needed to update this PR to reflect your recent integration test changes by adjusting them to compare on the parsed structure contents rather than verbatim file contents when it comes to the Configparser instance I introduced. Let me know if you still agree with that adjustment so that I can finally merge this. (this time for real if CI passes)

Repo files follow the INI syntax for which we can use ConfigParser.
The motivation behind using a library abstraction to generate the INI
repo files stems from the fact that we already have some options
rendered conditionally with a fair chance of extending this in the
future with more custom user-supplied DNF repo options which means even
more conditionals which would stretch the capabilities of a plain
triple string (or a template for that matter) a bit too far.
Therefore, this patch proposes using ConfigParser instance for creating
and dumping the repo files in the inject_files post action.

Note that to make things a tiny bit more ergonomic, we subclass the
ConfigParser class to define a couple utility methods that can tell us
whether the ConfigParser instance is effectively empty (because an
empty instance of ConfigParser still evaluates to True) and thus can
be dumped to a file as well as some extra handling of default options,
e.g. defaulting to 'gpgcheck=1' explicitly in the generated .repo file.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
The sorting was added solely to make comparing of the expected and
generated repo files' contents more deterministic in unit tests, but
now that we're driving the generation process with ConfigParser, we can
perform the comparisons on the object level rather than actual file
contents, and so we no longer need to sacrifice the beauty of using
iterdir's generator by converting it to a list just to keep tests
deterministic.

Signed-off-by: Erik Skultety <eskultet@redhat.com>
Introduce a new input JSON attribute 'options' for the PoC rpm package
manager allowing consumers to pass arbitrary DNF repo configuration
options that would be applied later through the inject-files command.

The general schema for the input JSON is as follows:

{
    type: <package_manager>
    path: <relative_path>
    options: {
        <namespace>: {
            <key1>..<keyN>: Union[str, Any]<val1>
        }
    }
}

which translated to DNF repo options for RPM might then look like:

{
    type: rpm
    path: .
    options: {
        dnf: {
             repoid1: {gpgcheck: 0, deltarpm: 1},
             repoid2: {fastestmirro: 1, sslverify: 0}
        }
    }
}

This implementation is trying to be generic enough to be potentially
later extended to other package managers as well which also means that
due to the fairly complex (multi-level nested dictionary) input
a new intermediary model _DNFOptions was needed to be introduced in
this patch, but most importantly a custom 'before' model validator for
this model had to be implemented to generate stable and deterministic
errors which could be reliably unit-tested (default pydantic error
messages for such complex structures are madness).

Worth noting would be that unless options were specified, they would
not appear in the build-config.json --> exclude_none=True

Signed-off-by: Erik Skultety <eskultet@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants