Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

render: add alpha support for filtering #1231

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

joelanford
Copy link
Member

@joelanford joelanford commented Mar 14, 2024

Description of the change:

This commit adds two alpha flags to opm render and updates FBC library code to support efficient filtering of FBC:

  • --alpha-filter-config=./path/to/config
  • --alpha-keep-packages=pkg1,pkg2

To support the filter config option, this commits defines a new configuration file format that supports:

  • filtering by package
  • within a package, filtering by channel
  • within a package, selecting a new default channel (required if the original default channel is filtered out)
  • within each channel, filtering by a version range

In order to support correct version range filtering, some bundles that fall outside the version range may remain in the channel to ensure that the filtered channel is still valid and includes all bundles that did fall in the range

Included in this commit is an update to declcfg.WalkMetasFS to enable concurrency, similar to recent changes in declcfg.LoadFS. declcfg.LoadFS is also refactored internally to use the new declcfg.WalkMetasFS implementation.

Motivation for the change:

There are two motivations:

  1. Filtering by individual versions is an often requested feature, especially in workflows that requiring mirroring catalogs to disconnected environments. In OLM, filtering versions requires semantic knowledge of the upgrade graph rules, so it is not as simple as keeping or removing arbitrary bundles and channel entries. This PR includes that semantic knowledge.
  2. When rendering an existing FBC, render is significantly faster than rendering an entire catalog and then using a post-render tool like jq or yq to filter after the fact.

I ran the existing benchmark to confirm that this update does not regress performance, even in the case of no filtering being applied. I will post those results in a comment.

Reviewer Checklist

  • Implementation matches the proposed design, or proposal is updated to match implementation
  • Sufficient unit test coverage
  • Sufficient end-to-end test coverage
  • Docs updated or added to /docs
  • Commit messages sensible and descriptive

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 14, 2024
Copy link

codecov bot commented Mar 14, 2024

Codecov Report

Attention: Patch coverage is 92.27053% with 32 lines in your changes are missing coverage. Please review.

Project coverage is 55.08%. Comparing base (e7ca8e3) to head (b498760).
Report is 4 commits behind head on master.

Current head b498760 differs from pull request most recent head eab0be1

Please upload reports for the commit eab0be1 to get more accurate results.

Files Patch % Lines
alpha/declcfg/load.go 74.71% 16 Missing and 6 partials ⚠️
alpha/declcfg/filter/config/v1alpha1/filter.go 95.07% 4 Missing and 3 partials ⚠️
alpha/action/render.go 89.47% 1 Missing and 1 partial ⚠️
alpha/declcfg/filter.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1231      +/-   ##
==========================================
+ Coverage   50.06%   55.08%   +5.01%     
==========================================
  Files         135      113      -22     
  Lines       12995    11577    -1418     
==========================================
- Hits         6506     6377     -129     
+ Misses       5405     4207    -1198     
+ Partials     1084      993      -91     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines 9 to 40
type Filterer interface {
FilterCatalog(ctx context.Context, fbc *declcfg.DeclarativeConfig) (*declcfg.DeclarativeConfig, error)
}

type MetaFilterer interface {
KeepMeta(meta *declcfg.Meta) bool
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Andy did a great job of getting this stuck in my head over the course of a few PRs: https://go.dev/wiki/CodeReviewComments#interfaces

The gist is that the interfaces should be defined where the values of the interface are used and not by the package that implements them. I'm not strongly opinionated on this (yet) but is something that crossed my mind while reviewing so figured I would point it out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is an outcome of the way this code evolved before I submitted it. I'll move it to the declcfg package.

Comment on lines 255 to 258
if r.Filterer != nil {
if metaFilterer, ok := r.Filterer.(filter.MetaFilterer); ok {
keepMeta = metaFilterer.KeepMeta
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not have an explicit field for the MetaFilterer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the time, I think it was because I was actually implementing inlined functions with that same signature, so it was lighter weight than implementing a full struct.

I'll add a function type that implements the interface (sorta like http.Handler and http.HandlerFunc).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also thinking about names... the age old problem.

}
}

func (f *filterer) FilterCatalog(_ context.Context, fbc *declcfg.DeclarativeConfig) (*declcfg.DeclarativeConfig, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it anticipated that the context will be required by some implementation of the filterer interface? If not, we can probably remove it from the interface and as a parameter from this function

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen context.Context get added to so many functions at this point that I pretty much add it defensively for functions I expect:

  • could be long running
  • could need to call other functions that accept a context

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm going to leave it in. One possibility I thought of is an interaction with an OCI registry to get the labels of bundle images and filter based on them. I'm considering adding a context to the meta filter signature as well for the same reason. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, we should have it for consistency reasons (most of the declcfg code follows this convention), but it's also proven really helpful when we refactored for parallelism previously, and the downside ... ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to include if we are anticipating the addition of a filtering method that will use it in the near future. Just calling out that it may be a premature optimization that could be added in the future when needed.

I also am not strongly opinionated here so if folks feel strongly we should keep it - fine by me

keepPackages sets.Set[string]
}

func (f *packageFilter) FilterCatalog(ctx context.Context, fbc *declcfg.DeclarativeConfig) (*declcfg.DeclarativeConfig, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My eyes could be deceiving me, but looks like this function doesn't use the ctx parameter.

Comment on lines 186 to 194
type pathMetas struct {
path string
metas []pathMeta
}

type pathMeta struct {
meta *Meta
err error
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these for? I can't seem to see where they are being used, am I just missing it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh good catch. PR went through some optimization iteration and these are now unused.

Comment on lines 141 to 161
var obj struct {
metav1.TypeMeta `json:",inline"`
}
if err := yaml.Unmarshal(data, &obj); err != nil {
return nil, err
}

if obj.Kind != "FilterConfiguration" {
return nil, fmt.Errorf("unexpected kind %q", obj.Kind)
}

switch obj.APIVersion {
case "olm.operatorframework.io/v1alpha1":
cfg, err := configv1alpha1.LoadFilterConfiguration(data)
if err != nil {
return nil, err
}
return configv1alpha1.NewFilterer(cfg, configv1alpha1.WithLogger(log)), nil
default:
return nil, fmt.Errorf("unexpected API version %q", obj.APIVersion)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the configv1alpha1.LoadFilterConfiguration function already performs validation. Could this whole thing just be simplified to something like:

	if filterFile == "" {
		return nil, nil
	}

	data, err := os.ReadFile(filterFile)
	if err != nil {
		return nil, err
	}
	
	cfg, err := configv1alpha1.LoadFilterConfiguration(data)
	if err != nil {
		return nil, err
	}
	return configv1alpha1.NewFilterer(cfg, configv1alpha1.WithLogger(log)), nil

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do realize that maybe this isn't as extensible because it only validates against the specific config version but this could be updated when/if we add another version

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, this is premature optimization. I'll reduce and add some comments about what we might do if we add new config file versions.

@joelanford joelanford force-pushed the filter-fbc branch 2 times, most recently from c47b060 to 29c9db1 Compare March 22, 2024 18:53
@joelanford joelanford force-pushed the filter-fbc branch 4 times, most recently from 3a2da31 to 4e18534 Compare March 26, 2024 20:44
@joelanford
Copy link
Member Author

Here's the promised benchstat output comparing master and this PR.

goos: darwin
goarch: arm64
                      │ master.out  │           filter-fbc.out           │
                      │   sec/op    │   sec/op     vs base               │
LoadFS/1_routines-10    718.5m ± 0%   717.7m ± 0%       ~ (p=0.353 n=10)
LoadFS/10_routines-10   110.7m ± 2%   110.5m ± 1%       ~ (p=0.684 n=10)
LoadFS/20_routines-10   110.9m ± 0%   110.9m ± 1%       ~ (p=0.971 n=10)
geomean                 206.6m        206.4m       -0.09%

                      │  master.out  │           filter-fbc.out            │
                      │     B/op     │     B/op      vs base               │
LoadFS/1_routines-10    234.8Mi ± 0%   233.6Mi ± 0%  -0.49% (p=0.000 n=10)
LoadFS/10_routines-10   237.7Mi ± 0%   236.6Mi ± 0%  -0.43% (p=0.000 n=10)
LoadFS/20_routines-10   237.1Mi ± 0%   236.0Mi ± 0%  -0.47% (p=0.000 n=10)
geomean                 236.5Mi        235.4Mi       -0.47%

                      │ master.out  │           filter-fbc.out           │
                      │  allocs/op  │  allocs/op   vs base               │
LoadFS/1_routines-10    2.561M ± 0%   2.559M ± 0%  -0.09% (p=0.000 n=10)
LoadFS/10_routines-10   2.562M ± 0%   2.559M ± 0%  -0.09% (p=0.000 n=10)
LoadFS/20_routines-10   2.561M ± 0%   2.559M ± 0%  -0.09% (p=0.000 n=10)
geomean                 2.561M        2.559M       -0.09%

Copy link
Contributor

openshift-ci bot commented Mar 28, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: everettraven, joelanford

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 3, 2024
This commit adds two alpha flags to `opm render` and updates
FBC library code to support efficient filtering of FBC:
- `--alpha-filter-config=./path/to/config`
- `--alpha-keep-packages=pkg1,pkg2`

To support the filter config option, this commits defines a new
configuration file format that supports:
- filtering by package
- within a package, filtering by channel
- within a package, selecting a new default channel (required if the original default
  channel is filtered out)
- within each channel, filtering by a version range

In order to support correct version range filtering, some bundles
that fall outside the version range may remain in the channel to
ensure that the filtered channel is still valid and includes all
bundles that did fall in the range

Included in this commit is an update to declcfg.WalkMetasFS to
enable concurrency, similar to recent changes in declcfg.LoadFS.
declcfg.LoadFS is also refactored internally to use the new
declcfg.WalkMetasFS implementation.

Signed-off-by: Joe Lanford <joe.lanford@gmail.com>
Signed-off-by: Joe Lanford <joe.lanford@gmail.com>
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 3, 2024
@perdasilva
Copy link
Contributor

/override go-apidiff

Copy link
Contributor

openshift-ci bot commented Apr 8, 2024

@perdasilva: Overrode contexts on behalf of perdasilva: go-apidiff

In response to this:

/override go-apidiff

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@operator-framework operator-framework deleted a comment from openshift-ci bot Apr 8, 2024
@grokspawn
Copy link
Contributor

/hold to discuss with another team

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 10, 2024
@perdasilva
Copy link
Contributor

perdasilva commented May 17, 2024

@joelanford I tried to resolve conflicts using the GitHub UI - feel free to force push over it =D it didn't work T_T

@joelanford
Copy link
Member Author

I think we're going to eventually close this, and hand the important bits of code to a stakeholder. For now, there is some consensus among maintainers that this is beyond the scope of core opm.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 21, 2024
@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants