Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any way to do value-predicated filtering (rather than type) #270

Open
thockin opened this issue Aug 1, 2021 · 8 comments
Open

Any way to do value-predicated filtering (rather than type) #270

thockin opened this issue Aug 1, 2021 · 8 comments
Labels

Comments

@thockin
Copy link

thockin commented Aug 1, 2021

I have a function like this:

type T {
    // ... 
    IPs []string
    Port int
}

func Create(input T) T {
    output := T{}
    // ...

    // If user provided IPs, use that.  Else find free ones.
    if len(input.IPs) > 0 {
        output.IPs = input.IPs
    } else {
        output.IPs = allocateIPs()
    }

    // If user provided Port, use that.  Else find a free one.
    if input.Port != 0 {
        output.Port = input.Port
    } else {
        output.Port = allocatePort()
    }

    return output
}

I'm trying to use cmp.Equal and cmp.Diff to test. I specifically want to test permutations where the input specifies and doesn't specify input IPs and ports. The real logic is significantly more complicated than this reductive example, obviously.

What I am looking for is a cmp.Option similar to IgnoreFields (since the type is []string) crossed with Comparer(), but without the symmetric requirement. For example:

cmpIPs := FieldComparer(T{}, "IPs", func(x, y []string) bool {
    if len(x.IPs) == 0 {
        return true
    }
    return reflect.DeepEqual(x.IPs, y.IPs)
}
cmpPort := FieldComparer(T{}, "Port", func(x, y int) bool {
    if x.Port == 0 {
        return true
    }
    return x.Port == y.Port
}
cmp.Equal(input, output, cmpIPs, cmpPort)

Is this possible to express?

@dsnet
Copy link
Collaborator

dsnet commented Aug 2, 2021

Is this possible to express?

Technically, yes.

The entire cmpopts package is implemented in terms of the public API of cmp. The feature that seems most useful to you is the currently unexported cmpopts.filterField function.

Assuming that function were exported, you could do something like:

cmpIPs := cmpopts.FilterField(T{}, "IPs", cmp.Comparer(func(x, y []string) bool {
    return reflect.DeepEqual(x, y) || len(x) == 0 || len(y) == 0
}))
cmpPort := cmpopts.FilterField(T{}, "Port", cmp.Comparer(func(x, y int) bool {
    return x == y || x == 0 || y == 0
 }))
cmp.Equal(input, output, cmpIPs, cmpPort)

I modified the checks against 0 to be symmetric. The requirement that cmp.Equal be symmteric was partly to support #67, where we would like to keep the property that:

  • !cmp.Less(x, y) and !cmp.Less(y, x) implies that cmp.Equal(x, y) or cmp.Equal(y, x)

Regarding why cmpopts.filterField is currently unexported, it's because there are some issues with composability that need to be figured out. For now, perhaps its best to just fork the implementation of cmpopts.filterFields.

@dsnet
Copy link
Collaborator

dsnet commented Aug 2, 2021

For the composibility problems with cmpopts.FilterField, this is a discussion I had with Roger on Slack some time ago:

  • @rogpeppe: I was just trying to do something with cmp and remembered that it's perhaps harder than it should be. I want to apply a comparison option (in this case, cmpopts.EquateApproxTime) to one particular field in a given struct type. What's a nice way of doing that?

  • @dsnet: Ideally, you would use the unexported cmpopts.filterFields helper

  • @dsnet: I never exposed it since composability of filters is a tricky subject

  • @rogpeppe: ah, i'd just found that
    in general, i think the Path support is a weak spot of cmp currently

  • @dsnet: I don't know if it's fair to say that Path itself is a weakness, since it's just a data type that expresses how we got to the current value we're comparing relative to the root. At present, that type expresses everything that you might want to know.

    I think the weakness is stemming from what operations can easily be done with a Path, whether they be expressed as methods on the Path or as functions that operate on a Path.

    Taking the earlier question of how to filter cmpopts.EquateApproxTime to a specific struct field, the Path has all the information available to accomplish that. The question then becomes, how can we express that in the most natural and intuitive way.

    Regarding the unexported filterField function, suppose you do:

    cmpopts.FilterField(MyStruct{}, "MyField", someOpt)

    an ambiguity arises whether your intention is for the FilterField to:

    • only apply exactly to the value at MyStruct.MyField, OR
    • apply to the value at MyStruct.MyField and all sub-values of it.

    I can think of situations where you might want one semantic or the other.
    I feel like part of the challenge is that describing filters on a tree-like structure is a hard problem in computer science regardless of whether it's in cmp or elsewhere.

  • @rogpeppe: How about FilterFieldOnly as well as FilterField?
    the latter form would match anything under

  • @dsnet: Do you think most users would understand the difference?
    I feel like most push-back from the proposal to add cmp to the standard library is that the library is too complicated

  • @rogpeppe: good question. I'll think about that

  • @dsnet: In the past, I thought there was a good reason why "only apply exactly to the value at MyStruct.MyField" might be desired, but now I'm struggling to remember what it was. Maybe we should just expose cmpopts.FilterField with the "apply to the value at MyStruct.MyField and all sub-values of it" semantics.

  • @rogpeppe: yes, on thinking about it that sounds good
    after all, the standard options apply recursively to the root value

  • @dsnet: Quiz: would you expect there to be an semantic difference between:

    • cmpopts.FilterField(StructAlpha{}, "FieldAlpha", cmpopts.FilterField(StructBravo{}, "FieldBravo", opt))
    • cmpopts.FilterField(StructBravo{}, "FieldBravo", cmpopts.FilterField(StructAlpha{}, "FieldAlpha", opt))

    ?

  • @rogpeppe: i think they should both be identical
    after all, in both places, opt applies only when we are both inside StructAlpha.FieldAlpha and StructBravo.FieldBravo

  • @dsnet: I believe you are correct since FilterField would only check whether that StructX.FieldX is in the path. Also, chaining filters together is equivalent to a logical AND. Thus the above is equivalent to saying:
    (is StructAlpha.FieldAlpha in the path) AND (is StructBravo.FieldBravo in the path)
    (is StructBravo.FieldBravo in the path) AND (is StructAlpha.FieldAlpha in the path)

  • @rogpeppe: exactly

  • @dsnet: The downside of the filter chaining is that it visually looks like it would be a "comes before" operator where someone might expect it to be:
    (StructAlpha.FieldAlpha is in the path) AND it occurs before (StructBravo.FieldBravo in the path)
    I don't know if that's gonna be a significant gotcha or not

  • @rogpeppe: would FilterField take multiple option arguments?
    actually I've just realised they should be considered different.
    there is and should be a contains relationship there
    given that struct contains relationships aren't usually mutually recursive, that should be relatively intuitive

  • @dsnet: If the relationship should be a "contains", then it seems the API for options would also need to also be expanded such that you can unwrap an option.
    This way the implementation for cmpopts.FilterField can check whether the provided opt has previously been wrapped with a cmpopts.FilterField and enforce the "contains"-like semantic.
    (aside: isn't it interesting how many type-theory-ish problems exist for cmp?)

  • @rogpeppe: hmm, is that so? i need to think about this in the morning.
    why would FilterField act any differently from any other option in that respect?

  • @dsnet: You could argue that cmp.FilterPath (which is the underlying implementation of cmpopts.FilterField) should have gone with "contains" semantics instead of a logical AND semantic when wrapped, but that ship has unfortunately already sailed.

@thockin
Copy link
Author

thockin commented Aug 3, 2021

Interesting. The symmetric bit is a trick for this case though - the comparison I need is not symmetric. (nil, "value") is allowed while ("value", nil) is an error.

For the moment I just pre-process the object and do if left.IPs == nil { left.IPs = right.IPs }. It's hacky but it got the job done. So consider this request low-prio, I guess. :)

@dsnet dsnet added the question label Dec 9, 2021
@chhsia0
Copy link

chhsia0 commented Oct 31, 2022

I recently had the same question and ended up essentially did the same as @thockin with github.com/imdario/mergo:

// Make a copy of want so we can merge ignored fields into it for diff without
// modifying the original object.
want = want.DeepCopyObject()
wantVal := reflect.ValueOf(want)
gotVal := reflect.New(reflect.Indirect(wantVal).Type())
got := gotVal.Interface()

... // populate got

// Merge all non-empty ignored fields from got to want.
if err := mergo.Merge(want, reflect.Indirect(gotVal).Interface()); err != nil {
    return fmt.Errorf("unable to merge ignored fields: %w", err)
}
if diff := cmp.Diff(want, got); diff != "" {
    return fmt.Errorf("unexpected diff (-want +got):\n%s", diff)
}

Specifically, want and got are some k8s objects that has fields of type time.Time, which cannot be handled by mergo, so the mergo transformer listed here is needed to make the above trick work (with light modifications to make it work on metav1.Time): https://pkg.go.dev/github.com/imdario/mergo@v0.3.13#hdr-Transformers

@matteoolivi
Copy link

matteoolivi commented Nov 11, 2022

I have the exact same use case: diffing K8s API objects and not wanting symmetry.
Moreover, I'd like to specify this rule in a generic way, because the objects can have many nested subfield. It'd be tedious to specify an Option or Comparer for every nested struct/subfields.

i.e. I'd like a rule like: "when diffing O1 and O2, ignore all fields of O1 that are nil, no matter their type or path"

@dsnet
Copy link
Collaborator

dsnet commented Nov 11, 2022

The symmetric nature of cmp.Equal is a property I don't think we should give up, but I could imagine something like:

cmp.Diff(want, got, cmpopts.IgnoreUnpopulated(want))

where cmptsopts.IgnoreUnpopulated uses go reflection to dynamically build a cmp.FilterPath from want.
It functionally provides a way to have a asymmetric comparison from a symmetric comparison.

The exact semantics of cmpopts.IgnoreUnpopulated would have to be thought-out, though.

@chhsia0
Copy link

chhsia0 commented Nov 11, 2022

Another snippet to use in combination with the time transformer to correctly compare KRM object statuses (before darccio/mergo#222 is merged):

// The following function allows deep-merging slices (e.g., []metav1.Condition).
return func(dst, src reflect.Value) error {
    if dst.CanSet() {
        for i := 0; i < src.Len() && i < dst.Len(); i++ {
            srcElem := src.Index(i)
            dstPtr := dst.Index(i).Addr() // an element of a slice is always addressable
            if !srcElem.CanInterface() || !dstPtr.CanInterface() {
                continue
            }
            if err := mergo.Merge(dstPtr.Interface(), srcElem.Interface(), mergo.WithTransformers(self)); err != nil {
                return err
            }
        }
        if dst.Len() < src.Len() {
            dst.Set(reflect.AppendSlice(dst, src.Slice(dst.Len(), src.Len())))
        }
    }
    return nil
}

@matteoolivi
Copy link

The symmetric nature of cmp.Equal is a property I don't think we should give up

@dsnet I see that a non-symmetric equal would be confusing. OTOH, it might make sense because having Diff and Equal behave inconsistently might be equally surprising to users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants