Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalizing ad-hoc attribute callbacks #146

Open
hynek opened this issue Feb 12, 2017 · 18 comments
Open

Generalizing ad-hoc attribute callbacks #146

hynek opened this issue Feb 12, 2017 · 18 comments
Labels

Comments

@hynek
Copy link
Member

hynek commented Feb 12, 2017

While 37d38c4 is pretty cool, when working on it it was very obvious to me, that it should be united with converter. After @Tinche pointed it out too, there’s no way back.

IOW, something like this:

@attr.s
class C:
    x = attr.ib()

   @x.callback_but_a_better_name:
   def take_number_make_it_a_hex_string(self, attribute, value):
       if not isinstance(value, int):
           raise TypeError(f"{attribute.name} must be an int")
       return hex(value)

Any other ideas?

The downside that it breaks the simplicity of the current implementation and we’d have to treat those callbacks in a special way (instead of just like any other validator).


Thinking of it, it might make sense to make it a different feature altogether (i.e. having a lightweight validator and something more substantial) because requiring to return the new value is a source for errors…in any case it mustn’t be called validator because the semantics would be different.

I’d tend to say that people can just do a self.x = hex(value) but that breaks frozen classes and we have that problem already in #120.

@hynek hynek added the Feature label Feb 12, 2017
@hynek hynek added this to the 17.1 milestone Feb 12, 2017
@hynek
Copy link
Member Author

hynek commented Feb 12, 2017

Maybe @x.pipe_through?

@Tinche
Copy link
Member

Tinche commented Feb 12, 2017

My line of thinking was basically that:

  • converters are a superset of validators (they can do everything validators do, and more)
  • converters have a less powerful interface than validators, but it's more convenient

Thus, if we can come up with a more sophisticated interface for converters, we could unify converters and validators and get rid of validators. I think the @x.validator syntax is nifty but of limited use (I almost always refactor my converters/validators out and reuse them), we should definitely support it, but this wasn't the primary motivation for my thinking; this idea is more to simplify the core.

So here is my idea, which may be different than @hynek's in scope.

We define a new concept. I'll call it hook, although I'm not really happy with that name, but let's go with it for now.

A simple class, to illustrate the concept.

def positive_int(inst, att, value):
    if value < 1:
        raise ValueError

@attr.s
class A:
    a = attr.ib(convert=int, validator=positive_int)

(I think it's telling I had to open attrs docs to look up the exact validator signature to write this.)

This results in the following __init__, basically:

class A:
    def __init__(self, a):
        self.a = int(a)
        positive_int(self, A.a, a)

This can be simplified by having a hook that does both.

def ensure_positive_int(inst, att, value):
    v = int(value)
    if v < 1:
        raise ValueError
    return v

@attr.s
class A:
    a = attr.ib(hook=ensure_positive_int)

# Resulting in:

class A:
    def __init__(self, a):
        self.a = ensure_positive_int(self, A.a, a)

So this is basically the same code, except we've eliminated one core concept. I'd say it's a cleaner design.

Now, about the differences in the interface. Validators take three arguments: the instance, the attribute, and the value. Converters only take the value. So let's be clever at __init__ creation time and inspect argument names and/or the number of arguments in the hook.
If the hook has one positional argument, just pass in the value. If there are more arguments, pass self for the argument called inst, value for value etc. This would allow for simpler validators and more complex converters.

(I have just realized inspect.getargspec doesn't work with int, float and maybe others. Why? In any case, this would complicate the implementation.)

For backward compatibility, validator=... and convert=... would obviously have to keep working for a year, but this is easy I think.

@hynek
Copy link
Member Author

hynek commented Feb 18, 2017

So I was thinking (dangerous!).

You’re obviously right that the only difference between a validator and convert is the amount of arguments they receive.

As for getargspec (and inspect.signature is the same way): it’s just a try/except. Passing extra arguments is the special case.

But wouldn’t it make sense to allow a whole pipeline that feeds into each other?

e.g.

@attr.s
class C:
    git_ref = attr.ib(pipe=[str])
    @git_ref.pipe
    def shorten(self, val):
        return val[:6]

Kind of like structlog’s processors. I like composable callables.

Opinions? @glyph? :D

@glyph
Copy link
Contributor

glyph commented Feb 18, 2017

This strikes me as a bridge too far :-). A full data-flow abstraction in your attribute definitions? I shudder to think of the confusion.

Also, if you really needed this, you could compose it out of simpler pieces. For example:

@attr.s
class C:
    refify = something_other_than_attrs.pipe()
    refify.pipe(str)
    @refify.pipe
    def shorten(self, val):
        return val[:6]
    git_ref = attr.ib(convert=refify)

However, this seems like a stand-in for something much more complex, which I can't fully understand whether it'd be necessary or not. Consider the much simpler:

@attr.s
class C:
    git_ref = attr.ib(convert=lambda x: str(x)[:6])

Not only is this considerably shorter, but it has the additional advantage of having a meaningful traceback that points at the offending line if one of its stages were to fail. The str in the attr.ib definition won't show up. Of course, this isn't a dealbreaker; I .addCallback(dumps) all the time, and that has the same problem. But, all things considered, it's worse than having some actual logic you can inspect; callables composed with syntax have additional tooling not available to callables composed via other more indirect means.

@hynek
Copy link
Member Author

hynek commented Feb 18, 2017

I shudder to think of the confusion.

The conclusion is – of course – an integrated e-mail client!

@hynek
Copy link
Member Author

hynek commented Feb 20, 2017

I think both of you are right. Having a single (i.e. non-list) “smart” pipe argument makes a lot of sense. I’m not sure I’d want to kick out validator especially now that it’s more useful with taking lists (composability and everything) and not depending on the return values.


All in all that would just mean that we’d rename convert to pipe, allow the usage as a decorator, and be smart about the arguments of the callables.


At that point I don’t think the rename is worth it but usage as a decorator gives nice partity with validator and OTOH we should add the smart arguments feature to validators too to have parity the other way.

@hynek hynek mentioned this issue Apr 14, 2017
6 tasks
@hynek hynek removed this from the 17.1 milestone May 10, 2017
@hynek
Copy link
Member Author

hynek commented May 10, 2017

Removing milestone because this is more thinking/bikeshedding than I’m willing to tolerate for 17.1.

@altendky
Copy link
Contributor

Re: #404 (comment)

I like the idea of a general hook from which the user can describe conversion/validation/.... But, what here let's me take a class that I've defined which validates and converts inputs using the proposed approach, and also create an instance of it without those being enforced? Or is the Sometimes it's OK to write some code suggestion that I use other techniques to create separate classes? or... I feel like I may have missed something that is obvious to others.

@hynek
Copy link
Member Author

hynek commented Jul 29, 2021

OK, I've done more dangerous thinking also in wrt #655 and #709.

I think the concept of a pipe is the way to go, but I think we've been thinking from the wrong side.

What we/the people want:

  • validators that take 1 or 3 arguments and whose return value is ignored
  • converters that take 1 or 3 arguments and whose return value is used

Currently we're lacking validators with 1 arg and converters with 3 args and everything is somewhat overlapping.

It is a matrix, but it's all about adapting interfaces.

So how about this:

def ensure_positive_int(value):
    if not value >= 1:
        raise ValueError("int not positive")

def do_it_all(inst, att, value):
    x = int(value)
    if not x >= 1:
        raise ValueError("int not positive")

    return x


@attr.define
class C:
    x: int = attr.field(pipe=[attr.Converter(int), attr.Validator(ensure_positive_int)])

# Same:
@attr.define
class C:
    x: int = attr.field(pipe=[do_it_all])

The idea here is that:

  • By default, the pipe function gets all three arguments and the return value is passed on.
  • If wrapped in Converter, it gets only the value and the return value is passed on.
  • If wrapped in Validator, it gets only the value the return value is ignored (aka the wrapper returns the value if the validator doesn't raise an error).

Obviously, there's also room for us to optimize the __init__ code, since we know what those wrappers mean. E.g. if there's two converters, it could write it as self.x = conv1(conv2(x)) or something.

One thing I like about this approach is that it makes the order explicit. Currently converters run before validators but that's something you have to know.

It might be a bit more verbose than deducing the intent by looking at parameters, but I find it makes everything clearer.


OPINIONS?

@glyph
Copy link
Contributor

glyph commented Jul 29, 2021

I find the intricate mad-genius reasoning here very appealing, but I do feel like I should ask:

is pipe= actually a good way to do this composition? If validation or conversion fails, I might rather see a traceback coming out of do_it_all than out of the guts of some optimized generated code that's just calling int.

If instead, we simply had, say:

@attr.define
class C:
    x: int = attr.field(
        process=lambda inst, att, value: attr.validate(
            inst, att, attr.convert(inst, att, value, int), attr.greater_than(0)
        )
    )

at first blush this might seem uglier, but for elaborate pipelines you still get a regular traceback, composition works by functions calling other functions, tracebacks in Sentry and similar tools get all the info they'd usually get, and attr.convert and attr.validate can make use of inst and att to give exception messages with more details.

@glyph
Copy link
Contributor

glyph commented Jul 29, 2021

(not to mention that Mypy could be made to work on this much more easily than trying to type-match the adjacent elements in a sequence)

@wsanchez
Copy link

@hynek I'm having a hard time understanding why a list of steps is better than a function with those steps in it. What is that adding?

@glyph Your example with the lambda is pretty hard to read, FWIW. I'm on my fourth blush and I can't make myself like it.

@glyph
Copy link
Contributor

glyph commented Jul 30, 2021

@wsanchez here's a less deliberately obfuscatory version:

def check_x(inst: object, att: Attribute, value: object) -> int:
    converted = attr.convert(inst, att, value, int)
    is_positive_integer = attr.greater_than(0)
    validated = attr.validate(inst, att, value, is_positive_integer)
    return validated

@attr.define
class C:
    x: int = attr.field(process=check_x)

Part of my point here though is exactly this: if you're doing a pipe=list([step, step, step]), then you have a weird inline expression and that's it. I wrote a big funky lambda to illustrate that you could still cram everything into a single streamlined expression if you want to, but you can also semantically equivalently break it out into a more readable structure, like this.

@glyph
Copy link
Contributor

glyph commented Jul 30, 2021

Alternately, method-style:

@attr.define
class C:
    x: int = attr.field()

    @attr.processor(x)
    def _process_x(self, att: Attribute, value: object) -> int:
        converted = attr.convert(self, att, value, int)
        is_positive_integer = attr.greater_than(0)
        validated = attr.validate(self, att, value, is_positive_integer)
        return validated

@glyph
Copy link
Contributor

glyph commented Jul 30, 2021

Riffing some more; maybe it would be cleaner to have the things as methods on Attribute:

@attr.define
class C:
    x: int = attr.field()

    @attr.processor(x)
    def _process_x(self, att: Attribute, value: object) -> int:
        converted = att.convert(self, value, int)
        is_positive_integer = attr.greater_than(0)
        validated = att.validate(self, value, is_positive_integer)
        return validated

@hynek
Copy link
Member Author

hynek commented Jul 30, 2021

@hynek I'm having a hard time understanding why a list of steps is better than a function with those steps in it. What is that adding?

Since we'd have only one type of processors, you'd have to have more than one if you want a stock validator and stock converter. For custom processors it makes no sense of course.

@glyph's check_x in #146 (comment) is supposed the natural form. The question is how you re-use/combine existing converters and validators. That's something people have been asking for forever, so there should be a good story for it.

Obviously we could say processor=attr.pipe(Converter(int), Validator(bigger_than_whatever)) but people tend to love syntactic sugar. ;)

@hynek
Copy link
Member Author

hynek commented Jul 30, 2021

Random thought before I forget it: since we have on_setattr, a good name would be on_init I reckon.

@zhiltsov-max
Copy link

Re: #655 (comment)

Having read this thread I, personally, think that the idea from #146 (comment) is worth considering. It is also similar to what I was thinking of in #655 (comment).

The general idea of piping also looks nice and attractive to me, as well as more generalized look on converters and validators. The point I'd want to make about it is that good interface is crucial for it to become widespread. So the solution with x = attr(pipe=[callable1, callable2]) looks like the most terse and appealing for me now. However, I don't think I like the word "pipe" here, it is a bit unintuitive and just highlights an implementation detail instead of focusing on the more high-level idea. Maybe, accepting / introducing a constructor(-s) argument is what we are trying to describe?

BTW, the word "pipe" makes me think of possible syntactic sugar alias with the | operator.

processor=attr.pipe(Converter(int), Validator(bigger_than_whatever))
attr.field(pipe=[attr.Converter(int), attr.Validator(ensure_positive_int)])

Looks a bit like typing exercises, so I'm more in favor of argspec analysis and automatic deduction, if a single fixed function interface doesn't work. Moreover, this will allow to integrate and reuse existing conversion-like functions just as is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants