Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #27910: Added a ChoiceEnum class for use in field choices #11223

Closed
wants to merge 4 commits into from

Conversation

shaib
Copy link
Member

@shaib shaib commented Apr 13, 2019

The class can serve as a base class for user enums, supporting
translatable human-readable names, or names automatically inferred
from the enum member name.

A ChoiceIntEnum is also probided.

docs/ref/models/fields.txt Outdated Show resolved Hide resolved
docs/ref/models/fields.txt Outdated Show resolved Hide resolved

Since the case where the enum values need to be integers is extremely
common, Django provides a ``ChoiceIntEnum`` class (inheriting int and
``ChoiceEnum``).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect the case where enum values are strings to be similarly common. Perhaps we should provide ChoiceStrEnum too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may. The real argument, actually, was mirroring the stdlib's Enum and IntEnum. More opinions welcome.

@shaib shaib force-pushed the ticket_27910_enums branch 4 times, most recently from d6d47f1 to 64125b8 Compare April 14, 2019 14:35
@jrief
Copy link
Contributor

jrief commented Apr 25, 2019

Hello Shai,
thanks for showing me this implementation during the sprints on DjangoCon2019.
I just played around with it, because I once implemented something similar.
What do you think about this implementation:

class ChoiceEnumMeta(EnumMeta):
    def __new__(metacls, classname, bases, classdict):
        labels = {}
        for key in classdict._member_names:
            source_value = classdict[key]
            if isinstance(source_value, (list, tuple)):
                try:
                    val, labels[key] = source_value
                except ValueError:
                    raise ValueError("Invalid ChoiceEnum member '{}'".format(key))
            else:
                val = source_value
                labels[key] = key.replace("_", " ").title()
            # Use dict.__setitem__() to suppress defenses against
            # double assignment in enum's classdict
            dict.__setitem__(classdict, key, val)
        cls = super(ChoiceEnumMeta, metacls).__new__(metacls, classname, bases, classdict)
        for key, label in labels.items():
            getattr(cls, key).label = label
        return cls

    @property
    def choices(cls):
        return [(k.value, k.label) for k in cls]

Reason for this change proposal is, that I somehow dislike storing redundant information together within an enum field. In your implementation, choices would be stored side-by-side with the enum members. Here the label instead is stored together with the enum member. This imo is somehow cleaner.
The second advantage would be, that it doesn't require an OrderedDict, I can just use a normal one.

We could of course rename our special label field to getattr(cls, key)._label to emphasize that it's something private, but I don't see any serious need for this.

What do you think of it?

Copy link
Member

@ngnpope ngnpope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @shaib.

This looks really cool and would neatly replace the rubbishy way that I've done this in the past.

I've had an initial pass at reviewing this, but some improvements came to mind:

  • It would be nice to have a values and labels property on the class to access a list of each.
  • I would then rename display to label to match the new labels as displays sounds off...
  • I think that everything in ChoiceEnum can be pushed into the metaclass instead.
  • Defining ChoiceStrEnum will make using it easier, akin to ChoiceIntEnum.
  • We should enforce unique mapping of name to values via enum.unique().

I have whipped up an improved version for consideration: https://gist.github.com/pope1ni/f4bf33bb408f95cbf8791b3b0332b002

We will also need to think how this works with blank=True automatically adding a blank value and possible some other cases, e.g. system checks, etc., but none of these things are insurmountable problems.

django/db/models/fields/__init__.py Outdated Show resolved Hide resolved
django/db/models/fields/__init__.py Outdated Show resolved Hide resolved
django/db/models/fields/__init__.py Outdated Show resolved Hide resolved
django/db/models/fields/__init__.py Outdated Show resolved Hide resolved
django/db/models/fields/__init__.py Outdated Show resolved Hide resolved


class ChoiceIntEnum(int, ChoiceEnum):
"""Included for symmetry with IntEnum"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this docstring. The documentation also lacks an example of usage which I see is in the docstring above.

docs/ref/models/fields.txt Outdated Show resolved Hide resolved
return self._choices[self]

@classmethod
def validate(cls, value, message=_("Invalid choice")):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the use of this .validate() method documented anywhere? What is it used by?

@@ -96,36 +96,42 @@ and the second element is the human-readable name. For example::
('SR', 'Senior'),
)

Generally, it's best to define choices inside a model class, and to
define a suitably-named constant for each value::
Django provides a special Enum type you can subclass to define this iterable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... a special ... - it already provides more than one.

We should probably use :class:`django.db.models.ChoiceEnum` , etc. and define some proper documentation sections for these new types to aid discovery.

try:
cls(value)
except ValueError:
raise exceptions.ValidationError(message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the message need to be passed into the class method? (Maybe there is a good reason I've missed.)
I also think we want to state which choice enum and what value caused this to happen, e.g.

@classmethod
def validate(cls, value):
    try:
        cls(value)
    except ValueError as e:
        raise exceptions.ValidationError('Invalid choice: %s.%s' % (cls.__name__, value)) from e

Copy link
Contributor

@auvipy auvipy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work! don't forget to rebase after addressing the review comments.

django/db/models/fields/__init__.py Outdated Show resolved Hide resolved
docs/ref/models/fields.txt Outdated Show resolved Hide resolved
@Krolken
Copy link

Krolken commented May 21, 2019

I really like this implementation. So much clearer when you have lots of options.

I tried it out and I think it could be improved in regards of using it in querysets.

Now just using the

Student.objects.filter(year_in_school=YearInSchool.FRESHMAN)

would result in a
query with ...WHERE year_in_school = YearInSchool.FRESHMAN and not ...WHERE year_in_school = FR

Of course you can make sure to use YearInSchool.FRESHMAN.value to get the correct query but the the previous will not throw an error.

In previous implementation I have seen the way to deal with this is to add:

def __str__(self):
   return self.value

Works for me using string enums, but not sure how this would work on integer enums.

Also 'FR' == YearInSchool.FRESHMAN >> False. Since the query will not return an Enum it would be nice to be able to compare without remembering the .value

Not really sure on what is the correct way of handling it, I just know it might introduce lots of mistyping and errors when people forget the .value in a comparison or queryset filter.

@LilyFoote
Copy link
Contributor

Perhaps the exact lookup can be made enum-aware.

@Krolken
Copy link

Krolken commented May 21, 2019

Yeah. that would be cool, but wouldn't it introduce a much large change. Using the ChoiceEnum is still optional.
Since there is a new class introduced maybe it is better to implement it on the class instead of patching the queryset api.

I fiddled a little bit more on my ChoiceEnum class and added:

    def __str__(self):
        return self.value

    def __eq__(self, other):
        if isinstance(other, self.__class__):
            return super().__eq__(other)
        else:
            return self.value == other

which made it work i a nice way. I can still compare Enums to Enums but if it is not an enum I am comparing to the value of the enum.

YearInSchool.FRESHMAN == YearInSchool.FRESHMAN >> True
YearInSchool.FRESHMAN == YearInSchool.JUNIOR >> False
YearInSchool.FRESHMAN == 'FR' >> True
YearInSchool.FRESHMAN == 'SE' >> False

@jrief
Copy link
Contributor

jrief commented May 21, 2019

@Krolken

Why not

    def __str__(self):
        return str(self.label)

This in my opinion is somehow more meaningful.

@shaib
Copy link
Member Author

shaib commented May 21, 2019

Hi all,

In the last few weeks, I have been unable to give this PR and discussion the attention they deserve, and I'm not sure when I'll be able to do so. So if anyone wants to claim ownership and push this forward, go ahead.

On today's discussion, though, I'd like to add that my favorite tenet of the Zen, one which I find grossly under-appreciated, is "In the face of ambiguity, resist the temptation to guess". What @Krolken and @jrief together point out is, that in the case of a ChoiceEnum based on str, conversion to string is ambiguous (it could be for display or for use as value, and these generally yield different results).

I think the regrettable, but only choice we have is to make __str__() raise an exception with a message describing the problem. A user might solve the problem locally, by adding a .value or .name or .label as they find appropriate; or they can resolve the ambiguity in a more global way, by subclassing the Enum class (if it turns out that there is more to this than overriding __str__() than maybe we should even provide these subclasses ourselves). But I don't think the framework should make this decision.

@auvipy
Copy link
Contributor

auvipy commented May 22, 2019

@shaib are you open to accepting PR on your branch?

@shaib
Copy link
Member Author

shaib commented May 22, 2019

@auvipy in principle, of course. In practice, can't promise anything.

@jrief
Copy link
Contributor

jrief commented May 22, 2019

I think the regrettable, but only choice we have is to make str() raise an exception with a message describing the problem. A user might solve the problem locally, by adding a .value or .name or .label

As far as I understand Python, the __str__()-method shall be used to render a human readable version of an object, whereas __repr__() shall be used to provide enough information to reconstruct that object. So returning label in __str__() and value in __repr__() would be the appropriate use-case.

@Krolken
Copy link

Krolken commented May 22, 2019

If the ChoiceEnum needs to be subclassed for usage I don't see why it should be part of the codebase.
Then it could just be left as a tip for developers on how to structure their choices.

Am I right in assuming what we want with the ChoiceEnum is to get away from writing:

class Student(models.Model):
    FRESHMAN = 'FR'
    SOPHOMORE = 'SO'
    JUNIOR = 'JR'
    SENIOR = 'SR'
   YEAR_IN_SCHOOL_CHOICES = (
        (FREHSMAN, _('Freshman')),
        (SOPHOMORE, _('Sophomore')),
        (JUNIOR, _('Junior')),
       (SENIOR, _('Senior'))
)

   years_in_school = modesl.Charfield(max_length=20, choices=YEAR_IN_SCHOOL_CHOICES)

#####

Student.objects.filter(year_in_school=Student.FRESHMAN)   # Here I need to know that FRESHMAN is one of the choices.

The benefit I saw with this implementation was that I can define clear sets of choices to be used across several models, classes and functions. And I could provide a nice, translatable string for the Django Admin.
I don't see myself using the labels in a template since the values returned in querys are not Enums. Maybe in forms, but is that not handled already by ModelForm if the choices are defined?

@jrief
Copy link
Contributor

jrief commented May 22, 2019

If the ChoiceEnum needs to be subclassed for usage I don't see why it should be part of the codebase.
Then it could just be left as a tip for developers on how to structure their choices.

To prevent boilerplate?!

@shaib
Copy link
Member Author

shaib commented May 27, 2019

@Krolken:

Now just using the Student.objects.filter(year_in_school=YearInSchool.FRESHMAN) would result in a query with ...WHERE year_in_school = YearInSchool.FRESHMAN and not ...WHERE year_in_school = FR`

I believe this comment is mistaken, and I just added tests that prove it. That is, the clauses you quote are what you see when you ask Django for a string representation of the query, but the actual queries executed in the database are the correct ones (note that if you send ...WHERE year_in_school = FR directly to the database you'll get an error because FR is undefined). Likewise for comparisons. It is only the explicit conversion to str which doesn't do what you expect. Thus, the premise that

[...] the ChoiceEnum needs to be subclassed for usage

doesn't hold even for str-based ChoiceEnums, not to mention that there was never an issue with any other type. Equality testing works fine as well (this is also shown by the tests).

@jrief:

As far as I understand Python, the __str__()-method shall be used to render a human readable version of an object,

That is generally correct, but the objects under discussion are str instances. You are suggesting that we take a str instance, apply str() to it, and get a different string. I find that unacceptable. We are encountering an ambiguity between str, the type of human-readable text, and str, a type of machine-readable data. This is a property of enum.Enum, by the way; ChoiceEnum just inherited it.

So, the way I see it, we can do one of several things:

  • We could copy Python's example and avoid defining explicitly a str-based ChoiceEnum. This will allow us to tell ourselves we didn't introduce ambiguity, and most users will be fine, but once in every while a user will call str() on an an enum value and introduce a subtle bug in their code. We will be able to play "not guilty", because we didn't give them a str-based ChoiceEnum.
  • We could take @jrief's suggestion and make str return a label, and I believe this will cause similar subtle bugs, only this time we'll no longer even be able to play "not guilty".
  • We could go the way @Krolken suggests, define ChoiceStrEnum and force str() to return the value (only for ChoiceStrEnum). This would make ChoiceStrEnum different from other Enums in a surprising way.
  • We could take another variant of @Krolken's suggestion, and force str() to return the value on all ChoiceEnums. This would make all ChoiceEnums different from other Enums in a surprising way.
  • Or we could take my suggestion: define ChoiceStrEnum and make str() raise an exception (only for ChoiceStrEnum). This would be a surprise, but one that's easy to see and handle.

Further comments welcome, and I think I'll take this back to the developers' list as well.

@jrief
Copy link
Contributor

jrief commented May 27, 2019

@shaib

That is generally correct, but the objects under discussion are str instances. You are suggesting that we take a str instance, apply str() to it, and get a different string.

I suggest to return the label as the human-readable string. This is in contrast to Python's enum which simply doesn't distinguish between a machine- and a human readable value.

The reason I proposed

    def __str__(self):
        return str(self.label)

is, that self.label could be a lazy object, typically a translated string. Therefore, in some edge cases, label must be forced to text using str(self.label). (I found that out the hard way, because I'm already using this implementation in one of my projects).

Anyway, I also would implement

    def __repr__(self):
        return self.value  # or maybe wrapped inside '<...>'

@Krolken
Copy link

Krolken commented Jun 12, 2019

@shaib I went over my implementation and realised that I was not inheriting from str in my ChoiceEnum. After I changed that it worked well in querysets.

I also wrote some application testing around it and found that messing with the Enum implementation lead to headaches as it stops doing what I expect it to do.

I am all for your implementation :)

@django django deleted a comment from shaib Jun 25, 2019
shaib and others added 3 commits July 4, 2019 09:47
The class can serve as a base class for user enums, supporting
translatable human-readable names, or names automatically inferred
from the enum member name.

A ChoiceIntEnum is also provided.

Thanks Ian Foote for review.
@felixxm
Copy link
Member

felixxm commented Jul 4, 2019

@shaib Thanks for this feature 🚀

I rebased from master, added ChoiceStrEnum, fixed most of small comments, pushed some minor edits and moved ChoiceEnum & co. to a separate file enums.py. I think that mostly docs & tests are missing e.g. docs/ref/enums.py.

@pope1ni Would you like to push it forward? 💚 It seems that you have ideas how to improve it and we are quite close 🏃‍♂️ .

@ngnpope
Copy link
Member

ngnpope commented Jul 4, 2019

@felixxm Would be happy to, yes.

@felixxm
Copy link
Member

felixxm commented Jul 4, 2019

@pope1ni Great 🚀 Feel-free to update this patch in your PR.

@ngnpope
Copy link
Member

ngnpope commented Jul 4, 2019

@felixmm I've opened #11540. Just a few things I need to look at still.

@felixxm
Copy link
Member

felixxm commented Jul 5, 2019

Updated in #11540.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
8 participants