Can we autoformat the Python code? #4104

cclauss · 2018-12-09T11:31:06Z

PyLint is very picky about import order, line lengths, indentation, etc.

Could we pick a tool(s) like autopep8, yapf, black, isort that we could run on our PRs before submission so that there is not so much manual rework required to placate PyLint?

My person preference is black because you can't mess with it.

Any color you want as long as it is black.

qstokkink · 2018-12-10T07:59:13Z

Personally I believe programmers should be in charge of styling code to be readable and tools should tell them when they go too far, instead of the tools dictating the style. I also believe that sometimes you need to ignore "correct" formatting/style.

Both programmers and automatic tools make mistakes though. I have seen autopep code that made my eyes water, but the same goes for humans.

cclauss · 2018-12-10T11:48:33Z

There is Goolge's golang fmt philosophy that says a lot of programmer time and code review attention goes into code formatting. They have established one format, language-wide and it is normal operations to put fmt into precommit hooks and continuous integration tools so that programmer attention is focused on what the code does instead of on changing how it looks. Their yapf was an effort to do the same for Python and black has an even stricter focus. A lot of programmer time and effort is being sunk into placating linters.

qstokkink · 2018-12-10T12:08:08Z

I understand where you're coming from, but I disagree with the "linter is always right" idea. Basically this: https://dev9.com/blog-posts/2015/1/the-myth-of-developer-productivity .

Humans make the best code for other humans and tools are only there to point where the human might have had a few too many beers during a late night programming session. If there is ever a conflict between tool and human, human should investigate the cause and decide who is right. Only in the most trivial cases should we adopt what the machine poops out.

ichorid · 2019-02-28T13:38:31Z

I would argue that many previous generations of programmers on Tribler project have never heard about linting... Personally, I don't like to spend my finite human life in this world fighting lint errors, and then fighting other developers on should these lint errors be fixed or not. And as my current job here is refactoring, which is basically repaying technical debt of the previous generations, I would definitely like some of this debt to be repaid by a machine. Because that is what programming is about - automating boring tasks.

Therefore, I vote to define some clear (and not too strict) PEP8-based rules for the Tribler project and then run autopep8 on the whole Tribler tree once.

After that, every developer's work will become much easier, as instead of fixing pylint errors by hand the developer will only have to hit Ctl-Alt-L once in their PyCharm (or whatever they use) and have the code formatted automatically without fear of touching lines that they have not really edited.

qstokkink · 2019-02-28T13:59:36Z

We already autopep8'd the entire codebase once https://github.com/Tribler/tribler/search?q=autopep8&type=Issues

Our life has not gotten any easier.

Also, read https://github.com/Tribler/tribler/blob/devel/doc/contributing.rst

ichorid · 2019-02-28T15:27:38Z

I always press the "Reformat code" button in PyCharm before committing stuff. Why do I then get the lint and flake8 errors?

synctext · 2019-02-28T16:09:24Z

Would anybody like to take the tool 'black' for a spin with our codebase?

ichorid · 2019-02-28T17:04:07Z

@synctext , I've tried that. The result really depends on what kind of code you format with it. For @qstokkink's crypto code it produce awful results. For the older TorrentChecker it works wonders.

ichorid · 2019-02-28T17:06:57Z

I propose the following approach:
if something can be checked by a machine, it should be automatically fixed by it.

So, I'll try to get our lint specifications in line with autopep8+isort results, and vice-versa.

ichorid · 2019-02-28T17:36:19Z

See and judge for yourself:

Original:

from __future__ import absolute_import
from __future__ import division

from cryptography.hazmat.primitives.asymmetric.rsa import _modinv


def format_polynomial(a, b, c):
    """
    Formats a polynomial cx^2 + bx + a into non-zero/non-one form.

    Ex. '0x^2 + 1x + 2' becomes 'x + 2'

    :param a: x^0 coefficient
    :param b: x^1 coefficient
    :param c: x^2 coefficient
    :returns: pretty format of polynomial
    """
    out = ''
    for (v, s) in [(a, ''), (b, 'x'), (c, 'x^2')]:
        if v:
            fmt_v = '' if abs(v) == 1 and s != '' else str(abs(v))
            out += (' + ' if out else '') + fmt_v + s
    if not out:
        out = '0'
    return out


class FP2Value(object):
    """
    Defines a rational value (a + bx + cx^2)/(aC + bCx + cCx^2)(mod 1 + x + x^2, mod p).
    """

    def __init__(self, mod, a=0, b=0, c=0, aC=1, bC=0, cC=0):
        """
        Intialize a value mod 'mod' of two quadratic polynomials divided by each other modulo (x^2 + x + 1).

        :param mod: the modulus
        :param a: the coefficient of 1
        :param b: the coefficient of x
        :param c: the coefficient of x^2
        :param aC: (the coefficient of 1) ^ -1
        :param bC: (the coefficient of x) ^ -1
        :param cC: (the coefficient of x) ^ -2
        """
        self.mod = mod
        self.a, self.b, self.c, self.aC, self.bC, self.cC = a % mod, b % mod, c % mod, aC % mod, bC % mod, cC % mod

    def __str__(self):
        """
        Format this value as a string.
        """
        numerator = format_polynomial(self.a, self.b, self.c)
        denominator = format_polynomial(self.aC, self.bC, self.cC)
        if denominator == '1':
            return numerator
        else:
            return "(%s)/(%s)" % (numerator, denominator)

    def __add__(self, other):
        """
        Add this value to another value and return a new FP2Value.
        """
        assert self.mod == other.mod

        a = self.aC * other.a - self.cC * other.a + self.a * other.aC - self.c * other.aC - self.bC * other.b + \
            self.cC * other.b - self.aC * other.c + self.bC * other.c - self.a * other.cC + self.b * other.cC
        b = self.bC * other.a - self.cC * other.a + self.b * other.aC - self.c * other.aC + self.aC * other.b - \
            self.bC * other.b + self.a  * other.bC - self.b * other.bC - self.aC * other.c + self.cC * other.c - \
            self.a * other.cC + self.c  * other.cC
        aC = self.aC * other.aC - self.cC * other.aC - self.bC * other.bC + \
             self.cC * other.bC - self.aC * other.cC + self.bC * other.cC
        bC = self.bC * other.aC - self.cC * other.aC + self.aC * other.bC - \
             self.bC * other.bC - self.aC * other.cC + self.cC * other.cC
        return FP2Value(self.mod, a=a, b=b, aC=aC, bC=bC)

    def __sub__(self, other):
        """
        Subtract another value from this value and return a new FP2Value.
        """
        assert self.mod == other.mod
        a = -self.aC * other.a + self.cC * other.a + self.a * other.aC - self.c * other.aC + self.bC * other.b - \
            self.cC * other.b - self.b * other.bC + self.c * other.bC + self.aC * other.c - self.bC * other.c - \
            self.a * other.cC + self.b * other.cC
        b = -self.bC * other.a + self.cC * other.a + self.b * other.aC - self.c * other.aC - self.aC * other.b + \
            self.bC * other.b + self.a * other.bC - self.b * other.bC + self.aC * other.c - self.cC * other.c - \
            self.a * other.cC + self.c * other.cC
        aC = self.aC * other.aC - self.cC * other.aC - self.bC * other.bC + \
             self.cC * other.bC - self.aC * other.cC + self.bC * other.cC
        bC = self.bC * other.aC - self.cC * other.aC + self.aC * other.bC - \
             self.bC * other.bC - self.aC * other.cC + self.cC * other.cC
        return FP2Value(self.mod, a=a, b=b, aC=aC, bC=bC)

    def __mul__(self, other):
        """
        Multiply this value with another value and return a new FP2Value.
        """
        assert self.mod == other.mod
        a = self.a * other.a - self.c * other.a - self.b * other.b + \
            self.c * other.b - self.a * other.c + self.b * other.c
        b = self.b * other.a - self.c * other.a + self.a * other.b - \
            self.b * other.b - self.a * other.c + self.c * other.c
        aC = self.aC * other.aC - self.cC * other.aC - self.bC * other.bC + \
             self.cC * other.bC - self.aC * other.cC + self.bC * other.cC
        bC = self.bC * other.aC - self.cC * other.aC + self.aC * other.bC - \
             self.bC * other.bC - self.aC * other.cC + self.cC * other.cC
        return FP2Value(self.mod, a=a, b=b, aC=aC, bC=bC)

    def __floordiv__(self, other):
        """
        Divide this value by another value and return a new FP2Value.
        """
        assert self.mod == other.mod
        a = self.a * other.aC - self.c * other.aC - self.b * other.bC + \
            self.c * other.bC - self.a * other.cC + self.b * other.cC
        b = self.b * other.aC - self.c * other.aC + self.a * other.bC - \
            self.b * other.bC - self.a * other.cC + self.c * other.cC
        aC = self.aC * other.a - self.cC * other.a - self.bC * other.b + \
             self.cC * other.b - self.aC * other.c + self.bC * other.c
        bC = self.bC * other.a - self.cC * other.a + self.aC * other.b - \
             self.bC * other.b - self.aC * other.c + self.cC * other.c
        return FP2Value(self.mod, a=a, b=b, aC=aC, bC=bC)

    def __eq__(self, other):
        """
        Check equality with another value.
        """
        if not isinstance(other, FP2Value):
            return False
        divd = (self // other).normalize()
        return all([divd.a == divd.aC, divd.b == divd.bC, divd.c == divd.cC])

    def __hash__(self):
        """
        Equality is not trivial. We hash everything to 0 for the full equality check.
        """
        return 0

    def intpow(self, power):
        """
        Raise this value by a given power (int).

        :param power: the power to raise this value by
        :type power: int
        """
        n = power
        R = FP2Value(self.mod, 1)
        U = self
        while n > 0:
            if (n % 2) == 1:
                R *= U
            U *= U
            n = n // 2
        return R

    def normalize(self):
        """
        Normalize to aC = 1: this is the best human-readable form.

        Ex. '20/4' becomes '5'
            '4 + 4x/2 + 2x' becomes '2 + 2x/1 + x'
        """
        mp = _modinv(self.aC % self.mod, self.mod)
        if mp > 0:
            a = (self.a * mp) % self.mod
            b = (self.b * mp) % self.mod
            c = (self.c * mp) % self.mod
            aC = 1
            bC = (self.bC * mp) % self.mod
            cC = (self.cC * mp) % self.mod
            return FP2Value(self.mod, a, b, c, aC, bC, cC)
        return FP2Value(self.mod, self.a, self.b, self.c, self.aC, self.bC, self.cC)

    def inverse(self):
        """
        Return the inverse of this value.
        """
        return FP2Value(self.mod, a=self.aC, b=self.bC, c=self.cC, aC=self.a, bC=self.b, cC=self.c)

    def wp_nominator(self):
        """
        Return the '1' and 'x' coefficients as a new value.
        """
        return FP2Value(self.mod, self.a, self.b)

    def wp_denom_inverse(self):
        """
        Return the '1^-1' and 'x^-1' coefficients modular inverse.
        """
        iq = FP2Value(self.mod, self.aC * self.aC - self.aC * self.bC + self.bC * self.bC)
        a = FP2Value(self.mod, self.aC - self.bC) // iq
        b = FP2Value(self.mod, -self.bC) // iq
        return FP2Value(self.mod, a.normalize().a, b.normalize().a)

Black-formatted

from __future__ import absolute_import
from __future__ import division

from cryptography.hazmat.primitives.asymmetric.rsa import _modinv


def format_polynomial(a, b, c):
    """
    Formats a polynomial cx^2 + bx + a into non-zero/non-one form.

    Ex. '0x^2 + 1x + 2' becomes 'x + 2'

    :param a: x^0 coefficient
    :param b: x^1 coefficient
    :param c: x^2 coefficient
    :returns: pretty format of polynomial
    """
    out = ""
    for (v, s) in [(a, ""), (b, "x"), (c, "x^2")]:
        if v:
            fmt_v = "" if abs(v) == 1 and s != "" else str(abs(v))
            out += (" + " if out else "") + fmt_v + s
    if not out:
        out = "0"
    return out


class FP2Value(object):
    """
    Defines a rational value (a + bx + cx^2)/(aC + bCx + cCx^2)(mod 1 + x + x^2, mod p).
    """

    def __init__(self, mod, a=0, b=0, c=0, aC=1, bC=0, cC=0):
        """
        Intialize a value mod 'mod' of two quadratic polynomials divided by each other modulo (x^2 + x + 1).

        :param mod: the modulus
        :param a: the coefficient of 1
        :param b: the coefficient of x
        :param c: the coefficient of x^2
        :param aC: (the coefficient of 1) ^ -1
        :param bC: (the coefficient of x) ^ -1
        :param cC: (the coefficient of x) ^ -2
        """
        self.mod = mod
        self.a, self.b, self.c, self.aC, self.bC, self.cC = (
            a % mod,
            b % mod,
            c % mod,
            aC % mod,
            bC % mod,
            cC % mod,
        )

    def __str__(self):
        """
        Format this value as a string.
        """
        numerator = format_polynomial(self.a, self.b, self.c)
        denominator = format_polynomial(self.aC, self.bC, self.cC)
        if denominator == "1":
            return numerator
        else:
            return "(%s)/(%s)" % (numerator, denominator)

    def __add__(self, other):
        """
        Add this value to another value and return a new FP2Value.
        """
        assert self.mod == other.mod

        a = (
            self.aC * other.a
            - self.cC * other.a
            + self.a * other.aC
            - self.c * other.aC
            - self.bC * other.b
            + self.cC * other.b
            - self.aC * other.c
            + self.bC * other.c
            - self.a * other.cC
            + self.b * other.cC
        )
        b = (
            self.bC * other.a
            - self.cC * other.a
            + self.b * other.aC
            - self.c * other.aC
            + self.aC * other.b
            - self.bC * other.b
            + self.a * other.bC
            - self.b * other.bC
            - self.aC * other.c
            + self.cC * other.c
            - self.a * other.cC
            + self.c * other.cC
        )
        aC = (
            self.aC * other.aC
            - self.cC * other.aC
            - self.bC * other.bC
            + self.cC * other.bC
            - self.aC * other.cC
            + self.bC * other.cC
        )
        bC = (
            self.bC * other.aC
            - self.cC * other.aC
            + self.aC * other.bC
            - self.bC * other.bC
            - self.aC * other.cC
            + self.cC * other.cC
        )
        return FP2Value(self.mod, a=a, b=b, aC=aC, bC=bC)

    def __sub__(self, other):
        """
        Subtract another value from this value and return a new FP2Value.
        """
        assert self.mod == other.mod
        a = (
            -self.aC * other.a
            + self.cC * other.a
            + self.a * other.aC
            - self.c * other.aC
            + self.bC * other.b
            - self.cC * other.b
            - self.b * other.bC
            + self.c * other.bC
            + self.aC * other.c
            - self.bC * other.c
            - self.a * other.cC
            + self.b * other.cC
        )
        b = (
            -self.bC * other.a
            + self.cC * other.a
            + self.b * other.aC
            - self.c * other.aC
            - self.aC * other.b
            + self.bC * other.b
            + self.a * other.bC
            - self.b * other.bC
            + self.aC * other.c
            - self.cC * other.c
            - self.a * other.cC
            + self.c * other.cC
        )
        aC = (
            self.aC * other.aC
            - self.cC * other.aC
            - self.bC * other.bC
            + self.cC * other.bC
            - self.aC * other.cC
            + self.bC * other.cC
        )
        bC = (
            self.bC * other.aC
            - self.cC * other.aC
            + self.aC * other.bC
            - self.bC * other.bC
            - self.aC * other.cC
            + self.cC * other.cC
        )
        return FP2Value(self.mod, a=a, b=b, aC=aC, bC=bC)

    def __mul__(self, other):
        """
        Multiply this value with another value and return a new FP2Value.
        """
        assert self.mod == other.mod
        a = (
            self.a * other.a
            - self.c * other.a
            - self.b * other.b
            + self.c * other.b
            - self.a * other.c
            + self.b * other.c
        )
        b = (
            self.b * other.a
            - self.c * other.a
            + self.a * other.b
            - self.b * other.b
            - self.a * other.c
            + self.c * other.c
        )
        aC = (
            self.aC * other.aC
            - self.cC * other.aC
            - self.bC * other.bC
            + self.cC * other.bC
            - self.aC * other.cC
            + self.bC * other.cC
        )
        bC = (
            self.bC * other.aC
            - self.cC * other.aC
            + self.aC * other.bC
            - self.bC * other.bC
            - self.aC * other.cC
            + self.cC * other.cC
        )
        return FP2Value(self.mod, a=a, b=b, aC=aC, bC=bC)

    def __floordiv__(self, other):
        """
        Divide this value by another value and return a new FP2Value.
        """
        assert self.mod == other.mod
        a = (
            self.a * other.aC
            - self.c * other.aC
            - self.b * other.bC
            + self.c * other.bC
            - self.a * other.cC
            + self.b * other.cC
        )
        b = (
            self.b * other.aC
            - self.c * other.aC
            + self.a * other.bC
            - self.b * other.bC
            - self.a * other.cC
            + self.c * other.cC
        )
        aC = (
            self.aC * other.a
            - self.cC * other.a
            - self.bC * other.b
            + self.cC * other.b
            - self.aC * other.c
            + self.bC * other.c
        )
        bC = (
            self.bC * other.a
            - self.cC * other.a
            + self.aC * other.b
            - self.bC * other.b
            - self.aC * other.c
            + self.cC * other.c
        )
        return FP2Value(self.mod, a=a, b=b, aC=aC, bC=bC)

    def __eq__(self, other):
        """
        Check equality with another value.
        """
        if not isinstance(other, FP2Value):
            return False
        divd = (self // other).normalize()
        return all([divd.a == divd.aC, divd.b == divd.bC, divd.c == divd.cC])

    def __hash__(self):
        """
        Equality is not trivial. We hash everything to 0 for the full equality check.
        """
        return 0

    def intpow(self, power):
        """
        Raise this value by a given power (int).

        :param power: the power to raise this value by
        :type power: int
        """
        n = power
        R = FP2Value(self.mod, 1)
        U = self
        while n > 0:
            if (n % 2) == 1:
                R *= U
            U *= U
            n = n // 2
        return R

    def normalize(self):
        """
        Normalize to aC = 1: this is the best human-readable form.

        Ex. '20/4' becomes '5'
            '4 + 4x/2 + 2x' becomes '2 + 2x/1 + x'
        """
        mp = _modinv(self.aC % self.mod, self.mod)
        if mp > 0:
            a = (self.a * mp) % self.mod
            b = (self.b * mp) % self.mod
            c = (self.c * mp) % self.mod
            aC = 1
            bC = (self.bC * mp) % self.mod
            cC = (self.cC * mp) % self.mod
            return FP2Value(self.mod, a, b, c, aC, bC, cC)
        return FP2Value(
            self.mod, self.a, self.b, self.c, self.aC, self.bC, self.cC
        )

    def inverse(self):
        """
        Return the inverse of this value.
        """
        return FP2Value(
            self.mod,
            a=self.aC,
            b=self.bC,
            c=self.cC,
            aC=self.a,
            bC=self.b,
            cC=self.c,
        )

    def wp_nominator(self):
        """
        Return the '1' and 'x' coefficients as a new value.
        """
        return FP2Value(self.mod, self.a, self.b)

    def wp_denom_inverse(self):
        """
        Return the '1^-1' and 'x^-1' coefficients modular inverse.
        """
        iq = FP2Value(
            self.mod, self.aC * self.aC - self.aC * self.bC + self.bC * self.bC
        )
        a = FP2Value(self.mod, self.aC - self.bC) // iq
        b = FP2Value(self.mod, -self.bC) // iq
        return FP2Value(self.mod, a.normalize().a, b.normalize().a)

ichorid · 2019-02-28T17:37:12Z

Personally, I find the Black-formatted code more readable, at least in this case.

cclauss · 2019-02-28T18:10:15Z

Autoformatting is for consistency. I am not a fan of sometime we do this and other times we do that.

synctext · 2019-03-01T08:47:40Z

@cclauss Agreed. We can hopefully convergence towards a consistent approach. We want the code base to have same look and feel everywhere. Guess such esthetics are difficult and deeply personal. Would an optional "reformatter" work?

Or should we first focus exclusively on slaying the Python3 dragon?

qstokkink · 2019-03-01T09:09:35Z

    def __add__(self, other):
        """
        Add this value to another value and return a new FP2Value.
        """
        assert self.mod == other.mod

        a = self.aC * other.a - self.cC * other.a + self.a * other.aC - self.c * other.aC - self.bC * other.b + \
            self.cC * other.b - self.aC * other.c + self.bC * other.c - self.a * other.cC + self.b * other.cC
        b = self.bC * other.a - self.cC * other.a + self.b * other.aC - self.c * other.aC + self.aC * other.b - \
            self.bC * other.b + self.a  * other.bC - self.b * other.bC - self.aC * other.c + self.cC * other.c - \
            self.a * other.cC + self.c  * other.cC
        aC = self.aC * other.aC - self.cC * other.aC - self.bC * other.bC + \
             self.cC * other.bC - self.aC * other.cC + self.bC * other.cC
        bC = self.bC * other.aC - self.cC * other.aC + self.aC * other.bC - \
             self.bC * other.bC - self.aC * other.cC + self.cC * other.cC
        return FP2Value(self.mod, a=a, b=b, aC=aC, bC=bC)

You can see that this design and this implementation are the same.

    def __add__(self, other):
        """
        Add this value to another value and return a new FP2Value.
        """
        assert self.mod == other.mod

        a = (
            self.aC * other.a
            - self.cC * other.a
            + self.a * other.aC
            - self.c * other.aC
            - self.bC * other.b
            + self.cC * other.b
            - self.aC * other.c
            + self.bC * other.c
            - self.a * other.cC
            + self.b * other.cC
        )
        b = (
            self.bC * other.a
            - self.cC * other.a
            + self.b * other.aC
            - self.c * other.aC
            + self.aC * other.b
            - self.bC * other.b
            + self.a * other.bC
            - self.b * other.bC
            - self.aC * other.c
            + self.cC * other.c
            - self.a * other.cC
            + self.c * other.cC
        )
        aC = (
            self.aC * other.aC
            - self.cC * other.aC
            - self.bC * other.bC
            + self.cC * other.bC
            - self.aC * other.cC
            + self.bC * other.cC
        )
        bC = (
            self.bC * other.aC
            - self.cC * other.aC
            + self.aC * other.bC
            - self.bC * other.bC
            - self.aC * other.cC
            + self.cC * other.cC
        )
        return FP2Value(self.mod, a=a, b=b, aC=aC, bC=bC)

This does not even fit on the same screen. This means the auto formatting is actively bad and the original human formatting was much better (not just because I wrote it of course 😉 )

qstokkink · 2019-03-01T09:32:44Z

Do note, just to reiterate, I'm not saying you can never use auto formatting. I'm only arguing that our checks should allow for the 2/10 cases where human formatting is better than auto formatting.

ichorid · 2019-03-01T14:53:59Z

@qstokkink , black allows the programmer to define regions that are not to be touched by the formatter, like this:

# fmt: off
custom_formatting = [
    0,  1,  2,
    3,  4,  5,
    6,  7,  8,
]
# fmt: on
regular_formatting = [0, 1, 2, 3, 4, 5, 6, 7, 8]

qstokkink · 2019-03-02T06:50:49Z

What would the previous example look like in that case in black then?

qstokkink · 2019-03-02T07:17:18Z

Also, what exactly is the proposed workflow? Having to fight my own branch with force pushes is going to be a pain in the behind. Polluting the repository with automatic style commits is also not a good idea imo.

cclauss · 2019-03-02T07:36:14Z

# fmt: off
def __add__(self, other):
        """
        Add this value to another value and return a new FP2Value.
        """
        assert self.mod == other.mod

        a = (self.aC * other.a - self.cC * other.a + self.a * other.aC - self.c * other.aC - self.bC * other.b +
             self.cC * other.b - self.aC * other.c + self.bC * other.c - self.a * other.cC + self.b * other.cC)
        b = (self.bC * other.a - self.cC * other.a + self.b * other.aC - self.c * other.aC + self.aC * other.b -
             self.bC * other.b + self.a  * other.bC - self.b * other.bC - self.aC * other.c + self.cC * other.c -
             self.a * other.cC + self.c  * other.cC)
        aC = (self.aC * other.aC - self.cC * other.aC - self.bC * other.bC +
              self.cC * other.bC - self.aC * other.cC + self.bC * other.cC)
        bC = (self.bC * other.aC - self.cC * other.aC + self.aC * other.bC -
              self.bC * other.bC - self.aC * other.cC + self.cC * other.cC)
        return FP2Value(self.mod, a=a, b=b, aC=aC, bC=bC)
# fmt: on

;-) Backslashes are a bad idea in Python code because one whitespace to the right of the backslash breaks the script on a change that is invisible to the reader. See PEP8.

A pre commit hook is provided by the black team.

https://github.com/ambv/black#version-control-integration

qstokkink · 2019-03-04T08:27:16Z

;-) Backslashes are a bad idea in Python code because one whitespace to the right of the backslash breaks the script on a change that is invisible to the reader. See PEP8.

Not to worry, this is an old version of the code. This has been fixed to adhere to flake8 recently.

Back to some fundamental issues: I'm absolutely against performance tools/tests/linters having a footprint in our codebase. This is bad design, as this implies there are special rules for tools, which should be giving you an objective measure of your code (instead of programmers fooling the tool's heuristics into giving a good score). Passing heuristics != good code. This is also in line with:

Autoformatting is for consistency. I am not a fan of sometime we do this and other times we do that.

Having to sometimes switch away from your auto formatter means the auto formatter is no good (either generally or due to bad configuration). This implies that we either need to configure black to suit our needs or use an auto formatter which is a bit less aggressive. Tweaking the auto formatter to be just right is a lot of work though. Maybe we should offer this up as a B.Sc. thesis topic?

cclauss · 2019-03-04T08:59:38Z

we either need to configure black to suit our needs or use an auto formatter which is a bit less aggressive. Tweaking the auto formatter to be just right is a lot of work though.

Black and go.fmt are both opinionated and intentionally not tweakable. They do not have command line options that allow the rules to be changed.

The Black docs start with:

By using Black, you agree to cede control over minutiae of hand-formatting. In return, Black gives you speed, determinism, and freedom from pycodestyle nagging about formatting. You will save time and mental energy for more important matters.

The blog that introduced go.fmt says:

Mechanical transformation is invaluable when working with large code bases, as it is both more comprehensive and less error prone than making wide-sweeping changes by hand. Indeed, when working at scale (like we do at Google) it isn't practical to make these kinds of changes manually.

qstokkink · 2019-03-04T09:32:08Z

Then we shouldn't use those tools. Bad code style is not a problem to be solved, Bad code style is a symptom of unmaintainable and/or unreadable code.

ichorid · 2019-03-04T10:47:54Z

Then we shouldn't use those tools. Bad code style is not a problem to be solved, Bad code style is a symptom of unmaintainable and/or unreadable code.

Unfortunately for me, my job here is exactly that: transforming large swathes of unmaintainable and/or unreadable code into something that is both readable and maintainable. And as I am physically unable to rewrite each offending file (even glancing through it took 2 workweeks!), I often end up moving and/or modifying large portions of existing files. And then I have to go through exactly that painful process that scared @qstokkink so much:

Having to fight my own branch with force pushes ...

@cclauss seems to understand that pain because, as myself, he faces whole codebases.

As there is no consensus on this, seems like I'll have to develop my own pre-commit hooks that fit our style checkers.

qstokkink · 2019-03-04T11:17:42Z

transforming large swathes of unmaintainable and/or unreadable code into something that is both readable and maintainable

If you're at that point, an auto formatter is not what you need anyway. That requires manual inspection and thorough refactoring, which is, indeed, not a trivial task.

ichorid · 2019-03-04T12:09:42Z

@qstokkink, the autoformatter is precisely what I need because when I move some code around, Flake8 considers this code 'touched' and starts to bug me on its formatting inconsistencies. And when I add/remove some modules and hit c-alt-L in Pycharm to optimize and align the changed imports, Flake8 complains that the imports are in the wrong order, grouped wrong or whatever.
I believe you don't encounter this kind of problems often because you mainly work on your personal, quite mature codebase, making small, incremental changes. Alas, most of the time, I am deprived of that luxury...

qstokkink · 2019-03-04T13:40:53Z

That's fine then right? Just apply your formatter for the files you touched.

ichorid · 2019-03-04T13:48:49Z

Yeah, kinda. Except for the problem of bigger-than-should-be diffs. But if everyone would use this workflow, the diffs will be normal. Because someone else would already applied the formatting to the entire file.

cclauss · 2019-03-04T15:13:36Z

@ichorid Do you have a pointer to your workflow? I am interested to adopt it.

ichorid · 2019-03-04T16:00:29Z

@cclauss , my workflow was really simple: just code and hit c-alt-L in Pycharm for it to align everything for you, then commit. It does not work for me now, because while I was working on a separate branch the rest of the team adopted this Flake8 checkers that are more picky about PEP8 than Pycharm's code reformat feature cares. I'm sure your workflow is much more sophisticated ))

Now, I propose the following workflow: autoformat the whole file you changed on pre-commit hook with something like a combination of isort and black. As long as everyone follows this workflow, the probability of merge conflicts, etc. will be minimized. And everyone could just forget about formatting at all and concentrate on the code.

ichorid · 2019-08-03T19:56:46Z

I'm working on implementing pre-commit hooks for checking our style requirements and automatically enforcing them with black and isort. These tools are not entirely compatible with our current style standards in some minor details. To ensure compatibility, we'll have to either modify the tools, or slightly change our style standards.
At the moment, I found the following incompatibilities:

E203 check in flake8 is not PEP8 compliant, and since black enforces PEP8, it triggers our flake8 checker. See black documentation on the issue. I propose removing E203 from our flake8 checks.
isort has no option to enforce cryptography-style "newline between 3rd party libraries" imports that is checked by flake8-import-order plugin that we use. There is an alternative flake8-isort plugin that uses isort itself for the checks. This ensures compatiblity between isort-sorted imports and flake8 checker. I propose switching to flake8-isort so we can start applying imports formatting automatically.

cclauss · 2019-08-03T20:31:06Z

My vote would be that you change your local style guidelines to conform with black formatting.

ichorid · 2019-08-04T09:26:48Z

Here is one example of pre-commit hooks used by ReadTheDocs project.

ichorid · 2019-08-04T15:37:08Z

Relevant links:

isort settings;
flake8-import-order plugin that we currently use with cryptography style;
autoflake tool that removes unused imports and variables (pre-commit hook)
flake8-isort plugin for flake8 that uses isort to check for imports order.
Short guide on setting up pre-commit for use with Python.

ichorid · 2019-08-04T15:52:40Z

My current pre-commit setup. Tested today, works perfectly (except for our flake8-import-order settings, of course..)

.pre-commit-config.yaml

repos:
-   repo: https://github.com/Tribler/mirrors-autoflake
    rev: v1.1
    hooks:
      - id: autoflake
        args: ['--in-place', '--remove-all-unused-imports', '--remove-unused-variable']

-   repo: https://github.com/Tribler/isort
    rev: 4.3.21-2-tribler
    hooks:
    - id: isort

-   repo: https://github.com/ambv/black
    rev: 19.3b0
    hooks:
    - id: black

-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v2.2.3
    hooks:
    -  id: flake8

.isort.cfg

[settings]
order_by_type = False
multi_line_output=3
case_sensitive = True
include_trailing_comma=True
force_grid_wrap=0
use_parentheses = True
combine_as_imports=True
force_sort_within_sections=True
line_length=118
known_future_library=future
known_first_party=Tribler,TriblerGUI
known_third_party=pony,twisted,six,anydex,ipv8,libtorrent,lz4,PyQt5,zope

.flake8

[flake8]
jobs = auto 
max-line-length = 120 
#select=E
ignore=E203,W503

pyproject.toml

[tool.black]
line-length = 120
py27 = true
skip-string-normalization = true
include = '\.pyi?$'
exclude = '''
/(
    \.git
  | \.venv
  | \.idea 
  | snap
  | doc
)/
'''

To try it out, create the aforementioned files in your Tribler root repository, then do

pip install flake8 black isort autoflake pre-commit
pre-commit  install

To try it out:

pre-commit run -v

Note that complete compliance with flake8 cryptography style requires modified isort.

synctext · 2019-08-04T16:52:40Z

Good initiative. Please organise a demo meeting for the DevTeam when everybody is back.

ichorid · 2019-08-13T17:51:37Z

C0330 is wrong on pylint, we should disable it. See psf/black#48

ichorid · 2019-09-04T09:41:36Z

After a thorough discussion between developers, we came to a consensus on applying Black and autoformatting our codebase:

We will add pre-commit hooks configuration to our codebase and developer guidelines;
The pre-commit hooks will contain flake8 checks, custom imports sorter based on isort, as indicated above, and Black formatter;
Initially, Black formatting will only be applied to the MetadataStore submodule and TriblerGUI;
After @egbertbouman's AsyncIO-related refactoring will be merged, we will consider applying Black formatting to the rest of the codebase;
Chosen parts of Tribler can be excluded from Black formatting at that point if owners of the respective code parts choose to do so.

devos50 added the type: enhancement label Dec 9, 2018

devos50 added this to the Backlog milestone Dec 9, 2018

cclauss closed this as completed Dec 10, 2018

ichorid reopened this Feb 28, 2019

ichorid self-assigned this Feb 28, 2019

ichorid modified the milestones: Backlog, V7.4: Libtorrent wrapper refactor Feb 28, 2019

cclauss closed this as completed Mar 4, 2019

ichorid reopened this Aug 3, 2019

ichorid added the component: infrastructure label Aug 13, 2019

This was referenced Aug 14, 2019

Fix false broken db warning in logger #4751

Merged

READY: Nested channels refactoring #4720

Closed

Add flake8/black/isort pre-commit hooks #4760

Merged

devos50 closed this as completed in #4760 Sep 5, 2019

ichorid mentioned this issue Jul 21, 2020

Feature/i18n #5479

Merged

ichorid mentioned this issue Nov 20, 2020

Add sentry #5735

Merged

3 tasks

Can we autoformat the Python code? #4104

Can we autoformat the Python code? #4104

Comments

cclauss commented Dec 9, 2018 • edited

qstokkink commented Dec 10, 2018

cclauss commented Dec 10, 2018 • edited

qstokkink commented Dec 10, 2018

ichorid commented Feb 28, 2019 • edited

qstokkink commented Feb 28, 2019 • edited

ichorid commented Feb 28, 2019

synctext commented Feb 28, 2019

ichorid commented Feb 28, 2019

ichorid commented Feb 28, 2019

ichorid commented Feb 28, 2019

ichorid commented Feb 28, 2019

cclauss commented Feb 28, 2019 • edited

synctext commented Mar 1, 2019 • edited

qstokkink commented Mar 1, 2019 • edited

qstokkink commented Mar 1, 2019

ichorid commented Mar 1, 2019

qstokkink commented Mar 2, 2019

qstokkink commented Mar 2, 2019

cclauss commented Mar 2, 2019 • edited

qstokkink commented Mar 4, 2019

cclauss commented Mar 4, 2019

qstokkink commented Mar 4, 2019

ichorid commented Mar 4, 2019

qstokkink commented Mar 4, 2019

ichorid commented Mar 4, 2019 • edited

qstokkink commented Mar 4, 2019

ichorid commented Mar 4, 2019

cclauss commented Mar 4, 2019

ichorid commented Mar 4, 2019 • edited

ichorid commented Aug 3, 2019

cclauss commented Aug 3, 2019

ichorid commented Aug 4, 2019

ichorid commented Aug 4, 2019

ichorid commented Aug 4, 2019 • edited

synctext commented Aug 4, 2019

ichorid commented Aug 13, 2019

ichorid commented Sep 4, 2019

cclauss commented Dec 9, 2018 •

edited

cclauss commented Dec 10, 2018 •

edited

ichorid commented Feb 28, 2019 •

edited

qstokkink commented Feb 28, 2019 •

edited

cclauss commented Feb 28, 2019 •

edited

synctext commented Mar 1, 2019 •

edited

qstokkink commented Mar 1, 2019 •

edited

cclauss commented Mar 2, 2019 •

edited

ichorid commented Mar 4, 2019 •

edited

ichorid commented Mar 4, 2019 •

edited

ichorid commented Aug 4, 2019 •

edited