Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update .clang-format to match Zcash style #6678

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

sellout
Copy link
Contributor

@sellout sellout commented May 19, 2023

This first matches the documented style in doc/developer-notes.md. Then it
attempts to minimize the diff when applied to src/zcash. I don’t think this is
the ideal style, but we can iterate from here.

In the initial commit there are some comments indicating changes to discuss.
These have been applied in later commits in this PR for comparison, but can be
dropped.

This doesn’t necessitate any changes to `examine` call sites, but the `match`
wrapper around the lambdas can now be elided.

This is useful down the road if we use `clang-format` or something else where
specifying custom pretty-printing for individual functions is difficult.

What now looks like

    examine(meh, match {
        [](const Foo& foo) { … },
        [](const Bar& bar) { … },
    });

would, under `clang-format`, likely end up similar to

    examine(
        meh,
        match {
            [](const Foo& foo) { … },
            [](const Bar& bar) { … },
        });

(which takes two extra lines, plus an additional indentation) but now can
instead be

    examine(
        meh,
        [](const Foo& foo) { … },
        [](const Bar& bar) { … });

which fits in the same space, with the same indentation as our current approach.
@sellout
Copy link
Contributor Author

sellout commented May 19, 2023

We could alternatively leave the top-level .clang-format alone (well, fix it for more recent Clang, and probably improve its Bitcoin-style compatibility) and then add our own .clang-format to src/zcash and potentially other directories where we expect our style to dominate.

But personally I think we should just have our style defined at the top level, and then have a script (and linter) that only applies it to files where we don’t need to maintain backportability.

Also converts one straggling `std::visit` to `examine`.
This first matches the documented style in doc/developer-notes.md. Then it
attempts to minimize the diff when applied to src/zcash. I don’t think this is
the ideal style, but we can iterate from here.

There are some comments indicating initial changes to discuss.
This directory is easy, since it’s unequivocally ours. So it should both inform our style, be
relatively easy to maintain our style, and won’t cause conflicts with backports.
# - minimize diff-sensitivity to things like changing function names (e.g., don’t want whitespace
# diffs on all the parameter lines)

# NB: If clang-format can’t parse this file (or doesn’t understand any of the values, it will format
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# NB: If clang-format can’t parse this file (or doesn’t understand any of the values, it will format
# NB: If clang-format can’t parse this file or doesn’t understand any of the values, it will format

AccessModifierOffset: -4
AlignEscapedNewlinesLeft: true
AlignAfterOpenBracket: AlwaysBreak
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use bitfields enough for it to matter, but I like AlignConsecutiveBitFields: Consecutive and BitFieldColonSpacing: Both.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer AlignAfterOpenBracket: BlockIndent.

This would definitely be a large diff from the current style, so it should be postponed, but I also don’t like that style, so we can discuss later 😁

AlignTrailingComments: true
AllowAllArgumentsOnNextLine: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not consistent with our current style in many places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ll check this. But … despite the name being Allow, I think making this true causes one-arg-per-line calls to be collapsed into a single line rather than left alone, so making it true initially might be a bigger diff.

Maybe commenting out the option for now (with a comment as to why) would keep things formatted as-is until we make the explicit decision?

AlignTrailingComments: true
AllowAllArgumentsOnNextLine: false
AllowAllConstructorInitializersOnNextLine: true
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option is deprecated, and since we set the replacement option PackConstructorInitializers, unnecessary.

Suggested change
AllowAllConstructorInitializersOnNextLine: true

AlignTrailingComments: true
AllowAllArgumentsOnNextLine: false
AllowAllConstructorInitializersOnNextLine: true
AllowAllParametersOfDeclarationOnNextLine: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is also not consistent with our current style.

@@ -30,17 +30,17 @@ AllowAllParametersOfDeclarationOnNextLine: false
AllowShortBlocksOnASingleLine: Empty
AllowShortCaseLabelsOnASingleLine: true
AllowShortFunctionsOnASingleLine: Inline
AllowShortIfStatementsOnASingleLine: true # I’d prefer false
AllowShortIfStatementsOnASingleLine: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AllowShortCaseLabelsOnASingleLine: true
AllowShortFunctionsOnASingleLine: Inline
AllowShortIfStatementsOnASingleLine: true # I’d prefer false
AllowShortLambdasOnASingleLine: All
AllowShortLoopsOnASingleLine: false
AlwaysBreakBeforeMultilineStrings: false
AlwaysBreakTemplateDeclarations: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider AlwaysBreakTemplateDeclarations: MultiLine (but I have no strong objection to Yes/true).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we don't currently use concepts, but we might as well set BreakBeforeConceptDeclarations: true for consistency.

AllowShortLoopsOnASingleLine: false
AlwaysBreakBeforeMultilineStrings: false
AlwaysBreakTemplateDeclarations: true
BinPackArguments: false
BinPackParameters: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AllowShortLambdasOnASingleLine: All
AllowShortLoopsOnASingleLine: false
AlwaysBreakBeforeMultilineStrings: false
AlwaysBreakTemplateDeclarations: true
BinPackArguments: false
BinPackParameters: false
BreakBeforeBinaryOperators: None # I’d prefer NonAssignment
BreakBeforeBinaryOperators: NonAssignment
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
NonAssignment is better, because None results in having to scan to a different column on each line for the operator, in multiline expressions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also set AlignOperands: AlignAfterOperator which works well with this choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of my preferences with style (which I violate constantly anyway, but an auto-formatter helps me not violate) is that changes to the code should cause minimal formatting changes on other lines1.

Things like AlignAfterOperator here violate that. To use the example from clang-format, changing the name of the variable from aaa to aaab would reformat the second line of

int aaa = bbbbbbbbbbbbbbb
        + ccccccccccccccc;

but not of

int aaa = bbbbbbbbbbbbbbb
    + ccccccccccccccc;
int aaa =
    bbbbbbbbbbbbbbb + ccccccccccccccc;

etc.

Footnotes

  1. I miss having semantic diff like I did with lisp, where line breaks, indentation, etc. are ignored (if they’re not syntactically significant) and diffs are done by token/expression rather than by char/line.

BinPackParameters: false
BreakBeforeBinaryOperators: false
BreakBeforeBinaryOperators: None # I’d prefer NonAssignment
BreakBeforeBraces: Linux
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider

Suggested change
BreakBeforeBraces: Linux
BreakBeforeBraces: Custom
BraceWrapping:
AfterCaseLabel: true
AfterClass: true
AfterControlStatement: MultiLine
AfterEnum: true
AfterFunction: true
AfterNamespace: true
AfterStruct: true
AfterUnion: true
AfterExternBlock: false
BeforeCatch: false
BeforeElse: false
BeforeLambdaBody: false
BeforeWhile: false
SplitEmptyFunction: true
SplitEmptyRecord: true
SplitEmptyNamespace: true

AfterCaseLabel: true reflects the fact that braced blocks aren't special in case branches, they just create a scope as they would anywhere else, and should be formatted as such. This is a rare case because braces usually aren't used around case branches in our code, but I am opinionated about it: where they are used they should stand out (since they don't do what people might think, in particular they don't prevent fallthough).

AfterControlStatement: MultiLine makes it clearer where the boundary between a multiline condition and the body is, while being concise in other cases.

If AfterFunction etc. are set to true, then setting SplitEmptyFunction etc. to true makes sense for consistency.

AfterExternBlock: false reflects our current style for extern "C" {.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thoughts, the effect of SplitEmptyFunction: true on the verbosity of empty constructors might justify setting SplitEmptyFunction: false. (Empty functions are rare at top level but not in constructors.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AfterCaseLabel: true

Yes, I particularly like this one.

On second thoughts, the effect of SplitEmptyFunction: true on the verbosity of empty constructors might justify setting SplitEmptyFunction: false. (Empty functions are rare at top level but not in constructors.)

There are some settings that can be set differently for “inline” definitions (in the class declaration), like AllowShortFunctionsOnASingleLine: Inline, so that might allow this to be true while not affecting the constructor case. But I don’t have a strong opinion on true/false here.

BreakBeforeBraces: Linux
BreakBeforeTernaryOperators: true
BreakConstructorInitializers: AfterColon # I’d prefer BeforeColon
BreakConstructorInitializers: BeforeColon
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no strong preference, but if this is BeforeColon, then BreakInheritanceList should also be BeforeColon for consistency.

Suggested change
BreakConstructorInitializers: BeforeColon
BreakConstructorInitializers: BeforeColon
BreakInheritanceList: BeforeColon

BreakBeforeBraces: Linux
BreakBeforeTernaryOperators: true
BreakConstructorInitializers: AfterColon # I’d prefer BeforeColon
BreakConstructorInitializers: BeforeColon
ColumnLimit: 100 # match rustfmt, if we don’t define this, then we don’t get Wadler
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol at the comment. (I guess this whole PR/review is a case of Wadler's law.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, true enough! (although hopefully having a GH check for this would minimize that in future)

Here I meant Wadler-Leijen pretty-printing, as opposed to … the other style that I can never remember the name of. Basically, setting a line length here avoids the need for “hints” to the formatter that tell it whether it should keep, say, an argument list on one line (regardless of length), or do one-line-per-argument (if there is a user-introduced newline somewhere in the arglist).

We don’t actually get Wadler-style from clang-format, but this gets it closer.

BreakBeforeTernaryOperators: true
BreakConstructorInitializers: AfterColon # I’d prefer BeforeColon
ColumnLimit: 100 # match rustfmt, if we don’t define this, then we don’t get Wadler
CommentPragmas: '^ IWYU pragma:'
ConstructorInitializerAllOnOneLineOrOnePerLine: false
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option is deprecated, and since we set the replacement option PackConstructorInitializers, unnecessary.

Suggested change
ConstructorInitializerAllOnOneLineOrOnePerLine: false

BreakBeforeTernaryOperators: true
BreakConstructorInitializers: AfterColon # I’d prefer BeforeColon
ColumnLimit: 100 # match rustfmt, if we don’t define this, then we don’t get Wadler
CommentPragmas: '^ IWYU pragma:'
ConstructorInitializerAllOnOneLineOrOnePerLine: false
ConstructorInitializerIndentWidth: 4
ContinuationIndentWidth: 4
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer

Suggested change
ContinuationIndentWidth: 4
BracedInitializerIndentWidth: 4
ContinuationIndentWidth: 8

which is consistent with the style we've started using in recent code, e.g.

if (inputs.has_value()) {
for (const auto& utxo : inputs.value().utxos) {
SignatureData sigdata;
ProduceSignature(
DummySignatureCreator(&wallet),
utxo.tx->vout[utxo.i].scriptPubKey,
sigdata,
consensusBranchId);
vin.emplace_back(COutPoint(utxo.tx->GetHash(), utxo.i), sigdata.scriptSig);
}
sproutInputCount = inputs.value().sproutNoteEntries.size();
saplingInputCount = inputs.value().saplingNoteEntries.size();
orchardInputCount = inputs.value().orchardNoteMetadata.size();
}
size_t logicalActionCount = CalculateLogicalActionCount(
vin,
vout,
std::max(sproutInputCount, sproutOutputCount),
saplingInputCount,
PadCount(saplingOutputCount),
PadCount(std::max(orchardInputCount, orchardOutputCount)));

(Move BracedInitializerIndentWidth to be in alphabetical order. It's needed because 8 characters is good for a continuation indent but too much for initializers.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ll have to check this one. The annoying thing is that I don’t think I saw anything like ArgumentListIndentWith, which is the only place (I think) that we use 8-space indents. ContinuationIndentWidth affects a lot more cases (although probably still a smaller diff than changing all the multi-line arg lists).

}
}, payment.address);
[&](const libzcash::SaplingPaymentAddress&) { ++saplingOutputCount; },
[&](const libzcash::OrchardRawAddress&) { ++orchardOutputCount; });
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that AlignAfterOpenBracket: BlockIndent (as I suggest here) would make this:

        examine(
            payment.address,
            [&](const CKeyID& addr) {
                vout.emplace_back(payment.amount, GetScriptForDestination(addr));
            },
            [&](const CScriptID& addr) {
                vout.emplace_back(payment.amount, GetScriptForDestination(addr));
            },
            [&](const libzcash::SaplingPaymentAddress&) { ++saplingOutputCount; },
            [&](const libzcash::OrchardRawAddress&) { ++orchardOutputCount; }
        );

which I much prefer (and this doesn't have the problem that WhitespaceSensitiveMacros: ['examine'] has that the body won't be formatted).

ConstructorInitializerAllOnOneLineOrOnePerLine: false
ConstructorInitializerIndentWidth: 4
ContinuationIndentWidth: 4
Cpp11BracedListStyle: true
DerivePointerAlignment: false
DisableFormat: false
IndentCaseLabels: false
DisableFormat: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider

EmptyLineAfterAccessModifier: Never
EmptyLineBeforeAccessModifier: Always

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider

FixNamespaceComments: true

Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider

ForEachMacros: ['BOOST_FOREACH', 'BOOST_REVERSE_FOREACH']

or alternatively, remove all uses of those macros (there are only three uses).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FixNamespaceComments: true

Hrmm, I thought I had that … maybe in the later commit, because here it only introduces diffs, since we haven’t been consistent thus far.

ForEachMacros: ['BOOST_FOREACH', 'BOOST_REVERSE_FOREACH']

Yeah, I would add this with a comment to remove it once #4818 is fixed.

Comment on lines +52 to +59
IncludeBlocks: Regroup
IncludeCategories:
- Regex: '<[^/.]+>'
Priority: 3
- Regex: '<.*>'
Priority: 2
- Regex: '.*'
Priority: 1
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#includes should never be automatically reordered by a formatter; I am adamant about that. Unlike reordering Rust imports, reordering #includes can change behaviour. (I'm not the only person who thinks this.)

Suggested change
IncludeBlocks: Regroup
IncludeCategories:
- Regex: '<[^/.]+>'
Priority: 3
- Regex: '<.*>'
Priority: 2
- Regex: '.*'
Priority: 1
SortIncludes: Never

(Move SortIncludes to be in alphabetical order.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this is probably the one I have the strongest feelings on, too 😆 (But since it’s big diffs, and because of the significance of include ordering, it should be in a separate PR anyway, so it’ll be removed and subject to further scrutiny).

Yes, until C++20 modules, interdependent includes is a serious problem. However, I think automatic ordering minimizes the problem.

First, the ordering, once applied, stays consistent. This isn’t going to shuffle includes on commits after the style is applied.

Second, it enforces “safer” default ordering

  • the “main” header for the .cpp is first, assuring that it’s self-contained (which we did not do consistently); and
  • the remaining headers are ordered “inside out”, reducing the chance that chained includes in third-party or system headers affect our local headers.

This is still pretty coarse – we could, say, order libzcash headers after other local headers (even without making more whitespace-separated groups) to reduce influence there, as well. Hell, we could order third-party headers so that any inter-dependency between them is better isolated (but hopefully they’ve done a good job already).

And finally, if there is any important ordering, it currently hasn’t been explained anywhere. Having it auto-formatted would require a // clang-format off to prevent important orderings (that deviate from the standard ones) from getting messed up. And // clang-format off should require an explanatory comment. Without the // clang-format off, the reviewer wouldn’t know that there was an important header ordering that should be documented.

Priority: 2
- Regex: '.*'
Priority: 1
IndentCaseLabels: true
IndentFunctionDeclarationAfterType: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider

IndentPPDirectives: AfterHash

IndentFunctionDeclarationAfterType: false
IndentWidth: 4
IndentWidth: 4
InsertBraces: true
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 despite this warning:

Setting this option to true could lead to incorrect code formatting due to clang-format’s lack of complete semantic information. As such, extra care should be taken to review code changes made by this option.

(What's the difference between this and #include ordering? Semantic changes due to this are reviewable locally, whereas semantic changes due to reordering #includes are dependent on the content of the included files. Also, the latter may be –and often are– dependent on the platform and on implementation details of headers that can change.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider

InsertNewlineAtEOF: true

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider

InsertTrailingCommas: true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider

InsertNewlineAtEOF: true

I think this option is added in Clang 16.

KeepEmptyLinesAtTheStartOfBlocks: false
Language: Cpp
Language: Cpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider

LineEnding: LF

PenaltyBreakString: 1000
PenaltyExcessCharacter: 1000000
PenaltyReturnTypeOnItsOwnLine: 200
PackConstructorInitializers: NextLine
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
PackConstructorInitializers: NextLine
PackConstructorInitializers: NextLineOnly

This is consistent with BreakConstructorInitializers: BeforeColon.

PenaltyBreakOpenParenthesis: 1
# PenaltyBreakString: 1000
# PenaltyExcessCharacter: 1000000
PenaltyReturnTypeOnItsOwnLine: 0
PointerAlignment: Left
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not consistent on this, but I do prefer Left.

$ grep -IR --include='*.cpp' 'char [*]' src |wc --lines
116
$ grep -IR --include='*.cpp' 'char[*]' src |wc --lines
259

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider

SortUsingDeclarations: Lexicographic

(This doesn't have the same problem as SortIncludes.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider

SpaceAfterCStyleCast: true
SpaceBeforeCaseColon: false
SpaceBeforeInheritanceColon: true
SpaceBeforeRangeBasedForLoopColon: true
SpaceBeforeSquareBrackets: false
SpaceInEmptyBlock: false
SpacesInConditionalStatement: false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PointerAlignment: Left

I’m fine either way, but since we’re here, I’ll state that I prefer Right for a reason

foo *x;

to me reads like “*x is a foo” (but this doesn’t generalize to C++ where foo &x should not be read as “&x is a foo”).

The related battle that I have never won (but which would maybe make the preceding argument stronger) is that

const foo *x;

should be written

foo const *x;

because in every other case, const comes after what it affects, e.g.

bool foo(int bar) const {}
foo * const *y;

In particular, const foo * const *z; is confusing to read as opposed to foo const * const *z;.

But I’m not interested in fighting for it, happy to accept Left.

SpacesBeforeTrailingComments: 1
SpacesInAngles: false
SpacesInContainerLiterals: true
SpacesInAngles: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SpacesInAngles: false
SpacesInAngles: Never

Comment on lines +94 to +96
# This is a bit of a hack to get clang-format to let us do what we want with examine, even though
# it’s not a macro.
WhitespaceSensitiveMacros: ['examine']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be deleted if we use AlignAfterOpenBracket: BlockIndent, I think. Not sure how disruptive that would be.

Comment on lines +115 to 119
if ((t == typecode)
|| (std::holds_alternative<CKeyID>(r) && std::holds_alternative<CScriptID>(receiver))
|| (std::holds_alternative<CScriptID>(r) && std::holds_alternative<CKeyID>(receiver))) {
return false;
}
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would become

        if ((t == typecode)
            || (std::holds_alternative<CKeyID>(r) && std::holds_alternative<CScriptID>(receiver))
            || (std::holds_alternative<CScriptID>(r) && std::holds_alternative<CKeyID>(receiver)))
        {
            return false;
        }

with AfterControlStatement: MultiLine as suggested here. I like that better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, I like that a lot.

Comment on lines +191 to +193
if (nu5Active) {
result = addr;
}
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer

            [&](const OrchardRawAddress& addr) {
                if (nu5Active) { result = addr; }
            },

if possible. Try and see whether AllowShortIfStatementsOnASingleLine: WithoutElse does that without too many side effects.

Also why is the indent for a lambda body only 2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also why is the indent for a lambda body only 2?

Ah, just an accidental manual formatting because in this commit clang-format is simply ignoring anything between the parens of examine (WhitespaceSensitiveMacros).

Comment on lines -347 to +383
t_bytes.has_value() ? t_bytes.value().data() : nullptr,
sapling_bytes.has_value() ? sapling_bytes.value().data() : nullptr,
orchard_bytes.has_value() ? orchard_bytes.value().data() : nullptr);
t_bytes.has_value() ? t_bytes.value().data() : nullptr,
sapling_bytes.has_value() ? sapling_bytes.value().data() : nullptr,
orchard_bytes.has_value() ? orchard_bytes.value().data() : nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a case where ContinuationIndentWidth: 8 (as suggested here) would preserve the current formatting.

Comment on lines +192 to +195
UnifiedFullViewingKey(const UnifiedFullViewingKey& key)
: inner(unified_full_viewing_key_clone(key.inner.get()), unified_full_viewing_key_free)
{
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we set SplitEmptyFunction: false as suggested here, then I think this becomes

    UnifiedFullViewingKey(const UnifiedFullViewingKey& key)
        : inner(unified_full_viewing_key_clone(key.inner.get()), unified_full_viewing_key_free)
    {}

which is less verbose, while still avoiding the issue that the {} is easily missed at the end of the line.

Comment on lines +98 to 100
return 32 + // left
32 + // right
parents.size() * 32; // parents
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, but this is another of those cases that offends my desire to not have formatting affect lines that haven’t been changed. However, I think this is another option where I couldn’t get it to leave the formatting as-is (aligned when we do it, and unaligned otherwise), and I think aligning overall made the diff smaller.

typedef libzcash::IncrementalMerkleTree<INCREMENTAL_MERKLE_TREE_DEPTH_TESTING, libzcash::SHA256Compress> SproutTestingMerkleTree;
typedef libzcash::IncrementalMerkleTree<INCREMENTAL_MERKLE_TREE_DEPTH, libzcash::SHA256Compress>
SproutMerkleTree;
typedef libzcash::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer a penalty for breaking here, but I don't think there's an option for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might be able to reduce the penalty for breaking between < > or something and get it not break here that way. I definitely set one penalty to 0 (and penalties are unsigned) to allow a break where clang-format didn’t initially. So, penalties clearly don’t all default to 0.

throw std::invalid_argument("nonsensical vpub_old value");
}
#include <fstream>
#include <memory>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another disadvantage of reordering headers, it can lead to confusing diffs (and we do need to fully review the clang-format changes).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that we need to fully review these changes.

However, once the formatter is applied, these should remain consistent, so future diffs should not have this confusion (and in the interest of initially minimizing diffs and having discussion afterward, any ordering of includes should definitely be outside of this PR).

Comment on lines +38 to +39
)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer ) {, but I don't think there's an option for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don’t think there’s an option that doesn’t also affect parameter lists without a dangling ). In this case (unless we do AlignAfterOpenBracket: BlockIndent) the comment could be moved above the parameter, or could become Doxygen in the header instead of being a comment here at all.

Comment on lines -72 to +63
// Balance must be sensical
if (inputs[i].note.value() > MAX_MONEY) {
throw std::invalid_argument("nonsensical input note value");
// The tree must witness the correct element
if (inputs[i].note.cm() != inputs[i].witness.element()) {
throw std::invalid_argument("witness of wrong element for joinsplit input");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See how the diff got completely out-of-sync here because of the header changes? Blech.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn’t because of the header changes, but because we don’t indent namespace blocks (but this one file had been). If you ignore whitespace in the diff, this whole section disappears.

const ed25519::VerificationKey& joinSplitPubKey)
{
const unsigned char personalization[blake2b::PERSONALBYTES] =
{'Z', 'c', 'a', 's', 'h', 'C', 'o', 'm', 'p', 'u', 't', 'e', 'h', 'S', 'i', 'g'};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary, but I'll live with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use a string literal and cast to unsigned here instead of the explicit array?

Comment on lines +163 to 174
plaintext.begin(),
NULL,
NULL,
ciphertext.begin(),
SAPLING_ENCCIPHERTEXT_SIZE,
NULL,
0,
cipher_nonce,
K)
!= 0) {
return std::nullopt;
}
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit odd, although not objectionably so. I think that with my suggested options it would end up as

    if (crypto_aead_chacha20poly1305_ietf_decrypt(
                plaintext.begin(),
                NULL,
                NULL,
                ciphertext.begin(),
                SAPLING_ENCCIPHERTEXT_SIZE,
                NULL,
                0,
                cipher_nonce,
                K
        ) != 0) {
        return std::nullopt;
    }

Comment on lines +173 to +175
a.g_A == b.g_A && a.g_A_prime == b.g_A_prime && a.g_B == b.g_B
&& a.g_B_prime == b.g_B_prime && a.g_C == b.g_C && a.g_C_prime == b.g_C_prime
&& a.g_K == b.g_K && a.g_H == b.g_H);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is worse, but I guess it's fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike that this keeps some operators of the same precedence on the same line but not others. Maybe there’s an option to always break operators of matching precedence (if they’re too long for the line).

Comment on lines +181 to +184
static constexpr size_t GROTH_PROOF_SIZE =
(48 + // π_A
96 + // π_B
48); // π_C
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove the parens here, no? Seems like there must be an option for that.

Comment on lines +21 to +25
#define ZC_ZIP225_ORCHARD_BASE_SIZE \
(ZC_ZIP225_ORCHARD_NUM_ACTIONS_SIZE + ZC_ZIP225_ORCHARD_FLAGS_SIZE \
+ ZC_ZIP225_ORCHARD_VALUE_BALANCE_SIZE + ZC_ZIP225_ORCHARD_ANCHOR_SIZE \
+ ZC_ZIP225_ORCHARD_SIZE_PROOFS_BASE_SIZE + ZC_ZIP225_ORCHARD_PROOF_BASE_SIZE \
+ ZC_ZIP225_ORCHARD_BINDING_SIG_SIZE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good!

seed.resize(64);
return zip339_phrase_to_seed(language, mnemonic.c_str(), (uint8_t (*)[64])seed.data());
return zip339_phrase_to_seed(language, mnemonic.c_str(), (uint8_t(*)[64])seed.data());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this isn't (uint8_t(*)[64]) seed.data() (with the space), when a space is added e.g. at line 78 in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a space removed at line 78.

Pretty sure I set the config to not have a space after a cast. But with some more -W changes, we should end up getting rid of C-style casts altogether.


READWRITE(language0);
READWRITE(mnemonic);
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch :-)

Comment on lines +99 to +101
return (
a.ak < b.ak || (a.ak == b.ak && a.nk < b.nk)
|| (a.ak == b.ak && a.nk == b.nk && a.ovk < b.ovk));
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have preferred

        return (a.ak < b.ak
                || (a.ak == b.ak && a.nk < b.nk)
                || (a.ak == b.ak && a.nk == b.nk && a.ovk < b.ovk));

although I guess that requires more semantic knowledge of C++ (because the reason I prefer it is that each line is a separate argument of the logical || with three arguments, and that requires knowing that || is associative probably). Anyway it's not a big deal.

Sapling = 0x02,
Orchard = 0x03
};
enum class ReceiverType : uint32_t { P2PKH = 0x00, P2SH = 0x01, Sapling = 0x02, Orchard = 0x03 };
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'd prefer enum items always on separate lines. I think that's AllowShortEnumsOnASingleLine: false.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

AllowShortBlocksOnASingleLine: false
AllowShortFunctionsOnASingleLine: All
AllowShortBlocksOnASingleLine: Empty
AllowShortCaseLabelsOnASingleLine: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider

AllowShortEnumsOnASingleLine: false

Comment on lines +132 to +133
throw std::runtime_error("SaplingDiversifiableFullViewingKey::DefaultAddress(): No valid "
"diversifiers out of 2^88!");
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer

        throw std::runtime_error(
                "SaplingDiversifiableFullViewingKey::DefaultAddress(): No valid diversifiers out of 2^88!"
        );

(i.e. don't automatically break strings).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few options around this.

I think we can tell it to not automatically break strings, we can also change the penalties around breaking strings and/or allowing lines longer than the specified max. Probably some tweaking involved to get what we want.

E.g., I think not breaking strings means that it’ll unite all manually-broken strings – which may be good, then we explicitly + when we want to concat them?

throw std::runtime_error(std::string(__func__) + ": diversifier index overflow.");;
if (!j.increment()) {
throw std::runtime_error(std::string(__func__) + ": diversifier index overflow.");
};
Copy link
Contributor

@daira daira May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8519bca in #6641 will force getting rid of the extraneous ;.

Copy link
Contributor

@daira daira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, made some suggestions.

Copy link
Contributor Author

@sellout sellout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed I hadn’t submitted this response, so here it is.

I don’t have strong preferences with most of your suggestions. Number 1 for me is just some automated formatting so I never have to think about it. But I think the first pass of this PR should drop my last commit (adding my preferences) and integrate more of your suggestions here.

Basically, step 1 should be what my first commit message says – match our documented style, then minimize the diffs1 on top of that. Then we can start making explicit changes to our style, in PRs that only change a few interconnected clang-format options at a time.

I’ve gone through your review cursorily, mostly just to capture my thoughts on your suggestions, but again, most of the things that trigger discussion should probably be deferred (unless it’s clearly just a gap/inconsistency in the existing style).

Footnotes

  1. But we should capture more than just libzcash when figuring out those diffs – draw a clearer line between “backportable” files and ones where we control the formatting. Ideally adding a check like we have for rustfmt.

AllowShortLambdasOnASingleLine: All
AllowShortLoopsOnASingleLine: false
AlwaysBreakBeforeMultilineStrings: false
AlwaysBreakTemplateDeclarations: true
BinPackArguments: false
BinPackParameters: false
BreakBeforeBinaryOperators: None # I’d prefer NonAssignment
BreakBeforeBinaryOperators: NonAssignment
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of my preferences with style (which I violate constantly anyway, but an auto-formatter helps me not violate) is that changes to the code should cause minimal formatting changes on other lines1.

Things like AlignAfterOperator here violate that. To use the example from clang-format, changing the name of the variable from aaa to aaab would reformat the second line of

int aaa = bbbbbbbbbbbbbbb
        + ccccccccccccccc;

but not of

int aaa = bbbbbbbbbbbbbbb
    + ccccccccccccccc;
int aaa =
    bbbbbbbbbbbbbbb + ccccccccccccccc;

etc.

Footnotes

  1. I miss having semantic diff like I did with lisp, where line breaks, indentation, etc. are ignored (if they’re not syntactically significant) and diffs are done by token/expression rather than by char/line.

BreakBeforeBraces: Linux
BreakBeforeTernaryOperators: true
BreakConstructorInitializers: AfterColon # I’d prefer BeforeColon
BreakConstructorInitializers: BeforeColon
ColumnLimit: 100 # match rustfmt, if we don’t define this, then we don’t get Wadler
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, true enough! (although hopefully having a GH check for this would minimize that in future)

Here I meant Wadler-Leijen pretty-printing, as opposed to … the other style that I can never remember the name of. Basically, setting a line length here avoids the need for “hints” to the formatter that tell it whether it should keep, say, an argument list on one line (regardless of length), or do one-line-per-argument (if there is a user-introduced newline somewhere in the arglist).

We don’t actually get Wadler-style from clang-format, but this gets it closer.

ConstructorInitializerAllOnOneLineOrOnePerLine: false
ConstructorInitializerIndentWidth: 4
ContinuationIndentWidth: 4
Cpp11BracedListStyle: true
DerivePointerAlignment: false
DisableFormat: false
IndentCaseLabels: false
DisableFormat: false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FixNamespaceComments: true

Hrmm, I thought I had that … maybe in the later commit, because here it only introduces diffs, since we haven’t been consistent thus far.

ForEachMacros: ['BOOST_FOREACH', 'BOOST_REVERSE_FOREACH']

Yeah, I would add this with a comment to remove it once #4818 is fixed.

PenaltyBreakOpenParenthesis: 1
# PenaltyBreakString: 1000
# PenaltyExcessCharacter: 1000000
PenaltyReturnTypeOnItsOwnLine: 0
PointerAlignment: Left
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PointerAlignment: Left

I’m fine either way, but since we’re here, I’ll state that I prefer Right for a reason

foo *x;

to me reads like “*x is a foo” (but this doesn’t generalize to C++ where foo &x should not be read as “&x is a foo”).

The related battle that I have never won (but which would maybe make the preceding argument stronger) is that

const foo *x;

should be written

foo const *x;

because in every other case, const comes after what it affects, e.g.

bool foo(int bar) const {}
foo * const *y;

In particular, const foo * const *z; is confusing to read as opposed to foo const * const *z;.

But I’m not interested in fighting for it, happy to accept Left.

Comment on lines +191 to +193
if (nu5Active) {
result = addr;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also why is the indent for a lambda body only 2?

Ah, just an accidental manual formatting because in this commit clang-format is simply ignoring anything between the parens of examine (WhitespaceSensitiveMacros).

seed.resize(64);
return zip339_phrase_to_seed(language, mnemonic.c_str(), (uint8_t (*)[64])seed.data());
return zip339_phrase_to_seed(language, mnemonic.c_str(), (uint8_t(*)[64])seed.data());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a space removed at line 78.

Pretty sure I set the config to not have a space after a cast. But with some more -W changes, we should end up getting rid of C-style casts altogether.

Sapling = 0x02,
Orchard = 0x03
};
enum class ReceiverType : uint32_t { P2PKH = 0x00, P2SH = 0x01, Sapling = 0x02, Orchard = 0x03 };
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

Comment on lines +132 to +133
throw std::runtime_error("SaplingDiversifiableFullViewingKey::DefaultAddress(): No valid "
"diversifiers out of 2^88!");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few options around this.

I think we can tell it to not automatically break strings, we can also change the penalties around breaking strings and/or allowing lines longer than the specified max. Probably some tweaking involved to get what we want.

E.g., I think not breaking strings means that it’ll unite all manually-broken strings – which may be good, then we explicitly + when we want to concat them?

Comment on lines +52 to +59
IncludeBlocks: Regroup
IncludeCategories:
- Regex: '<[^/.]+>'
Priority: 3
- Regex: '<.*>'
Priority: 2
- Regex: '.*'
Priority: 1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this is probably the one I have the strongest feelings on, too 😆 (But since it’s big diffs, and because of the significance of include ordering, it should be in a separate PR anyway, so it’ll be removed and subject to further scrutiny).

Yes, until C++20 modules, interdependent includes is a serious problem. However, I think automatic ordering minimizes the problem.

First, the ordering, once applied, stays consistent. This isn’t going to shuffle includes on commits after the style is applied.

Second, it enforces “safer” default ordering

  • the “main” header for the .cpp is first, assuring that it’s self-contained (which we did not do consistently); and
  • the remaining headers are ordered “inside out”, reducing the chance that chained includes in third-party or system headers affect our local headers.

This is still pretty coarse – we could, say, order libzcash headers after other local headers (even without making more whitespace-separated groups) to reduce influence there, as well. Hell, we could order third-party headers so that any inter-dependency between them is better isolated (but hopefully they’ve done a good job already).

And finally, if there is any important ordering, it currently hasn’t been explained anywhere. Having it auto-formatted would require a // clang-format off to prevent important orderings (that deviate from the standard ones) from getting messed up. And // clang-format off should require an explanatory comment. Without the // clang-format off, the reviewer wouldn’t know that there was an important header ordering that should be documented.

IndentFunctionDeclarationAfterType: false
IndentWidth: 4
IndentWidth: 4
InsertBraces: true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider

InsertNewlineAtEOF: true

I think this option is added in Clang 16.

@daira daira self-assigned this Jun 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants