Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

odb: A couple internal improvements to body parsing/rendering #100

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

rwe
Copy link
Contributor

@rwe rwe commented Sep 29, 2021

A few miscellaneous fixes related to Commit.body and Tree.body

  • Fix: space-indented commit header value continuation lines were handled on parsing, but not restored on construction.
  • Fix: Commit header key/value parsing just used split(maxsplit=1), but that will incorrectly lose leading whitespace from the value. Instead, it now explicitly runs .split(b" ", maxsplit=1) to designate exactly one space character as the separator. This is consistent with how git internally parses those values.
  • Style/performance: Previously, bodies were constructed by repeatedly appending to an immutable bytestring. This has O(n^2) memory usage due to copying each time. (Some python runtimes apparently detect this if the variable isn't used for anything else, but as far as I can find it's still frowned upon). Now, the body parts are yielded from a generator, which is given to b"".join(…) to construct that contiguous bytestring all at once, giving the internals more leeway for buffer handling.

@rwe
Copy link
Contributor Author

rwe commented Oct 6, 2021

I've rebased this to cover the gpgsig changes along with a couple tiny related cleanups in that area.

@rwe
Copy link
Contributor Author

rwe commented Jan 10, 2022

This branch has been good to go for a few months but I've pushed up a rebase to fix a mypy checking issue introduced by a conflict with the recently introduced sh_run utility function.

The space-indented header value continuation lines were handled on
parsing, but not restored on construction.
Note that the comment "Strip CR…" is the same description as in the
git/gpg-interface.c C source.
Although `sign_buffer` is (roughly) the corresponding C function name,
that refers to *mutating* a buffer rather than returning a signature
header value.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant