Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to replicate signatures from other implementations / verify correctness? #46

Closed
thejohnfreeman opened this issue Jan 15, 2020 · 17 comments
Labels

Comments

@thejohnfreeman
Copy link

I'm looking for a Python library for ECDSA. I need to match the output of projects in other languages using different libraries for ECDSA. I took some known (key, message, signature) "test vectors" from this ECDSA library in Go and was able to replicate them (or at least the first and fourth) with elliptic in JavaScript, to verify that they work the same. However, I can't seem to do the same with fastecdsa.

How would you write a function that, given two 32-byte bytes values for both the key and the message digest, returns a DER-encoded bytes signature? I tried this (paraphrased):

from fastecdsa import curve, ecdsa

def sign(private_key_bytes: bytes, message_digest_bytes: bytes) -> bytes:
    private_key = int.from_bytes(private_key_bytes, byteorder='big')
    r, s = ecdsa.sign(
        message_digest_bytes.hex(), private_key, curve=curve.secp256k1, prehashed=True
    )
    r_bytes = r.to_bytes(32, byteorder='big')
    s_bytes = s.to_bytes(32, byteorder='big')
    # Hacky hard-coded DER encoding that works for the two cases I'm testing.
    return b'\x30\x44' + b'\x02\x20' + r_bytes + b'\x02\x20' + s_bytes

You can see it failing on the two test vectors I copied from the Go project.

@AntonKueltz
Copy link
Owner

Have you tried using the fastecdsa.encoding.der.DEREncoder class, specifically the encode_signature(r, s) method? The resulting byte encoding should conform with RFC2459.

I'm also not sure message_hash_bytes.hex() is correct, I guess it depends on if the go test vector you're using is a hex encoding of bytes (this would be my guess) or if it's a utf-8 encoded string that happens to only be characters that are also valid hex values. If it's the former you could do binascii.unhexlify(hexstring) to get the raw bytes.

@thejohnfreeman
Copy link
Author

thejohnfreeman commented Jan 16, 2020

Good point on the encoder. I missed that, but added it now. Now my sign function looks like this:

def sign(private_key_bytes: bytes, message_hash_bytes: bytes) -> bytes:
    private_key = int.from_bytes(private_key_bytes, byteorder='big')
    r, s = ecdsa.sign(
        message_hash_bytes.hex(),
        private_key,
        curve=curve.secp256k1,
        prehashed=True,
    )
    return DEREncoder.encode_signature(r, s)

The reason for the call to .hex() is because I peeked behind the curtain to see how the prehashed parameter is handled. When False, the sign method passes a .hexdigest() to the C FFI, so I figured I should match that. If I try to pass the bytes directly, I get a type error:

msg = b'\x9eWU\xec/2\x8c\xc8cZUA]\x0e\x9a\t\xc2\xb6\xf2\xc9\xb04<\x94_\xbb\xfe\x08$zL\xbe'
d = 224606400986705160...1487958639159536557, curve = secp256k1
hashfunc = <built-in function openssl_sha256>, prehashed = True

    def sign(msg: MsgTypes, d: int, curve: Curve = P256, hashfunc=sha256, prehashed: bool = False):
        """Sign a message using the elliptic curve digital signature algorithm.
    
        The elliptic curve signature algorithm is described in full in FIPS 186-4 Section 6. Please
        refer to http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.186-4.pdf for more information.
    
        Args:
            |  msg (str|bytes|bytearray): A message to be signed.
            |  d (int): The ECDSA private key of the signer.
            |  curve (fastecdsa.curve.Curve): The curve to be used to sign the message.
            |  hashfunc (_hashlib.HASH): The hash function used to compress the message.
        """
        # generate a deterministic nonce per RFC6979
        rfc6979 = RFC6979(msg, d, curve.q, hashfunc)
        k = rfc6979.gen_nonce()
    
        # add a prehash option
        if not prehashed:
            hashed = hashfunc(msg_bytes(msg)).hexdigest()
        else:
            hashed = msg
    
        r, s = _ecdsa.sign(
            hashed,
            str(d),
            str(k),
            str(curve.p),
            str(curve.a),
            str(curve.b),
            str(curve.q),
            str(curve.gx),
>           str(curve.gy)
        )

E       TypeError: argument 1 must be str, not bytes
../../../virtualenv/python3.7-dev/lib/python3.7/site-packages/fastecdsa/ecdsa.py:48: TypeError

@AntonKueltz
Copy link
Owner

This looks like a legitimate bug with bytes being passed as pre hashed messages. I’ll need to take a look at the C bindings, they should really be operating on bytes and not strings. Will try to work the issue this weekend.

@AntonKueltz
Copy link
Owner

AntonKueltz commented Jan 18, 2020

I fixed the pre-hashing issue but there's still an issue with the DER encoder generating bad ASN.1 so I'll keep this open. I'll have to look at the implementation, I think that code was from a PR. The tests for the implementation pass but they may not actually be valid, there's some odd corner cases around the most significant bit of the most significant byte being 1 which seem to be trying to handle signed integers (but we only work with unsigned integers since all negative numbers have an equivalent positive number in the field we're working in)

On a side note, I strongly dislike exposing pre-hashing in the interface because it confuses the interface massively. What I mean is it's not clear what the function expects for a pre-hashed message... is it bytes? a string? hex encoded (like most hash digests are commonly shown)? an integer? It's not clear because ECDSA is not defined for pre-hashed messages. I may actually end up removing support for this because it's confusing and doesn't have many legitimate uses (just sign the message and indicate the hash algorithm).

@thejohnfreeman
Copy link
Author

From what I've read, DER encodes all integers as signed integers; it does not distinguish. So if one wants to encode an unsigned integer with a most significant 1 bit, one must prefix it with a zero byte, extending its length by 1.

I'm fine if the prehashed option is removed, as long as the hashfunc is overridable. In some places, I use a custom IdentityHash that returns the "message" unchanged, so I can still pass a digest to the signing function.

@AntonKueltz
Copy link
Owner

Sorry for the late reply. You're correct, unsigned integers need a leading 0 bit. I'll look into making sure the DER encoding is actually conforming to test vectors. These should be part of the test suite to begin with, seems like the overall encoding tested against known values got missed as a case when initially adding DER encoding.

Overriding hashfunc could be tricky since it is also used by the nonce generation. At a minimum an identity hash function would need to implement the same interface as all the other hash functions in hashlib so as to not have issues with undefined methods being called.

@thejohnfreeman
Copy link
Author

Indeed. You can see my IdentityHash here.

@AntonKueltz
Copy link
Owner

AntonKueltz commented Feb 20, 2020

I probably should have checked this a lot earlier - The root of the issue is that the raw r and s values of the signatures differ between the libraries. So the mismatch is occurring pre encoding. Do you know if the libraries you're using are using RFC6979 to generate nonces and using sha256 as the hash for the HMAC in that algorithm?

@thejohnfreeman
Copy link
Author

The libraries I've been testing are fastecdsa, ecdsa, and cryptography (and I have eyes on starkbank-ecdsa and ecpy). You can see the uniform interface I built for each as test fixtures, and the tests using those fixtures.

ecdsa says it uses RFC6979 for the method sign_deterministic, which is the method I call. Reading the source, it uses the same hash function for both the message digest and the nonce, but I need to use a custom hash for the digest (first half of SHA-512). I guess I'll have to call sign_digest_deterministic with a digest I compute for myself, passing SHA-256 for the nonce hash to match fastecdsa. Is it possible for fastecdsa to use the same hash for both? I don't believe RFC6979 mandates SHA-256, though its example does match the behavior of ecdsa:

HMAC [RFC2104] is a construction of a Message Authentication Code
using a hash function and a secret key. Here, we use HMAC with the
same hash function H as the one used to process the input message
prior to signature generation or verification.

cryptography doesn't mention RFC6979 or determinism at all.

@thejohnfreeman
Copy link
Author

I added a test after changing my call to sign_digest_deterministic with SHA-256, and it still fails to match fastecdsa.

@AntonKueltz
Copy link
Owner

Still looking into this, haven’t had too much success root causing.

@AntonKueltz
Copy link
Owner

Finally figured this one out, issue is with prehashed values. RFC6979 indicates that the message passed to it should be hashed, but prehashed ECDSA interprets this as meaning only if the message is prehashed, so the step isn't applied for prehashed signs.

This ended up breaking everything since k is generated "incorrectly", and everything afterwards is wrong as a result.

@AntonKueltz
Copy link
Owner

Fixed in 033ab59. Closing this out, let me know if you continue to have issues.

@thejohnfreeman
Copy link
Author

I install fastecdsa through PyPI; will you be publishing a new version with that change?

@AntonKueltz
Copy link
Owner

It's going to be a bit until the next release (I don't have access to the machine with my signing keys for the time being). But yes, this will be published to pypi in the next release.

@AntonKueltz
Copy link
Owner

AntonKueltz commented Apr 9, 2020

@thejohnfreeman v2.1.1 is now available via pypi with this fix. Sorry it took a while to get this diagnosed and implemented,.

@thejohnfreeman
Copy link
Author

I can confirm v2.1.2 let me replicate signatures. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants