Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about serialization of bn254 curve point #109

Open
nanne007 opened this issue Dec 2, 2023 · 10 comments
Open

about serialization of bn254 curve point #109

nanne007 opened this issue Dec 2, 2023 · 10 comments
Labels
discussion Discussion that may not have an action T-design Type: Discussion over API/implementation design and/or research associated

Comments

@nanne007
Copy link

nanne007 commented Dec 2, 2023

Recently I'm working on a project which involved both halo2curves and arkworks.rs.
I found that the serialization of bn254 curve point like G1 are different in these two projects.
when serializing, halo2curves add the negative sign as the 6th bit of last bytes, and point of infinity as the 7th bit. see

fn to_bytes(&self) -> Self::Repr {

while arkworks.rs does the opposite(6th for point of infinity, 7th for negative sign). see https://github.com/arkworks-rs/algebra/blob/860a986360a1deb19a4d06b991a1a700d34b1298/ec/src/models/short_weierstrass/serialization_flags.rs#L7

I had dealt the subtle difference in my project.
However, I wonder if there exists a common standard for (de)serialization Field and Curve Point?
If there isn't, I wonder how could we change the situation, and unify the serialization in many other zk projects and make proof parsing more portable?

@nanne007
Copy link
Author

nanne007 commented Dec 2, 2023

related discussion on arkworks arkworks-rs/algebra#708

@CPerezz
Copy link
Member

CPerezz commented Dec 4, 2023

Ohhh! That's really unfortunate!
@kilic was the original creator of the bn256 inclusion PR. So he might know better about this specific nuance.

I agree we should make it unified such that we don't bring extra trouble to the usears.

Do you have any specific formatting proposal BTW @nanne007 ?

@nanne007
Copy link
Author

nanne007 commented Dec 4, 2023

Thank you for the reply!
Currently dont have any particular proposal in mind.

In fact, the serialization(compressed version) of bn254 in both projects are very similar except the bit sign.
I think if we can agree on this, maybe a common format of bn254 can be derived from the two projects.

The zcash serialization of bls12-381 sets a good standard here. https://www.ietf.org/archive/id/draft-irtf-cfrg-pairing-friendly-curves-11.html#name-zcash-serialization-format-

But for bn254, it can be harder, as there're already many different impls used on production.
apart from these two projects, ethereum uses big endian encoding for Field, see https://eips.ethereum.org/EIPS/eip-196#encoding .
This can be solved by add an option about le or be when calling serialization function like to_bytes.
However, adding other option for the bit sign of a curve point's negative of y and point of infinity seems a little inconvenient.

@CPerezz
Copy link
Member

CPerezz commented Dec 4, 2023

But for bn254, it can be harder, as there're already many different impls used on production.
apart from these two projects, ethereum uses big endian encoding for Field, see https://eips.ethereum.org/EIPS/eip-196#encoding .

I've stopped myself from ranting against BE. Just letting you know I hate this... It should've all been LE.

On the rest, I agree, we can solve this on this way or another. I think we can flip the bit order in the compressed form such that the two ecosystems match together.
For the general formatting what we can do is offer a general serialization impl which has arbitrary endianness based on an option as you say and sign compression (it's the default IMO).

For the rest, we can define an SerializedExt trait or whatever where we can implement extra serialization options which are more optimal for certain use cases each.

Any idea how could we formalize this a bit more? We'll be happy to jump into a call, clarify this and update it!
At the end, a smooth transition from one ecosystem to the other would be the most desired thing to have!

@guangyuz
Copy link

guangyuz commented Dec 5, 2023

Unifying the formatting makes sense for the community. Add @huyuncong , maybe he knows someone from arkworks who can take a look.

@CPerezz
Copy link
Member

CPerezz commented Dec 5, 2023

Maybe we can try summoning @Pratyush and he can orient us. Also curious to know his thoughts.

@mratsim
Copy link
Contributor

mratsim commented Dec 11, 2023

When implementing curve serialization, I have seen 3 different cases:

I've stopped myself from ranting against BE. Just letting you know I hate this... It should've all been LE.

BigInts/Fields are always serialized in BE, it's also natural when hex debugging. See I2OSP and OS2IP from RSA spec: https://www.rfc-editor.org/rfc/rfc8017.html#section-4

Additionally, if needed, Fp2 and Fp12 serializations might need standardization as well.

For Fp2, Zcash serializes extension field first, i.e. the i element then the base field.

For Fp12, the serialization should be canonical instead of being tied to a specific towering Fp2 -> Fp6 -> Fp12 vs Fp2 -> Fp4 -> Fp12. See discussion in supranational/blst#101 (comment)

@davidnevadoc
Copy link
Contributor

davidnevadoc commented Dec 11, 2023

There is an extra case, the curves that have no extra bits for flags, like secp256k1. For this curve we are adding an extra byte to hold the flags in both the compressed an uncompressed format. Bitcoin's implementation does this as well:
https://github.com/bitcoin-core/secp256k1/blob/77af1da9f631fa622fb5b5895fd27be431432368/src/eckey_impl.h#L37

I found a couple more issues.
The infinity flag is the 7th bit in the compressed form:

let is_inf = Choice::from(tmp[[< $name _COMPRESSED_SIZE >] - 1] >> 7);

But it is the 6th in the uncompressed form:
let infinity_flag_set = Choice::from((bytes[[< $name _UNCOMPRESSED_SIZE >] - 1] >> 6) & 1);

The way of determining the sign seems not to be agreed upon either. In fact, we are not getting the sign but the parity of the y-coordinate.

@mratsim
Copy link
Contributor

mratsim commented Dec 11, 2023

The way of determining the sign seems not to be agreed upon either. In fact, we are not getting the sign but the parity of the y-coordinate.

Note that "sign" is a convention that splits a cyclic group in 2 halves.

When standardizing hash-to-curve, the "sign" was changed to mean x mod 2:

@nanne007
Copy link
Author

nanne007 commented Dec 12, 2023

and should notice that sign of bn254 curve in halo2curves and arkworks have different meaning

in arkworks:

the negative sign always means the bigger one, see the code:

https://github.com/arkworks-rs/algebra/blob/ee0a4078b67037b4c4d22d330113a5d7c28fd6ce/ec/src/models/short_weierstrass/affine.rs#L160

while in halo2curves:
sign is the smallest bit of Y, as y and -y have different even-old parity, so use last bit can distinguish them.
see the code:

https://github.com/privacy-scaling-explorations/halo2curves/blob/a3f15e4106c8ba999ac958ff95aa543eb76adfba/src/derive/curve.rs#L145C35-L145C35

That means one canno distinguish current y is the smaller one or the bigger one based on the sign in halo2curves impl.

@CPerezz CPerezz added discussion Discussion that may not have an action T-design Type: Discussion over API/implementation design and/or research associated labels Dec 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Discussion that may not have an action T-design Type: Discussion over API/implementation design and/or research associated
Projects
None yet
Development

No branches or pull requests

5 participants