Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate p and q of exactly half the number of bits of N #98

Open
joostrijneveld opened this issue Sep 21, 2017 · 8 comments
Open

Generate p and q of exactly half the number of bits of N #98

joostrijneveld opened this issue Sep 21, 2017 · 8 comments

Comments

@joostrijneveld
Copy link
Contributor

joostrijneveld commented Sep 21, 2017

Currently, several functions in rsa.key have an accurate flag that signifies that the number of bits
should be precisely adhered to. In particular, new_keys and gen_keys have such a flag, which default to True. In the respective documentation, I read the following fragments:

:param accurate: when True, ``n`` will have exactly the number of bits you
    asked for. However, this makes key generation much slower. When False,
    `n`` may have slightly less bits.
:param nbits: the total number of bits in ``p`` and ``q``. Both ``p`` and
    ``q`` will use ``nbits/2`` bits.

When I call gen_keys(2048), I am presented with an n of 2048 bits, as expected. However, I receive a p of 1088 bits, and a q of 960 bits (these values are constant across runs; I'll get to that later).

When I then inspect the internals of gen_keys, I notice that it's calling find_p_q(nbits // 2, getprime_func, accurate) (which, for my call, comes down to find_p_q(1024, rsa.prime.getprime, True)). This function contains the following documentation (suddenly none of which promises a p and q of 1024 bits, although the nbits parameter does imply it).

The resulting p * q has exacty 2 * nbits bits, and the returned p and q
will not be equal.
:param nbits: the number of bits in each of p and q.
:param accurate: whether to enable accurate mode or not.

I think it's important to:

  • explicitly and clearly document the behavior that users can expect;
  • offer a way to generate p and q of exactly half of the size of the modulus, for compatibility with other implementations. In particular hardware-based implementations that rely on private keys in CRT form often have this constraint;
    • do this in a secure way, i.e. still choosing primes that are far enough apart.

Note that while PKCS#1 does not require p and q to have a bit-length half the bit-length of the modulus, FIPS 186-4 does require it (Appendix B3.1, top of page 53) by requiring \sqrt(2) * 2^{n/2 - 1} < p < 2^{n/2} - 1.

Upon closer inspection, it quickly becomes clear why p and q have these lengths, as find_p_q contains the following snippet:

# factor n.
shift = nbits // 16
pbits = nbits + shift
qbits = nbits - shift

Combining this with the fact that rsa.prime.getprime simply applies rejection sampling to find a prime of exactly nbits quickly explains why I'm seeing 1088-bit and 960-bit primes.

However, as the comment notes, simply sampling p and q to be 1024 bits each could result in them being very close together (although I'm not yet sure of the precise requirements and/or statistical properties). I realize this issue is not as trivial as it may sound based on the title. Maybe FIPS 186-4 provides sufficient handholds to come to a secure and bit-accurate generation of p and q, though.

If you have ideas about what the API should look like (should the FIPS 186-4 behavior just be default for accurate=True?), let me know - I may spend a moment to try to address this.

@sybrenstuvel
Copy link
Owner

Good comments, thanks.

@RichardThiessen
Copy link

RichardThiessen commented Oct 8, 2017

# Make sure that p and q aren't too close or the factoring programs can
# factor n.

This is a misconception. Current state of the art suggests choosing P and Q at random from the range of n/2 bit primes. This guarantees with overwhelming probability that P and Q will be "far apart enough" for the key to be strong.

Fips 186-4 has additional requirements:

  • provable primes (keysize<=1024)
  • p and q must be strong primes (large prime factors in (p-1),(p+1),(q-1),(q+1))
  • abs(p-q)>2**(keysize/2-100)
  • 2**(keysize/2)-1 > p > 2**(keysize/2)/sqrt(2) (for q as well)

The strong prime and provable prime requirements are not really nessesary. To quote a 1999 paper

The gist of this paper is thus that strong primes offer little protection beyond that offered by random primes. On the other hand, aside from the additional effort required to generate them, there seems to be no technical reason not to use strong primes if one so desires. But we see no rationale for requiring the use of strong primes in public key RSA standards. We note that the conclusions in this paper reflect the current state of knowledge about factoring. It is entirely possible that future developments might change these conclusions in either direction. A new factoring algorithm might again make strong primes seem desirable or it might make strong primes seem particularly dangerous.

This hasn't changed. So implementing the additional checks isn't necessary. (OpenSSL doesn't do them)

On the other hand the 2^(n/2) ... 2^(n/2)/sqrt(2) constraint on primes used by FIPS 186-4 is clever. It ensures the resulting modulus fits exactly in the key bitsize with no additional checks or logic. This would greatly simplify the key generation logic.

I have a working code with fixes for this and performance improvements for key generation. Unfortunately, this is my first git contribution. Still figuring out the work flow.

@sybrenstuvel
Copy link
Owner

@RichardThiessen thanks for the FIPS 186-4 reference.

I'm working on an implementation. It's working pretty well, except the multiplication by sqrt(2) is problematic, as for 4096-bit keys it's no longer possible to convert the numbers to floats.

sybrenstuvel added a commit that referenced this issue Aug 4, 2019
This commit implements the FIPS 186-4 recommendation for lower & upper
bounds on p & q, and on a minimum distance |p-q|.

@joostrijneveld and @RichardThiessen I'd appreciate your eyes on this
before it's pushed to master. If you can, please test & give me some
feedback on #98.
@sybrenstuvel
Copy link
Owner

7c2468e on branch https://github.com/sybrenstuvel/python-rsa/tree/issue-98-fips-186-4-prime-selection implements the FIPS 186-4 recommendation for lower & upper bounds on p & q, and on a minimum distance |p-q|.

@joostrijneveld and @RichardThiessen I'd appreciate your eyes on this before it's pushed to master, as it's of course really at the core of the library. If you can, please test & give me some feedback here.

@joostrijneveld
Copy link
Contributor Author

I'm afraid I'm currently not in a position where I can contribute or comment, but I'll gladly have a look in a couple weeks (i.e. mid-September). I've set a reminder :)

@sybrenstuvel
Copy link
Owner

Thanks for the heads-up! I'll see about releasing 4.1 before that time then, and then this can go into 4.2.

@RichardThiessen
Copy link

RichardThiessen commented Aug 6, 2019

The trick for doing approximate multiplication or division of big integers by sqrt(2) (or other constants) is turning sqrt(2) into a fractional constant of the form fraction(2^n,a) such that a**2 is very close to 2*(2**n)**2. (note: see code snippet at end for actual constant derivation)

Then one simply does an integer multiply and divide by the two values (a,2^n) and the result is very close to the correct value.

There are a few caveats but they can mostly be ignored:
-number of bits in the constant determines accuracy of the result 128 bit fractional constant applied to 1024 bit number gives integer error on the order of 2**(1024-128) so top 128 bits are probably correct lower ~900 are garbage.
-the constant is not truly an integer, two integer constant values bracket the true value. Which is used determines the sign of the error.

Neither of these really matter though since the space of primes we choose from is very large the precision of the constant (in bits) determines the error and thus a 128 bit constant with error in the wrong direction leaves only a tiny slice of the resulting range outside the correct range. Still, it's good practice to at least pay attention to point 2 above and use the right rounding on the constant depending on the desired direction of error.

Here a snippet of find_p_q() from my local heavily modified branch. That should be all you need.

` #constraints implemented are from FIPS 186-4 appendix B-3.1.2
#http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.186-4.pdf#page=62

#primes are chosen to be between (2**nbits-1) and ceil(2**nbits/sqrt(2))
#this ensures that regardless of what primes are chosen the final modulus
#will be between (2**(2*nbits)-1) and (2**(2*nbits-1)+1)

maximum=2**nbits#up to but not including
#multiply by fractional representation of 1/sqrt(2)(rounded up) for minimum
minimum=(maximum*0xb504f333f9de6484597d89b3754abea0)>>128
minimum+=1#add 1 in case nbits is very small

### code for generating the above constant
# divisor=2**128#fraction divisor
# target=divisor**2//2
# increment=divisor
# value=0
# while increment:
#     value+=increment
#     if value**2>target:
#         value-=increment
#     increment//=2
# value+=1#round up
# print(hex(value))`

Note that the constant and the result are rounded up to puch the result closer to 2**nbits so we're definitely within the target range. I think it also works on small numbers too. *(checks) Yup, looks like it does work on small numbers (IE:result=ceil(n/sqrt(2)) for n< ~2^128)

for testing a hypothetical divide_bigint_by_sqrt2() function you can check that n**2~=2*(f(n)**2) as well as the direction of the error of the two terms in the equality.

RichardThiessen added a commit to RichardThiessen/python-rsa that referenced this issue Jun 4, 2023
@RichardThiessen
Copy link

This is about the smallest change I could make to fix the issue. The _bigint_divide_by_sqrt_2 function is generic to make things simpler, no more commented out code to generate a constant.

Also worth noting, I refactored the code in parallel.py to not re-implement what's already in prime.py This simplifies things a bit in the future if there's reason to change out the underlying getprime function for benchmarking purposes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants