Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set reachable depth for generate #3145

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from
Open

Set reachable depth for generate #3145

wants to merge 4 commits into from

Conversation

ekaf
Copy link
Contributor

@ekaf ekaf commented Apr 21, 2023

Fix #3072: when a grammar is recursive, the default generation depth (sys.maxsize) is wildly out of reach.

import sys
print(sys.maxsize)

9223372036854775807

print(sys.getrecursionlimit())
1000

So this PR adds a test to check if the grammar is recursive, in which case the default "depth" is lowered to a safe value.
Because of indirect recursion between generate_all() and generate_one(), the depth cannot exceed one third of the recursion limit. Additionally, Python 3 has the undocumented peculiarity that the actually reachable recursion limit is 3 less than sys.getrecursionlimit().

With this PR, generating from the grammar in #3072 no longer raises any error:

from nltk.grammar import CFG
from nltk.parse.generate import generate

G = CFG.fromstring("""
S -> 'a' S | 
""")

gen = generate(G)
out = next(gen)
print(len(out))

329

print(''.join(out))

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

@ekaf
Copy link
Contributor Author

ekaf commented Apr 22, 2023

Here's a small experiment to verify the unintuitive behaviour of the recursion limit in Python 3:

from sys import setrecursionlimit
setrecursionlimit(10)

def recurse(n):
    print(n)
    recurse(n+1)

recurse(1)

1
2
3
4
5
6
7

After printing 7, Python 3 raises a RecursionError. But with Python 2, the same code is able to reach 9.

@stevenbird
Copy link
Member

I suggest dropping the recursion check, and setting the parameter regardless of whether the grammar is recursive

@ekaf
Copy link
Contributor Author

ekaf commented Dec 23, 2023

Strangely, CI failed only with Python 3.11 on macos-latest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

generate gives infinite recursion for recursive grammars
2 participants