Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute TokenList.value dynamically (v2) #710

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

living180
Copy link
Contributor

This PR supersedes #623. The meat of the PR is the same: fix the remaining portion of issue #621 by making TokenList.value a dynamically-computed property rather than an attribute. This avoids the quadratic runtime behavior that occurred due to recomputing TokenList.value each time TokenList.group_tokens() was called with extend=True.

The previous PR #623 had some rather awkward hacks related to stripping comments, but I found that I could avoid those by simply tweaking the comment stripping process to strip comments from a token list before stripping any sublists, making this PR much simpler.

Avoid stripping T.Comment tokens contained within an sql.Comment before
stripping the sql.Comment itself.  Now an sql.Comment token will be
stripped first along with any contained T.Comment tokens.
@codecov
Copy link

codecov bot commented Mar 28, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.01 🎉

Comparison is base (fc76056) 96.95% compared to head (d2ab15c) 96.97%.

❗ Current head d2ab15c differs from pull request most recent head ff4f391. Consider uploading reports for the commit ff4f391 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #710      +/-   ##
==========================================
+ Coverage   96.95%   96.97%   +0.01%     
==========================================
  Files          20       20              
  Lines        1545     1555      +10     
==========================================
+ Hits         1498     1508      +10     
  Misses         47       47              
Impacted Files Coverage Δ
sqlparse/filters/others.py 98.79% <100.00%> (ø)
sqlparse/sql.py 97.68% <100.00%> (+0.06%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@living180 living180 force-pushed the dynamic_tokenlist_value_v2 branch 3 times, most recently from f47bc2c to de63e50 Compare March 30, 2023 10:08
Rename Token to TokenBase and make it a superclass for TokenList and a
new Token class.  Move some of the functionality of TokenBase into Token
and TokenList.  This will make it easier to maintain separate
functionality for Token versus TokenList.
The fact that a new value was being computed each time
TokenList.group_tokens() was called caused supra-linear runtime when
token grouping was enabled.

Address by making TokenList.value a dynamically-computed property rather
than a static attribute.
@sdether
Copy link

sdether commented Aug 11, 2023

This is just the fix I need. Ran into some problems with parsing SQL with ~50k ID IN clauses, which with 0.4.4 takes a bit over 6 minutes to parse and with this patch only takes 6 seconds !!!

@rumbin
Copy link

rumbin commented Nov 20, 2023

any progress here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants