Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved javascript regex regocnizing for extracting js messages #791

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

gitaarik
Copy link
Contributor

@gitaarik gitaarik commented Jun 9, 2021

In a JS library that I use, there was regex with a / inside a character class [a-z]. The babel JS lexer thought this closed the regex, but JavaScript allows this as a normal match character within a character class, and does not close the regex. I updated the regex so that it will cover this exception.

This could also possibly fix these issues, or set the stage for further improvements regarding these issues:

How do I add unit tests for this? I haven't found clear documentation about this.

@gitaarik
Copy link
Contributor Author

Hi @akx, have you had time to look at this? It would be nice if this can be merged.

Copy link
Member

@akx akx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beyond the other comments, I'd appreciate tests showing how this improves parsing for the files previously accepted :)

babel/__init__.py Outdated Show resolved Hide resolved
CHANGES Outdated Show resolved Hide resolved
CHANGES Outdated Show resolved Hide resolved
CHANGES Outdated Show resolved Hide resolved
@gitaarik
Copy link
Contributor Author

Hey @akx , could you please give me some guidance about how/where I should add unit tests for this? I haven't found clear documentation about this.

@gitaarik gitaarik force-pushed the javascript-lexer-regex-parse-fix branch from c9baa67 to be48522 Compare February 19, 2022 19:16
@gitaarik gitaarik force-pushed the javascript-lexer-regex-parse-fix branch from be48522 to fdcac82 Compare February 19, 2022 21:14
@gitaarik
Copy link
Contributor Author

@akx Got it now, I wasn't familiar with tox, but now I am, a little.

I added the unit test:

https://github.com/gitaarik/babel/blob/javascript-lexer-regex-parse-fix/tests/messages/test_js_extract.py#L156-L182

I would really appreciate it if this can be merged soon and that a new release could be made. Then the build process of our project can be greatly simplified :).

@codecov
Copy link

codecov bot commented Feb 23, 2022

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (65de3dc) 89.82% compared to head (b37389a) 89.82%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #791   +/-   ##
=======================================
  Coverage   89.82%   89.82%           
=======================================
  Files          25       25           
  Lines        4391     4391           
=======================================
  Hits         3944     3944           
  Misses        447      447           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gitaarik
Copy link
Contributor Author

gitaarik commented Feb 24, 2022

I'm not completely understanding this Codecov result, and what, if anything, is wrong.

LucasLefevre added a commit to odoo/o-spreadsheet that referenced this pull request Jun 15, 2022
There's an issue in odoo when extracing translatable source terms
from the bundled library file `o_spreadsheet.js`. Some terms are not
extracted.

Babel doesn't correctly tokenize the regex /"/
As a result, all following tokens are fucked up and source terms are no longer
exported after this point.

It seems to be known that babel js lexer isn't perfect.
python-babel/babel#467
python-babel/babel#616
python-babel/babel#640
python-babel/babel#791
robodoo pushed a commit to odoo/o-spreadsheet that referenced this pull request Jun 24, 2022
There's an issue in odoo when extracing translatable source terms
from the bundled library file `o_spreadsheet.js`. Some terms are not
extracted.

Babel doesn't correctly tokenize the regex /"/
As a result, all following tokens are fucked up and source terms are no longer
exported after this point.

It seems to be known that babel js lexer isn't perfect.
python-babel/babel#467
python-babel/babel#616
python-babel/babel#640
python-babel/babel#791

closes #1444

Signed-off-by: Pierre Rousseau (pro) <pro@odoo.com>
fw-bot pushed a commit to odoo/o-spreadsheet that referenced this pull request Jun 24, 2022
There's an issue in odoo when extracing translatable source terms
from the bundled library file `o_spreadsheet.js`. Some terms are not
extracted.

Babel doesn't correctly tokenize the regex /"/
As a result, all following tokens are fucked up and source terms are no longer
exported after this point.

It seems to be known that babel js lexer isn't perfect.
python-babel/babel#467
python-babel/babel#616
python-babel/babel#640
python-babel/babel#791

X-original-commit: df307ed
robodoo pushed a commit to odoo/o-spreadsheet that referenced this pull request Jun 24, 2022
There's an issue in odoo when extracing translatable source terms
from the bundled library file `o_spreadsheet.js`. Some terms are not
extracted.

Babel doesn't correctly tokenize the regex /"/
As a result, all following tokens are fucked up and source terms are no longer
exported after this point.

It seems to be known that babel js lexer isn't perfect.
python-babel/babel#467
python-babel/babel#616
python-babel/babel#640
python-babel/babel#791

closes #1461

X-original-commit: df307ed
Signed-off-by: Pierre Rousseau (pro) <pro@odoo.com>
Signed-off-by: Lucas Lefèvre (lul) <lul@odoo.com>
LucasLefevre added a commit to odoo/o-spreadsheet that referenced this pull request Jun 24, 2022
There's an issue in odoo when extracing translatable source terms
from the bundled library file `o_spreadsheet.js`. Some terms are not
extracted.

Babel doesn't correctly tokenize the regex /"/
As a result, all following tokens are fucked up and source terms are no longer
exported after this point.

It seems to be known that babel js lexer isn't perfect.
python-babel/babel#467
python-babel/babel#616
python-babel/babel#640
python-babel/babel#791
robodoo pushed a commit to odoo/o-spreadsheet that referenced this pull request Jun 24, 2022
There's an issue in odoo when extracing translatable source terms
from the bundled library file `o_spreadsheet.js`. Some terms are not
extracted.

Babel doesn't correctly tokenize the regex /"/
As a result, all following tokens are fucked up and source terms are no longer
exported after this point.

It seems to be known that babel js lexer isn't perfect.
python-babel/babel#467
python-babel/babel#616
python-babel/babel#640
python-babel/babel#791

closes #1462

Signed-off-by: Pierre Rousseau (pro) <pro@odoo.com>
Signed-off-by: Lucas Lefèvre (lul) <lul@odoo.com>
fw-bot pushed a commit to odoo/o-spreadsheet that referenced this pull request Jun 24, 2022
There's an issue in odoo when extracing translatable source terms
from the bundled library file `o_spreadsheet.js`. Some terms are not
extracted.

Babel doesn't correctly tokenize the regex /"/
As a result, all following tokens are fucked up and source terms are no longer
exported after this point.

It seems to be known that babel js lexer isn't perfect.
python-babel/babel#467
python-babel/babel#616
python-babel/babel#640
python-babel/babel#791

X-original-commit: d711ad2
fw-bot pushed a commit to odoo/o-spreadsheet that referenced this pull request Jun 24, 2022
There's an issue in odoo when extracing translatable source terms
from the bundled library file `o_spreadsheet.js`. Some terms are not
extracted.

Babel doesn't correctly tokenize the regex /"/
As a result, all following tokens are fucked up and source terms are no longer
exported after this point.

It seems to be known that babel js lexer isn't perfect.
python-babel/babel#467
python-babel/babel#616
python-babel/babel#640
python-babel/babel#791

X-original-commit: d711ad2
fw-bot pushed a commit to odoo/o-spreadsheet that referenced this pull request Jun 24, 2022
There's an issue in odoo when extracing translatable source terms
from the bundled library file `o_spreadsheet.js`. Some terms are not
extracted.

Babel doesn't correctly tokenize the regex /"/
As a result, all following tokens are fucked up and source terms are no longer
exported after this point.

It seems to be known that babel js lexer isn't perfect.
python-babel/babel#467
python-babel/babel#616
python-babel/babel#640
python-babel/babel#791

X-original-commit: d711ad2
robodoo pushed a commit to odoo/o-spreadsheet that referenced this pull request Jun 24, 2022
There's an issue in odoo when extracing translatable source terms
from the bundled library file `o_spreadsheet.js`. Some terms are not
extracted.

Babel doesn't correctly tokenize the regex /"/
As a result, all following tokens are fucked up and source terms are no longer
exported after this point.

It seems to be known that babel js lexer isn't perfect.
python-babel/babel#467
python-babel/babel#616
python-babel/babel#640
python-babel/babel#791

closes #1465

X-original-commit: d711ad2
Signed-off-by: Pierre Rousseau (pro) <pro@odoo.com>
Signed-off-by: Lucas Lefèvre (lul) <lul@odoo.com>
robodoo pushed a commit to odoo/o-spreadsheet that referenced this pull request Jun 24, 2022
There's an issue in odoo when extracing translatable source terms
from the bundled library file `o_spreadsheet.js`. Some terms are not
extracted.

Babel doesn't correctly tokenize the regex /"/
As a result, all following tokens are fucked up and source terms are no longer
exported after this point.

It seems to be known that babel js lexer isn't perfect.
python-babel/babel#467
python-babel/babel#616
python-babel/babel#640
python-babel/babel#791

closes #1464

X-original-commit: d711ad2
Signed-off-by: Pierre Rousseau (pro) <pro@odoo.com>
Signed-off-by: Lucas Lefèvre (lul) <lul@odoo.com>
robodoo pushed a commit to odoo/o-spreadsheet that referenced this pull request Jun 24, 2022
There's an issue in odoo when extracing translatable source terms
from the bundled library file `o_spreadsheet.js`. Some terms are not
extracted.

Babel doesn't correctly tokenize the regex /"/
As a result, all following tokens are fucked up and source terms are no longer
exported after this point.

It seems to be known that babel js lexer isn't perfect.
python-babel/babel#467
python-babel/babel#616
python-babel/babel#640
python-babel/babel#791

closes #1463

X-original-commit: d711ad2
Signed-off-by: Pierre Rousseau (pro) <pro@odoo.com>
Signed-off-by: Lucas Lefèvre (lul) <lul@odoo.com>
@michaeldg
Copy link

@akx Is what you requested done now? can this be merged?

@gitaarik gitaarik requested a review from akx November 17, 2023 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants