Add check for potential misuse of unicode #749

CarliJoy · 2021-11-04T10:18:44Z

Is your feature request related to a problem? Please describe.
Recently some possible misuses of unicode characters were described.
See PEP 672 for a description.

Describe the solution you'd like
It would be nice to have some Bandit rules that can be configured:

An optional filter that enforces ASCII - only (excluding \u[0-9a-f]+, \b, \r, \x1A, \x1B) in all file contents
An optional filter that enforces ASCII only as filenames
An filter that looks for potential bad unicode chars and of course for \u[0-9a-f]+, \b, \r, \x1A, \x1B)
An filter that prevents using look alike characters of different language groups as a variable, class or function name

Describe alternatives you've considered
See linked PEP.
The content of the filters is of course up to debate.

The text was updated successfully, but these errors were encountered:

Lucas-C · 2021-11-09T10:54:19Z

The vulnerability is detailed here: http://trojansource.codes

adversaries can attack the encoding of source code files to inject vulnerabilities
The trick is to use Unicode control characters to reorder tokens in source code at the encoding level.

Extract from the PDF white paper, section "VII. F - Defenses":

The simplest defense is to ban the use of text directionality control characters
If an application wishes to print text that requires Bidi overrides, developers can generate those characters using escape sequences rather than embedding potentially dangerous characters into source code.

I'm willing to work on a PR if maintainers at @PyCQA approve this feature

CarliJoy added the enhancement New feature or request label Nov 4, 2021

CarliJoy mentioned this issue Nov 9, 2021

[Trojan Source Attacks] New feature: ban the use of text directionality control characters #750

Closed

twoertwein mentioned this issue Nov 11, 2021

Replace Bidi overrides with escape characters psf/black#2595

Open

Lucas-C linked a pull request Nov 16, 2021 that will close this issue

New check: B113: TrojanSource - Bidirectional control characters #757

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add check for potential misuse of unicode #749

Add check for potential misuse of unicode #749

CarliJoy commented Nov 4, 2021 •

edited

Lucas-C commented Nov 9, 2021 •

edited

Add check for potential misuse of unicode #749

Add check for potential misuse of unicode #749

Comments

CarliJoy commented Nov 4, 2021 • edited

Lucas-C commented Nov 9, 2021 • edited

CarliJoy commented Nov 4, 2021 •

edited

Lucas-C commented Nov 9, 2021 •

edited