Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Trojan Source Attacks] New feature: ban the use of text directionality control characters #750

Closed
Lucas-C opened this issue Nov 4, 2021 · 4 comments
Labels
enhancement New feature or request

Comments

@Lucas-C
Copy link

Lucas-C commented Nov 4, 2021

Is your feature request related to a problem? Please describe.
The vulnerability is detailed here: https://trojansource.codes

adversaries can attack the encoding of source code files to inject vulnerabilities
The trick is to use Unicode control characters to reorder tokens in source code at the encoding level.

Describe the solution you'd like
Could a new check be added to bandit to detect those characters?

Describe alternatives you've considered
Using a language-agnostic linter tool detecting this vulnerability,
but I do not know any existing one so far.

Additional context

@Lucas-C Lucas-C added the enhancement New feature or request label Nov 4, 2021
@kleph
Copy link

kleph commented Nov 4, 2021

May be a bit less strict, as the article mention, unterminated bidirectionnal control character ?

I still haven't use those type of character, but I presume they are usefull for right to left languages. I assume banning all control chars would prevent writing comment in those languages ?

It's probably harder to implement though.

Appart from that, I think it's a great idea to implent those checks in tools, to not rely on human eye !

@Lucas-C
Copy link
Author

Lucas-C commented Nov 5, 2021

It is recommended as part of the PDF white paper, section "VII. F - Defenses":

The simplest defense is to ban the use of text directionality control characters
If an application wishes to print text that requires Bidi overrides, developers can generate those characters using escape sequences rather than embedding potentially dangerous characters into source code.

@CarliJoy
Copy link

CarliJoy commented Nov 9, 2021

Please note #749 where I already opened an issue for the same topic.
Especially have a look at the linked ticket there that explains the issue in detail for python!

@Lucas-C
Copy link
Author

Lucas-C commented Nov 9, 2021

Indeed, those issues are duplicates.
I'm closing this as it came after yours @CarliJoy

@Lucas-C Lucas-C closed this as completed Nov 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants