Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve XSS vulnerability in local Wordnet browser #3096

Merged
merged 1 commit into from
Dec 28, 2022

Conversation

tomaarsen
Copy link
Member

Hello!

Pull Request overview

  • Resolve Cross Site Scripting vulnerability in nltk.app.wordnet_app. This only affected users of this browser interface to Wordnet, and not other users of Wordnet. If the following image is not familiar to you, then you are not affected:

image

Details

Whenever an unknown path was supplied in the localhost, e.g. http://localhost:8000/<script>alert(1)</script>.html, then the wordnet app would try to find a file called <script>alert(1)</script>.html, be unable to do so, and then report back with an error on the website saying that "Internal error: Path for static page '<script>alert(1)</script>' is unknown". However, this page was loaded as HTML, so the script would be executed.

I don't believe that there is a real attack vector here, as the pages that are normally seen are directly from Wordnet, so one of the Wordnet URLs would need to be modified into a malicious link. That said, there is no reason not to fix this.

Reproducing

The wordnet browser app can be started like so:

import nltk
nltk.app.wordnet()

Then, browsing to http://localhost:8000/<script>alert(1)</script>.html would cause the following popup to appear:
image

The fix

By setting the Content-type to text/plain when an unknown path is used, we prevent any code from being executed.

After the fix

When running the reproduction code, we now see:
image
And no popup.

This vulnerability was disclosed according to our security policy, and we are thankful for that.

  • Tom Aarsen

By setting the Content-type to text/plain when an unknown path is used.
@github-actions github-actions bot removed the wordnet label Dec 28, 2022
@tomaarsen tomaarsen merged commit c8cedf1 into nltk:develop Dec 28, 2022
@tomaarsen tomaarsen deleted the vuln/wordnet_app_xss branch December 28, 2022 13:58
@arademaker
Copy link

Where can I find the tutorial/doc about how to use the wordnet app?

@tomaarsen
Copy link
Member Author

There's no tutorial on it, and the only documentation is this: https://www.nltk.org/api/nltk.app.wordnet_app.html
But honestly, that's quite vague and doesn't really seem to correspond with my knowledge on the wordnet app. I would just run

import nltk
nltk.app.wordnet

And mess around with the webbrowser that should automatically pop up. There's a help button in the interface.

@kylemcmearty
Copy link

Hey is this fix going into version 3.9?

@tomaarsen
Copy link
Member Author

We're still discussing this internally. We will either:

  • include this fix into NLTK 3.9, or
  • remove wordnet_app from NLTK altogether for version 3.9 onwards.

One thing is certain, once NLTK 3.9 releases, it should not have this vulnerability.

@arademaker
Copy link

arademaker commented Jan 13, 2023

Thank you, @tomaarsen. I could finally see the wordnet app in action with nltk.app.wordnet() in the Python prompt. I noticed that the example of searching for multiple words on the help page needs to be fixed. The current word field is also not enabled.

@tomaarsen
Copy link
Member Author

There may still be some small issues with it. Steven and I have discussed potentially deprecating it and moving it to nltk_contrib instead of paying the maintenance cost for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants