Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extractor/generic] Decode unicode-escaped embed URLs #5919

Merged
merged 2 commits into from Jan 2, 2023

Conversation

bashonly
Copy link
Member

@bashonly bashonly commented Jan 2, 2023

When the generic extractor finds an embedded video URL from a block of javascript, the URL can contain unicode-escape sequences which need to be decoded, or else urllib.parse will incorrectly parse the URL and could throw an idna codec error due to the host/netloc or path being too long.

Fixes #5854

Before submitting a pull request make sure you have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

@bashonly bashonly added the bug Bug that is not site-specific label Jan 2, 2023
@pukkandan pukkandan merged commit 05997b6 into yt-dlp:master Jan 2, 2023
@bashonly bashonly deleted the fix/generic-unicode branch January 3, 2023 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug that is not site-specific
Projects
None yet
2 participants