8000 Unicode URLs not normalised · Issue #1 · Sentynel/talk · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Unicode URLs not normalised #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Sentynel opened this issue Jan 1, 2021 · 1 comment
Open

Unicode URLs not normalised #1

Sentynel opened this issue Jan 1, 2021 · 1 comment
Labels
bug Something isn't working

Comments

@Sentynel
Copy link
Owner
Sentynel commented Jan 1, 2021

Some stories have Unicode in their URLs:
https://www.angrymetalguy.com/medico-peste-%D7%91-the-black-bile-review/
You would expect this to be treated consistently by browsers, but it's not! Most use this representation, but some lowercase the percent escapes, and some just hand over the unicode directly. (I don't know which, annoyingly - I can just see the database entries the scraper has created for all three.)

Needs testing on latest and reporting upstream if it exists there.

@Sentynel Sentynel added the bug Something isn't working label Jan 1, 2021
@Sentynel
Copy link
Owner Author
Sentynel commented Jan 1, 2021

Reported upstream: coralproject#3358

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant
2A41
0