8000 Remove utf8_encode() by DaveRandom · Pull Request #5 · mnapoli/imapi · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Remove utf8_encode() #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 8, 2017
Merged

Conversation

DaveRandom
Copy link
Contributor
@DaveRandom DaveRandom commented Mar 30, 2017

This definitely shouldn't be here, and I'm fairly certain is the root cause of mnapoli/externals#11 - the example linked there is a classic UTF-8 double-encoding issue.

The character set of the message is decoded on retrieval, since no decoder is passed to the parser explicitly the default is used, the default decoder always returns valid UTF-8.

Also ftr utf8_encode() and utf8_decode() are almost completely useless (they only make sense when the other relevant charset is ISO-8859-1) and should probably be deprecated in PHP.

@DaveRandom
Copy link
Contributor Author

Note that I've not actually been able to test and verify this, but I'm 99.999999% confident it will fix the linked externals.io issue.

@@ -29,8 +29,8 @@ public function create(string $mailbox, Horde_Imap_Client_Data_Fetch $hordeEmail
// Parse the message body
$parser = new Parser();
$parser->setText($hordeEmail->getFullMsg());
$htmlContent = utf8_encode((string) $parser->getMessageBody('html'));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code change requires a test case.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😢 I'm terribly ashamed to admit I didn't write tests in this package (testing something calling IMAP would mean mocking a lot of stuff which defeats the purpose of the tests, or putting in a lot of work into a setup with Vagrant + IMAP server). It was the kind of thing I left for later (this package is more of a prototype than a complete solution)

I'm tempted to try it the old way: merge it and see what happens :D

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merge it and see what happens :D

3b44c5

Copy link
Contributor Author
@DaveRandom DaveRandom Apr 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mnapoli Since the symptom is demonstrably a double-encoding of ISO-8859-1 -> UTF-8 and that is precisely what utf8_encode() would do here, I am very confident that this would fix the problem.

Just as a concrete example of this being what is happening:

https://externals.io/thread/787#email-14713

Sara Golemon a écrit :

In ISO-8859-1:

'Ã' === "\xC3"
'©' === "\xA9"

In UTF-8:

'é' === "\xC3\xA9"

If you are ever having encoding issues and you see things that start with à it's a pretty good indicator that this is what is happening, quite a few commonly used non-ASCII chars start with this byte when UTF-8 encoded :-)

@mnapoli
Copy link
Owner
mnapoli 8000 commented Apr 8, 2017

Let's go! Thank you @DaveRandom

@mnapoli mnapoli merged commit 96263f6 into mnapoli:master Apr 8, 2017
mnapoli added a commit to mnapoli/externals that referenced this pull request Apr 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0