8000 Remove utf8_encode() by DaveRandom · Pull Request #5 · mnapoli/imapi · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Remove utf8_encode() #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 8, 2017
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions src/EmailFactory.php
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@ public function create(string $mailbox, Horde_Imap_Client_Data_Fetch $hordeEmail
// Parse the message body
$parser = new Parser();
$parser->setText($hordeEmail->getFullMsg());
$htmlContent = utf8_encode((string) $parser->getMessageBody('html'));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code change requires a test case.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😢 I'm terribly ashamed to admit I didn't write tests in this package (testing something calling IMAP would mean mocking a lot of stuff which defeats the purpose of the tests, or putting in a lot of work into a setup with Vagrant + IMAP server). It was the kind of thing I left for later (this package is more of a prototype than a complete solution)

I'm tempted to try it the old way: merge it and see what happens :D

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merge it and see what happens :D

3b44c5

Copy link
Contributor Author
@DaveRandom DaveRandom Apr 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mnapoli Since the symptom is demonstrably a double-encoding of ISO-8859-1 -> UTF-8 and that is precisely what utf8_encode() would do here, I am very confident that this would fix the problem.

Just as a concrete example of this being what is happening:

https://externals.io/thread/787#email-14713

Sara Golemon a écrit :

In ISO-8859-1:

'Ã' === "\xC3"
'©' === "\xA9"

In UTF-8:

'é' === "\xC3\xA9"

If you are ever having encoding issues and you see things that start with à it's a pretty good indicator that this is what is happening, quite a few commonly used non-ASCII chars start with this byte when UTF-8 encoded :-)

$textContent = utf8_encode((string) $parser->getMessageBody('text'));
$htmlContent = (string) $parser->getMessageBody('html');
$textContent = (string) $parser->getMessageBody('text');

// Filter HTML body to have only safe HTML
$htmlContent = trim($this->htmlFilter->purify($htmlContent));
Expand Down
0