-
Notifications
You must be signed in to change notification settings - Fork 555
Escape non-printable characters in JSON output #1949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1949 +/- ##
==========================================
+ Coverage 70.94% 71.12% +0.18%
==========================================
Files 658 658
Lines 29031 29072 +41
Branches 3154 3158 +4
==========================================
+ Hits 20595 20677 +82
+ Misses 6649 6602 -47
- Partials 1787 1793 +6 ☔ View full report in Codecov by Sentry. |
@riemass what about the JOSN reader? Do we already support the |
I didn't think of it, good catch. I'll update the PR once ready. |
@Neverlord question regarding the |
Using local encoding would go against the spec (see 8.1):
If a user wants to render a received JSON string in local encoding, they should to the (most likely lossy) conversion themselves.
The way I read it, the latter syntax is only used if a character exceeds a single 16 bit character in UTF-16. Either way, I think the output the API should produce needs to be UTF-8 according to the spec. |
libcaf_core/caf/detail/json.cpp
Outdated
10000 | if (std::distance(i, e) > 6 && i[1] == '\\' && i[2] == 'u') { | |
i += 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the iterator be over the \
at this point? Then it would be i[0]
holding \
and i[1]
holding u
.
If i[2]
would hold u
, then it should be i += 3
, so something is off. Also wonder why no unit test catches this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The iterator is in fact pointing to the last character of the last sequence (before \
). This is done so that the ++i
below the switch statement can be executed properly (even if there is no second surrogate).
The read_ascii
function uses ++c
instead of c++
so that i+=2
is in fact correct. I see the confusion here, I'll refactor.
Fixes #1937.