New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace wchar_t
string decoding implementation with a uint32_t
-based one
#555
Replace wchar_t
string decoding implementation with a uint32_t
-based one
#555
Conversation
Codecov Report
@@ Coverage Diff @@
## main #555 +/- ##
==========================================
+ Coverage 91.81% 91.84% +0.03%
==========================================
Files 6 6
Lines 1856 1852 -4
==========================================
- Hits 1704 1701 -3
+ Misses 152 151 -1
Continue to review full report at Codecov.
|
This fixes character handling on platforms with 16-bit wchar_t (notably, Windows), which was broken (in different ways) on both CPython and PyPy. Fixes ultrajson#552
eb9c5c1
to
bc7bdff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. I was expecting a replacement of all strings to be a much bigger, scarier looking change set.
Yeah, much of the code essentially assumed 32-bit ints already for proper operation, so not many changes were needed at all. Also, just realised I forgot about the benchmarks. Some quick tests right now indicate that it's very marginally faster than the previous code by a couple per cent or so. |
…nt32_t`-based one" Backport ultrajson/ultrajson#555
…nt32_t`-based one" (#67) Backport ultrajson/ultrajson#555
…nt32_t`-based one" (explosion#67) Backport ultrajson/ultrajson#555
…nt32_t`-based one" (explosion#67) Backport ultrajson/ultrajson#555
…nt32_t`-based one" (explosion#67) Backport ultrajson/ultrajson#555
This fixes character handling on platforms with 16-bit
wchar_t
(notably, Windows), which was broken (in different ways) on both CPython and PyPy.Fixes #552
Remarks:
Py_UCS4 == JSUINT32
check magic, see the comments on Surrogates fix fails tests with PyPy on Windows #552.PyUnicode_FromWideChar
does some extra work compared toPyUnicode_FromKindAndData
(mostly due to surrogate handling). On 16-bitwchar_t
platforms, the larger buffer size might have some impact though; I won't be able to run comparisons for that though, I think.