Fix handling of surrogates on encoding #530

JustAnotherArchivist · 2022-04-17T03:31:37Z

This allows surrogates anywhere in the input, compatible with the json module from the standard library.

This also refactors two interfaces:

The PyUnicode to char* conversion is moved into its own function, separated from the JSONTypeContext handling, so it can be reused for other things in the future (e.g. indentation and separators) which don't have a type context.
Converting the char* output to a Python string with surrogates intact requires the string length for PyUnicode_Decode & Co. While strlen could be used, the length is already known inside the encoder, so the encoder function now also takes an extra size_t pointer argument to return that and no longer NUL-terminates the string. This also permits output that contains NUL bytes (even though that would be invalid JSON), e.g. if an object's __json__ method return value were to contain them.

Fixes #156
Fixes #447
Fixes #537
Supersedes #284

codecov-commenter · 2022-04-17T03:32:55Z

Codecov Report

Merging #530 (59aa3bf) into main (b300d64) will decrease coverage by 0.07%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #530      +/-   ##
==========================================
- Coverage   91.76%   91.68%   -0.08%     
==========================================
  Files           6        6              
  Lines        1821     1828       +7     
==========================================
+ Hits         1671     1676       +5     
- Misses        150      152       +2

Impacted Files	Coverage Δ
lib/ultrajsonenc.c	`85.78% <100.00%> (-0.30%)`	⬇️
python/objToJSON.c	`90.21% <100.00%> (-0.22%)`	⬇️
tests/test_ujson.py	`99.61% <100.00%> (+<0.01%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b300d64...59aa3bf. Read the comment docs.

JustAnotherArchivist · 2022-04-17T03:38:32Z

Regarding the change of the JSON_EncodeObject interface, I figured that this is the cleanest way to implement it. However, if it's better avoided (is that considered public API? I'm not really sure...), the information is also available in the encoder struct, so it could be calculated from that in objToJSON instead.

lib/ultrajsonenc.c

python/objToJSON.c

Erotemic · 2022-04-21T21:19:42Z

@JustAnotherArchivist I ran benchmarks on this branch and on main to compare them.

Trying to compare those number really made we wish I had any easy way to dump the numbers from specific runs into a file and then reload them so I can compare across versions using t-tests instead of trying to look at these pointwise measurements. I ran each benchmark 4 times. Here are the results from the main branch:

#########
# Before: main - b3f8754c8a0c743e9f80c06472ec7a7adc96f438

|                                                                               | ujson      | nujson     | orjson     | simplejson | json       |
|-------------------------------------------------------------------------------|-----------:|-----------:|-----------:|-----------:|-----------:|
| Array with 256 doubles                                                        |            |            |            |            |            |
| encode                                                                        |     27,517 |      8,628 |    146,771 |      6,223 |      6,508 |
| decode                                                                        |     51,582 |     74,452 |     80,940 |     18,849 |     20,600 |
| Array with 256 UTF-8 strings                                                  |            |            |            |            |            |
| encode                                                                        |      6,343 |      5,106 |     29,019 |      5,143 |      5,437 |
| decode                                                                        |      3,200 |      3,222 |      2,096 |        650 |      2,814 |
| Array with 256 strings                                                        |            |            |            |            |            |
| encode                                                                        |     81,984 |     55,767 |    151,685 |     30,407 |     44,763 |
| decode                                                                        |     47,109 |     45,275 |     64,663 |     59,936 |     62,081 |
| Medium complex object                                                         |            |            |            |            |            |
| encode                                                                        |     21,927 |     23,676 |     68,060 |      8,082 |      9,093 |
| decode                                                                        |     19,358 |     20,629 |     26,938 |     11,895 |     15,046 |
| Array with 256 True values                                                    |            |            |            |            |            |
| encode                                                                        |    231,129 |    198,916 |    714,383 |    132,045 |    139,363 |
| decode                                                                        |    383,955 |    310,780 |    398,080 |    250,025 |    266,290 |
| Array with 256 dict{string, int} pairs                                        |            |            |            |            |            |
| encode                                                                        |     25,301 |     26,959 |     96,947 |      6,824 |     12,689 |
| decode                                                                        |     22,374 |     20,400 |     27,277 |     13,244 |     17,475 |
| Dict with 256 arrays with 256 dict{string, int} pairs                         |            |            |            |            |            |
| encode                                                                        |         86 |         93 |        237 |         20 |         43 |
| decode                                                                        |         56 |         54 |         57 |         33 |         42 |
| Dict with 256 arrays with 256 dict{string, int} pairs, outputting sorted keys |            |            |            |            |            |
| encode                                                                        |         82 |         68 |            |         15 |         44 |
| Complex object                                                                |            |            |            |            |            |
| encode                                                                        |        874 |      1,052 |            |        749 |        768 |
| decode                                                                        |        779 |        762 |            |        293 |        531 |


- CPython 3.9.9 (main, Jan  6 2022, 18:33:12) [GCC 10.3.0]
- ujson        : 5.2.1.dev14

|                                                                               | ujson      |
|-------------------------------------------------------------------------------|-----------:|
| Array with 256 doubles                                                        |            |
| encode                                                                        |     32,143 |
| decode                                                                        |     53,294 |
| Array with 256 UTF-8 strings                                                  |            |
| encode                                                                        |      6,551 |
| decode                                                                        |      3,488 |
| Array with 256 strings                                                        |            |
| encode                                                                        |     84,152 |
| decode                                                                        |     44,547 |
| Medium complex object                                                         |            |
| encode                                                                        |     20,955 |
| decode                                                                        |     19,723 |
| Array with 256 True values                                                    |            |
| encode                                                                        |    222,522 |
| decode                                                                        |    388,101 |
| Array with 256 dict{string, int} pairs                                        |            |
| encode                                                                        |     23,855 |
| decode                                                                        |     22,189 |
| Dict with 256 arrays with 256 dict{string, int} pairs                         |            |
| encode                                                                        |         90 |
| decode                                                                        |         51 |
| Dict with 256 arrays with 256 dict{string, int} pairs, outputting sorted keys |            |
| encode                                                                        |         73 |
| Complex object                                                                |            |
| encode                                                                        |        770 |
| decode                                                                        |        762 |


|                                                                               | ujson      |
|-------------------------------------------------------------------------------|-----------:|
| Array with 256 doubles                                                        |            |
| encode                                                                        |     38,800 |
| decode                                                                        |     55,194 |
| Array with 256 UTF-8 strings                                                  |            |
| encode                                                                        |      6,370 |
| decode                                                                        |      3,460 |
| Array with 256 strings                                                        |            |
| encode                                                                        |     78,901 |
| decode                                                                        |     43,462 |
| Medium complex object                                                         |            |
| encode                                                                        |     22,154 |
| decode                                                                        |     20,283 |
| Array with 256 True values                                                    |            |
| encode                                                                        |    228,851 |
| decode                                                                        |    402,941 |
| Array with 256 dict{string, int} pairs                                        |            |
| encode                                                                        |     24,935 |
| decode                                                                        |     23,396 |
| Dict with 256 arrays with 256 dict{string, int} pairs                         |            |
| encode                                                                        |         91 |
| decode                                                                        |         52 |
| Dict with 256 arrays with 256 dict{string, int} pairs, outputting sorted keys |            |
| encode                                                                        |         78 |
| Complex object                                                                |            |
| encode                                                                        |        826 |
| decode                                                                        |        753 |


|                                                                               | ujson      |
|-------------------------------------------------------------------------------|-----------:|
| Array with 256 doubles                                                        |            |
| encode                                                                        |     31,242 |
| decode                                                                        |     52,362 |
| Array with 256 UTF-8 strings                                                  |            |
| encode                                                                        |      6,217 |
| decode                                                                        |      3,581 |
| Array with 256 strings                                                        |            |
| encode                                                                        |     82,842 |
| decode                                                                        |     46,859 |
| Medium complex object                                                         |            |
| encode                                                                        |     20,831 |
| decode                                                                        |     20,506 |
| Array with 256 True values                                                    |            |
| encode                                                                        |    220,400 |
| decode                                                                        |    405,073 |
| Array with 256 dict{string, int} pairs                                        |            |
| encode                                                                        |     25,198 |
| decode                                                                        |     22,576 |
| Dict with 256 arrays with 256 dict{string, int} pairs                         |            |
| encode                                                                        |         93 |
| decode                                                                        |         57 |
| Dict with 256 arrays with 256 dict{string, int} pairs, outputting sorted keys |            |
| encode                                                                        |         85 |
| Complex object                                                                |            |
| encode                                                                        |        852 |
| decode                                                                        |        708 |

And here are the results from this branch:

########
# After: c9df7129c1dbd94f333c0ebd9de8470142aa092d

|                                                                               | ujson      | nujson     | orjson     | simplejson | json       |
|-------------------------------------------------------------------------------|-----------:|-----------:|-----------:|-----------:|-----------:|
| Array with 256 doubles                                                        |            |            |            |            |            |
| encode                                                                        |     38,501 |      9,889 |    141,171 |      7,040 |      6,792 |
| decode                                                                        |     52,256 |     80,021 |     85,377 |     19,820 |     19,733 |
| Array with 256 UTF-8 strings                                                  |            |            |            |            |            |
| encode                                                                        |      5,817 |      5,453 |     28,334 |      5,393 |      5,704 |
| decode                                                                        |      3,600 |      3,509 |      2,195 |        715 |      3,171 |
| Array with 256 strings                                                        |            |            |            |            |            |
| encode                                                                        |     85,067 |     54,707 |    143,638 |     30,560 |     42,362 |
| decode                                                                        |     45,363 |     41,422 |     67,748 |     58,435 |     61,740 |
| Medium complex object                                                         |            |            |            |            |            |
| encode                                                                        |     18,818 |     24,514 |     69,272 |      8,066 |     10,271 |
| decode                                                                        |     19,737 |     21,578 |     28,814 |     11,917 |     15,318 |
| Array with 256 True values                                                    |            |            |            |            |            |
| encode                                                                        |    230,512 |    201,239 |    744,935 |    135,794 |    139,228 |
| decode                                                                        |    385,795 |    294,987 |    398,714 |    239,698 |    266,018 |
| Array with 256 dict{string, int} pairs                                        |            |            |            |            |            |
| encode                                                                        |     22,762 |     14,233 |     66,452 |      6,463 |     12,255 |
| decode                                                                        |     22,778 |     22,681 |     27,304 |     11,938 |     18,121 |
| Dict with 256 arrays with 256 dict{string, int} pairs                         |            |            |            |            |            |
| encode                                                                        |         85 |         92 |        333 |         21 |         45 |
| decode                                                                        |         53 |         52 |         53 |         37 |         44 |
| Dict with 256 arrays with 256 dict{string, int} pairs, outputting sorted keys |            |            |            |            |            |
| encode                                                                        |         72 |         71 |            |         16 |         45 |
| Complex object                                                                |            |            |            |            |            |
| encode                                                                        |        806 |      1,036 |            |        711 |        763 |
| decode                                                                        |        754 |        759 |            |        280 |        505 |



|                                                                               | ujson      |
|-------------------------------------------------------------------------------|-----------:|
| Array with 256 doubles                                                        |            |
| encode                                                                        |     41,131 |
| decode                                                                        |     55,802 |
| Array with 256 UTF-8 strings                                                  |            |
| encode                                                                        |      5,655 |
| decode                                                                        |      3,403 |
| Array with 256 strings                                                        |            |
| encode                                                                        |     85,681 |
| decode                                                                        |     47,392 |
| Medium complex object                                                         |            |
| encode                                                                        |     19,911 |
| decode                                                                        |     19,984 |
| Array with 256 True values                                                    |            |
| encode                                                                        |    227,750 |
| decode                                                                        |    394,301 |
| Array with 256 dict{string, int} pairs                                        |            |
| encode                                                                        |     24,164 |
| decode                                                                        |     23,753 |
| Dict with 256 arrays with 256 dict{string, int} pairs                         |            |
| encode                                                                        |         86 |
| decode                                                                        |         47 |
| Dict with 256 arrays with 256 dict{string, int} pairs, outputting sorted keys |            |
| encode                                                                        |         68 |
| Complex object                                                                |            |
| encode                                                                        |        792 |
| decode                                                                        |        725 |


|                                                                               | ujson      |
|-------------------------------------------------------------------------------|-----------:|
| Array with 256 doubles                                                        |            |
| encode                                                                        |     38,121 |
| decode                                                                        |     54,035 |
| Array with 256 UTF-8 strings                                                  |            |
| encode                                                                        |      5,180 |
| decode                                                                        |      3,581 |
| Array with 256 strings                                                        |            |
| encode                                                                        |     84,799 |
| decode                                                                        |     45,156 |
| Medium complex object                                                         |            |
| encode                                                                        |     17,860 |
| decode                                                                        |     21,003 |
| Array with 256 True values                                                    |            |
| encode                                                                        |    214,279 |
| decode                                                                        |    369,041 |
| Array with 256 dict{string, int} pairs                                        |            |
| encode                                                                        |     23,489 |
| decode                                                                        |     23,781 |
| Dict with 256 arrays with 256 dict{string, int} pairs                         |            |
| encode                                                                        |         87 |
| decode                                                                        |         52 |
| Dict with 256 arrays with 256 dict{string, int} pairs, outputting sorted keys |            |
| encode                                                                        |         64 |
| Complex object                                                                |            |
| encode                                                                        |        761 |
| decode                                                                        |        724 |


|                                                                               | ujson      |
|-------------------------------------------------------------------------------|-----------:|
| Array with 256 doubles                                                        |            |
| encode                                                                        |     38,605 |
| decode                                                                        |     53,182 |
| Array with 256 UTF-8 strings                                                  |            |
| encode                                                                        |      5,304 |
| decode                                                                        |      3,605 |
| Array with 256 strings                                                        |            |
| encode                                                                        |     81,199 |
| decode                                                                        |     45,518 |
| Medium complex object                                                         |            |
| encode                                                                        |     19,501 |
| decode                                                                        |     21,424 |
| Array with 256 True values                                                    |            |
| encode                                                                        |    226,025 |
| decode                                                                        |    371,919 |
| Array with 256 dict{string, int} pairs                                        |            |
| encode                                                                        |     22,941 |
| decode                                                                        |     24,784 |
| Dict with 256 arrays with 256 dict{string, int} pairs                         |            |
| encode                                                                        |         88 |
| decode                                                                        |         57 |
| Dict with 256 arrays with 256 dict{string, int} pairs, outputting sorted keys |            |
| encode                                                                        |         77 |
| Complex object                                                                |            |
| encode                                                                        |        832 |
| decode                                                                        |        741 |

Overall, I think there might be a slight performance regression here, but it's hard to be sure without the t-test. The "Array with 256 UTF-8" test does seem to be ~16% slower.

Erotemic · 2022-05-29T23:27:07Z

So it does seem that there is a very slight, but measurable performance regression. I suppose this is to be expected given that this is doing strictly more than the previous code. I'm not sure if there is any way around it. But I did 104 paired t-tests and got that (impl_version=5.2.1.dev51 without this PR) performs better than (impl_version=5.2.1.dev9 with this PR) on average (p=0.00001616).

Here is an example plot:

Might need to run it a few more times to buff up the stats because I'm getting a high std on the times for this PR. It's probably a fluke, but more stats will help figure that out.

This allows surrogates anywhere in the input, compatible with the json module from the standard library. This also refactors two interfaces: - The `PyUnicode` to `char*` conversion is moved into its own function, separated from the `JSONTypeContext` handling, so it can be reused for other things in the future (e.g. indentation and separators) which don't have a type context. - Converting the `char*` output to a Python string with surrogates intact requires the string length for `PyUnicode_Decode` & Co. While `strlen` could be used, the length is already known inside the encoder, so the encoder function now also takes an extra `size_t` pointer argument to return that and no longer NUL-terminates the string. This also permits output that contains NUL bytes (even though that would be invalid JSON), e.g. if an object's `__json__` method return value were to contain them. Fixes ultrajson#156 Fixes ultrajson#447 Fixes ultrajson#537 Supersedes ultrajson#284

…code_AsEncodedString

JustAnotherArchivist · 2022-05-30T06:41:27Z

@Erotemic Thank you! It's good that you see a difference between the two because there should be one in the code so far as it does extra work (the encoding string comparison, in particular). :-)
The array of dict loads graph is funny. Over an order of magnitude of stdev for that? That's insane...

I've also run my own benchmarks tonight and noticed that I appear to have introduced a memory leak here (which, of course, would also have a performance impact via constant allocations). I don't immediately see where that might be coming from though. It only occurs on non-ASCII characters, so I'm pretty sure it has to be related to my use of PyUnicode_AsEncodedString. But PyUnicodeToUTF8 stores the bytes object pointer in the context's newObj member (by passing that to PyUnicodeToUTF8Raw), which should get XDECREF'd in Object_endTypeContext. An easy way to reproduce is this, which peaks at 481 MB RSS for me (reduce the number to get a smaller leak, naturally) compared to ~10 MB RSS with main or an ASCII character:

python3 -c 'import ujson,timeit; o = "‽"; timeit.timeit("ujson.dumps(o)", number = 10000000, globals = globals())'

~~I haven't analysed it yet, only just saw it minutes ago really, so it might be something totally obvious. If anyone sees something, please let me know, else I'll throw some tooling at it tomorrow.~~

Edit: Nevermind, I realised what's going on. I need to pass the newObj pointer in by reference of course, not by value, else it's never overwritten.

Erotemic · 2022-05-30T15:09:23Z

Over an order of magnitude of stdev for that? That's insane...

One of my FOSS goals to accomplish on my vacation was to get nice benchmarks for this. I went a bit overboard and started writing a very general benchmark and analysis framework (#542), and I scrambled a bit on Sunday night to get something out the door for this PR. As such, I wouldn't be surprised if I just didn't run enough iterations or use robust enough outlier rejection. Next change I get to work on this, I'll run a more thorough set of measurements. Another clue that this could be the case is that Python's json also has a high std for some values of size, but not others.

I could also see a memory leak messing with cache efficiency, so perhaps now that that's fixed the high std will go away.

…leak

Erotemic · 2022-05-31T01:54:37Z

I ran the stats again 10 times on 5.2.1.dev9, and found that the std did not go away. In fact it got smoother, indicating that it might be a real effect. (although, I'm not sure if I can explain the behavior of Python's json in the dumps variant of that test).

However, when updating the branch to the latest (which is now 5.3.1.dev3), the issue does seem to have gone away

With the fix to the memory leak, the benchmarks are much closer. The openskill win probs give this branch a 33.5% and main a 37.4%.

Overall they are very close:

mean_time        count      mean       std           min       25%       50%       75%       max
impl_version                                                                                    
5.3.1.dev3    247192.0  0.000334  0.001044  3.670000e-07  0.000001  0.000010  0.000144  0.008951
5.2.1.dev51   157304.0  0.000335  0.001056  3.705000e-07  0.000001  0.000010  0.000142  0.009026

	('json', '2.0.9')	('ujson', '5.2.1.dev51')	('ujson', '5.2.1.dev9')	('ujson', '5.3.1.dev3')
('Array of Dict[str, int]', 'dumps')	13,465.78	24,483.13	24,079.77	24,448.54
('Array of Dict[str, int]', 'loads')	19,205.54	24,485.10	23,134.04	24,175.13
('Array with True values', 'dumps')	143,011.13	247,006.81	239,535.62	246,471.54
('Array with True values', 'loads')	265,125.40	374,553.88	363,526.20	369,392.79
('Array with UTF-8 strings', 'dumps')	6,098.56	8,523.53	7,072.90	8,275.22
('Array with UTF-8 strings', 'loads')	3,555.35	3,828.23	3,830.45	3,833.94
('Array with doubles', 'dumps')	7,808.38	41,552.05	41,808.82	41,069.89
('Array with doubles', 'loads')	22,633.05	57,433.73	56,682.77	57,067.74
('Complex object', 'dumps')	800.46	1,153.45	1,110.43	1,159.12
('Complex object', 'loads')	556.07	815.78	807.11	813.17

JustAnotherArchivist · 2022-05-31T02:09:22Z

@Erotemic Thank you! This does match what I'm seeing in my own benchmarks as well: very similar timings and no statistically significant differences, though I'm not very confident in my statistics, so I'll exclude that here.

I wrote my own small script rather than using the existing benchmarks for that. I'm specifically focusing only on dumps involving strings since anything else should be unaffected. I'm comparing main and this PR with encoding = "utf-8" and NULL; the middle one is mostly included because I wondered what the performance impact of that string comparison is. I built ujson in debug and production mode for all of these (though I doubt that matters). I'm running the tests on Python 3.10.1 and Debian using timeit.repeat (i.e. many runs in each measurement). The number of repetitions is determined dynamically such that a call is roughly 0.03 seconds, and then that's repeated 100 times. I'm testing strings of different lengths and ASCII/UTF-8 on their own, in lists, in dicts, and in dicts with sort_keys = True, 70 tests in total. That should cover everything of relevance to this PR, I think, but if anyone has additional suggestions, please let me know.

Code

Warning: poor and dirty code ahead.

Building from three branches in the repo: main (b300d64), fix-encode-surrogates (59aa3bf), and fix-encode-surrogates-with-string-comp (9b9af1a plus cherry-pick 59aa3bf)

for branch in main fix-encode-surrogates fix-encode-surrogates-with-string-comp; do git switch ${branch} && make clean && make build && mv -nv ujson.cpython-310-x86_64-linux-gnu.so ujson.cpython-310-x86_64-linux-gnu.so_${branch}_debug && make clean && make build-prod && mv -nv ujson.cpython-310-x86_64-linux-gnu.so ujson.cpython-310-x86_64-linux-gnu.so_${branch}_prod; done

with this simple Makefile:

clean:
	rm -rf build/ ujson.cpython-*.so

build:
	CFLAGS='-DDEBUG' python3 setup.py develop

build-prod:
	python3 setup.py develop

This results in six .so files:

ujson.cpython-310-x86_64-linux-gnu.so_fix-encode-surrogates_debug
ujson.cpython-310-x86_64-linux-gnu.so_fix-encode-surrogates_prod
ujson.cpython-310-x86_64-linux-gnu.so_fix-encode-surrogates-with-string-comp_debug
ujson.cpython-310-x86_64-linux-gnu.so_fix-encode-surrogates-with-string-comp_prod
ujson.cpython-310-x86_64-linux-gnu.so_main_debug
ujson.cpython-310-x86_64-linux-gnu.so_main_prod

Test driver script (./test):

#!/bin/bash
set -e

strings=(
	'""'
	'"a"'
	'"a" * 10'
	'"a" * 1000'
	'"‽"'
	'"‽" * 10'
	'"‽" * 1000'
        )

os=("${strings[@]}")
for string in "${strings[@]}"
do
	os+=(
		"[${string}]"
		"[${string} for i in range(10)]"
		"[${string} for i in range(100)]"
		"{${string}: ${string}}"
		"{${string} + str(i): ${string} for i in range(10)}"
		"{${string} + str(i): ${string} for i in range(100)}"
	    )
done

for o in "${os[@]}"
do
	num=
	for extra in '' sort
	do
		for so in ujson.cpython-310-x86_64-linux-gnu.so_*
		do
			echo "${so} ${o} ${extra}" >&2
			ln -sf "${so}" ujson.cpython-310-x86_64-linux-gnu.so

			if [[ -z "${num}" ]]
			then
				num=$(python3 test.py --num "${o}")
				echo "Repeats: ${num}" >&2
			fi

			python3 test.py "${so}" "${o}" "${num}" ${extra}
		done
		if [[ "${o}" != '{'* ]]
		then
			break
		fi
	done
done

Actual test code:

import sys
import timeit
import ujson

o = eval(sys.argv[2])
stmt = 'ujson.dumps(o)'
if sys.argv[4:]:
	stmt = 'ujson.dumps(o, sort_keys = True)'
t = timeit.Timer(stmt, globals = globals())

if sys.argv[1] == '--num':
	num = None
	def cb(number, timeTaken):
		global num
		num = number
	t.autorange(cb)
	num = num // 10
	print(num)
	sys.exit(0)
else:
	num = int(sys.argv[3])
out = t.repeat(repeat = 100, number = num)
print(ujson.dumps({'so': sys.argv[1], 'o': sys.argv[2], 'sorted': bool(sys.argv[4:]), 'stmt': stmt, 'num': num, 'times': out}))

Aggregation script:

import collections
import json
import math
import sys


UNITS = ['s', 'ms', 'μs']
def format_time(t):
	for unit in UNITS:
		if t >= 1:
			return f'{t:.3f} {unit}'
		t *= 1000
	return f'{t:.3f} ns'


names = {
	'ujson.cpython-310-x86_64-linux-gnu.so_fix-encode-surrogates_debug': 'PR dbg',
	'ujson.cpython-310-x86_64-linux-gnu.so_fix-encode-surrogates_prod': 'PR',
	'ujson.cpython-310-x86_64-linux-gnu.so_fix-encode-surrogates-with-string-comp_debug': 'PR utf-8 dbg',
	'ujson.cpython-310-x86_64-linux-gnu.so_fix-encode-surrogates-with-string-comp_prod': 'PR utf-8',
	'ujson.cpython-310-x86_64-linux-gnu.so_main_debug': 'main dbg',
	'ujson.cpython-310-x86_64-linux-gnu.so_main_prod': 'main',
}


objs = [json.loads(line) for line in sys.stdin]
nums = collections.defaultdict(set)
for obj in objs:
	nums[(obj['o'], obj['sorted'])].add(obj['num'])
assert all(len(s) == 1 for s in nums.values()), nums
nums = {k: v.pop() for k, v in nums.items()}

results = collections.defaultdict(dict)
for obj in objs:
	results[(obj['o'], obj['sorted'])][names[obj['so']]] = [t / nums[(obj['o'], obj['sorted'])] for t in obj['times']]
assert all(sorted(v.keys()) == sorted(list(results.values())[0].keys()) for v in results.values())
table = []
sos = sorted(list(results.values())[0].keys())
table.append(['object', 'repetitions'] + sos)
for o, sorted_ in sorted(results.keys()):
	objStr = f'`{o}`' + (' (sorted)' if sorted_ else '')
	num = nums[(o, sorted_)]
	row = [objStr, num]
	for so in sos:
		times = results[(o, sorted_)][so]
		avg = sum(times) / len(times)
		std = math.sqrt(sum((t - avg) ** 2 for t in times) / (len(times) - 1))
		row.append(f'{format_time(avg)} ± {format_time(std)}')
	table.append(row)

for row in table:
	print(' | '.join(map(str, row)))

It isn't pretty, but it gets the job done. :-)

Detailed results

Times are mean ± standard deviation across the 100 timings of calls to ujson.dumps (divided by the repetitions, so it's the time for one call).
main = b300d64, PR = 59aa3bf, PR utf-8 = 59aa3bf but with encoding = "utf-8" in the PyUnicode_AsEncodedString calls (9b9af1a plus cherry-pick 59aa3bf); 'dbg' = -DDEBUG builds

object	repetitions	PR	PR dbg	PR utf-8	PR utf-8 dbg	main	main dbg
`""`	100000	330.962 ns ± 21.579 ns	331.388 ns ± 26.562 ns	330.559 ns ± 9.437 ns	327.031 ns ± 12.451 ns	357.985 ns ± 66.177 ns	343.491 ns ± 35.661 ns
`"a"`	100000	331.411 ns ± 8.811 ns	331.438 ns ± 19.186 ns	335.556 ns ± 25.898 ns	334.828 ns ± 23.093 ns	341.940 ns ± 28.286 ns	342.782 ns ± 13.895 ns
`"a" * 10`	100000	359.705 ns ± 23.576 ns	374.189 ns ± 43.659 ns	361.411 ns ± 10.400 ns	373.530 ns ± 64.270 ns	367.340 ns ± 29.009 ns	408.382 ns ± 46.798 ns
`"a" * 1000`	10000	2.062 μs ± 64.108 ns	2.670 μs ± 272.481 ns	2.003 μs ± 123.380 ns	2.570 μs ± 73.442 ns	2.098 μs ± 70.130 ns	2.581 μs ± 163.473 ns
`"‽"`	100000	388.676 ns ± 18.003 ns	388.502 ns ± 24.181 ns	404.346 ns ± 26.557 ns	404.844 ns ± 29.161 ns	407.890 ns ± 58.940 ns	388.106 ns ± 19.613 ns
`"‽" * 10`	50000	555.224 ns ± 108.290 ns	514.468 ns ± 58.516 ns	507.981 ns ± 36.753 ns	507.013 ns ± 17.370 ns	527.522 ns ± 74.904 ns	513.023 ns ± 75.479 ns
`"‽" * 1000`	5000	7.865 μs ± 1.002 μs	8.810 μs ± 1.401 μs	7.348 μs ± 342.788 ns	8.477 μs ± 1.059 μs	7.496 μs ± 502.910 ns	8.811 μs ± 1.379 μs
`["" for i in range(10)]`	50000	808.299 ns ± 38.030 ns	844.152 ns ± 58.747 ns	843.949 ns ± 70.050 ns	818.540 ns ± 20.382 ns	823.642 ns ± 57.387 ns	815.330 ns ± 17.406 ns
`["" for i in range(100)]`	5000	4.864 μs ± 262.122 ns	4.689 μs ± 156.891 ns	4.979 μs ± 464.768 ns	4.806 μs ± 264.168 ns	4.840 μs ± 553.113 ns	4.978 μs ± 602.529 ns
`[""]`	50000	464.924 ns ± 44.260 ns	445.698 ns ± 16.507 ns	460.880 ns ± 64.904 ns	498.537 ns ± 101.675 ns	553.022 ns ± 153.658 ns	465.797 ns ± 74.576 ns
`["a" * 10 for i in range(10)]`	20000	1.009 μs ± 117.170 ns	1.068 μs ± 66.220 ns	977.184 ns ± 30.245 ns	1.039 μs ± 79.822 ns	1.010 μs ± 78.960 ns	1.057 μs ± 52.710 ns
`["a" * 10 for i in range(100)]`	5000	6.323 μs ± 576.793 ns	6.747 μs ± 179.390 ns	6.347 μs ± 362.491 ns	6.722 μs ± 260.656 ns	6.508 μs ± 276.877 ns	6.738 μs ± 362.473 ns
`["a" * 1000 for i in range(10)]`	1000	15.841 μs ± 563.196 ns	21.570 μs ± 773.005 ns	15.783 μs ± 1.702 μs	21.437 μs ± 761.532 ns	16.631 μs ± 717.250 ns	21.774 μs ± 1.680 μs
`["a" * 1000 for i in range(100)]`	100	160.445 μs ± 9.618 μs	251.155 μs ± 16.864 μs	166.693 μs ± 18.840 μs	247.247 μs ± 18.211 μs	165.248 μs ± 5.803 μs	248.047 μs ± 16.170 μs
`["a" * 1000]`	10000	1.985 μs ± 70.743 ns	2.537 μs ± 114.215 ns	1.962 μs ± 43.358 ns	2.536 μs ± 129.308 ns	2.101 μs ± 99.484 ns	2.612 μs ± 258.642 ns
`["a" * 10]`	50000	462.762 ns ± 51.515 ns	460.432 ns ± 48.403 ns	488.346 ns ± 37.602 ns	450.428 ns ± 39.787 ns	472.680 ns ± 69.874 ns	466.314 ns ± 34.472 ns
`["a" for i in range(10)]`	50000	868.938 ns ± 39.353 ns	883.671 ns ± 80.923 ns	844.851 ns ± 26.965 ns	860.837 ns ± 47.408 ns	859.650 ns ± 59.082 ns	860.623 ns ± 60.249 ns
`["a" for i in range(100)]`	5000	5.006 μs ± 280.968 ns	5.146 μs ± 627.172 ns	4.993 μs ± 239.469 ns	4.880 μs ± 266.675 ns	5.051 μs ± 412.666 ns	4.951 μs ± 234.397 ns
`["a"]`	50000	421.045 ns ± 20.659 ns	429.624 ns ± 16.557 ns	431.644 ns ± 70.713 ns	434.711 ns ± 40.802 ns	433.016 ns ± 37.992 ns	437.467 ns ± 37.961 ns
`["‽" * 10 for i in range(10)]`	10000	2.008 μs ± 93.952 ns	2.231 μs ± 378.806 ns	2.158 μs ± 71.979 ns	2.355 μs ± 265.526 ns	2.200 μs ± 412.928 ns	2.022 μs ± 33.960 ns
`["‽" * 10 for i in range(100)]`	2000	16.535 μs ± 1.415 μs	16.848 μs ± 1.895 μs	17.333 μs ± 1.563 μs	18.998 μs ± 1.420 μs	16.014 μs ± 775.730 ns	15.583 μs ± 474.698 ns
`["‽" * 1000 for i in range(10)]`	500	68.734 μs ± 4.193 μs	74.115 μs ± 1.556 μs	68.692 μs ± 3.590 μs	75.018 μs ± 5.948 μs	69.124 μs ± 1.685 μs	75.870 μs ± 2.469 μs
`["‽" * 1000 for i in range(100)]`	20	671.081 μs ± 33.647 μs	1.082 ms ± 99.842 μs	664.693 μs ± 59.960 μs	1.068 ms ± 50.550 μs	700.504 μs ± 81.766 μs	1.184 ms ± 282.350 μs
`["‽" * 1000]`	5000	6.970 μs ± 187.767 ns	7.594 μs ± 179.000 ns	6.993 μs ± 195.717 ns	7.652 μs ± 271.062 ns	7.017 μs ± 162.788 ns	7.661 μs ± 153.459 ns
`["‽" * 10]`	50000	661.926 ns ± 111.926 ns	637.064 ns ± 124.779 ns	658.298 ns ± 104.951 ns	657.599 ns ± 147.319 ns	576.827 ns ± 119.654 ns	566.266 ns ± 43.490 ns
`["‽" for i in range(10)]`	20000	1.290 μs ± 120.439 ns	1.313 μs ± 36.971 ns	1.456 μs ± 130.919 ns	1.570 μs ± 259.890 ns	1.285 μs ± 49.978 ns	1.261 μs ± 60.233 ns
`["‽" for i in range(100)]`	5000	9.176 μs ± 566.832 ns	10.385 μs ± 1.521 μs	10.248 μs ± 871.610 ns	11.451 μs ± 928.792 ns	8.983 μs ± 175.036 ns	9.132 μs ± 404.129 ns
`["‽"]`	50000	459.685 ns ± 19.125 ns	460.470 ns ± 31.992 ns	494.480 ns ± 46.169 ns	487.892 ns ± 15.926 ns	457.412 ns ± 8.236 ns	470.631 ns ± 66.596 ns
`{"" + str(i): "" for i in range(10)}`	20000	1.448 μs ± 178.279 ns	1.504 μs ± 105.600 ns	1.645 μs ± 82.541 ns	1.571 μs ± 105.794 ns	1.779 μs ± 357.363 ns	1.495 μs ± 244.853 ns
`{"" + str(i): "" for i in range(10)}` (sorted)	20000	2.340 μs ± 113.971 ns	2.577 μs ± 520.136 ns	2.590 μs ± 412.858 ns	2.588 μs ± 490.597 ns	2.268 μs ± 85.880 ns	2.635 μs ± 441.124 ns
`{"" + str(i): "" for i in range(100)}`	2000	12.661 μs ± 3.115 μs	13.987 μs ± 4.396 μs	13.162 μs ± 455.384 ns	13.837 μs ± 1.201 μs	11.597 μs ± 1.415 μs	12.076 μs ± 841.905 ns
`{"" + str(i): "" for i in range(100)}` (sorted)	2000	17.813 μs ± 1.518 μs	18.737 μs ± 1.219 μs	19.775 μs ± 4.103 μs	19.637 μs ± 1.113 μs	17.280 μs ± 699.506 ns	18.892 μs ± 4.041 μs
`{"": ""}`	50000	512.216 ns ± 49.006 ns	514.969 ns ± 32.391 ns	535.441 ns ± 40.912 ns	540.447 ns ± 50.803 ns	533.904 ns ± 25.071 ns	510.613 ns ± 24.115 ns
`{"": ""}` (sorted)	50000	992.358 ns ± 68.370 ns	1.030 μs ± 83.488 ns	1.006 μs ± 71.661 ns	1.064 μs ± 73.835 ns	1.045 μs ± 70.550 ns	1.002 μs ± 46.819 ns
`{"a" * 10 + str(i): "a" * 10 for i in range(10)}`	10000	2.032 μs ± 74.509 ns	2.031 μs ± 83.832 ns	2.189 μs ± 125.064 ns	2.225 μs ± 85.969 ns	2.013 μs ± 195.430 ns	2.066 μs ± 176.608 ns
`{"a" * 10 + str(i): "a" * 10 for i in range(10)}` (sorted)	10000	2.948 μs ± 185.798 ns	3.209 μs ± 272.913 ns	3.044 μs ± 305.726 ns	3.176 μs ± 345.496 ns	2.862 μs ± 145.853 ns	2.999 μs ± 141.117 ns
`{"a" * 10 + str(i): "a" * 10 for i in range(100)}`	2000	15.642 μs ± 1.255 μs	16.746 μs ± 574.249 ns	16.764 μs ± 545.844 ns	19.249 μs ± 1.664 μs	15.364 μs ± 1.745 μs	15.931 μs ± 347.373 ns
`{"a" * 10 + str(i): "a" * 10 for i in range(100)}` (sorted)	2000	21.560 μs ± 1.867 μs	22.631 μs ± 1.682 μs	22.990 μs ± 2.037 μs	24.288 μs ± 2.178 μs	21.980 μs ± 2.179 μs	23.018 μs ± 2.295 μs
`{"a" * 1000 + str(i): "a" * 1000 for i in range(10)}`	500	31.836 μs ± 1.909 μs	43.113 μs ± 1.143 μs	32.363 μs ± 2.925 μs	43.804 μs ± 3.606 μs	34.213 μs ± 4.043 μs	43.522 μs ± 1.748 μs
`{"a" * 1000 + str(i): "a" * 1000 for i in range(10)}` (sorted)	500	32.976 μs ± 822.254 ns	45.163 μs ± 4.429 μs	33.829 μs ± 3.523 μs	44.067 μs ± 945.833 ns	33.695 μs ± 3.187 μs	45.067 μs ± 3.510 μs
`{"a" * 1000 + str(i): "a" * 1000 for i in range(100)}`	50	323.762 μs ± 33.111 μs	525.391 μs ± 25.537 μs	339.531 μs ± 55.616 μs	581.457 μs ± 102.544 μs	327.819 μs ± 10.353 μs	540.379 μs ± 37.670 μs
`{"a" * 1000 + str(i): "a" * 1000 for i in range(100)}` (sorted)	50	457.764 μs ± 11.453 μs	578.741 μs ± 39.808 μs	456.773 μs ± 36.551 μs	570.697 μs ± 12.694 μs	462.546 μs ± 35.306 μs	586.032 μs ± 29.981 μs
`{"a" * 1000: "a" * 1000}`	5000	3.638 μs ± 111.291 ns	4.780 μs ± 260.507 ns	4.117 μs ± 938.619 ns	4.763 μs ± 137.732 ns	3.693 μs ± 407.699 ns	5.153 μs ± 872.270 ns
`{"a" * 1000: "a" * 1000}` (sorted)	5000	4.126 μs ± 168.282 ns	5.230 μs ± 165.105 ns	4.216 μs ± 200.370 ns	5.314 μs ± 360.869 ns	4.113 μs ± 98.996 ns	5.481 μs ± 178.557 ns
`{"a" * 10: "a" * 10}`	50000	559.787 ns ± 23.744 ns	630.646 ns ± 60.463 ns	658.801 ns ± 178.613 ns	596.396 ns ± 44.788 ns	591.999 ns ± 20.885 ns	573.709 ns ± 44.375 ns
`{"a" * 10: "a" * 10}` (sorted)	50000	1.069 μs ± 19.611 ns	1.090 μs ± 129.267 ns	1.079 μs ± 111.368 ns	1.090 μs ± 63.895 ns	1.087 μs ± 82.415 ns	1.101 μs ± 179.307 ns
`{"a" + str(i): "a" for i in range(10)}`	20000	1.595 μs ± 93.374 ns	1.756 μs ± 134.943 ns	1.843 μs ± 221.463 ns	1.868 μs ± 207.812 ns	1.707 μs ± 50.682 ns	1.844 μs ± 381.616 ns
`{"a" + str(i): "a" for i in range(10)}` (sorted)	20000	2.438 μs ± 62.248 ns	2.589 μs ± 148.294 ns	2.872 μs ± 537.920 ns	2.732 μs ± 204.015 ns	2.665 μs ± 557.911 ns	2.666 μs ± 563.152 ns
`{"a" + str(i): "a" for i in range(100)}`	2000	12.981 μs ± 1.230 μs	13.194 μs ± 481.495 ns	14.152 μs ± 1.295 μs	15.162 μs ± 1.900 μs	12.064 μs ± 1.040 μs	12.340 μs ± 347.782 ns
`{"a" + str(i): "a" for i in range(100)}` (sorted)	2000	17.664 μs ± 1.506 μs	18.241 μs ± 653.215 ns	20.911 μs ± 4.320 μs	19.302 μs ± 1.055 μs	18.069 μs ± 1.945 μs	18.367 μs ± 917.251 ns
`{"a": "a"}`	50000	597.640 ns ± 25.139 ns	498.800 ns ± 18.347 ns	611.569 ns ± 165.416 ns	602.315 ns ± 179.341 ns	494.121 ns ± 16.147 ns	560.490 ns ± 150.525 ns
`{"a": "a"}` (sorted)	50000	1.047 μs ± 146.205 ns	974.692 ns ± 32.447 ns	1.077 μs ± 94.959 ns	1.123 μs ± 42.643 ns	1.058 μs ± 113.001 ns	1.025 μs ± 90.689 ns
`{"‽" * 10 + str(i): "‽" * 10 for i in range(10)}`	10000	3.592 μs ± 108.392 ns	3.898 μs ± 474.092 ns	4.633 μs ± 1.578 μs	4.179 μs ± 174.091 ns	3.625 μs ± 246.603 ns	3.966 μs ± 464.331 ns
`{"‽" * 10 + str(i): "‽" * 10 for i in range(10)}` (sorted)	10000	5.227 μs ± 1.030 μs	5.808 μs ± 1.307 μs	5.329 μs ± 932.538 ns	5.700 μs ± 1.296 μs	4.628 μs ± 277.763 ns	5.035 μs ± 912.219 ns
`{"‽" * 10 + str(i): "‽" * 10 for i in range(100)}`	1000	31.508 μs ± 1.176 μs	32.719 μs ± 1.511 μs	33.604 μs ± 904.663 ns	35.264 μs ± 2.036 μs	31.745 μs ± 2.288 μs	32.682 μs ± 3.127 μs
`{"‽" * 10 + str(i): "‽" * 10 for i in range(100)}` (sorted)	1000	41.031 μs ± 850.535 ns	44.331 μs ± 8.925 μs	43.623 μs ± 988.271 ns	45.266 μs ± 3.913 μs	40.777 μs ± 1.192 μs	41.681 μs ± 2.454 μs
`{"‽" * 1000 + str(i): "‽" * 1000 for i in range(10)}`	100	139.376 μs ± 8.547 μs	216.094 μs ± 39.039 μs	138.553 μs ± 11.026 μs	200.044 μs ± 9.055 μs	140.704 μs ± 14.005 μs	199.116 μs ± 10.668 μs
`{"‽" * 1000 + str(i): "‽" * 1000 for i in range(10)}` (sorted)	100	223.860 μs ± 45.564 μs	228.023 μs ± 12.153 μs	237.516 μs ± 53.341 μs	272.447 μs ± 56.124 μs	261.851 μs ± 64.149 μs	291.881 μs ± 55.386 μs
`{"‽" * 1000 + str(i): "‽" * 1000 for i in range(100)}`	10	1.397 ms ± 121.731 μs	2.262 ms ± 160.086 μs	1.424 ms ± 210.156 μs	2.206 ms ± 117.800 μs	1.456 ms ± 279.166 μs	2.352 ms ± 391.154 μs
`{"‽" * 1000 + str(i): "‽" * 1000 for i in range(100)}` (sorted)	10	1.566 ms ± 175.532 μs	2.716 ms ± 511.366 μs	2.353 ms ± 863.657 μs	2.512 ms ± 118.144 μs	2.159 ms ± 464.589 μs	3.093 ms ± 790.857 μs
`{"‽" * 1000: "‽" * 1000}`	2000	13.441 μs ± 344.561 ns	14.856 μs ± 601.978 ns	13.384 μs ± 257.500 ns	14.919 μs ± 798.129 ns	13.713 μs ± 563.873 ns	14.943 μs ± 767.904 ns
`{"‽" * 1000: "‽" * 1000}` (sorted)	2000	14.193 μs ± 1.279 μs	15.375 μs ± 710.797 ns	14.075 μs ± 370.828 ns	15.560 μs ± 1.219 μs	16.631 μs ± 3.379 μs	17.480 μs ± 2.910 μs
`{"‽" * 10: "‽" * 10}`	50000	733.941 ns ± 27.110 ns	754.108 ns ± 45.325 ns	758.309 ns ± 27.925 ns	791.072 ns ± 60.025 ns	732.935 ns ± 49.456 ns	809.213 ns ± 47.058 ns
`{"‽" * 10: "‽" * 10}` (sorted)	50000	1.276 μs ± 91.393 ns	1.259 μs ± 118.681 ns	1.273 μs ± 35.804 ns	1.411 μs ± 236.450 ns	1.240 μs ± 47.544 ns	1.317 μs ± 83.957 ns
`{"‽" + str(i): "‽" for i in range(10)}`	10000	2.209 μs ± 51.946 ns	2.319 μs ± 42.217 ns	2.428 μs ± 32.124 ns	2.561 μs ± 97.825 ns	2.267 μs ± 274.004 ns	2.251 μs ± 55.288 ns
`{"‽" + str(i): "‽" for i in range(10)}` (sorted)	10000	3.425 μs ± 346.904 ns	3.515 μs ± 361.567 ns	3.584 μs ± 302.316 ns	3.608 μs ± 164.488 ns	3.280 μs ± 289.558 ns	3.339 μs ± 445.816 ns
`{"‽" + str(i): "‽" for i in range(100)}`	2000	19.128 μs ± 985.401 ns	19.783 μs ± 1.300 μs	21.647 μs ± 1.043 μs	22.034 μs ± 668.883 ns	18.926 μs ± 1.698 μs	19.609 μs ± 1.630 μs
`{"‽" + str(i): "‽" for i in range(100)}` (sorted)	2000	26.560 μs ± 3.053 μs	27.238 μs ± 1.672 μs	29.051 μs ± 2.024 μs	31.536 μs ± 5.686 μs	25.624 μs ± 534.098 ns	26.127 μs ± 1.471 μs
`{"‽": "‽"}`	50000	586.382 ns ± 23.397 ns	614.275 ns ± 19.908 ns	612.504 ns ± 11.466 ns	730.181 ns ± 153.693 ns	638.338 ns ± 39.827 ns	624.343 ns ± 18.022 ns
`{"‽": "‽"}` (sorted)	50000	1.061 μs ± 35.487 ns	1.131 μs ± 68.909 ns	1.140 μs ± 212.915 ns	1.134 μs ± 111.423 ns	1.059 μs ± 29.087 ns	1.049 μs ± 27.367 ns

The raw output of ./test is available here: https://transfer.archivete.am/F77fw/ujson-pr530-benchmark.jsonl

hugovk · 2022-06-01T15:13:36Z

Thanks!

JustAnotherArchivist marked this pull request as draft April 17, 2022 03:42

JustAnotherArchivist force-pushed the fix-encode-surrogates branch from abc7a7d to c9df712 Compare April 17, 2022 03:49

JustAnotherArchivist marked this pull request as ready for review April 17, 2022 03:50

JustAnotherArchivist mentioned this pull request Apr 17, 2022

Allow str and None values for indent #518

Open

Erotemic reviewed Apr 20, 2022

View reviewed changes

lib/ultrajsonenc.c Show resolved Hide resolved

python/objToJSON.c Outdated Show resolved Hide resolved

This was referenced May 18, 2022

A crash issue happens during fuzzing test #537

Closed

Generalize benchmarks #532

Closed

JustAnotherArchivist marked this pull request as draft May 29, 2022 05:40

JustAnotherArchivist added 2 commits May 30, 2022 01:58

Switch to NULL encoding (= UTF-8) to avoid string comparison in PyUni…

98321fa

…code_AsEncodedString

JustAnotherArchivist force-pushed the fix-encode-surrogates branch from c9df712 to 98321fa Compare May 30, 2022 06:41

Fix bytesObj not getting assigned and DECREFd, resulting in a memory …

59aa3bf

…leak

JustAnotherArchivist marked this pull request as ready for review May 31, 2022 03:53

bwoodsend approved these changes May 31, 2022

View reviewed changes

hugovk added the changelog: Fixed For any bug fixes label Jun 1, 2022

hugovk merged commit 66bb6e0 into ultrajson:main Jun 1, 2022

hugovk mentioned this pull request Jun 1, 2022

Fix handling of surrogate pseudocharacters under Python 3. #284

Closed

sync-by-unito bot mentioned this pull request Jul 11, 2022

Bump ujson from 4.3.0 to 5.4.0 in /sample-projects/streaming-audio/FastAPI/live-transcription-fastapi deepgram/deepgram-python-sdk#28

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix handling of surrogates on encoding #530

Fix handling of surrogates on encoding #530

JustAnotherArchivist commented Apr 17, 2022 •

edited

codecov-commenter commented Apr 17, 2022 •

edited

JustAnotherArchivist commented Apr 17, 2022

Erotemic commented Apr 21, 2022

Erotemic commented May 29, 2022

JustAnotherArchivist commented May 30, 2022 •

edited

Erotemic commented May 30, 2022

Erotemic commented May 31, 2022

JustAnotherArchivist commented May 31, 2022

hugovk commented Jun 1, 2022

Fix handling of surrogates on encoding #530

Fix handling of surrogates on encoding #530

Conversation

JustAnotherArchivist commented Apr 17, 2022 • edited

codecov-commenter commented Apr 17, 2022 • edited

Codecov Report

JustAnotherArchivist commented Apr 17, 2022

Erotemic commented Apr 21, 2022

Erotemic commented May 29, 2022

JustAnotherArchivist commented May 30, 2022 • edited

Erotemic commented May 30, 2022

Erotemic commented May 31, 2022

JustAnotherArchivist commented May 31, 2022

hugovk commented Jun 1, 2022

JustAnotherArchivist commented Apr 17, 2022 •

edited

codecov-commenter commented Apr 17, 2022 •

edited

JustAnotherArchivist commented May 30, 2022 •

edited