Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-allocate capacity for maps in json! macro. #876

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

zachs18
Copy link

@zachs18 zachs18 commented Apr 5, 2022

Attempt to resolve #810.

This implementation works by expanding the contents of the map twice, first producing a capacity value, then actually inserting the elements.

Because it expands the contents of the map twice, when encountering invalid syntax, it prints some error messages twice, which may be unwanted.

@zachs18
Copy link
Author

zachs18 commented Apr 6, 2022

Actually, I think I could get rid of the duplicated error messages by just returning 0 in the capacity counting part of the macro when a syntax error has occurred, and just letting the actual parsing part of the macro print the actual error message.

@zachs18
Copy link
Author

zachs18 commented Apr 7, 2022

The latest commit makes it so error messages are printed only once (like before) by just returning 0 if an error occurs in the capacity-counting part of the macro.

Considering that large enough invocations of this macro are more likely to hit the recursion limit than to cause compilation performance issues, I'm guessing performance is not a big issue here.

But anyway, i also did a small performance test.
My results were:

With features=["preserve_order"], debug compilation time went up from ~5 seconds to ~9 seconds, and release compilation time went up from ~20 seconds to ~25 seconds. Runtime did not appear to change significantly, and memory usage was decreased by more than half, from 736801 (bytes allocated as reported by valgrind) for the master branch, to 348933 for this pull request.

Without features=["preserve_order"], debug compilation time went up from ~5 seconds to ~8.5 seconds, and release compilation time went up from ~15 seconds to ~18 seconds. Runtime did not appear to change significantly, and memory usage was 909965 (bytes allocated as reported by valgrind) for all versions (debug/release, master/pull).

I used the following program with only serde_json as a direct dependency, and did cargo build; touch src/main.rs; time cargo build to get closer to accurate timings for just compiling the performance test code.

#![recursion_limit = "16384"]
use serde_json::json;
fn main() {
    let _m = include!("../test.json");
}

Where ../test.json is a file containing json!{ /* ~27 KiB of json */ }, as generated by this Python script with parameters size=1024 and max_depth=16:

import json
import random

ALPHABET: str = "abcdefghijklmnopqrstuvwxyz"

def make_random_key(size: int) -> str:
    return ''.join(random.choices(ALPHABET, k=size))

def make_random_map(size: int, max_depth: int) -> dict:
    if size <= 1 or max_depth <= 1:
        return {make_random_key(4): None for _ in range(size)}
    else:
        map = {}
        count: int = random.randint(1, size)
        each: int = size // count
        leftover: int = size % count
        for i in range(count):
            map[make_random_key(4)] = make_random_map(each + (1 if i < leftover else 0), max_depth-1)
        return map

def make_random_json(size: int, max_depth: int) -> str:
    return json.dumps(make_random_map(size, max_depth))

if __name__ == "__main__":
    import sys
    size = int(sys.argv[1])
    max_depth = int(sys.argv[2])
    print("json!{", make_random_json(size, max_depth), "}")

Attempt to address issue serde-rs#810.

This implementation works by expanding the contents of the map twice,
first to produce a capacity value, then actually inserting the elements.

Because it expands the contents of the map twice, when encountering invalid syntax,
it may print error messages twice, which may be unwanted.
…rt errors.

Since these errors are already reported by the part of the macro that actually parses the map,
it is unnecessary to report them twice, so the capacity-counting part can just return 0 if
it encounters an error.
@zachs18 zachs18 marked this pull request as draft October 30, 2022 00:50
@zachs18
Copy link
Author

zachs18 commented Oct 30, 2022

I've rebased onto current master. When I get a chance I'll re-do the performance test. Also, looking back, it'd probably be a good idea to only do this with features=["preserve_order"], since otherwise it doesn't actually improve memory usage at runtime. I'll also do that later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Make json macro pre-allocate capacity for maps
1 participant