Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The strings are not optimized on nodejs #85

Open
nikils opened this issue Jan 16, 2015 · 3 comments
Open

The strings are not optimized on nodejs #85

nikils opened this issue Jan 16, 2015 · 3 comments

Comments

@nikils
Copy link

nikils commented Jan 16, 2015

Hi,

The strings in josn5 are formed using character concatenation, this results in large amount of memory needed to store each in nodejs. v8 uses 'first' and 'second' pointer store concatenated strings. Since we are using character concatenation it ends up storing large tree instead of single string. One of the work around is to call string.charAt(0) on the resulting string so that v8 then converts the large tree into single string.

Here is the work around we are using.

function parse_json5(text, def) {
    var json;
    try {
        json = json5.parse(text);
        v8optimze(json);
    } catch (ex) {
        json = isUndefined(def) ? {} : def;
    }

    return json;
}

function v8optimze(jsonObj) {
    var tempStr, key;

    if (!jsonObj) {
        return null;
    }

    switch (typeof jsonObj) {
        case 'string':
            if (jsonObj.length > 0) {
                tempStr += jsonObj.charAt(0);
            }
            break;
        case 'object':
            if (Array.isArray(jsonObj)) {
                for (key = 0; key < jsonObj.length; key++) {
                    v8optimze(jsonObj[key]);
                }
            } else {
                for (key in jsonObj) {
                    if (jsonObj.hasOwnProperty(key)) {
                        v8optimze(key);
                        v8optimze(jsonObj[key]);
                    }
                }
            }
            break;
    }
}

json5

The screenshot shows how a small string (25 bytes) is using 1284 bytes instead.
You can use node-heapdump (https://github.com/bnoordhuis/node-heapdump) to get heapdump in nodejs and analyze it in chrome dev tools.
Here is the code to get heapdump:

var json5 = require('json5');
var heapdump = require('heapdump');

global.test = json5.parse('{ foo: "hello world"}');
heapdump.writeSnapshot('/tmp/' + Date.now() + '.heapsnapshot', function (err, fil) {
console.log(err);
console.log('saved');
console.log(global.test);
});

@aseemk
Copy link
Member

aseemk commented Jan 17, 2015

Fascinating! Thank you.

Does anyone else have any thoughts on this? Should we simply do this on all objects we return? Or if, say, the input string was over a certain size?

Or perhaps we can do this as part of the parse process inline. E.g. on every close quote.

@rlidwka
Copy link
Contributor

rlidwka commented Jan 19, 2015

Thoughts are pretty skeptic, and the published workaround isn't a very good one.

However, char-by-char concatenation in general isn't efficient. Maybe instead of var str = ''; str += 'x'; return str we could do var arr = []; arr.push('x'); return arr.join('') or something.

Also, any benchmarks?

@bnjmnt4n
Copy link
Contributor

Array#join is faster in IE6/7, but is slower for modern browsers. Using += causes modern browsers to store references to the strings being concatenated, but the whole string is joined only when actually being used, unlike Array#join, which joins strings immediately. That said, char-by-char concatenation is extremely slow. There should be a better way to do that.

@jordanbtucker jordanbtucker self-assigned this Sep 20, 2017
@jordanbtucker jordanbtucker added this to the v3.0.0 milestone Feb 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants