speedup array work by using push instead of concat #185

elidoran · 2016-12-17T12:40:13Z

Current implementation creates "throw away" (intermediary) arrays with two concat() calls on an empty initial new array.

It could instead be obj[key] = [].concat(obj[key], val); which puts both values in the same call to concat(). This is only marginally faster.

The fastest way is using push() when it's an array and creating the new array with the values otherwise. When testing with 10 million iterations this way is an order of magnitude faster than the two above methods. It also avoids creating a lot of "throw away" arrays.

Travic CI failure is due to new linting seeing unnecessary escapes. I fixed those in a new PR #184.

ljharb · 2016-12-17T17:49:52Z

Which benchmarks are you using to determine "faster"? Due to the JIT in JS, if it's "intuition" or "theory", it's invariably a wrong assumption.

Separately, since nobody has 10 million query string values, that's an irrelevant comparison. I'd need to see proof that it was actually faster on practical, real-world test cases, before wanting to use mutation, merely for performance.

ljharb · 2016-12-17T17:55:26Z

lib/parse.js

-            obj[key] = [].concat(obj[key]).concat(val);
+            var cur = obj[key];
+            if (Array.isArray(cur)) {
+                cur.push(val);


it's also worth noting that this behaves differently when val is an array or not, because of the concat behavior. Since the decoder can be user-provided, it becomes very important to retain that auto-spreading behavior (as well as the lack of mutation).

elidoran · 2016-12-17T18:48:08Z

I spent time setting up performance testing runs and comparing four different implementations to see which ones work better. I didn't imagine it :) I also don't expect you to take my word for it. I figured you'd run some tests yourself to see. I can share some simple perf testing scripts if you'd like. I didn't get into the details in the PR because, basically, when contributing to a new project I'm never sure if the maintainer is still active or cares for PR's. :) So, I throw some in there to see how they respond.

I spotted this section because the concat function is one of those things on my mental list to look out for because performance testing shows avoiding it is often faster.

I think running the tests many times is relevant because it causes runtime optimizations to kick in and give more real results. A server will eventually run a while and parse millions of query strings.

I tested the implementations by giving it the permutations:

two non-array values
two array values
array and non-array
non-array and array

The implementations I tested:

the current one: [].concat(obj[key]).concat(val)
a variant: [].concat(obj[key], val)
a crazy one: Array.prototype.concat(obj[key], val) (It’s supposed to be less than ideal, and it is)
the one I submitted

and the one which allows val to be an array:

var cur = obj[key];
if (Array.isArray(cur)) {
    if (Array.isArray(val)) {
        cur.push.apply(cur, val);
    } else {
        cur.push(val);
    }
} else if (Array.isArray(val)) {
    val.unshift(cur);
    obj[key] = val;
} else {
    obj[key] = [cur, val];
}

I left it the simpler way expecting val to always be a non-array because the val is pulled from a string via string.slice() right above there. I didn't look at the optional decoder and think "hmm, they may return an array there.” That’s my mistake.

I'd be happy to change the PR to use that implementation, if you're interested. And, I can provide some performance testing

ljharb · 2016-12-17T20:52:48Z

Thanks, I'm glad to hear you're being thorough :-)

So, to reiterate, my concerns are:

first and foremost, the code should be clear - "fast" is the least important consideration.
observable mutations must be avoided, and non-observable mutations should be avoided when possible (for clarity)
if there is a compelling performance case to non-observably mutate, then we can do that, but it needs to be done cautiously, and with comments and metrics detailing why the mutation was permitted.

One way to mitigate the code clarity concern is to move the current implementation into a helper function first - leaving the main function clean - and then optimize the performance of that helper function. In other words, if we stick [].concat(obj[key]).concat(val) in a function, and then swap out the implementation of that function for testing. That also means that even your 5th implementation, which is more complex and larger, could still live in an internally tested abstraction, away from the main function.

Please do update the PR; it's exceedingly rare that a performance claim is actually worth optimizing for when evidence is requested, so it'd be fun to see this one bear out :-)

…bine

elidoran · 2016-12-19T16:34:46Z

I don’t consider speed the least consideration. If it’s very clear code but, takes forever, then it’s of no use.

I agree about clarity. Code should be as readable as possible. JavaScript has a tough time with that. All the extra braces and parenthesis and stuff make it messy. Also, this code is lacking comments, so, I didn’t put my own in, otherwise I would document what it’s doing. I think comments are required for clarity. I added a few to the new version of this.

I’m wary of mutation where appropriate. I’m not wholly opposed to its use. If the array came from outside the library then “defensive copying” would be a good idea at the start. In this case, the qs library is creating the array itself, using it repeatedly, and then providing it afterwards back to the user. So, it’s okay for it to mutate the arrays it creates for itself while processing the query string.

I’m also a fan of moving code blocks out into their own functions. Monolithic functions are tough to deal with. I moved this one out to the already require()’d “utils”. I also altered its structure for greater readability.

I’m providing the benchmark info via a separate PR; #187. That way it isn’t part of a code change PR.

Below is an image of the results I get when running the benchmark. The column headers show the type of the two inputs given to the implementations. The first column is the row headers showing which implementation it is. The numbers are what's produced by BenchmarkJS which are the average number of operations during a run and the +- stuff is the variance of the result from run to run.

ljharb · 2016-12-19T19:05:23Z

lib/utils.js

+
+    // mutate `a` to append `b` and then return it
+    if (firstIsArray) {
+        secondIsArray ? a.push.apply(a, b) : a.push(b);


It's probably a good idea to merge in utils.combine with concat first into master, and then have the optimization PR be separate.

speedup array work by using push instead of concat

b346519

ljharb reviewed Dec 17, 2016

View reviewed changes

elidoran added 3 commits December 19, 2016 10:22

move array combine to its own function in utils

0b2eda9

Merge remote-tracking branch 'upstream/master' into speedup-array-com…

57923f4

…bine

add missing comment

9c6f0b1

elidoran mentioned this pull request Dec 19, 2016

Show benchmarking for array combine stuff #187

Closed

ljharb reviewed Dec 19, 2016

View reviewed changes

elidoran mentioned this pull request Dec 22, 2016

Benchmark combo #189

Draft

elidoran closed this Dec 22, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speedup array work by using push instead of concat #185

speedup array work by using push instead of concat #185

elidoran commented Dec 17, 2016

ljharb commented Dec 17, 2016

ljharb Dec 17, 2016

elidoran commented Dec 17, 2016

ljharb commented Dec 17, 2016

elidoran commented Dec 19, 2016

ljharb Dec 19, 2016

speedup array work by using push instead of concat #185

speedup array work by using push instead of concat #185

Conversation

elidoran commented Dec 17, 2016

ljharb commented Dec 17, 2016

ljharb Dec 17, 2016

Choose a reason for hiding this comment

elidoran commented Dec 17, 2016

ljharb commented Dec 17, 2016

elidoran commented Dec 19, 2016

ljharb Dec 19, 2016

Choose a reason for hiding this comment