Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

preserve order, make variadic and handle falsy values in concat #1436

Merged
merged 6 commits into from Jun 23, 2017

Conversation

hargasinski
Copy link
Collaborator

@hargasinski hargasinski commented Jun 12, 2017

This is based on the discussion in #1430. It adds order preservation to the async.concat functions.

I feel like I might have changed our internal structure a little bit too much though by removing internal/concat and internal/doSeries. If I did, let me know, I'll change the PR to only modify the tests and internal/concat. Thanks! This isn't actually changing our public API, so I feel like it should be okay?

Otherwise, I still want to add a couple more tests before merging (for sparse results, original untouched, etc...). Added tests.

@hargasinski hargasinski changed the title [WIP] preserve order in concat preserve order, make variadic and handle falsy values in concat Jun 13, 2017
@hargasinski
Copy link
Collaborator Author

hargasinski commented Jun 13, 2017

As a side note, the performance decrease is pretty significant:

$ node perf/benchmark.js --grep concat
Latest tag is  v2.4.1
Comparing v2.4.1 with current on Node v7.10.0
--------------------------------------
concat(10) v2.4.1 x 74,243 ops/sec ±1.04% (28 runs sampled), 0.0135ms per run
concat(10) current x 52,714 ops/sec ±1.35% (31 runs sampled), 0.0190ms per run
v2.4.1 is faster
--------------------------------------
concat(300) v2.4.1 x 4,029 ops/sec ±2.68% (30 runs sampled), 0.248ms per run
concat(300) current x 2,748 ops/sec ±1.29% (32 runs sampled), 0.364ms per run
v2.4.1 is faster
--------------------------------------
concat(10000) v2.4.1 x 9.86 ops/sec ±2.20% (19 runs sampled), 101ms per run
concat(10000) current x 8.69 ops/sec ±3.27% (17 runs sampled), 115ms per run
v2.4.1 is faster
--------------------------------------
v2.4.1 faster overall (102ms total vs. 115ms total)
v2.4.1 won more benchmarks (3 vs. 0)

Should I change the slice, and _concat.apply(...) back to the original to recover some of the performance?

@@ -19,4 +19,4 @@ import doSeries from './internal/doSeries';
* containing the concatenated results of the `iteratee` function. Invoked with
* (err, results).
*/
export default doSeries(concat);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the last use of doSeries?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. From the history, it looks like all of the other doSeries were converted to doLimit(<function_name>Limit, 1); when the *Series functions were implemented in terms of *Limit here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's time to delete it, then!

@@ -330,7 +330,7 @@ describe('groupBy', function() {
});

it('handles empty object', function(done) {
async.groupByLimit({}, 2, function(val, next) {
async.groupBySeries({}, function(val, next) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this test change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was a dupe of the one on L208. It's under groupSeries, so I assume I just forgot to update a copy and pasted test when I originally added the groupBy tests.

I was using the groupBy tests as a reference for some of the edge cases we usually test.

@@ -0,0 +1,32 @@
var slice = require('../lib/internal/slice').default;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these test changes related to the PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes(ish), as concat now uses internal/slice. I wanted to make sure the behaviour concat relies on isn't broken by accident. Plus, slice is used in a few of our functions, so it's good to have some tests for it similar to the DLL ones.

@hargasinski
Copy link
Collaborator Author

hargasinski commented Jun 14, 2017

For what it's worth, an alternate implementation for handling falsy values and preserving order is (not variadic though):

export default function(coll, limit, iteratee, callback) {
    callback = callback || noop;
    var _iteratee = wrapAsync(iteratee);
    mapLimit(coll, limit, function(val, callback) {
        _iteratee(val, function(err, arr) {
            if (err) return callback(err);
            else if (arguments.length > 1) return callback(null, {result: arr});
            return callback(null);
        });
    }, function(err, mapResults) {
        var result = [];
        for (var i = 0; i < mapResults.length; i++) {
            if (mapResults[i]) {
                result = result.concat(mapResults[i].result);
            }
        }
        return callback(err, result);
    });
}

but the performance increase isn't as big as I expected:

$ node perf/benchmark.js -g concat
Latest tag is  v2.4.1
Comparing v2.4.1 with current on Node v7.10.0
--------------------------------------
concat(10) v2.4.1 x 72,131 ops/sec ±0.79% (26 runs sampled), 0.0139ms per run
concat(10) current x 56,546 ops/sec ±1.28% (32 runs sampled), 0.0177ms per run
v2.4.1 is faster
--------------------------------------
concat(300) v2.4.1 x 3,939 ops/sec ±2.57% (30 runs sampled), 0.254ms per run
concat(300) current x 3,157 ops/sec ±1.67% (31 runs sampled), 0.317ms per run
v2.4.1 is faster
--------------------------------------
concat(10000) v2.4.1 x 9.70 ops/sec ±2.15% (19 runs sampled), 103ms per run
concat(10000) current x 8.66 ops/sec ±3.29% (17 runs sampled), 115ms per run
v2.4.1 is faster
--------------------------------------
v2.4.1 faster overall (103ms total vs. 116ms total)
v2.4.1 won more benchmarks (3 vs. 0)

whereas a similar implementation using push instead of concat appears to give even better performance than previously (for large arrays). I could look more into if you want.

export default function(coll, limit, iteratee, callback) {
    callback = callback || noop;
    var _iteratee = wrapAsync(iteratee);
    mapLimit(coll, limit, function(val, callback) {
        _iteratee(val, function(err, arr) {
            if (err) return callback(err);
            else if (arguments.length > 1) return callback(null, {result: arr});
            return callback(null);
        });
    }, function(err, mapResults) {
        var result = [];
        for (var i = 0; i < mapResults.length; i++) {
            if (mapResults[i]) {
                var val = mapResults[i].result;
                if (Array.isArray(val)) {
                    for (var j = 0; j < val.length; j++) {
                        result.push(val[j]);
                    }
                } else {
                    result.push(val);
                }
            }
        }

        return callback(err, result);
    });
}
$ node perf/benchmark.js -g concat
Latest tag is  v2.4.1
Comparing v2.4.1 with current on Node v7.10.0
--------------------------------------
concat(10) v2.4.1 x 73,431 ops/sec ±0.81% (29 runs sampled), 0.0136ms per run
concat(10) current x 63,043 ops/sec ±1.22% (31 runs sampled), 0.0159ms per run
v2.4.1 is faster
--------------------------------------
concat(300) v2.4.1 x 4,192 ops/sec ±2.71% (31 runs sampled), 0.239ms per run
concat(300) current x 4,661 ops/sec ±1.38% (32 runs sampled), 0.215ms per run
current is faster
--------------------------------------
concat(10000) v2.4.1 x 9.55 ops/sec ±3.80% (18 runs sampled), 105ms per run
concat(10000) current x 35.78 ops/sec ±5.31% (14 runs sampled), 28.0ms per run
current is faster
--------------------------------------
current faster overall (28.2ms total vs. 105ms total)
current won more benchmarks (2 vs. 1)

var _concat = Array.prototype.concat;
for (var i = 0; i < mapResults.length; i++) {
if (mapResults[i]) {
result = _concat.apply(result, mapResults[i]);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One note I just realized about this implementation is that it is breaking for ArrayLike objects:

var result = {0: 'foo', 1: 'bar', 2: 'baz', length: 3};
// previously
[].concat(result); // [{0: 'foo', 1: 'bar', 2: 'baz', length: 3}]
// now
Array.prototype.concat.apply([], args); // ['foo', 'bar', 'baz']
Nevermind, sorry, this will actually run
Array.prototype.concat.apply([], [args]); // [{0: 'foo', 1: 'bar', 2: 'baz', length: 3}]

keeping the previous behaviour.

@aearly
Copy link
Collaborator

aearly commented Jun 14, 2017

I've been really busy this past week, will review soon!

Copy link
Collaborator

@aearly aearly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs some minor cleanup, but good otherwise!

});
}, function(err, mapResults) {
var result = [];
var _concat = Array.prototype.concat;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can move this var to the top level scope.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thank you!

@@ -19,4 +19,4 @@ import doSeries from './internal/doSeries';
* containing the concatenated results of the `iteratee` function. Invoked with
* (err, results).
*/
export default doSeries(concat);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's time to delete it, then!

expect(result).to.be.an('array').that.is.empty;
});

setTimeout(function() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there another way to test this besides expecting it to be a certain way after a timeout? This kind of thing leads to flaky tests.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, fixed. Thank you! It now uses async.setImmediate to wait arr.length event loop cycles before checking the result.

The *Limit versions of parallel, each, map and groupBy use a similar test to the other before the fix. When I get a chance, I'll look into updating them to use async.setImmediate as well.

@hargasinski hargasinski merged commit e9b2855 into master Jun 23, 2017
@hargasinski hargasinski deleted the concat-order branch June 23, 2017 04:21
@hargasinski
Copy link
Collaborator Author

Thanks! Is there anything else that needs to be done for v2.5?

@aearly
Copy link
Collaborator

aearly commented Jun 23, 2017

Nope, just updating the changelog and such. Want to do that? I can publish when you're ready.

@hargasinski
Copy link
Collaborator Author

@aearly I updated the changelog, I believe the make script handles everything else. Thanks! Sorry that it took so long.

@aearly
Copy link
Collaborator

aearly commented Jun 25, 2017

Excellent, just published v2.5.0! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants