Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#1348] initial groupBy implementation #1364

Merged
merged 2 commits into from Feb 27, 2017
Merged

Conversation

hargasinski
Copy link
Collaborator

My attempt at implementing groupBy for #1348 (and I guess #214). This is my first feature implementation for async, so feedback is greatly appreciated.

I may have been a little excessive with the testing, as I ported a lot of the generic tests (like ensuring groupByLimit follows the concurrency limit) from mocha_test/map.js, while including a few specific groupBy tests. Also, the example I used for the docs isn't that great as I couldn't really come up with anything concise but illustrative. I'd like to change it before merging, if anyone has any ideas.

Copy link
Collaborator

@aearly aearly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM. The implementation is pretty simple. 👍

* @param {Function} [callback] - A callback which is called when all `iteratee`
* functions have finished, or an error occurs. Result is an `Object` whoses
* properties are arrays of values which returned the corresponding key.
* @example
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a decent example.

export default doParallelLimit(function(eachFn, coll, iteratee, callback) {
callback = callback || noop;
coll = coll || [];
var result = {};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about having result = Object.create(null)? Then instead of hasOwnProperty.call(...) down below, you could just do key in result.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to avoid using Object.create(null) as the behaviour can not be emulated consistently in older environments. I prefer using standard objects + hasOwn (hasOwn tends to be faster anyway last time I checked)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't mind using Object.create(null) internally, but I'd also prefer to avoid using it for objects returned to the user. The user might use something like result.hasOwnProperty(anImportantKey) somewhere later in the code, which would result in a TypeError.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Object.create(null) is ES5, which is our minimum required version as of 2.0. @hargasinski brings up a good point though, best not to use it because of that.

});
});

it('with reflect', function(done) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the reflect tests are useful here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll remove them, thanks!

lib/groupBy.js Outdated
* correspond to the values passed to the `iteratee` callback. Note: Since this
* function applies the `iteratee` to each item in parallel, there is no
* guarantee that the `iteratee` functions will complete in order and there is
* no guarantee that grouped items will appear in the same order as in `coll`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reconsidering about groupBy again I question whether its better implemented synchronously after all the asynchronous work has been resolved, as making the order fixed in an asynchronous processing flow would make things much more complicated

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I understand, do you mean kinda like ./internal/filter.js where it asynchronously transforms the values, but synchronously filters them afterwards? I could look into that tomorrow.

}, val * 25);
}

context('groupBy', function() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we use context anywhere else?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was looking at the mapValues tests when I added that. We use it in the compose and queue tests too.

It's more personal preference than anything, but I feel like it makes the test results easier to read and it forces them to be more organized:
image

@hargasinski
Copy link
Collaborator Author

Updated the PR to use a synchronous groupBy implementation.

/cc @megawac @aearly

var hasOwnProperty = Object.prototype.hasOwnProperty;

for (var i = 0; i < mapResults.length; i++) {
if (mapResults[i]) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code style question: is there a preference for:

if (!mapResults[i]) continue;

// do something ...

versus

if (mapResults[i]) {
    // do something ...
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the continue style, but that's just like... ...my opinion, man.

@aearly
Copy link
Collaborator

aearly commented Feb 25, 2017

BTW, the CI failures for this were bogus, so I'd say this is 👍 to merge.

@hargasinski
Copy link
Collaborator Author

I wanted to make sure @megawac is good with the implementation before merging.

@hargasinski hargasinski merged commit bdc3d81 into caolan:master Feb 27, 2017
@hargasinski hargasinski deleted the groupBy branch February 27, 2017 05:05
@hargasinski
Copy link
Collaborator Author

@aearly @megawac should we publish v2.2.0?

@aearly
Copy link
Collaborator

aearly commented Mar 24, 2017

Yep, there's a couple other enhancements ready too.

@aearly
Copy link
Collaborator

aearly commented Mar 24, 2017

@hargasinski I'll let you do the honors, otherwise I'll do it later this evening.

@hargasinski
Copy link
Collaborator Author

hargasinski commented Mar 25, 2017

@aearly thank you, but I don't have permission to publish async (I'm not listed as a collaborator on npm). You'll have to do it 😅. I did update the changelog for v2.2.0 though.

@aearly
Copy link
Collaborator

aearly commented Mar 25, 2017

Ok, published! Thanks for writing the changelog.

@Kikobeats
Copy link
Contributor

Kikobeats commented Mar 25, 2017

Just to know, there is no documentation for this? or was not generated 🤔

@aearly
Copy link
Collaborator

aearly commented Mar 25, 2017

There was an error publishing docs. Are we using the gh-pages branch, or the docs/ folder now? Either way, make publish-docs needs to be fixed.

@hargasinski
Copy link
Collaborator Author

hargasinski commented Mar 25, 2017

I just updated the gh-pages, I'll look into make publish-docs now. Thanks for catching this @Kikobeats.

Edit: we're still using the gh-pages branch.

@hargasinski
Copy link
Collaborator Author

@aearly rerunning npm install; npm install gh-pages-deploy -g appears to have resolved the issue for me.

I couldn't figure out what was causing the exact issue (it might have been a bug in an older version of duplexify they were using), so I looked into alternatives just in case. The simplest one I found is git-directory-deploy. It's similar to gh-pages-deploy, though it has two downsides: it doesn't confirm the commands beforehand, and it appends the given message to previous commit's message. The command would be: git-directory-deploy --directory docs --branch gh-pages --message "gh-pages update". We could also use git directly as suggested by this gist. The downside to that is it requires tracking the docs folder.

For now, if updating gh-pages-deploy works, I'd suggest we stick to that, as it appears to be the best option.

@aearly
Copy link
Collaborator

aearly commented Mar 28, 2017

Thanks for looking in to this. It appears there was an incompatibility with Node 6.3 streams that was making it fail: mafintosh/duplexify@acf662f (I think I last published with Node 6.2)

Updating it fixes things. I say we keep with gh-pages-deploy with this issue resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants