New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 105 exclude candidates from cluster #6448
Issue 105 exclude candidates from cluster #6448
Conversation
9ebdbec
to
2061660
Compare
26cd6ea
to
ed9cd64
Compare
Hi @wetneb, |
I had solved the problem! |
Looking great! I will try to have a look soon. In the meantime tagging @cooperzoe too, as this involves changes to the interface. @cooperzoe what do you think of the screenshot above, where the "merge" checkbox has been moved to the first column and the bullet points have been turned into checkboxes? See the original discussion in #105 to motivate those changes. (P.S.: @zyadtaha I changed your issue description so that it includes "Fixes #105" instead of a more complicated link to the issue. This syntax is required for GitHub to properly link the pull request to the issue on the right-hand side) |
I took the time to try it out and I really like it! I noticed that you took care of updating the cluster size and row counts depending on which choices in a cluster are selected. And the "Browse this cluster" link too. While I appreciate the care, I wonder if it's really the ideal behavior. For those reasons, I'd vote for a simpler option: keep the "Browse this cluster", "Cluster size" and "Row count" features static (always encompassing all choices in the cluster, without adapting to their checkboxes). But that's just my opinion! |
@wetneb I am OK for dynamic changes based on the checkboxes per cluster value and like that idea. What was your worry here about dynamic updates on each cluster as a user unchecks each or instead massively changes all using the Select All / Deselect All buttons? Also, if unchecking a few values in a single cluster, should be fast enough I think regardless if users have 200k cluster values in that single cluster, yeah? The labeling "Cluster size" never did make sense to me and often confused folks in training, maybe it's better to relabel with "Cluster value count". The feedback I'd heard was that they thought "size" was something about the length of cluster strings. Other than that, no further opinion. |
I am not worried by performance at all, just about usability. If my comment above does not make it clear, it might be easier to understand my concern by trying out the current prototype. Just run clustering, use the right-hand side filters and see what happens. |
Hi @wetneb, |
@zyadtaha I'll just double-check with others if they agree, to make sure I don't make you do unnecessary work. @thadguidry do you agree with making the cluster fields static again or should I describe my concerns with them being dynamic in another form (with screenshots?) @ostephens @Critic-A any thoughts? |
@wetneb screenshots please. Env. not in a good state (harddrive again) |
@thadguidry here is my explanation, with added screenshots. Any clearer? Consider the state in which we arrive just after having computed clusters, where no clusters will be selected. The "row count" and "cluster size" columns contain only zeros, which is not super informative. This is confusing since the facet histograms do show that some clusters match the filters I have set up. For it to work, you'd need to first select all clusters, which sort of implies that you consider them all good without having looked at them, in a sense. |
@wetneb Hmm, it's almost like we shouldn't show any data in the range sliders until at least 1 cluster is selected? If I look at the descriptions above the range sliders...they say "... in cluster". That almost leads me to believe that maybe we word that as "...in selected clusters" above the range sliders? Dunno. |
To me, requiring clusters to be selected for them to appear in the filters goes against the purpose of those filters, which is (in my opinion) to help you review subsets of clusters and mark them as mergeable. That's why I am proposing to keep the "Browse this cluster", "Cluster size" and "Row count" features static, independent of the selection. |
Ah, those sliders always help you find clusters to merge and not change or modify them, right? In that case, I also agree with you. |
@zyadtaha then we could indeed go for keeping the "Browse this cluster", "Cluster size" and "Row count" features static. |
@wetneb The "Browse this cluster", "Cluster size" and "Row count" features are now static. |
Hi @wetneb, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't reviewed the code changes super closely yet, but testing this interactively it looks like we have a small regression in this use case:
- run clustering
- tick the checkbox of one cluster (which ticks the checkboxes in all its values)
- use the first slider on the right-hand side to change the filtering settings
-> with this new code, no clusters are returned, whereas the histogram in the filters indicate that some should be shown.
I have checked that this bug isn't present on the master branch (clusters are shown after following this workflow).
The dataset on which I tried this can be downloaded here (on the "Producteur" column).
Hi @wetneb, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works as expected on my side!
I only have a couple of nitpicks on the code left.
Perhaps it could be worth having a new Cypress test case to demonstrate the functionality, if you feel like it.
Hi @wetneb, I made the changes you suggested and added a test case with Cypress. Thanks for your feedback! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot!
Congratulations @zyadtaha for fixing such an old issue! Your contribution will be appreciated by a lot of users :) |
Fixes #105
Changes proposed in this pull request:
Export clusters
,Merge selected & re-cluster
andMerge selected & Close
will include only checked candidates in the cluster.Cluster size
,Row count
will change dynamically.Browse this cluster
button is clicked, the unchecked candidates will not be included.Browse this cluster
button is clicked and no candidates are checked.