Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sort(A) and sort(A, d => d) are not equivalent #217

Closed
Fil opened this issue Jun 23, 2021 · 5 comments · Fixed by #227
Closed

sort(A) and sort(A, d => d) are not equivalent #217

Fil opened this issue Jun 23, 2021 · 5 comments · Fixed by #227
Labels
documentation Improvements or additions to docs question Further information is requested

Comments

@Fil
Copy link
Member

Fil commented Jun 23, 2021

in the way they handle non-comparable values:

const A = [1, undefined, -1];
d3.sort(A); // [-1, 1, undefined]
d3.sort(A, d => d); // [1, undefined, -1]

not sure if this should be considered a bug or not, but it can be surprising. The basic caveat is that having an undefined value (and after #210, a null), will throw a wrench into sorting—maybe it's something to add to the documentation instead.

@Fil Fil added documentation Improvements or additions to docs question Further information is requested labels Jun 23, 2021
@mbostock
Copy link
Member

Definitely a bug.

I’m also wondering now if ascending and descending should consider all non-comparable values as after comparable values, rather than having undefined behavior.

@Fil
Copy link
Member Author

Fil commented Jun 25, 2021

If we change ascending to:

function ascending(a, b) {
  if (a !== a) a = null; // NaN to null
  if (b !== b) b = null;
  return a == null ? b == null ? NaN : 1
    : b == null ? -1
    : a < b ? -1
    : a > b ? 1
    : a >= b ? 0
    : NaN;
}

there are consequences on greatest and greatestIndex (which must not select the undefined, so should use ascendingDefined as default comparator).

@mbostock
Copy link
Member

I was thinking that we should introducing d3.ascendingDefined and d3.descendingDefined (both of which put undefined values at the end, i.e., as the greatest values), and have d3.sort default to using d3.ascendingDefined instead of d3.ascending. I think there are valid use cases for both. We’d presumably use Plot’s implementation, unless we think it needs changing?

https://github.com/observablehq/plot/blob/ccb9c9a60700dd38096eca6b6ae658f8af0e038a/src/defined.js#L3-L13

(One question is that Number.isNaN will detect NaN directly, but won’t detect an object that coerces to NaN, such as an invalid Date. So maybe we should do a self comparison test !(x >= x) instead?)

@mbostock
Copy link
Member

mbostock commented Aug 4, 2021

In the accessor case (sort(objects, d => d.foo)) and if we consider the identity case equivalent (sort(values, d => d)), then I think we can change d3.sort to move null, NaN and undefined values to the end of the array as a generalization of array.sort’s behavior.

However, I’m not sure how to do that in the comparator case (sort(objects, (a, b) => a.foo - b.foo)) since the comparator doesn’t give us enough information: it doesn’t tell us which of a or b is undefined, so we don’t know which one to move to the end. I suppose there might be some way we could try to infer this information, but I think maybe we leave the behavior of d3.sort as-is in the comparator case, and focus on fixing the accessor and one-argument identity case.

@mbostock
Copy link
Member

compare(x, x) !== 0 is enough to tell us that we should move x to the end. But this requires a pass before sorting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to docs question Further information is requested
2 participants