Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: stats: add array API-support #20544

Open
13 of 69 tasks
mdhaber opened this issue Apr 20, 2024 · 0 comments
Open
13 of 69 tasks

ENH: stats: add array API-support #20544

mdhaber opened this issue Apr 20, 2024 · 0 comments
Labels
array types Items related to array API support and input array validation (see gh-18286) enhancement A new feature or improvement scipy.stats

Comments

@mdhaber
Copy link
Contributor

mdhaber commented Apr 20, 2024

Towards gh-18867

This issue tracks progress toward the addition of array-API support to scipy.stats functions. The functions listed below look ready for conversion, and I'd be happy to review PRs for them. Priority, balancing the ease and importance of the task, is roughly in the order listed.

After that:

After that:

  • add N-D support to _array_api.cov; consider making it public if array API won't offer it
  • linregress: add axis and array API support
  • ks_2samp: consider natively vectorizing, then adding array API support
  • bartlett: consider natively vectorizing, then adding array API support
  • levene: consider natively vectorizing, then adding array API support
  • anserson_ksamp: might be able to vectorize, then add array API support
  • wasserstein_distance: consider natively vectorizing, then adding array API support
  • energy_distance: consider natively vectorizing, then adding array API support

I'd like to implement the following using _masked_array (gh-20363):

  • tmean
  • tvar
  • tmin
  • tmax
  • tstd
  • tsem

These functions are held up by rankdata (possibly among other things), which is itself in need of improved array-API support. See gh-20639.

  • kendalltau
  • mannwhitneyu
  • wilcoxon
  • kruskal
  • cramervonmises_2samp
  • friedmanchisquare
  • brunnermunzel
  • ansari
  • fligner
  • mood

These functions need median, quantile, or similar, either directly or via iqr. See data-apis/array-api#795.

  • iqr
  • siegelslopes
  • theilslopes
  • median_test
  • median_abs_deviation
  • epps_singleton_2samp
  • levene (optional)
  • fligner (optional)
  • sen_seasonal_slopes

I am not interested in supporting the following functions: bayes_mvs, mvsdist, the frequency statistics, weightedtau, somersd or other tabular methods, multiscale_graphcorr.

Many other functions are not listed here because they really need special function support to be useful.

I wrote the following functions, so I'd prefer to do the upgrades on those personally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
array types Items related to array API support and input array validation (see gh-18286) enhancement A new feature or improvement scipy.stats
Projects
None yet
Development

No branches or pull requests

2 participants
@mdhaber and others