Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is this project still being maintained? #1272

Open
DeliciousHair opened this issue Mar 20, 2023 · 16 comments
Open

Is this project still being maintained? #1272

DeliciousHair opened this issue Mar 20, 2023 · 16 comments

Comments

@DeliciousHair
Copy link

I'm just looking at the activity level in terms of PRs being merged, wondering if this project is still a thing?

@nilgoyette
Copy link
Collaborator

Last time I heard about this problem, both maintainers were too busy. IIRC, one is a teacher and has several projects going on and the other has just finished his phd and started working. xd009642 was added as a maintainer in 2020.

See this issue for more information.

@bluss
Copy link
Member

bluss commented May 3, 2023

I'd like to give away more of "my role" since I don't have the bandwidth unfortunately. That means bringing on permanent maintainers.

I maybe want to keeping going working on lower level stuff, like I have with matrixmultiply now and would maybe with numerical simd for other blas-like operations that could benefit ndarray.

I guess it's a bit of a pickle now that the organization is bigger than just one repo but low activity across the board. I can't necessarily do everything without asking others.

The status of the code is "not great" in terms of how easy it is to maintain and change (me knows most of the internals, some lack of abstractions for internals, lots of unsafe code that works just because of careful contributors, easy to mess it up).

What do @termoshtt @nilgoyette @adamreichold @jturner314 @LukeMathWalker think about this? What's the direction for ndarray (there's a lot that can be done - modernisation using const generics)? Are there other projects that we should emulate? Or that have made ndarray redundant?

@ZuseZ4
Copy link
Contributor

ZuseZ4 commented May 3, 2023

@bluss Since you mentioned matrixmultiply and simd in the other post, did you see the work from sarah (faer-rs)?
She does seem to have extremely competitive performance when compared to openBLAS, Eigen and Matrixmultiply.
I generally liked the design of ndarray a touch more than that of nalgebra due to the data I work with.
If I can add a wish for the direction of ndarray, I'd love to see ndarray using faer for more of it's operations.

@antimora
Copy link

antimora commented May 3, 2023

@bluss Thanks for the update because the community has been worried about the maintenance of such important Rust library. Many projects rely on its existence and can't find any drop in replacement. I hope the ndarray creators and maintainers can come up with a long term solution. I am sure there are people who would be happy very least to review PRs.

@DeliciousHair
Copy link
Author

@bluss good to see this is not abandoned! I also really appreciate the lack-of-bandwidth problem, suffering from it myself regularly.

The issue of how to bring on "permanent" maintainers is an ongoing problem though, at least for open-source projects like this as they often live and die at the bandwidth (or interest) of a small handful of people. Given that this project does not seem to have the corporate backing that tends to address this particular problem via financial incentives, one possible option is that maintenance is handled via committee, with membership that can be changed. This would require first setting up a contributing guideline and a code of conduct that would enable said committee to exist, but it may allow progress to actually progress without the time-poor bottleneck in the mix.

Just my two cents, really happy to see this conversation happening :-)

@adamreichold
Copy link
Collaborator

What do you think about this?

I think bringing in more people to share the load is a good idea. It can still fail as volunteers sometimes just do not have any time to contribute. For example, we do have multiple active maintainers who continue the work independently at PyO3. But currently, our active phases almost never overlap which makes small changes slow and it often feels impossible to obtain the necessary consensus for large changes.

As for actually doing it, I see two options: You give some people you are able to trust somewhat access and let it run living with the likely but hopefully temporary breakage resulting from that. Or you increase your time investment for a while to actively guide new people into reviewing PR and making releases but I am not sure if that is possible at all.

What's the direction for ndarray (there's a lot that can be done - modernisation using const generics)?

For me personally, with my rust-numpy hat on, this crate is mainly a fundamental data structure for scientific computing and hence I would - also in consideration of the bandwidth issue - prefer if it would focus on that. So yes, modernisation but also simplification, i.e. trying to move even more towards an ecosystem model like ndarray-stat and ndarray-linalg where most operations live outside of ndarray itself. If the whole can of worms of accelerated subroutines via BLAS et al. could be simplified or moved into separate add-on crates, the would be great as well.

Are there other projects that we should emulate? Or that have made ndarray redundant?

I do not know of any with the same "fundamental data structure" focus as ndarray.

Is the NumFOCUS organisation something you could see yourself contacting and asking for (monetary) support? Would money alone actually solve anything?

@DeliciousHair
Copy link
Author

DeliciousHair commented May 4, 2023

Would money alone actually solve anything?

I think the thing it does help with is that funding enables somebody to justify prioritizing their time to doing maintenance should there be conflicting pressures on them as well. I mean it's far from a perfect solution, but life is expensive so unless one can afford to volunteer their time to an open-source software project (and many people can and do, don't get me wrong) then if there is demand X that pays the rent vs. really-intersting-project Y, then X will usually win. A financial incentive simply helps to level this field a bit.

As for directions / applications / focus, it occurs to me that a selling point that could be used to attract some funding (I don't know how any of this stuff works, outside my realm of experience) is that using these shiny-new rust implementations of ubiquitous python libraries does have massive market appeal--just look at how hot polars has become, partially because it stomps all over pandas in many ways. Thus, rather than taking the view that ndarray is a sort of numpy-for-rust, in light of projects like faer and rapl, it may be better to view the ecosystem as the new rust-backed numpy, assuming said ecosystem would include some python wrappers of course.

Dispensing with BLAS/LAPACK and gaining out-of-the-box parallelization has immeasurable value to many industries and use-cases after all.

@adamreichold
Copy link
Collaborator

I think the thing it does help with is that funding enables somebody to justify prioritizing their time to doing maintenance should there be conflicting pressures on them as well. I mean it's far from a perfect solution, but life is expensive so unless one can afford to volunteer their time to an open-source software project (and many people can and do, don't get me wrong) then if there is demand X that pays the rent vs. really-intersting-project Y, then X will usually win. A financial incentive simply helps to level this field a bit.

I do not disagree, but I would like to add that this reasoning is limited to situations where one works on a project basis. If you have a steady job and obligations to a family, funding for individual projects does not change how much time one has for FOSS work.

@nilgoyette
Copy link
Collaborator

The status of the code is "not great" in terms of how easy it is to maintain and change (me knows most of the internals, some lack of abstractions for internals, lots of unsafe code that works just because of careful contributors, easy to mess it up).

That's my main problem with this crate if I'm going to help maintain it. When I open the internals, I don't understand what I'm reading. I'm usually able to add a method and whatnot, but I don't feel knowledgeable enough for "more complex" stuff.

  • "lack of abstractions for internals" @bluss Is there a list somewhere? Did you have ideas on how to solve them but never had the time to do it? Can you share those ideas?
  • IIRC, @jturner314 had a PoC for a new iteration management. I don't remember the details, but he claimed that it was, at least, faster. This is super interesting. Can we know the status of this project? Once we have more details, maybe someone will be able to finish it?

So yes, modernisation but also simplification, i.e. trying to move even more towards an ecosystem model like ndarray-stat and ndarray-linalg where most operations live outside of ndarray itself.

This is an excellent idea and this is already what's going on. I created ndarray-ndimage for that reason. ndarray should probably be kept "small" and clean, then let others build on it.

ndarray is a sort of numpy-for-rust

This is exactly my opinion. I don't think ndarray is redundant. nalgebra certainly share some features, but I do not use them in the same ways nor with the same goals. At least for now, ndarray has a reason to live!

@antimora
Copy link

antimora commented May 4, 2023

@nilgoyette

This is exactly my opinion. I don't think ndarray is redundant. nalgebra certainly share some features, but I do not use them in the same ways nor with the same goals. At least for now, ndarray has a reason to live!

Just to highlight the importance of this library. We use NDArray as one of our backends for Burn's deep learning framework.

@jturner314
Copy link
Member

Personally, now that I'm no longer a student, am working full-time, and have more responsibilities, I have less time and energy to devote to FOSS. And, unfortunately, I don't have much need for ndarray at work, so it's hard to justify spending time on it at work. (I still use ndarray for a few things, such as interacting with NumPy and conveniently parallelizing things in some cases, but nalgebra is usually a better fit for the things I'm working on now.)

I do think that an n-dimensional array type is very important; while nalgebra is nicely polished, the 1-D and 2-D vectors and matrices provided by nalgebra don't satisfy all use cases.

It would be great to bring on more people to take over the maintenance. I'd also be happy to move my ndarray-npy crate to the rust-ndarray organization; I haven't had the energy to really maintain it properly by myself.

As far as improvements go, I think that it would be possible to simplify ndarray's internals and public API by taking advantage of const generics and GATs and reworking the API in terms of traits (instead of the generic ArrayBase type). By simplifying the implementation (using better abstractions which enforce correctness) and making the public API easier to use, I'd hope that more people would use and contribute to ndarray.

I have some ideas for how to update the internals and API using traits, GATs, and const generics, but I doubt I'll find the time to implement it all myself. If someone is interested on working on it, I'd be willing to chat about it.

IIRC, @jturner314 had a PoC for a new iteration management. I don't remember the details, but he claimed that it was, at least, faster. This is super interesting. Can we know the status of this project? Once we have more details, maybe someone will be able to finish it?

Yeah, ndarray currently has optimal iteration only in a few special cases (all arrays standard layout or all Fortran layout). By reordering iteration to be as close to memory order as possible, flattening nested loops where possible, and tiling where necessary, it would be possible to improve the iteration performance over arrays of dissimilar or non-standard layouts. I put together an initial prototype of the reordering and loop-flattening pieces at https://github.com/jturner314/nditer. I also worked on automatic tiling but haven't pushed that to the repo. The primary thing that blocked me from finalizing that project was testing -- the code is complicated in some places, so I really wanted to implement proptest support for generating and testing with arrays of arbitrary shapes and layouts. I didn't get a chance to finish it. Another way to improve performance would be to take better advantage of SIMD, but I didn't work on that.

@bluss
Copy link
Member

bluss commented May 13, 2023

Great input from everyone. I wasn't fully aware of faer-rs, no, so thanks for the pointer.

I would like to invite those participating in the discussion here to become collaborators in ndarray.
It's unfortunately not realistic for me to take on a greater responsibility now, so that is not going to be the outcome of the discussion, even though one could reasonably wish for it. I want to leave the way open for others to develop ndarray without having me as gatekeeper.

Can I for example ask @adamreichold, are you interested? Do you have any contacts that are?

@adamreichold
Copy link
Collaborator

Can I for example ask @adamreichold, are you interested? Do you have any contacts that are?

Took me a while to consider the commitment but yes, I am interested. I would be glad if I could help with maintenance and eventually further development.

I do think my own time budget and my inexperience in maintaining this particular project imply that I could not immediately tackle any large charges. On contrary, in the beginning I would deliberately limit myself to building and packaging issues and reviewing contributions with the aim of producing point releases and hopefully eventually a 0.16.0 release. Ideally, I will be able to learn enough to do more in the future.

(I also do not want to give a wrong impression, I do not consider myself well-networked and have few contacts beyond direct collaboration via the FOSS projects. I will ask the one acquaintance who I think could be in a position to contribute though.)

@nilgoyette
Copy link
Collaborator

I would like to invite those participating in the discussion here to become collaborators in ndarray.

I find myself in jturner's situation (less/no more ndarray at work, for a while), but I really love ndarray and I can at least

  • check the issues page and answer, when I'm able.
    • We currently have 213 of those. Can I (we) clean it? I offer myself to read all of them and close them when they are no longer relevant. Can I have the right to do so?
  • try to review the PR when I do understand what's going on

@bluss
Copy link
Member

bluss commented May 16, 2023

Awesome, I've added you on this repo, but there is more admin to do - the whole org - which we will get to

@DeliciousHair
Copy link
Author

DeliciousHair commented Jun 5, 2023

Thought I'd chime in here that I'd be happy to put my hand up to volunteer for some sort of maintainer / reviewer status. At the moment I'm also trying to contribute to rapl so I've at least got my mind in the correct linear algebra / tensor space to be thinking about this. Work schedule is a bit up and down, rather "up" at the moment so free time is at a premium and contributions will be slim for the next month or so. However, I do have enough availability to do reviews most any time, and am happy to participate in any planning where my input may be of value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants