Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Re-separate the openadas api into its own package #374

Open
je-cook opened this issue Aug 5, 2022 · 4 comments
Open

Proposal: Re-separate the openadas api into its own package #374

je-cook opened this issue Aug 5, 2022 · 4 comments

Comments

@je-cook
Copy link

je-cook commented Aug 5, 2022

Would it be possible to re-separate the openadas API package from the core of CHERAB?

In the Bluemira[1] and PROCESS[2] teams we're looking for an alternative to getting the user to download data from openadas directly (as you have already implemented).

As I understand it from a conversation with Carine the reason for the original combining was to ease installation hurdles. I think if the below is added to your core setup.py file the user would see no difference (https://pip.pypa.io/en/stable/topics/vcs-support/):

# setup.py

setup(...
      install_requires=[
      ...
      "openadas @git+https://github.com/<repo location>.git",
      ]

)

Secondly and this is maybe a bit more cheeky could the API not have a hard dependency on CHERAB's core?
From a quick look at the API it doesnt look like any of the imports are unavoidable with some work (for instance the unit conversion I would suggest Pint). It also seems like an easier dependency flow than the slightly circular situation that could arise if they were separated in their current form.

We would be happy to help with the separation where we can.

[1] https://github.com/Fusion-Power-Plant-Framework/bluemira
[2] https://ccfe.ukaea.uk/resources/process/

@Mateasek
Copy link
Member

Mateasek commented Aug 5, 2022

Hi @je-cook,

If I'm not mistaken, Cherab was always meant to be able to support multiple (even custom made) atomic databases. The core code dependencies are there to set the interface. OpenADAS is just the default option and one of possible sources of atomic data. Breaking the dependency would require redefining the whole ideology behind the described system. There are dependencies on Raysect also. If what I described is true I'm not sure what you propose would be that simple and a correct direction for the Cherab project.

@je-cook
Copy link
Author

je-cook commented Aug 5, 2022

Thanks for the quick reply.

Supporting multiple atomic databases seem like a great thing to have! The separation I am talking about is the level below that. Specifically, parsing and downloading of the data. The only hurdle I saw in all the folders besides the raysect imports in your rates folder was your Element and Isotope class which reminded me of something like Mendeleev.

Possibly a direction would be a generic library that has interfaces to lots of atomic databases, I imagine that would be more work however 🙂

How do the Raysect interpolators that are the imports in the rates folder differ from for instance the Scipy interpolators (https://docs.scipy.org/doc/scipy/tutorial/interpolate.html)? Naiively they seem to have a similar level of configurability.

As I said before I didnt expect this necessarily to be straight forward but as all three of us are using data from OpenADAS it seemed a good idea to pool the resources.

@jacklovell
Copy link
Member

Thanks for the proposal @je-cook. Certainly worth further discussion. A couple of points:

As I understand it from a conversation with Carine the reason for the original combining was to ease installation hurdles. I think if the below is added to your core setup.py file the user would see no difference (https://pip.pypa.io/en/stable/topics/vcs-support/):

The installation issues are not related to the availability of the cherab-openadas sub-package. It's actually a bug in Cython which prevents packages which cimport Cherab modules from seeing those modules at build time: cython/cython#2918. A fix for this is in the alpha builds for Cython 3.0 but was not backported to the 0.x series. Until Cython 3.0 is released and widely available there is a high probability that anybody trying to install Cherab from source will run into this problem. This was the initial motivation for moving the cherab-openadas submodule into core (well, that and the fact that cherab-core requires at least one atomic data source to work at all so it makes sense to provide this one as a built-in).

The only hurdle I saw in all the folders besides the raysect imports in your rates folder was your Element and Isotope class which reminded me of something like Mendeleev.

The Element and Isotope classes are quite deeply ingrained into the framework, not just in core but in other submodules like cherab-solps too. Re-writing the framework to pull in an external library dependency and break backwards compatibility is not something I think we want to commit to.

How do the Raysect interpolators that are the imports in the rates folder differ from for instance the Scipy interpolators (https://docs.scipy.org/doc/scipy/tutorial/interpolate.html)? Naiively they seem to have a similar level of configurability.

The Scipy interpolators are optimised for a single call with large arrays of data. The Raysect interpolators are optimised for many calls each with a single data value, which follows the computational paradigm used by the ray tracer. Plenty of new users to Cherab have made their own interpolators using the Scipy ones and absolutely destroyed the performance of the framework. They should not be used for observations with Cherab.

There is a part of the cherab-openadas package which could be pulled out and usefully generalised: the part which reads the mish mash of ADF files and converts them into a well-defined JSON representation. Having that well-defined representation and a couple of reasonably-generic parsers would potentially reduce the hurdles for individual codes to incorporate the data: one could take an example parser and only need to make small modifications to match whatever representation is taken for those codes. I'd be more comfortable about limiting the scope of a package which aims to be generic, to only do one small thing and do that well. One could easily spend more time making a library compatible with 3 codes than you'd spend writing the same thing 3 times for each code (replace 3 with as appropriate).

@je-cook
Copy link
Author

je-cook commented Aug 8, 2022

Thanks for the clarifications @jacklovell really useful to know, and I agree with the generalisation comment, it was more whimsical than serious!

There is a part of the cherab-openadas package which could be pulled out and usefully generalised: the part which reads the mish mash of ADF files and converts them into a well-defined JSON representation.

I think this is the best way forward, I will drop a message separately so we can have a meeting about the finer details.

Thanks for the consideration out of the blue, I will leave this open until we get round to implementing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants