Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenADAS: Expand ADF15 headers #298

Open
Mateasek opened this issue May 31, 2021 · 7 comments
Open

OpenADAS: Expand ADF15 headers #298

Mateasek opened this issue May 31, 2021 · 7 comments

Comments

@Mateasek
Copy link
Member

Recently I had to add an ADF15 file sent to me from ADAS which has a header the current parser doesn't recognise. I changed the adf15 parsing slightly to make it cover more versions without needing to touch the currently used regexp. PR #297 shows what I did. I tested it by populating a new repository, but I'm not sure the code should look like this. If we have to add more versions in future the code will become quite long. Would that be a problem? We could also define the headers in a separate file. Any opinions?

@vsnever
Copy link
Member

vsnever commented Jun 1, 2021

Hi @Mateasek,
I think the way you defined the headers in #297 is fine. There are not so many PEC versions in ADAS.

It is likely that the same problem may occur with the files for hydrogen and hydrogen-like ions, so I think that pec_index_header should be updated in _scrape_metadata_hydrogen() and _scrape_metadata_hydrogen_like() as well.

Unfortunately, the headers are just the tip of the iceberg. It will be more difficult to make a parser that supports different formats of electronic configuration strings, #256.

@Mateasek
Copy link
Member Author

Mateasek commented Jun 1, 2021

It is likely that the same problem may occur with the files for hydrogen and hydrogen-like ions, so I think that pec_index_header should be updated in _scrape_metadata_hydrogen() and _scrape_metadata_hydrogen_like() as well.

I thought I would leave it to moment when we "discover" a new version of the header with the other methods. The problem is really in the comment section of the data file. I see that this problem is deeper than I expected. I will wait a bit more for somebody else to express their opinions and then I will make a regular PR.

@roni-maenpaa
Copy link

Hi @Mateasek and @vsnever

I think as an interim solution this is quite useful - one runs into this issue when parsing, for example, "adf15/pec96#n/pec96#n_vsu#n3.dat".

I tested PR #297 (now closed?) and it worked well for me. :)

@Mateasek
Copy link
Member Author

Hi @roni-maenpaa

thanks for the information. I'm happy that it helps, but I think adding more and more regex definitions and cases is not the proper way forward. We will probably have to go some other way, since the problematic part of the data files is assumed by ADAS to be loosely defined comment section.

One idea was to avoid parsing the problematic file part let users define provide the necessary info themselves along providing a set of examples how to parse the most common versions of the headers. Taking care of the parsing inside cherab-openadas is quite tedious and never ending job...

@roni-maenpaa
Copy link

Thanks a lot for the response, @Mateasek! I see that this issue is indeed quite tricky.

I had another question regarding the handling of OpenADAS-data in Cherab which may be related, so I will link the issue I raised here just in case: #426

@jacklovell
Copy link
Member

How about letting the user pass in a custom metadata_parser function object:

def parse_adf15(element, change, adf_file_path, metadata_parser=None):
...

When the value is None make a best-effort guess at a suitable comment block parser based on the ones we have available at the moment (hydrogen, hydrogen-like, full): I would also vote for adding a parser for PEC40 etc. to the default set as it covers quite a lot of useful transitions (virtually all the Argon PECs, for example). These parsers would become public functions rather than private adf15.py functions.

End users then have the option of providing their own custom parser for a particular file format, which removes the burden on Cherab to cover all possible metadata formats and instead restrict ourselves to only common/popular formats. We could then welcome pull requests to add additional parser functions if the community ends up finding them generally useful.

An additional demo showing how to write a parser for an example unsupported format (this could be PEC40 for example) should also be provided: at that point we would have provided all the tools required to use any ADF15 file.

@Mateasek
Copy link
Member Author

Mateasek commented May 20, 2024

I would go even further. Lets specify the format of the data the users have to provide to the parsing functions to get the problematic information in (e.g. wavelengths). Then lets have a pool of ready parsers providing the right data format as an example of what the users have to do themselves.

What I mean is lets specify data format on the input and provide some examples of how to get the data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants