Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update information for omdb in providers.json #53

Merged
merged 2 commits into from
Jan 5, 2021

Conversation

rartino
Copy link
Contributor

@rartino rartino commented Jan 5, 2021

Now httk has support for hosting OPTIMADE v1.0.0, which we are setting up for hosting omdb. This PR updates the omdb info accordingly to point to our index meta-database.

@rartino rartino requested a review from CasperWA January 5, 2021 15:58
Copy link
Member

@CasperWA CasperWA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to all check out, cheers @rartino 👍

@CasperWA CasperWA merged commit 5ffde13 into Materials-Consortia:master Jan 5, 2021
@CasperWA
Copy link
Member

CasperWA commented Jan 5, 2021

As a note, I have just tested this out in my client, and it correctly identifies your production database as a potentially valid database to be used for the client, but then after some more probing it deems it invalid due to the missing versioned URL, etc.

@rartino rartino deleted the update_of_omdb branch January 6, 2021 10:21
@rartino
Copy link
Contributor Author

rartino commented Jan 6, 2021

@CasperWA All parts of the setup wasn't in place yet. They are now.

(I'm trying to run the optimade-python-tools validator on http://optimade.openmaterialsdb.se/, but I didn't yet quite figure it out. The optimade-python-tools README.md says: "It also contains a server validator tool, which may be called from the shell or used as a GitHub Action from optimade-validator-action.", but where is the shell-script? Is there any documentation on how to run the validator?)

@CasperWA
Copy link
Member

CasperWA commented Jan 6, 2021

@CasperWA All parts of the setup wasn't in place yet. They are now.

(I'm trying to run the optimade-python-tools validator on http://optimade.openmaterialsdb.se/, but I didn't yet quite figure it out. The optimade-python-tools README.md says: "It also contains a server validator tool, which may be called from the shell or used as a GitHub Action from optimade-validator-action.", but where is the shell-script? Is there any documentation on how to run the validator?)

Maybe it isn't described well enough in the documentation (https://www.optimade.org/optimade-python-tools/) actually, just looking at the table of contents, it seems there isn't a dedicated page for the validator yet. For shame.

In any case.
What you want to do is this:

Ensure you have a Python 3.6+ environment.

$ pip install -U optimade
Installing optimade from PyPI ...
$ optimade-validator http://optimade.openmaterialsdb.se
Validating ...

To understand what options you can pass to the validator, you can use the common -h/--help option.

Note: If you want the "state-of-the-art" validator, i.e., the development version 😅 you can either git clone the optimade-python-tools repository down locally and install via pip install -U -e /path/to/optimade-python-tools or install it directly from GitHub via pip install -U git+https://github.com/Materials-Consortia/optimade-python-tools.git@master#egg=optimade.

@CasperWA
Copy link
Member

CasperWA commented Jan 6, 2021

Also, as a side note. Do you have anywhere I can post issues for your implementation?
I cannot seem to search in your database, try, e.g., http://optimade.openmaterialsdb.se/v1.0.0/structures?filter=NOT+structure_features+HAS+ANY+%22assemblies%22&sort=id&response_format=json&page_limit=10&page_offset=0&page_number=1

I receive a 500 Internal Server Error response with the error detail: "optimade_filter_to_httk_recurse() got multiple values for keyword argument 'recursion'".
As a positive it correctly returns errors in the OPTIMADE way :)

@rartino
Copy link
Contributor Author

rartino commented Jan 15, 2021

@CasperWA I've found and corrected a few bugs since the version you looked at this.

The implementation is now in the httk devel branch, so please just file any remaining oddities you find as issues here: https://github.com/httk/httk

I get 18 out of 22 from the validator, but it apparently experiences an internal failure, and I don't quite understand the error messages I get (one error it reports sends in a very odd syntax for the response_fields, but perhaps it is some issue with my info formatting...). I probably should file these as issues with the validator.

@ml-evs
Copy link
Member

ml-evs commented Jan 16, 2021

Hey @rartino, just took a quick look at this, it seems like you're hitting every branch on the way down in terms of triggering validator errors!

The internal error is being caused by your single-entry endpoint, which MUST (from memory) contain the single entry at the top-level, not as a list with 1 entry. Somehow the validator is accommodating for this failure, and then is using all of the next-level keys as the returned properties to test in other entries... It also looks like you're missing the data_returned field in your meta responses, which is causing the internal failure.

@rartino
Copy link
Contributor Author

rartino commented Jan 16, 2021

@ml-evs Thanks for taking a look and giving feedback!

Good find with the single entry endpoint, I'll fix that. (But I guess also the validator needs a fix :-) )

Isn't the data_returned optional on SHOULD level? It isn't super easy to produce it in my implementation at the moment. I will probably get it implemented rather soon, but databases working with huge datasets can have a very difficult time producing an exact count of the number of hits of a query. See, e.g., google's hit count.

@ml-evs
Copy link
Member

ml-evs commented Jan 16, 2021

Good find with the single entry endpoint, I'll fix that. (But I guess also the validator needs a fix :-) )

Unfortunately I think this is caused by a recent change that made the validator more forgiving on the single entry endpoint where someone had forgotten to include meta (but wanted to validate the rest), clearly we still need to find the right balance! For now I'll just add an explicit check and error message rather than this weird response field behaviour.

Isn't the data_returned optional on SHOULD level?

Looks like it is! I'll push a fix today and release v0.12.8 so it doesn't catch anyone else out.

@ml-evs
Copy link
Member

ml-evs commented Jan 16, 2021

Hey @rartino, have nailed down the problem to pagination: when you request response fields from the OMDB, the pagination link mangles the response fields:

e.g. http://optimade.openmaterialsdb.se/structures?response_fields=dimension_types

has links->next of http://optimade.openmaterialsdb.se/structures?response_fields=%5B%27dimension_types%27%2C+%27type%27%2C+%27id%27%5D&page_limit=50&page_offset=50&response_format=json&unknown_response_fields=%5B%5D.

@rartino
Copy link
Contributor Author

rartino commented Jan 18, 2021

@ml-evs and @CasperWA Thanks for the help. Now the httk implementation validates cleanly (but given the number of bugs I've corrected the last couple of days I am sure there are more left...).

@ml-evs
Copy link
Member

ml-evs commented Jan 18, 2021

We're just about to fix the data_available and data_returned comments you have, so thanks for reporting that!

@rartino
Copy link
Contributor Author

rartino commented Jan 18, 2021

@ml-evs Right, I also meant to say in my last comment that I have now implemented both data_available and data_returned which you may need for updating the OPTIMADE paper table.

(Note: the omdb database available via OPTIMADE is presently only at ~ 10% of our full database, as these changes required rebuilding the database and we are chewing through that. However, I'm of course fine with whatever numbers you get when you run your script as long as they are > 0 :-) )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants