Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensuring that links point to index meta-dbs #34

Merged

Conversation

giovannipizzi
Copy link
Contributor

@giovannipizzi giovannipizzi commented Jun 11, 2020

In particular:

Also, this PR probably supersedes PR #18
Mentioning also #23 as here we are providing an easy way to host an index Meta-DB, if you don't want to host it yourself.

Fixes #36 : Updating providers.json with the updated links resource object schema added to the spec in Materials-Consortia/OPTIMADE#273.

- Explicitly stated in the README the requirement that this list
  only allows pointing to Index meta-DBs
- Add Index meta-DBs for COD and TCOD
- Add instructions on how to host a new Index meta-DB
- Set to `null` the `base_url` of the example entry (otherwise it
  would not pass validation)
- Added new pytest that validates that all `base_url`s in the main
  `providers.json` file are indeed Index meta-DBs, and that
  at least these Index meta-DBs have a valid syntax
@giovannipizzi
Copy link
Contributor Author

giovannipizzi commented Jun 11, 2020

I create a draft PR so I can start to get feedback.

This is still draft because:

Also removing the now useless symlinks that, in my opinion,
create confusion (I was under the impression that they were honored,
but they are not - I think we used earlier, before Netlify,
when serving with GitHub Pages).
@giovannipizzi
Copy link
Contributor Author

Just for reference, for people who want to check live the results using Netlify preview feature:

BTW: Thanks @rartino for pointing out this preview feature! Not only I discovered a small issue, but I could also simplify further the instructions to create a new meta-db: now the redirects are parametrised, so there is no need to create symlinks or specify new redirects. I also took the occasion to remove symlinks from the repo: I think they are confusing, as they are not really honoured by netlify, that instead uses the _redirects file. So I think it's better to just use that and remove all symlinks.

@rartino
Copy link
Contributor

rartino commented Jun 11, 2020

I suppose we can remove the symlinks... But they were there because with them, the repository also works correctly in github pages, except for the serving json files with the wrong mime-type.

But I guess some people may find it a good thing that this way of serving the repositry with that flaw stops working. (I was holding out for github pages to deliver some form of solution to that limitation some day...)

@giovannipizzi
Copy link
Contributor Author

@rartino yes, I see. I think the best would be to open an issue to say that we might want to go back to GitHub pages, and in that case we should remove _redirects and re-instate the symlinks. And stating that at the moment we cannot solve this because of GitHub pages limitations. So the repo stays cleaner (and it's easier for people to be consistent when creating a new static index meta-database). What do you think?

src/index-metadbs/cod/v1/links.json Outdated Show resolved Hide resolved
src/index-metadbs/tcod/v1/links.json Outdated Show resolved Hide resolved
@rartino
Copy link
Contributor

rartino commented Jun 11, 2020

Also: I agree with your suggestion. Do you open the issue, just so that the notes about what needs to be done to re-instate github compatibility are kept. (You know best what you removed...)

giovannipizzi and others added 2 commits June 11, 2020 18:19
Co-authored-by: Rickard Armiento <gitcommits@armiento.net>
@rartino
Copy link
Contributor

rartino commented Jun 11, 2020

Sorry about this; but my first suggestion had slashes at the end of the OPTIMADE providers URLs (both homepage and base_url). I tried to quickly edit it, but apparently you picked it up anyway.

@giovannipizzi
Copy link
Contributor Author

@rartino: done, issue open: #35

Also thanks for the providers link. I didn't focus on fully converting to the changed suggested in Materials-Consortia/OPTIMADE#273, but this is a good point. As a note: we'll need to adapt the type and link_type also of the main file before merging.

@giovannipizzi
Copy link
Contributor Author

BTW: this issue is ready for review. The failing test is due to something that is being fixed in the nomad implementation (pinging @markus1978 - as soon as it's redeployed, please let me know so I re-run the tests)

src/index-metadbs/tcod/v1/links.json Outdated Show resolved Hide resolved
src/index-metadbs/cod/v1/links.json Outdated Show resolved Hide resolved
src/index-metadbs/exmpl/v1/links.json Outdated Show resolved Hide resolved
giovannipizzi and others added 3 commits June 11, 2020 20:41
Co-authored-by: Rickard Armiento <gitcommits@armiento.net>
Also adding flag that allows to allow/disallow v0.x endpoints
These have been adapted in the OPTIMADE specs in
Materials-Consortia/OPTIMADE#273
Copy link
Member

@CasperWA CasperWA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great @giovannipizzi !

I haven't checked the test_providers.py, but as the only file.

.github/workflows/validate.yml Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
_redirects Show resolved Hide resolved
src/index-metadbs/README.md Outdated Show resolved Hide resolved
src/index-metadbs/README.md Outdated Show resolved Hide resolved
src/index-metadbs/exmpl/v1/links.json Outdated Show resolved Hide resolved
src/index-metadbs/exmpl/v1/links.json Outdated Show resolved Hide resolved
src/index-metadbs/tcod/v1/info.json Outdated Show resolved Hide resolved
src/index-metadbs/tcod/v1/links.json Outdated Show resolved Hide resolved
src/index-metadbs/tcod/v1/links.json Outdated Show resolved Hide resolved
@giovannipizzi
Copy link
Contributor Author

Not your fault @CasperWA actually thanks for reviewing! I just saw 273 merged so I decided to update directly this.

BTW: I think this is ready for a final review and possibly merge.
The only remaining thing is the failing tests on nomad.

I see three options:

  • we wait for Markus to deploy the fix
  • we merge anyway (the test is not blocking), even if I don't like this approach too much (we might miss other problems)
  • we temporarily remove nomad's url and we put it back as soon as it is valid

I don't like approach 3, so I suggest approach 1; but if there are no other issues I'd like to merge this by tomorrow (Friday) evening before the end the workshop (also because this is blocking for e.g. #23) so maybe if the deployment is not ready, we continue tomorrow with approach 3?

@markus1978 do you know how long it will take for the fix to be in production? What is the opinion of others?

Also, @sauliusg it would be good to have your review on this since I am "touching" the COD and TCOD entries (even if I am not really changing anything, except adding the Index Meta-DB).

README.md Show resolved Hide resolved
@CasperWA
Copy link
Member

The only remaining thing is the failing tests on nomad.

I think @markus1978 has an intention of having the Index Meta-Database hosted here as well?

@giovannipizzi
Copy link
Contributor Author

I think @markus1978 told me in private that now they have their own meta-db so he's happy to use his own, and that he would have fixed the syntax error in his response

@rartino
Copy link
Contributor

rartino commented Jun 11, 2020

How would we normally handle this situation? As the providers list grows, there surely is going to be situations when we want to merge changes while someones database is offline. We probably should only mark as 'fail' if the database we are trying to add is the breaking one and only issue warnings for other failing databases. (Is it possible in travis to see which database is being added somehow?...)

And then we probably should set up some other external system that checks the list periodically and alerts us if some database has been failing for weeks - in which case we should have a protocol for sending out a warning, and then removing it.

@CasperWA
Copy link
Member

How would we normally handle this situation? As the providers list grows, there surely is going to be situations when we want to merge changes while someones database is offline. We probably should only mark as 'fail' if the database we are trying to add is the breaking one and only issue warnings for other failing databases. (Is it possible in travis to see which database is being added somehow?...)

And then we probably should set up some other external system that checks the list periodically and alerts us if some database has been failing for weeks - in which case we should have a protocol for sending out a warning, and then removing it.

Yeah, this feels like something that is at a warning level and not failing/error level. I.e., these things shouldn't fail our CI tests.

I agree that it would be great to test the incoming providers, however, I don't know how easy that is to do technically, and I don't know that it even at that point should be a failure criterium? I would still consider this as a warning.

Instead we should probably have a "dashboard" of a sort, where it's possible to see the current list of responding providers ("current" here being on a weekly basis or similar?).

@giovannipizzi
Copy link
Contributor Author

Fully agreed, was thinking the same thing.
I still think it valuable to have the tests run at PR time, so one at least can check manually if there is a problem with the new additions (I don't think it's obvious how to check only the 'new' ones).
And this should anyway be a periodic check and produce some rendered output with badges, i.e. the dashboard idea of Casper. E.g. we do something similar in our plugin registry using GitHub pages. I'll think if there is something similar that can be done here. But I'd say that this is for a different PR.

If we accept that these errors should be checked manually but not be blocking (I am ok, and we can revisit later), I will now convert this PR from a draft to a real PR, so it can be reviewed and merged.

@giovannipizzi giovannipizzi marked this pull request as ready for review June 12, 2020 06:55
@rartino
Copy link
Contributor

rartino commented Jun 12, 2020

I know this is last minute, but should we add a field date_added? Then it is easy to only report an error if the one with the highest data_added breaks. It also seems useful metadata going forward.

This was discussed and found not a good idea. Maybe we can do something similar by letting the validator look at the git history.

@markus1978
Copy link
Contributor

I think I fixed the nomad issues that failed the tests. Is there a way to re-run the checks without a new commit?

@giovannipizzi
Copy link
Contributor Author

@markus1978 yes, I just rerun it. That problem went away, thanks!
But now we have this error:

[ERROR] data -> attributes -> available_api_versions -> 0 -> url
92
E           [ERROR]   url MUST be a versioned base URL. It is: http://repository.nomad-coe.eu/v0.8/optimade/index (type=value_error)

I think that the specs require that in the /info endpoint here, instead, you point to the versioned API (see example here). See specs here.

@giovannipizzi
Copy link
Contributor Author

@CasperWA is this ok with you now?

@giovannipizzi
Copy link
Contributor Author

Regarding the dashboard, see the independent PR #39

@markus1978
Copy link
Contributor

Ok, I am going to change the url under available versions. But, if I remember correctly @CasperWA instructing me the other way around? But I might also just be solidly confused, since it feels that I have changed urls from versioned to un-versioned and back again about a dozen times by now.

@giovannipizzi
Copy link
Contributor Author

:-) I think it's unversioned when you link to another base_url, but versioned when you declare your own URL in /info (please correct me if I'm wrong)

Copy link
Member

@CasperWA CasperWA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use the IndexInfoResponse for the /info endpoints of Index Meta-Databases. These are slightly different, adding in the relationships key where one can provide a default database/links resource.

Concerning the suggested version value changes, I am not completely sure if they are needed according to the spec, so I would suggest to check that before blindly accepting the changes :)

Otherwise fine by me, thanks @giovannipizzi !

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
src/index-metadbs/cod/v1/info.json Outdated Show resolved Hide resolved
src/index-metadbs/cod/v1/info.json Outdated Show resolved Hide resolved
src/index-metadbs/exmpl/v1/info.json Outdated Show resolved Hide resolved
src/index-metadbs/tcod/v1/info.json Outdated Show resolved Hide resolved
src/index-metadbs/tcod/v1/info.json Outdated Show resolved Hide resolved
tests/test_providers.py Outdated Show resolved Hide resolved
tests/test_providers.py Outdated Show resolved Hide resolved
tests/test_providers.py Outdated Show resolved Hide resolved
@CasperWA
Copy link
Member

Ok, I am going to change the url under available versions. But, if I remember correctly @CasperWA instructing me the other way around? But I might also just be solidly confused, since it feels that I have changed urls from versioned to un-versioned and back again about a dozen times by now.

In the available_api_versions it should indeed be versioned base URLs (as far as I remember from the spec).

@markus1978
Copy link
Contributor

All good, eventually I will get it; you just have to be patient with me ;-). Should be fixed now.

@giovannipizzi
Copy link
Contributor Author

Thanks a lot @markus1978 - all tests pass now! :-D

Co-authored-by: Casper Welzel Andersen <43357585+CasperWA@users.noreply.github.com>
@giovannipizzi
Copy link
Contributor Author

Thanks @CasperWA . Merged all your changes.
However now I have a bunch of failures:

[ERROR] PROBLEMS DETECTED!
[ERROR] 
[ERROR] Provider "cod": http://providers.optimade.org/index-metadbs/cod/v1/info endpoint has problems during validation.
[ERROR] Error message:
[ERROR] 1 validation error for IndexInfoResponse
[ERROR] data -> relationships -> default -> data -> type
[ERROR]   unexpected value; permitted: 'child' (type=value_error.const; given=links; permitted=['child'])
[ERROR] 
[ERROR] Provider "exmpl": http://providers.optimade.org/index-metadbs/exmpl/v1/info endpoint has problems during validation.
[ERROR] Error message:
[ERROR] 1 validation error for IndexInfoResponse
[ERROR] data -> relationships -> default -> data -> type
[ERROR]   unexpected value; permitted: 'child' (type=value_error.const; given=links; permitted=['child'])
[ERROR] 
[ERROR] Provider "nmd": https://repository.nomad-coe.eu/v0.8/optimade/index/v0/info endpoint has problems during validation.
[ERROR] Error message:
[ERROR] 1 validation error for IndexInfoResponse
[ERROR] data -> relationships
[ERROR]   field required (type=value_error.missing)
[ERROR] 
[ERROR] Provider "tcod": http://providers.optimade.org/index-metadbs/tcod/v1/info endpoint has problems during validation.
[ERROR] Error message:
[ERROR] 1 validation error for IndexInfoResponse
[ERROR] data -> relationships -> default -> data -> type
[ERROR]   unexpected value; permitted: 'child' (type=value_error.const; given=links; permitted=['child'])

For those of cod, cod, exmpl, it's probably because the optimade-python-tools have not yet been released with the fix? When will this happen?

For nomad, is the type required in the end?
See also one of the points of Materials-Consortia/optimade-python-tools#332

Copy link
Contributor

@rartino rartino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

How clear are we in the specification about in which fields you are supposed to prefix the version number with 'v' and not? I feel we are widely inconsistent with this. (And IMO it had been better to just have the 'v' on the URL and nowhere else.)

@giovannipizzi giovannipizzi dismissed CasperWA’s stale review June 17, 2020 15:39

I think I addressed all requests by Casper so I'm dismissing this

@giovannipizzi giovannipizzi merged commit a96d424 into Materials-Consortia:master Jun 17, 2020
@giovannipizzi giovannipizzi deleted the fix_index_metadbs branch June 17, 2020 15:39
@giovannipizzi
Copy link
Contributor Author

@rartino I'm not 100% sure that this is currently specified, I didn't check - what is in here should be correct as it is used by the client. Indeed we should probably open an issue in the main repo if this is not clear, or it's confusing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants