-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wikibase: proxy manifest requests #6136
base: master
Are you sure you want to change the base?
Conversation
extensions/wikibase/src/org/openrefine/wikibase/commands/FetchManifestCommand.java
Dismissed
Show dismissed
Hide dismissed
I think it makes sense to fetch the manifest from the backend instead of in the frontend. GitHub identifies a security issue in the fact that we are fetching a user-supplied URL. I guess it makes sense but I am not clear on what the security implications really are. Perhaps the server could serve a maliciously crafted JSON payload which would exploit a vulnerability of our JSON parser, say (like there were similar attacks with exponential expansion of XML documents). But obviously we already do plenty of backend-side fetching of user-supplied URLs so any security issues with that would also apply there. |
I think that #3109 is a natural follow up, however, my intent for now is only to enable the manifest hosting use-case.
I share your view, an exploit here would already effect other parts of the codebase. Just proxying it is probably the safest thing we can do? |
Yes, we should not be forwarding any more information by making this request from the backend (no added GET/POST parameters, no cookies because you're using a fresh OkHTTP instance), so I think that's fine. |
Maybe a user-agent string? |
I agree with @wetneb that some type of restriction/registration of the target domain or payload would be useful. The fact that the current security posture is poor doesn't mean that we shouldn't try to minimize, or at least not increase, the attack surface. Letting the user pick from a set of configured URLs would like resolve the security warning, but if that's not possible, validating the content (a la #3109) would be next best. What issue # is this associated with? The PR template should have included a placeholder for the issue #. |
In what way would this increase the attack surface? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After having a fresh look at this, I think this looks safe enough as it stands. @tfmorris do you have more thoughts about in which way this would be unsafe?
response.getWriter().write(res.body().string()); | ||
response.setContentType("application/json"); | ||
response.setStatus(200); | ||
response.setCharacterEncoding("UTF-8"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be worth cleaning up this part to just go through our generic utilities to return JSON from a command, which would probably involve parsing the JSON before (which should be fine?)
I think writing the response body before setting the encoding can cause encoding issues (if I remember correctly).
I've lost the thread here. I'm happy to let you two agree on the best solution. |
@Abbe98 I'm assuming you won't have time to work on this, so I'd just make a few cleaning tweaks and merge it unless you have concerns? |
I am actually tempted to make this more generic, rather than more specific, turning it into a endpoint to fetch JSON from arbitrary third-party sources and return it to the frontend. That would be a way to eliminate all uses of JSONP in the frontend, which would be a clear security win. (I think @Abbe98 has suggested something similar in the past.) We could still have some safeguards to limit the size of the data downloaded (to make sure we don't choke up the backend) and ensure that the JSON parsing is suitable for adversarial payloads (which should be the default with Jackson, but it's always worth checking). |
This makes the backend proxy requests for Wikibase manifests to allow them to be hosted in MediaWiki itself(removing the CORS requirement).
MediaWiki hosted manifest example(Wikidata): https://www.wikidata.org/wiki/User:Abbe98/openrefine-manifest.json?action=raw
Setting this to draft for now but I would like to hear what others think about this approach before working out the details.