Skip to content
This repository has been archived by the owner on Dec 6, 2022. It is now read-only.

Consideration for the availability of files attached to old revisions #655

Open
rooby opened this issue Mar 2, 2018 · 3 comments
Open

Comments

@rooby
Copy link

rooby commented Mar 2, 2018

Drupal has always had a potential issue with files attached to old revisions of content.

The problem is public files attached to old revisions of content that are still available by direct link and are still searchable in search engines.

This is potentially a big problem for government sites if the site is misconfigured or misused (either accidentally or maliciously) because you might have policy documents or other legal documents that are out of date but are still accessible on the website by anyone.

The technicalities of the problem are:

When revisions are enabled, each individual revision of a node (or other revision enabled entity) counts as its own file usage. So if revisions are enabled and you replace a file on a file field with a new version of that file, the old version is retained because it is still used on that revision and it needs to be there in case the user wants to revert to that revision at some point.

Files in the public file system are always directly accessible if you know the URL. Generally people aren't required to guess the URL because Google or other search engines will find it for them. So if you have public files on an old revision they will stick around until that revision is deleted (and the file isn't used anywhere else on the site).

Using the private file system, Drupal will check that the user has access to view the file. For files attached to nodes, this means that node access rules will be checked and if the user doesn't have access to view unpublished revisions then they can't access the file. Since anonymous users will pretty much exclusively not have access to unpublished revisions this solves the problem (although it does add performance overheads to loading the file for uncached requests).

So any files where it might be an issue if a user can access old versions of that file should use the private file system.

So if the site builder is unaware of this behaviour (or just forgets about it when building a particular site) there could end up being a problem.

The next issue is that it is not necessarily easy to get an ideal configuration with the current version of GovCMS SaaS.

Unless all the content editors of a site are very competent and knowledgable on how the system works it would be best to avoid any possibly confusion that might arise by allowing them to select public/private each time they upload a file. Plus even if they are competent and do know about it, it's not hard to make a mistake and select the wrong one. It's also more user friendly if the system can work it out for them.

It's easy to have a document field that specifically uses private files but it's not so easy to handle files added via the WYSIWYG. In that case you either make all files going into the WYSIWYG private, which is not ideal because of performance implications, or you allow the uploader to choose per file, which is not ideal because it is error prone and not user friendly.

Another possible solution is to be able to set the scheme per file type, so for example all documents are private but all images are not. I don't think it's currently possible in GovCMS to do that though. It also leaves open the possibility of unwanted images being visible from old revisions.

Probably the safest way to handle it is to have all content related files in the private file system, so long as the performance trade-off is considered acceptable, which it might be due to the caching layers we have in place.

Whatever is the preferred approach, this is an issue that site builders need to be aware of and I think it would be good for GovCMS to have a recommendation as to how it is to be set up and to set sensible defaults.

@tobybellwood
Copy link
Contributor

Thanks @rooby - dealing with how best to employ private FS is something we're currently grappling with.

@samhighley
Copy link

This is also an issue for us

@rooby
Copy link
Author

rooby commented Mar 15, 2018

I ran into a related issue that means private files are not really useful on GovCMS in its current form.

GovCMS is geared towards sites that are primarily used by anonymous users however the paranoia module will not allow you to give the 'view private files' permission to the anonymous or authenticated role.

Related drupal.org issues:
https://www.drupal.org/project/file_entity/issues/2953137
https://www.drupal.org/project/paranoia/issues/2953239

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants