Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paginated requests for the file list #13915

Closed
PVince81 opened this issue Feb 5, 2015 · 32 comments
Closed

Paginated requests for the file list #13915

PVince81 opened this issue Feb 5, 2015 · 32 comments

Comments

@PVince81
Copy link
Contributor

PVince81 commented Feb 5, 2015

At some point we'll move the files web UI to WebDAV #12353

We might want to introduce real paginated requests but will have to somehow extend the WebDAV protocol in some way.

Please note that the current list.php doesn't support pagination. It still loads the whole list but the JS code will cut it into pages.

One problem with pagination is always about updates. If between the call to the first page, a file has been inserted in the first page, then the call for the second page will be shifted and missing that file.

@DeepDiver1975 @icewind1991

@DeepDiver1975
Copy link
Member

One problem with pagination is always about updates. If between the call to the first page, a file has been inserted in the first page, then the call for the second page will be shifted and missing that file.

Once we have a solution for #983 - this should no longer be an issue because we know the content did change and we can invalidate the pagination.

@PVince81
Copy link
Contributor Author

PVince81 commented Feb 5, 2015

Hmmm true, nice idea 😄

@oparoz oparoz added this to the 8.2-next milestone May 21, 2015
@PVince81
Copy link
Contributor Author

research topic => 9.0

@PVince81 PVince81 modified the milestones: 9.0-next, 8.2-current Sep 21, 2015
@PVince81
Copy link
Contributor Author

So far there has been no need for paginated requests, the current solution seems to work fine.

Close ? (can be reopened if the need arises)

@PVince81
Copy link
Contributor Author

PVince81 commented Jun 7, 2016

Reopening to reconsider for 9.2.

One idea is to use the REPORT method using pagination like for the comments, and using the PROPFIND-like response from the "tags filter" section.

On the JS side, more work will be required because some parts of the code require the whole list to be known upfront, for example for detecting duplicate file names.

CC @butonic

@PVince81 PVince81 modified the milestones: 9.2-next, 9.0 Jun 7, 2016
@PVince81 PVince81 reopened this Jun 7, 2016
@butonic
Copy link
Member

butonic commented Jun 7, 2016

For reference: the comments implementation for REPORT LIMIT and OFFSET filters is in https://github.com/owncloud/core/blob/master/apps/dav/lib/Comments/CommentsPlugin.php#L49

@butonic
Copy link
Member

butonic commented Jun 7, 2016

Simply using multiple ajax requests to load the whole list may not be the solution because the sheer number of files might be too much to handle in a browser:

We are experiencing issues with infinity scroll with a lot of files (> +/- 5000). The browser keeps loading & after a while, a timeout encours and the files are not loaded. We looked into this and noticed that ownCloud is loading all files with a single ajax request instead of requesting the nessesary files per scroll event.

If we up the timeout, we still have the issue that the browser cannot hold all those files in it's memory and hangs & that the initial loading time if very high.

@PVince81
Copy link
Contributor Author

PVince81 commented Jun 7, 2016

Also note that with pagination and multiple ajax per page, there is a risk of missing files in case new files have appeared in the list. Now thinking of it, we might be able to use the Webdav precondition headers to specify the last known folder etag. In case the list changed between two paginated calls, we need to find a way to recover from it.

@PVince81
Copy link
Contributor Author

@DeepDiver1975 pointed out that if we had a different data structure where everything the user views is on a single table, filtering and pagination would be more accurate and faster.

@PVince81
Copy link
Contributor Author

... because currently the result of whatever the user views is pulled from different tables, especially when dealing with mount points. So it's not a simple flat list.

@PVince81
Copy link
Contributor Author

PVince81 commented Oct 7, 2016

would need adding pagination stuff to all layers: #6103

I also thought about #4936 a bit. If we keep etag history (webdav sync?!) it means it should still be possible to query an older version of the file list and paginate on that. Well, but that only works for the full file list, not for search.
For search we'd still need to cache the results somewhere.

@felixheidecke
Copy link
Contributor

I'll try do give pagination a shot using JS. Since I'm fairly new to OC, I don't know how to talk to the API just yet, but virgin eyes are always good :-)

Is there a way to talk JSON with/to the API?

@PVince81
Copy link
Contributor Author

Have a look at the PROPFIND calls in the network console after browsing into a folder in the web UI.

It seems that Webdav should be extensible enough to allow us to make it read JSON and output JSON. But I'm not sure whether SabreDAV is extensible enough. We'll likely need adding layers there.

@DeepDiver1975
Copy link
Member

Main problem still is that we currently need to generate the full file list on the server and that list can be paginated.

Short term idea: cache the generated file list for a short time frame and reuse this on subsequent calls.

@butonic
Copy link
Member

butonic commented Mar 15, 2017

The problem is less showing the list, rather than fetching it from the server. XML serialization of 10000 entries alone takes a a few seconds. Also pagination can be passed down to the db layer, allowing filecache queries to be paginated as well.

@PVince81
Copy link
Contributor Author

Also pagination can be passed down to the db layer, allowing filecache queries to be paginated as well.

some day... not now... because as we already know the filecache doesn't contain all entries we want to return, these need to be combined with mount points... pagination becomes quite complex then.

@guruz
Copy link
Contributor

guruz commented Mar 16, 2017

XML serialization of 10000 entries alone takes a a few seconds.

We can improve this, because the current implementation is suboptimal (all in memory instead of streaming) see #14531 (comment) and owncloud/client#3111 (comment)

@butonic
Copy link
Member

butonic commented Mar 16, 2017

but then the client has to support streaming as well ... otherwise we still have to wait until the whole response has arrived. that is not how jquery ajax requests work. And if we were to implement comet-style long polling we would quickly leave the WebDAV spec.

@guruz
Copy link
Contributor

guruz commented Mar 16, 2017

In any case with streaming you should still get a speed improvement as you can save on all the in-memory huge tree overhead and just stream out the XML. I think nginx in its default configuration is even tempted to spool the reply to a file before sending to the client socket :[

For real streamed reading:
For the desktop client: owncloud/client#3111
For web: There is a way to iteratively get the data (responseText or so) in the onprogress or in some onreadystatechanged with current browsers. You don't need long polling or whatever for this.
Maybe https://fetch.spec.whatwg.org/#fetch-api supports it even nicer.

@guruz
Copy link
Contributor

guruz commented Apr 4, 2017

Linking @PVince81 's other newer thoughts: #26840

@PVince81
Copy link
Contributor Author

PVince81 commented Apr 7, 2017

Currently the only thing I think we can offer:

Server side

  • extend FilesReportPlugin REPORT method to support returning a PROPFIND without searching (with Depth), with "offset" and "limit".
  • fetch all files from the View API (which goes to DB, mount providers, etc), then cut out the results that we need based on "offset" and "limit"
  • format PROPFIND response as a response to the REPORT method
  • add summary attributes in the REPORT response, required to be able to render summaries in UI when not all results were returned:
    • total result count
    • total size (the UI can't calculate it any more since we only have a subset of the results)

Web UI side

  • replace PROPFIND calls with REPORT for the files JS API if an offset/limit was provided
  • make FileList._nextPage actually fetch the next page from server and render it
  • adjust summary to use the new attributes
  • rework the whole "select all" + action logic because we don't have all results any more
    • "delete all" action needs its own custom API, can't iterate over all results with Webdav DELETE any more
    • "move all" action needs new custom API
    • ... ?
  • detecting if a file/folder name already exists at creation time in the list needs a new approach (since we don't have all names any more)
  • PUT upload with file conflict: can't rely on client-side upload any more to save a request, but fortunately the server-side conflict check already works
  • ... ? (pretty sure there's a lot of other stuff that hangs to the assumption that we have the whole list)

I think I'm more happy to provide the server part but I'm a bit scared how this will scar the legacy FileList code even more to cram in this new pagination, considering that a lot of code rely on the assumption that the whole list is available.

@guruz
Copy link
Contributor

guruz commented Apr 10, 2017

Now thinking of it, we might be able to use the Webdav precondition headers to specify the last known folder etag. In case the list changed between two paginated calls, we need to find a way to recover from it.

👍 We might re-use this for the sync client.

But in general I'm still disagreeing with doing multiple requests per page. Everyone is trying to go away from multiple requests for latency/mobile reasoins (e.g. CSS sprites, HTTP2, bundling, (...)) and you want to introduce it :-(

Just saying..

@PVince81
Copy link
Contributor Author

Multiple request is only if you actually scroll down.

I wouldn't mind giving up all the pagination stuff if "fetching 1000 entries in one go" is the way to go. Because pagination is complicated especially when the crappy FS layer cannot support it.

@PVince81
Copy link
Contributor Author

moving to triage for rescheduling

@CorneeldH
Copy link

What is the status of this issue? Since searching in mobile clients depends on it, it would be really helpful for usability.

@butonic
Copy link
Member

butonic commented Mar 14, 2019

new effort in owncloud/web#116

@DeepDiver1975 DeepDiver1975 removed their assignment Sep 16, 2020
@micbar
Copy link
Contributor

micbar commented Sep 16, 2021

this is about API pagination, not about the frontend.

This case is also valid for ocis because webDAV will also be the file access protocol.
The webDAV spec doesn't support filtering and pagination on PROPFIND request.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions.

@github-actions
Copy link

This issue has been automatically closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants