Skip to content

Blueprint Virtualization

Adi Dahiya edited this page Feb 15, 2024 · 2 revisions

This document serves as a RFC to improve Blueprint by having virtualized components, which would improve their performances.

Last updated: July 2022

Context

Most (all ?) of Blueprint’s component are not virtualized and this affects performance. This RFC aims at improving the performance of Blueprint’s component when displaying large amount of data (list, grid, timeline, tree, ...) by using a virtualization library called react-window.

A similar discussion occurred in the past about react-virtualized, which was the bulkier predecessor of react-window. Given the improvements made in react-window (in particular the size of the dependency decreased of a magnitude) I think this is time to revisit this.

What is virtualization ?

BLUF : rendering only what is visible

A virtualized component (such as a list) will load only the elements that are visible and unload them when they are not, by replacing them with new ones. They calculate which items are visible inside the area where the list is displayed (the viewport). This process is called “windowing”, and is different from the “virtual DOM” used by React (which create DOM element only when react renders elements). This is part of the performance optimization guide of React.

Non-virtualized list / from blogpost Virtualized list = windowing to (strictly) what the user can see / from blogpost Virtualized list with buffers = windowing to (with a bit of extra items before and after, to handling scrolling without lag) what the user can see / from blogpost
image image image

A few concepts :

  • Windowing : limiting the objects we want to be part of the virtual DOM that React will render into real DOM nodes. DOM nodes that exit the "window" are recycled, or immediately replaced with newer elements as the user scrolls down the list. This keeps the number of all rendered elements specific to the size of the window. (from here)
  • Buffer : If we limit the objects to render to exactly what the user can see, while scrolling there might be a delay before rendering new objects, hence there are “buffers” that pre-load data just before and after the currently displayed list (but that’s still far less objects than the full list)
  • Virtual DOM : Faster to iterate on (because no need to draw on screen). Difference with previous DOM will be computed to only change “what is needed” in the real DOM.
  • Real DOM : slower, because need to be drawn.
image

e.g. showing 1000 rows in a frame only displaying 8 rows at a time (from blogpost)

Non virtualized : all rows are materialized as DOM elements objects Page display Virtualized : more nesting, but only 8 rows are materialized as DOM elements objects, evolving as user is scrolling (like a “rolling window”)
image image image

Pros

✅ Improved performance (FPS) (without vs with)

Cons

❌ Need to know the width and height of the list (or container) and the height of the row (in the container). But there are helper classes to compute them. ❌ That’s a library, so it’s adding a dependency

React virtualized or React window ?

BLUF: react window is “better” than react virtualized.

React virtualized (23k GH stars, 950k dwl per week, 2.27MB) and react window (12k GH stars, 1M dwl per week, 800Kb) were both created by the same author (Brian Vaughn). React window is the 2nd iteration. Thanks to the author :

  • It’s smaller
  • It’s faster
  • It has less unnecessary (thanks to the author) features and APIs
  • The documentation is more user-friendly

One number :

Adding a react-virtualized list to a CRA project increases the (gzipped) build size by ~33.5 KB.

Adding a react-window list to a CRA project increases the (gzipped) build size by <2 KB.

Both are MIT licensed, greenlighted by IP Ninjas. See RTFM.

From Comparison Page on Github

Virtualization vs Paging (vs lazy-loading) ?

BLUF : Regardless if paging can be a solution, the frontend should not be an order of magnitude worst than what someone building a front-end from scratch can get.

Vocabulary :

  • lazy-loading : loading some data only when needed (e.g. a picture only when visible by the user)
  • Infinite scrolling or infinite scrolling : Add new DOMs nodes as the user scrolls near to the end of the already loaded list. That still populate the DOM with many nodes, if the user scrolls a lot.
  • Chunking or Paging : getting data from the back end in blocks of n rows/objects instead of all rows/objects at once.
  • Paging (front-end) : renders the data in pages instead of rendering all the information at once. Less stress on the DOM. The user need to click a “next page” or a “page index” to load more data.
  • Virtualization (in this context): a way to display a lot of unpaginated data without significant performance impact.

Infinite scrolling can be achieve by using the “infinite-loader” package of react-window. Such implementation will benefit from the virtualized list, and so not only “add nodes” but render only the visible nodes (+ some before and some after, as buffer). A callback has to be provided to “load more” items. (see example there). The number of nodes “not visible” but “pre-loaded” is configurable. (overscanCount).

Pagination is an alternative to virtualized components. When loading only a few rows from the backend, the volume of data to display is much lower, and so virtualization is less important. However, if more data is rendered than shown to the user, then virtualization is still useful. Data-intensive widgets (e.g. zoomable timelines, tree table ...) cannot be paged. It’s easier for users to jump to a given page when they want to visit “far away” data instead of scrolling all the way down to it.

There is a scale discussion : If 1k nodes in a tree widget (or grid, or table...) are a problem (an order of magnitude slower than other application using virtualization), this needs to be sorted; regardless if paging is an alternative. This will also put less strain on the user’s machine.

Why do we need virtualization ?

BLUF : blueprint components can be (sometimes, depending on which ones with which data) crazy slow

Tree with default tree widget in Slate with 9k nodes Tree in CodeSandbox in Slate (with a third party library) with 13k items Virtualized tree based on react-window with 110k items (sandbox)
Screen Recording 2024-02-15 at 8 43 22 AM Screen Recording 2024-02-15 at 8 44 00 AM Screen Recording 2024-02-15 at 8 44 32 AM

Current state

Some component(s), not in core blueprint, were created using windowing/virtualized components. (Internal links removed here, see Quip for details)

@blueprintjs/table has a custom implementation of virtualized row rendering. It was created years ago when these third-party virtualization libraries were less mature, and it has not been updated much since. It’s probably worth migrating the table component to use react-window to reduce bugs and maintenance burden.

Multiple places could benefit from virtualized components : better performance for Slate tree, hierarchical widget for Workshop, hierarchal tree for Quiver, tree in Contour...

Potential Contributors

Note : This is a list of potential helpers/reviewers/relevant people involved in the initiative.

  • Adi Dahiya
  • Vincent Falconieri
  • Robert Fidler

Steps/Roadmap

This is blurry and will get updated as we go

For blueprint in general :

  • Create a package (tutorial) called like “blueprint-virtualized” and create the virtualized components there.

For tree component :

  • Linearize the underlying tree data in memory + virtualize the rendering over it.

Other solutions considered

  • react-vtree is a tree component which uses react-window, which seems like a viable solution to get a performant tree (we can style it to match Blueprint), but this library is a little under-maintained (see issue #83).

Issue Links

Libraries on Github

Blog posts

Clone this wiki locally