Skip to content
Anthony Williams edited this page Jun 26, 2017 · 1 revision

Loading Datasets

When a request is made to ETEngine, the initial graph and dataset state must be loaded. This process begins in Gql::Gql with a call to Etsource::Loader#dataset. There is a cache here if the database has been loaded before, the cached value is returned and we proceed no further. Should the cache be empty, the request to load the dataset is passed through to Etsource::Dataset::Import.

Etsource::Dataset::Import

This process begins in Etsource::Dataset::Import. This class is responsible for taking data from the computed Refinery graph and the Atlas::Dataset, and transforming it into the structure used by ETEngine: a Qernel::Dataset.

Etsource::Dataset::Import.new(:nl).import
# => #<Qernel::Dataset>

Everything in this class is related to extracting data from Atlas and Refinery, and making it "fit" the expectations of ETEngine. The graph information (demands of nodes, shares of edges and slots) is loaded through Import#load_graph_dataset. This method calls AtlasLoader to compute the Refinery graph and serialize the results to a MessagePack file which is shared among Rails processes.

This class also handles loading the carrier data and time curves, along with the region data (simply Atlas::Dataset#to_hash).

Etsource::AtlasLoader

AtlasLoader is responsible for loading a Refinery graph (through Atlas) and then triggering calculation of the graph. When complete, Atlas::FullExporter converts the graph data to a Hash which can then be serialized into a MessagePack file.

Note that AtlasLoader does not care whether it is loading a "full" or "derived" dataset; Atlas takes care of the differences between the two, and AtlasLoader simply calls Atlas::Runner#calculate and #refinery_graph.

There are two versions of the AtlasLoader:

  • PreCalculated is used when the config.yml file defines etsource_lazy_load_dataset to be false. This is the case in staging and production environments. This loader will compute the results for every dataset defined by ETSource when the application loads and save the results to MessagePack Files.

  • Lazy is used when config.yml defines etsource_lazy_load_dataset to be true; as is typically the case in development. This loader will not pre-compute any data when the application starts, but will load the dataset information the first time a scenario is loaded which references a dataset.

    For example, when the application first starts, no MessagePack data will be present in the tmp directory. If you now choose to load an NL scenario, there will be a short delay while the dataset information is loaded and calculated. Shortly thereafter, MessagePack data for NL will be saved (and re-used for any subsequent request).

Gql::Gql

Once the data has been loaded (or read from the cache), Gql::Gql:

  1. Clones the dataset (once for the present, once for the future),
  2. Applies initializer inputs (derived regions only),
  3. Scales the graphs (scaled scenarios only),
  4. Applies user inputs
  5. Finally, calculates the present and future graphs.

At this point queries can be run, and results returned to the user.