Skip to content

Automatic data refetches

Pierre Segonne edited this page Feb 1, 2024 · 4 revisions

Electricity Maps strives to organise and expose real-time, granular electricity data to the world.

To do so, we constantly execute the collection of parsers built by our community to recover the most up-to-date electricity data available for the hundreds of data sources we rely on.

Why automatically refetch our data?

Unfortunately, the data published in real-time by our sources is not always completely accurate. For example:

  • The power production breakdown can contain a large portion of unknown production which is later reallocated to the respective production modes.
  • The reported production per mode can be incomplete and only consolidated later.

This explains why some data sources publish two different datasets: a real-time dataset and a consolidated dataset. (See for example with RTE in France)

Real-time data is also prone to outliers. Erroneous electricity data can be reported by the sources and only corrected after these data points have been flagged to the source.

Overall this means that data originally published in real-time will diverge from consolidated data sources as time passes

This explains why Electricity Maps automatically refetches data after a set amount of time from its sources.

How is the data refetched?

We have developed internally a set of tools that will automatically refetch data once a day, covering a 48h period, for the current day; for a week in the past; a month in the past; and three months in the past.

Data Quality - Chart for public documentation (https___github com_electricitymaps_electricitymaps-contrib_wiki_Automatic-data-refetches)

All the parsers will be executed, with the desired set of target_datetime such that 48h of data are refetched if available.

We decide to only refetch up to three months in the past because statistical institutions typically have three months to consolidate their data. Past three months, changes to the reported data are therefore unlikely.

How much does the data changes over time?

At this point in time, we cannot provide a easy to digest overview of how much data changes over time due to our automatic data refetches.

Clone this wiki locally