Skip to content

Data quality improvements

Thomas Leese edited this page May 18, 2022 · 1 revision

The quality of the data associated with our moves depends on what our suppliers send us. At the moment, it's difficult for suppliers to correct data once it's been sent to us, and we've recorded our options on how to improve this. In the mean time, we have to occasionally import data from the suppliers.

Importing data

There are a number of Rake tasks in the API for importing CSV files from suppliers which corrects the data in our database, where the suppliers can't correct this themselves.

Running the import

  1. Check the CSV files provided by the suppliers match the format expected. It can be helpful to use dos2unix to ensure it's in a unix line endings format.
  2. Copy the file on to a pod using kubectl cp.
  3. Import the data by using bundle exec rake import:...:...[filename.csv]

List of tasks

📝  Code

To see what format the import tasks expect you can look at the file linked above.

import:cancel_or_reject_journeys:serco
Cancels or rejects any journeys which have been left in progress.
import:delete_events:serco
Removes any events from the database. (Note that this is not reversible)
import:events_incorrect_occurred_at:serco
Fixes the occurred at date for any events.
import:journeys_missing_vehicle:serco
Sets vehicle information on any journeys missing that.
import:journeys_without_ending_state:serco
Sets the state of journeys with an ending event but in the wrong state.
import:missing_journey_ending_events:serco
Adds a final ending event to journeys which are missing one.
import:missing_journey_start_events:serco
Adds a start event to journeys which are missing one and updates the state.
import:missing_move_ending_events:serco
Adds a final ending event to moves which are missing one.
import:missing_move_start_events:serco
Adds a start event to moves which are missing one and updates the state.
import:moves_without_ending_state:serco
Sets the state of moves with an ending event but in the wrong state.
import:moves_without_to_location:serco
Sets the destination location on moves which are missing one.
Clone this wiki locally