Skip to content

Tracking intermittent failures over time project

Josh Matthews edited this page Feb 16, 2017 · 3 revisions

Track intermittent test failures over time

Background information: Servo has a large number of automated tests that intermittently fail. These are ignored as part of our continuous integration, but we do not track how often these failures occur. The goal of this work is to record this information and expose this data in a way that allows the Servo developers to act on it and fix the most frequent failures.

Tracking issue: https://github.com/servo/servo/issues/15602 (please ask questions here)

Initial steps:

  • email the mozilla.dev.servo mailing list (be sure to subscribe to it first!) introducing your group and asking any necessary questions
  • build a flask service for storing intermittent failure occurrences
    • use a JSON file to store information
    • recording a new occurrence requires the test file, platform, test machine (builder), and related GitHub pull request number
    • allow querying the stored results given a particular test file name
    • our existing known intermittent issue tracker is a straightforward example of a simple flask service

Subsequent steps:

  • add the ability to query the service by a date range, to find out which failures were most frequent
  • build a HTML front-end to the service that queries it using JS and reports the results in a useful manner (linking to github, sorting, etc.)
  • make filter-intermittents command record a separate failure for each intermittent failure encountered
  • propagate the required information for recording failures in saltfs
Clone this wiki locally