Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate and report quartiles in performance results #60950

Merged
merged 6 commits into from May 15, 2024

Conversation

jsnajdr
Copy link
Member

@jsnajdr jsnajdr commented Apr 22, 2024

Inspired by #60509, I'm adding a calculation of 25% and 75% quartiles of the measurements. Each measurement is a set of 10+ samples, so let's extract more data than a single number, i.e., the average or the median.

Knowing the quartiles gives us some idea about the variance, i.e., the error of the measurements. Knowing that is important for determining if a change in a metric in a given PR is significant or not.

Also using medians consistently everywhere instead of averages. Medians are more stable in presence of outliers, i.e., measurements where the process fails in a big way, i.e., the error is larger than a gaussian noise.

Let's see how it works.

@jsnajdr jsnajdr added the [Type] Performance Related to performance efforts label Apr 22, 2024
@jsnajdr jsnajdr self-assigned this Apr 22, 2024
Copy link

github-actions bot commented Apr 22, 2024

Size Change: +54 B (0%)

Total Size: 1.75 MB

Filename Size Change
build/edit-post/index.min.js 14.6 kB +4 B (+0.03%)
build/edit-site/index.min.js 220 kB -6 B (0%)
build/editor/index.min.js 85.9 kB +56 B (+0.07%)
ℹ️ View Unchanged
Filename Size
build/a11y/index.min.js 955 B
build/annotations/index.min.js 2.27 kB
build/api-fetch/index.min.js 2.32 kB
build/autop/index.min.js 2.1 kB
build/blob/index.min.js 578 B
build/block-directory/index.min.js 7.26 kB
build/block-directory/style-rtl.css 1.03 kB
build/block-directory/style.css 1.03 kB
build/block-editor/content-rtl.css 4.57 kB
build/block-editor/content.css 4.57 kB
build/block-editor/default-editor-styles-rtl.css 395 B
build/block-editor/default-editor-styles.css 395 B
build/block-editor/index.min.js 259 kB
build/block-editor/style-rtl.css 15.5 kB
build/block-editor/style.css 15.5 kB
build/block-library/blocks/archives/editor-rtl.css 61 B
build/block-library/blocks/archives/editor.css 60 B
build/block-library/blocks/archives/style-rtl.css 90 B
build/block-library/blocks/archives/style.css 90 B
build/block-library/blocks/audio/editor-rtl.css 150 B
build/block-library/blocks/audio/editor.css 150 B
build/block-library/blocks/audio/style-rtl.css 122 B
build/block-library/blocks/audio/style.css 122 B
build/block-library/blocks/audio/theme-rtl.css 133 B
build/block-library/blocks/audio/theme.css 133 B
build/block-library/blocks/avatar/editor-rtl.css 116 B
build/block-library/blocks/avatar/editor.css 116 B
build/block-library/blocks/avatar/style-rtl.css 104 B
build/block-library/blocks/avatar/style.css 104 B
build/block-library/blocks/block/editor-rtl.css 277 B
build/block-library/blocks/block/editor.css 277 B
build/block-library/blocks/button/editor-rtl.css 415 B
build/block-library/blocks/button/editor.css 414 B
build/block-library/blocks/button/style-rtl.css 627 B
build/block-library/blocks/button/style.css 626 B
build/block-library/blocks/buttons/editor-rtl.css 337 B
build/block-library/blocks/buttons/editor.css 337 B
build/block-library/blocks/buttons/style-rtl.css 332 B
build/block-library/blocks/buttons/style.css 332 B
build/block-library/blocks/calendar/style-rtl.css 239 B
build/block-library/blocks/calendar/style.css 239 B
build/block-library/blocks/categories/editor-rtl.css 113 B
build/block-library/blocks/categories/editor.css 112 B
build/block-library/blocks/categories/style-rtl.css 124 B
build/block-library/blocks/categories/style.css 124 B
build/block-library/blocks/code/editor-rtl.css 53 B
build/block-library/blocks/code/editor.css 53 B
build/block-library/blocks/code/style-rtl.css 121 B
build/block-library/blocks/code/style.css 121 B
build/block-library/blocks/code/theme-rtl.css 124 B
build/block-library/blocks/code/theme.css 124 B
build/block-library/blocks/columns/editor-rtl.css 108 B
build/block-library/blocks/columns/editor.css 108 B
build/block-library/blocks/columns/style-rtl.css 421 B
build/block-library/blocks/columns/style.css 421 B
build/block-library/blocks/comment-author-avatar/editor-rtl.css 125 B
build/block-library/blocks/comment-author-avatar/editor.css 125 B
build/block-library/blocks/comment-content/style-rtl.css 92 B
build/block-library/blocks/comment-content/style.css 92 B
build/block-library/blocks/comment-template/style-rtl.css 199 B
build/block-library/blocks/comment-template/style.css 198 B
build/block-library/blocks/comments-pagination-numbers/editor-rtl.css 123 B
build/block-library/blocks/comments-pagination-numbers/editor.css 121 B
build/block-library/blocks/comments-pagination/editor-rtl.css 222 B
build/block-library/blocks/comments-pagination/editor.css 209 B
build/block-library/blocks/comments-pagination/style-rtl.css 235 B
build/block-library/blocks/comments-pagination/style.css 231 B
build/block-library/blocks/comments-title/editor-rtl.css 75 B
build/block-library/blocks/comments-title/editor.css 75 B
build/block-library/blocks/comments/editor-rtl.css 840 B
build/block-library/blocks/comments/editor.css 839 B
build/block-library/blocks/comments/style-rtl.css 637 B
build/block-library/blocks/comments/style.css 636 B
build/block-library/blocks/cover/editor-rtl.css 671 B
build/block-library/blocks/cover/editor.css 674 B
build/block-library/blocks/cover/style-rtl.css 1.7 kB
build/block-library/blocks/cover/style.css 1.69 kB
build/block-library/blocks/details/editor-rtl.css 65 B
build/block-library/blocks/details/editor.css 65 B
build/block-library/blocks/details/style-rtl.css 86 B
build/block-library/blocks/details/style.css 86 B
build/block-library/blocks/embed/editor-rtl.css 312 B
build/block-library/blocks/embed/editor.css 312 B
build/block-library/blocks/embed/style-rtl.css 410 B
build/block-library/blocks/embed/style.css 410 B
build/block-library/blocks/embed/theme-rtl.css 133 B
build/block-library/blocks/embed/theme.css 133 B
build/block-library/blocks/file/editor-rtl.css 326 B
build/block-library/blocks/file/editor.css 327 B
build/block-library/blocks/file/style-rtl.css 280 B
build/block-library/blocks/file/style.css 281 B
build/block-library/blocks/file/view.min.js 324 B
build/block-library/blocks/footnotes/style-rtl.css 201 B
build/block-library/blocks/footnotes/style.css 199 B
build/block-library/blocks/form-input/editor-rtl.css 227 B
build/block-library/blocks/form-input/editor.css 227 B
build/block-library/blocks/form-input/style-rtl.css 343 B
build/block-library/blocks/form-input/style.css 343 B
build/block-library/blocks/form-submission-notification/editor-rtl.css 340 B
build/block-library/blocks/form-submission-notification/editor.css 340 B
build/block-library/blocks/form-submit-button/style-rtl.css 69 B
build/block-library/blocks/form-submit-button/style.css 69 B
build/block-library/blocks/form/view.min.js 471 B
build/block-library/blocks/freeform/editor-rtl.css 2.61 kB
build/block-library/blocks/freeform/editor.css 2.61 kB
build/block-library/blocks/gallery/editor-rtl.css 956 B
build/block-library/blocks/gallery/editor.css 960 B
build/block-library/blocks/gallery/style-rtl.css 1.72 kB
build/block-library/blocks/gallery/style.css 1.72 kB
build/block-library/blocks/gallery/theme-rtl.css 108 B
build/block-library/blocks/gallery/theme.css 108 B
build/block-library/blocks/group/editor-rtl.css 403 B
build/block-library/blocks/group/editor.css 403 B
build/block-library/blocks/group/style-rtl.css 103 B
build/block-library/blocks/group/style.css 103 B
build/block-library/blocks/group/theme-rtl.css 78 B
build/block-library/blocks/group/theme.css 78 B
build/block-library/blocks/heading/style-rtl.css 189 B
build/block-library/blocks/heading/style.css 189 B
build/block-library/blocks/html/editor-rtl.css 336 B
build/block-library/blocks/html/editor.css 337 B
build/block-library/blocks/image/editor-rtl.css 891 B
build/block-library/blocks/image/editor.css 891 B
build/block-library/blocks/image/style-rtl.css 1.6 kB
build/block-library/blocks/image/style.css 1.59 kB
build/block-library/blocks/image/theme-rtl.css 133 B
build/block-library/blocks/image/theme.css 133 B
build/block-library/blocks/image/view.min.js 1.54 kB
build/block-library/blocks/latest-comments/style-rtl.css 357 B
build/block-library/blocks/latest-comments/style.css 357 B
build/block-library/blocks/latest-posts/editor-rtl.css 213 B
build/block-library/blocks/latest-posts/editor.css 212 B
build/block-library/blocks/latest-posts/style-rtl.css 478 B
build/block-library/blocks/latest-posts/style.css 478 B
build/block-library/blocks/list/style-rtl.css 88 B
build/block-library/blocks/list/style.css 88 B
build/block-library/blocks/media-text/editor-rtl.css 306 B
build/block-library/blocks/media-text/editor.css 305 B
build/block-library/blocks/media-text/style-rtl.css 505 B
build/block-library/blocks/media-text/style.css 503 B
build/block-library/blocks/more/editor-rtl.css 431 B
build/block-library/blocks/more/editor.css 431 B
build/block-library/blocks/navigation-link/editor-rtl.css 668 B
build/block-library/blocks/navigation-link/editor.css 669 B
build/block-library/blocks/navigation-link/style-rtl.css 193 B
build/block-library/blocks/navigation-link/style.css 192 B
build/block-library/blocks/navigation-submenu/editor-rtl.css 296 B
build/block-library/blocks/navigation-submenu/editor.css 295 B
build/block-library/blocks/navigation/editor-rtl.css 2.26 kB
build/block-library/blocks/navigation/editor.css 2.26 kB
build/block-library/blocks/navigation/style-rtl.css 2.26 kB
build/block-library/blocks/navigation/style.css 2.25 kB
build/block-library/blocks/navigation/view.min.js 1.03 kB
build/block-library/blocks/nextpage/editor-rtl.css 395 B
build/block-library/blocks/nextpage/editor.css 395 B
build/block-library/blocks/page-list/editor-rtl.css 377 B
build/block-library/blocks/page-list/editor.css 377 B
build/block-library/blocks/page-list/style-rtl.css 175 B
build/block-library/blocks/page-list/style.css 175 B
build/block-library/blocks/paragraph/editor-rtl.css 235 B
build/block-library/blocks/paragraph/editor.css 235 B
build/block-library/blocks/paragraph/style-rtl.css 335 B
build/block-library/blocks/paragraph/style.css 335 B
build/block-library/blocks/post-author/style-rtl.css 175 B
build/block-library/blocks/post-author/style.css 176 B
build/block-library/blocks/post-comments-form/editor-rtl.css 96 B
build/block-library/blocks/post-comments-form/editor.css 96 B
build/block-library/blocks/post-comments-form/style-rtl.css 508 B
build/block-library/blocks/post-comments-form/style.css 508 B
build/block-library/blocks/post-content/editor-rtl.css 74 B
build/block-library/blocks/post-content/editor.css 74 B
build/block-library/blocks/post-date/style-rtl.css 61 B
build/block-library/blocks/post-date/style.css 61 B
build/block-library/blocks/post-excerpt/editor-rtl.css 71 B
build/block-library/blocks/post-excerpt/editor.css 71 B
build/block-library/blocks/post-excerpt/style-rtl.css 141 B
build/block-library/blocks/post-excerpt/style.css 141 B
build/block-library/blocks/post-featured-image/editor-rtl.css 734 B
build/block-library/blocks/post-featured-image/editor.css 732 B
build/block-library/blocks/post-featured-image/style-rtl.css 342 B
build/block-library/blocks/post-featured-image/style.css 342 B
build/block-library/blocks/post-navigation-link/style-rtl.css 215 B
build/block-library/blocks/post-navigation-link/style.css 214 B
build/block-library/blocks/post-template/editor-rtl.css 99 B
build/block-library/blocks/post-template/editor.css 98 B
build/block-library/blocks/post-template/style-rtl.css 397 B
build/block-library/blocks/post-template/style.css 396 B
build/block-library/blocks/post-terms/style-rtl.css 96 B
build/block-library/blocks/post-terms/style.css 96 B
build/block-library/blocks/post-time-to-read/style-rtl.css 69 B
build/block-library/blocks/post-time-to-read/style.css 69 B
build/block-library/blocks/post-title/style-rtl.css 100 B
build/block-library/blocks/post-title/style.css 100 B
build/block-library/blocks/preformatted/style-rtl.css 125 B
build/block-library/blocks/preformatted/style.css 125 B
build/block-library/blocks/pullquote/editor-rtl.css 135 B
build/block-library/blocks/pullquote/editor.css 135 B
build/block-library/blocks/pullquote/style-rtl.css 354 B
build/block-library/blocks/pullquote/style.css 353 B
build/block-library/blocks/pullquote/theme-rtl.css 174 B
build/block-library/blocks/pullquote/theme.css 174 B
build/block-library/blocks/query-pagination-numbers/editor-rtl.css 122 B
build/block-library/blocks/query-pagination-numbers/editor.css 121 B
build/block-library/blocks/query-pagination/editor-rtl.css 221 B
build/block-library/blocks/query-pagination/editor.css 211 B
build/block-library/blocks/query-pagination/style-rtl.css 288 B
build/block-library/blocks/query-pagination/style.css 284 B
build/block-library/blocks/query-title/style-rtl.css 63 B
build/block-library/blocks/query-title/style.css 63 B
build/block-library/blocks/query/editor-rtl.css 486 B
build/block-library/blocks/query/editor.css 486 B
build/block-library/blocks/query/view.min.js 958 B
build/block-library/blocks/quote/style-rtl.css 237 B
build/block-library/blocks/quote/style.css 237 B
build/block-library/blocks/quote/theme-rtl.css 233 B
build/block-library/blocks/quote/theme.css 235 B
build/block-library/blocks/read-more/style-rtl.css 140 B
build/block-library/blocks/read-more/style.css 140 B
build/block-library/blocks/rss/editor-rtl.css 101 B
build/block-library/blocks/rss/editor.css 101 B
build/block-library/blocks/rss/style-rtl.css 289 B
build/block-library/blocks/rss/style.css 288 B
build/block-library/blocks/search/editor-rtl.css 184 B
build/block-library/blocks/search/editor.css 184 B
build/block-library/blocks/search/style-rtl.css 690 B
build/block-library/blocks/search/style.css 689 B
build/block-library/blocks/search/theme-rtl.css 114 B
build/block-library/blocks/search/theme.css 114 B
build/block-library/blocks/search/view.min.js 478 B
build/block-library/blocks/separator/editor-rtl.css 146 B
build/block-library/blocks/separator/editor.css 146 B
build/block-library/blocks/separator/style-rtl.css 239 B
build/block-library/blocks/separator/style.css 239 B
build/block-library/blocks/separator/theme-rtl.css 194 B
build/block-library/blocks/separator/theme.css 194 B
build/block-library/blocks/shortcode/editor-rtl.css 286 B
build/block-library/blocks/shortcode/editor.css 286 B
build/block-library/blocks/site-logo/editor-rtl.css 805 B
build/block-library/blocks/site-logo/editor.css 805 B
build/block-library/blocks/site-logo/style-rtl.css 204 B
build/block-library/blocks/site-logo/style.css 204 B
build/block-library/blocks/site-tagline/editor-rtl.css 86 B
build/block-library/blocks/site-tagline/editor.css 86 B
build/block-library/blocks/site-title/editor-rtl.css 116 B
build/block-library/blocks/site-title/editor.css 116 B
build/block-library/blocks/site-title/style-rtl.css 57 B
build/block-library/blocks/site-title/style.css 57 B
build/block-library/blocks/social-link/editor-rtl.css 324 B
build/block-library/blocks/social-link/editor.css 324 B
build/block-library/blocks/social-links/editor-rtl.css 676 B
build/block-library/blocks/social-links/editor.css 675 B
build/block-library/blocks/social-links/style-rtl.css 1.48 kB
build/block-library/blocks/social-links/style.css 1.48 kB
build/block-library/blocks/spacer/editor-rtl.css 350 B
build/block-library/blocks/spacer/editor.css 350 B
build/block-library/blocks/spacer/style-rtl.css 48 B
build/block-library/blocks/spacer/style.css 48 B
build/block-library/blocks/table/editor-rtl.css 395 B
build/block-library/blocks/table/editor.css 395 B
build/block-library/blocks/table/style-rtl.css 639 B
build/block-library/blocks/table/style.css 639 B
build/block-library/blocks/table/theme-rtl.css 152 B
build/block-library/blocks/table/theme.css 152 B
build/block-library/blocks/tag-cloud/style-rtl.css 251 B
build/block-library/blocks/tag-cloud/style.css 253 B
build/block-library/blocks/template-part/editor-rtl.css 393 B
build/block-library/blocks/template-part/editor.css 393 B
build/block-library/blocks/template-part/theme-rtl.css 107 B
build/block-library/blocks/template-part/theme.css 107 B
build/block-library/blocks/term-description/style-rtl.css 111 B
build/block-library/blocks/term-description/style.css 111 B
build/block-library/blocks/text-columns/editor-rtl.css 95 B
build/block-library/blocks/text-columns/editor.css 95 B
build/block-library/blocks/text-columns/style-rtl.css 166 B
build/block-library/blocks/text-columns/style.css 166 B
build/block-library/blocks/verse/style-rtl.css 99 B
build/block-library/blocks/verse/style.css 99 B
build/block-library/blocks/video/editor-rtl.css 552 B
build/block-library/blocks/video/editor.css 555 B
build/block-library/blocks/video/style-rtl.css 185 B
build/block-library/blocks/video/style.css 185 B
build/block-library/blocks/video/theme-rtl.css 133 B
build/block-library/blocks/video/theme.css 133 B
build/block-library/classic-rtl.css 179 B
build/block-library/classic.css 179 B
build/block-library/common-rtl.css 1.11 kB
build/block-library/common.css 1.11 kB
build/block-library/editor-elements-rtl.css 75 B
build/block-library/editor-elements.css 75 B
build/block-library/editor-rtl.css 12.2 kB
build/block-library/editor.css 12.2 kB
build/block-library/elements-rtl.css 54 B
build/block-library/elements.css 54 B
build/block-library/index.min.js 217 kB
build/block-library/reset-rtl.css 472 B
build/block-library/reset.css 472 B
build/block-library/style-rtl.css 14.8 kB
build/block-library/style.css 14.8 kB
build/block-library/theme-rtl.css 707 B
build/block-library/theme.css 713 B
build/block-serialization-default-parser/index.min.js 1.12 kB
build/block-serialization-spec-parser/index.min.js 2.87 kB
build/blocks/index.min.js 51.7 kB
build/commands/index.min.js 15.1 kB
build/commands/style-rtl.css 953 B
build/commands/style.css 951 B
build/components/index.min.js 222 kB
build/components/style-rtl.css 12 kB
build/components/style.css 12 kB
build/compose/index.min.js 12.8 kB
build/core-commands/index.min.js 2.81 kB
build/core-data/index.min.js 72.5 kB
build/customize-widgets/index.min.js 10.9 kB
build/customize-widgets/style-rtl.css 1.36 kB
build/customize-widgets/style.css 1.36 kB
build/data-controls/index.min.js 640 B
build/data/index.min.js 9 kB
build/date/index.min.js 17.9 kB
build/deprecated/index.min.js 451 B
build/dom-ready/index.min.js 324 B
build/dom/index.min.js 4.65 kB
build/edit-post/classic-rtl.css 578 B
build/edit-post/classic.css 578 B
build/edit-post/style-rtl.css 2.68 kB
build/edit-post/style.css 2.68 kB
build/edit-site/style-rtl.css 12.9 kB
build/edit-site/style.css 12.9 kB
build/edit-widgets/index.min.js 17.5 kB
build/edit-widgets/style-rtl.css 4.18 kB
build/edit-widgets/style.css 4.18 kB
build/editor/style-rtl.css 8.39 kB
build/editor/style.css 8.4 kB
build/element/index.min.js 4.83 kB
build/escape-html/index.min.js 537 B
build/format-library/index.min.js 8.07 kB
build/format-library/style-rtl.css 493 B
build/format-library/style.css 492 B
build/hooks/index.min.js 1.55 kB
build/html-entities/index.min.js 448 B
build/i18n/index.min.js 3.58 kB
build/interactivity/debug.min.js 16.3 kB
build/interactivity/file.min.js 447 B
build/interactivity/image.min.js 1.67 kB
build/interactivity/index.min.js 13 kB
build/interactivity/navigation.min.js 1.17 kB
build/interactivity/query.min.js 740 B
build/interactivity/router.min.js 2.79 kB
build/interactivity/search.min.js 618 B
build/is-shallow-equal/index.min.js 527 B
build/keyboard-shortcuts/index.min.js 1.3 kB
build/keycodes/index.min.js 1.46 kB
build/list-reusable-blocks/index.min.js 2.11 kB
build/list-reusable-blocks/style-rtl.css 851 B
build/list-reusable-blocks/style.css 851 B
build/media-utils/index.min.js 2.92 kB
build/modules/importmap-polyfill.min.js 12.2 kB
build/notices/index.min.js 948 B
build/nux/index.min.js 1.57 kB
build/nux/style-rtl.css 748 B
build/nux/style.css 744 B
build/patterns/index.min.js 6.46 kB
build/patterns/style-rtl.css 595 B
build/patterns/style.css 595 B
build/plugins/index.min.js 1.8 kB
build/preferences-persistence/index.min.js 2.06 kB
build/preferences/index.min.js 2.85 kB
build/preferences/style-rtl.css 710 B
build/preferences/style.css 712 B
build/primitives/index.min.js 809 B
build/priority-queue/index.min.js 1.52 kB
build/private-apis/index.min.js 1 kB
build/react-i18n/index.min.js 623 B
build/react-refresh-entry/index.min.js 9.47 kB
build/react-refresh-runtime/index.min.js 6.78 kB
build/redux-routine/index.min.js 2.7 kB
build/reusable-blocks/index.min.js 2.7 kB
build/reusable-blocks/style-rtl.css 256 B
build/reusable-blocks/style.css 256 B
build/rich-text/index.min.js 10.1 kB
build/router/index.min.js 1.93 kB
build/server-side-render/index.min.js 1.96 kB
build/shortcode/index.min.js 1.39 kB
build/style-engine/index.min.js 2.02 kB
build/token-list/index.min.js 582 B
build/url/index.min.js 3.74 kB
build/vendors/inert-polyfill.min.js 2.48 kB
build/vendors/react-dom.min.js 41.7 kB
build/vendors/react.min.js 4.03 kB
build/viewport/index.min.js 957 B
build/warning/index.min.js 249 B
build/widgets/index.min.js 7.11 kB
build/widgets/style-rtl.css 1.17 kB
build/widgets/style.css 1.17 kB
build/wordcount/index.min.js 1.02 kB

compressed-size-action

( sum( array ) / array.length ) ** 2
);
}

export function median( array ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some cases we also seems to be calculating the median, for others the average 🤔

@@ -72,26 +92,46 @@ export function curateResults(
results: WPRawPerformanceResults
): WPPerformanceResults {
const output = {
timeToFirstByte: median( results.timeToFirstByte ),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will change the reporting from median to average, is that intentional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some cases we also seems to be calculating the median, for others the average

It seems to me like a pointless inconsistency. The metrics measured don't look fundamentally different from each other. That why I'm trying to unify the stats to use only one methods.

Are you worried that changing it would invalidate the stats history? I think in our case the medians and averages should be approx the same, any difference is just a random noise. I'll check if one of them is systematically higher than the other one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just saying this would be reverting #54157, but maybe that's ok

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we remove that, maybe we should also remove the median function etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would be reverting #54157

That's true, we'll have to deal with that. The reason why median is needed to "stabilize the metric" is that there are outliers and that median is less sensitive to them than average.

This is a real-world example of an inserterHover measurement with 20 data points:

[2.60, 3.09, 2.57, 2.62, 2.21, 2.98, 2.34, 3.34, 2.37, 3.00, 2.14, 2.16, 2.07, 2.22, 2.12, 2.27, 2.66, 2.20, 6.89, 2.99]

Almost all measurements are nicely grouped around 2.5, in a ±0.5 range, but there is one outlier at 6.8:

inserter-data

The presence of this outlier moves the median from 2.37 to 2.47, by 0.1. But the average is more sensitive, it's moved from 2.52 to 2.74, by 0.22.

When there are more outliers than one, then the stats are even more distorted.

We can solve this by detecting and removing the outliers. By arguing that when a certain result is too far from all others, it means that the measurement instrument broke down and produced a garbage result. Without outliers, neither the median and the average are distorted, and are closer to each other.

My ultimate goal here is to do some significance test for each metric and produce a result like:

on trunk       the inserterHover metric is 2.5±0.5ms
on this branch the inserterHover metric is 1.5±0.5ms
with 95% confidence this is a real improvement, not a random noise

This enables automated reporting in PR where the tool can confidently say "out of 100 metrics I checked, your PR improves this one but another two are significantly worse"

All the common significance form accept averages and variances as inputs. Not medians and quartiles. That's why I'm trying to move to averages.

@jsnajdr jsnajdr force-pushed the add/perf-results-variance branch from c024213 to e4aa308 Compare May 10, 2024 08:45
@jsnajdr jsnajdr changed the title Calculate and report variance in performance results Calculate and report quartiles in performance results May 10, 2024
@jsnajdr jsnajdr marked this pull request as ready for review May 13, 2024 12:41
Copy link

github-actions bot commented May 13, 2024

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: jsnajdr <jsnajdr@git.wordpress.org>
Co-authored-by: ellatrix <ellatrix@git.wordpress.org>
Co-authored-by: youknowriad <youknowriad@git.wordpress.org>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@jsnajdr
Copy link
Member Author

jsnajdr commented May 13, 2024

So, at the end this PR does two things:

  1. Switch all statistics to median. Until now, we were using a random mixture of average and median. In other words we are following up to Performance: Use median instead of average to stabilize First Block metric #54157, and "stabilizing" all metrics. Median is more stable in our case than average because we have outlier measurements, and median is less distorted by them than the average.
  2. Report the quartile ranges in the result tables.

The result table now looks like this:

┌────────────────────────┬───────────────────────────────┬───────────────────────────────┬──────────┐
│ (index)                │ 0a755e7                       │ trunk                         │ % Change │
├────────────────────────┼───────────────────────────────┼───────────────────────────────┼──────────┤
│ timeToFirstByte        │ '49.6 ms +7.96% -4.54% (16)'  │ '50.25 ms +9.85% -3.48% (16)' │ '-1.29%' │
│ largestContentfulPaint │ '103.7 ms +2.22% -5.06% (16)' │ '102.7 ms +3.12% -1.9% (16)'  │ '0.97%'  │
│ lcpMinusTtfb           │ '52.4 ms +3.05% -4.39% (16)'  │ '52.05 ms +1.34% -3.17% (16)' │ '0.67%'  │
└────────────────────────┴───────────────────────────────┴───────────────────────────────┴──────────┘

It reports the median (50% quartile), how many percent are the 75% and 25% quartiles larger or smaller than the median, and the number of measurements (16 in the example).

The main value of this is that we can compare the "% change" against another branch, and the variability of the measurements on one branch. So, if the reported change is -1.29% while the measurements themselves vary by 5-9%, we can see that the reported change is probably not real, it's noise.

A nice next step would be if we could report the quartiles also to codevitals (in log-performance-results) and if the database and the /api/log endpoint were prepared for data that contain not only a single value, but a { q25, q50, q75 } triple. Then we could draw a chart that includes the error ranges.

@youknowriad
Copy link
Contributor

Is the (16) number really helpful in the table? Maybe it's just adding noise?

I wanted to check the summary but it seems the performance job is failing in the last run.

@jsnajdr jsnajdr force-pushed the add/perf-results-variance branch from 2510c52 to cd450ae Compare May 15, 2024 06:51
@jsnajdr
Copy link
Member Author

jsnajdr commented May 15, 2024

seems the performance job is failing in the last run.

I probably broke it when resolving the conflicts with #61450. Both PR touch the same code, printing the results table. I'm working on fixing it.

Is the (16) number really helpful in the table?

It's temporarily helpful, as I wanted to confirm I'm using all the samples from all test rounds. And it's useful to record the number of samples permanently in the codevitals database because statistical significance depends on number of samples: the precision of the measurement grows proportionally to $\sqrt{n}$. But we don't need to print it in the results table, I'll remove it.

@youknowriad
Copy link
Contributor

Thanks

@jsnajdr
Copy link
Member Author

jsnajdr commented May 15, 2024

I wanted to check the summary but it seems the performance job is failing in the last run.

@youknowriad it's all green now, you can check how the summary looks like.

When logging the data to codevitals' /api/log endpoint, we're still sending the single median value. Do you have any thoughts about how to change the database and the rest of the service if we wanted to store all the q25, q50 and q75 values?

@youknowriad
Copy link
Contributor

@jsnajdr Looks good.

For codevitals, I haven't thought about it much but I have nothing against it. Feel free to open an issue or actually work on it https://github.com/youknowriad/codevitals I'm just thinking that we should make it optional as not all projects might have it...

@jsnajdr jsnajdr merged commit 5b2454b into trunk May 15, 2024
62 checks passed
@jsnajdr jsnajdr deleted the add/perf-results-variance branch May 15, 2024 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Type] Performance Related to performance efforts
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants