New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically update metric plots for in-progress runs #2099 #5017
Conversation
@cedkoffeto Thanks for the contribution! The DCO check failed. Please sign off your commits by following the instructions here: https://github.com/mlflow/mlflow/runs/4132478673. See https://github.com/mlflow/mlflow/blob/master/CONTRIBUTING.rst#sign-your-work for more details. |
Signed-off-by: Cedric Koffeto cedkoffeto@gmail.com Signed-off-by: cedric koffeto <cedkoffeto@gmail.com>
Signed-off-by: cedric koffeto <cedkoffeto@gmail.com>
Hi @cedkoffeto, thanks for the PR! I'll review it soon :) |
@harupy Glad to be able to help 😉 |
@cedkoffeto btw could you take a screen record of how the metric plot automatically gets updated? |
Here it is @harupy Enregistrement.de.l.ecran.2021-11-08.a.03.09.24.mp4 |
Thanks for the screen recording. It looks like the entire plot gets rendered. I'm investigating how we can only update the lines like this: only-update-lines.mov |
Here's my attempt: https://github.com/harupy/mlflow/tree/5017-harupy. The implementation is almost the same as yours. In my implementation, I don't call this.setState((prevState) => ({
historyRequestIds: [...prevState.historyRequestIds, ...requestIds],
})); The message showing up at the top is just for demo purposes. auto-plot-update.movPython script I used:import time
import numpy as np
import mlflow
with mlflow.start_run() as run:
print(
"URL:",
r"http://localhost:3000/#/metric/metric1?runs=[%22<<< RUN_ID >>>%22]&experiment=0&plot_metric_keys=[%22metric1%22]&plot_layout={%22autosize%22:true,%22xaxis%22:{},%22yaxis%22:{}}&x_axis=relative&y_axis_scale=linear&line_smoothness=1&show_point=true&deselected_curves=[]&last_linear_y_axis_range=[]".replace(
"<<< RUN_ID >>>", run.info.run_id
),
)
for epoch in range(1, 10):
print(epoch)
mlflow.log_metric(key="metric1", value=epoch * np.log(epoch), step=epoch)
mlflow.log_metric(key="metric2", value=(1 / epoch) * np.log(epoch), step=epoch)
time.sleep(3) |
@cedkoffeto @harupy Awesome stuff! Does the proposal from https://github.com/harupy/mlflow/tree/5017-harupy also preserve plot customizations and zoom? |
Hi @harupy, |
Thanks @dbczumar |
@dbczumar Yep, it does. Here's a quick demo. auto-plot-update-customization.mov |
mlflow/server/js/src/experiment-tracking/components/MetricsPlotPanel.js
Outdated
Show resolved
Hide resolved
mlflow/server/js/src/experiment-tracking/components/MetricsPlotPanel.js
Outdated
Show resolved
Hide resolved
mlflow/server/js/src/experiment-tracking/components/MetricsPlotPanel.js
Outdated
Show resolved
Hide resolved
Hi @harupy, any update? |
Hi @cedkoffeto, sorry for the late reply. We internally discussed this feature. Here's our latest prototype: automatic-metric-plot-update.mov |
Hi @harupy, that's great! |
Signed-off-by: harupy <hkawamura0130@gmail.com>
@cedkoffeto I pushed some commits to update the PR. |
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
// Stop polling when the polling duration exceeds this value | ||
export const METRICS_PLOT_POLLING_DURATION_MS = 3600 * 1000; // 1 hour |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should only stop polling when there's no new data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dbczumar I remember we discussed that we should set an appropriate polling threshold when a run never ends, but not setting such a threshold sounds ok to me because runs end in most cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Offline discussion: check the timestamp of the last metric, and if it's more than 1 week, then we won't refresh.
@@ -22,10 +22,16 @@ import { getUUID } from '../../common/utils/ActionUtils'; | |||
export const CHART_TYPE_LINE = 'line'; | |||
export const CHART_TYPE_BAR = 'bar'; | |||
|
|||
// Polling interval | |||
export const METRICS_PLOT_POLLING_INTERVAL_MS = 5000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we increase this to 10 seconds? 5 seems aggressive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, what happens if the refresh fails? Does the page crash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we increase this to 10 seconds? 5 seems aggressive.
Sure!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, what happens if the refresh fails? Does the page crash?
Let me test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when-request-fails.mov
- The page doesn't crash.
- The page keeps polling.
|
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
export const METRICS_PLOT_POLLING_INTERVAL_MS = 10 * 1000; // 10 seconds | ||
// A run is considered as 'hanging' if its status is 'RUNNING' but its latest metric was logged | ||
// prior to this threshold. The metrics plot doesn't automatically update hanging runs. | ||
export const METRICS_PLOT_HANGING_RUN_THRESHOLD_MS = 3600 * 24 * 7 * 1000; // 1 week |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does "hanging" make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep!
@@ -611,6 +611,10 @@ | |||
"defaultMessage": "Registered Models", | |||
"description": "Text for registered model link in the title for model comparison page" | |||
}, | |||
"UEDu0c": { | |||
"defaultMessage": "MLflow UI automatically fetches metric histories for active runs and updates the metrics plot with a {interval} second interval.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Included interval
so a user doesn't need to guess or measure how long the interval is.
{ key: 'metric_1', value: 100, step: 2, timestamp: 1556662044000 }, | ||
{ key: 'metric_1', value: 50, step: 1, timestamp: 1556662043000 }, | ||
{ key: 'metric_1', value: 100, step: 2, timestamp: now }, | ||
{ key: 'metric_1', value: 50, step: 1, timestamp: now - 1 }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced hardcoded timestamps with now
to prevent the metrics plot from considering these runs as hanging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Awesome work, @cedkoffeto, @harupy ! Thank you so much for this contribution, @cedkoffeto!
Signed-off-by: harupy <hkawamura0130@gmail.com>
Thanks! It was a pleasure :) |
Signed-off-by: Cedric Koffeto cedkoffeto@gmail.com
What changes are proposed in this pull request?
setInterval function added in metricsPlotPanel component which retrieves execution status and metrics history from API every 8 seconds to update the plot and automatically stops when all executions are complete. closes #2099
How is this patch tested?
It can be tested with a unit test to check if the plot is actually updated
Release Notes
Is this a user-facing change?
Automatically update metric plots for in-progress runs
What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/artifacts
: Artifact stores and artifact loggingarea/build
: Build and test infrastructure for MLflowarea/docs
: MLflow documentation pagesarea/examples
: Example codearea/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models
: MLmodel format, model serialization/deserialization, flavorsarea/projects
: MLproject format, project running backendsarea/scoring
: MLflow Model server, model deployment tools, Spark UDFsarea/server-infra
: MLflow Tracking server backendarea/tracking
: Tracking Service, tracking client APIs, autologgingInterface
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows
: Windows supportLanguage
language/r
: R APIs and clientslanguage/java
: Java APIs and clientslanguage/new
: Proposals for new client languagesIntegrations
integrations/azure
: Azure and Azure ML integrationsintegrations/sagemaker
: SageMaker integrationsintegrations/databricks
: Databricks integrationsHow should the PR be classified in the release notes? Choose one:
rn/breaking-change
- The PR will be mentioned in the "Breaking Changes" sectionrn/none
- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/feature
- A new user-facing feature worth mentioning in the release notesrn/bug-fix
- A user-facing bug fix worth mentioning in the release notesrn/documentation
- A user-facing documentation change worth mentioning in the release notes