New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add server option for serving only artifacts and proxied serving mode #5045
Conversation
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
…t-serve-artifacts Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
…t-serve-artifacts Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
mlflow/server/handlers.py
Outdated
return wrapper | ||
|
||
|
||
def _disable_mlflow_artifacts_only(func): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _disable_mlflow_artifacts_only(func): | |
def _disable_if_artifacts_only(func): |
Can we also rename this function to indicate that the decorated endpoint is disabled if --artifacts-only
is specified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much more clear decorator name. changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BenWilson2 It seems it's not changed yet :)
…t-serve-artifacts Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
mlflow/cli.py
Outdated
"false", | ||
"false", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"false", | |
"false", | |
False, | |
False, |
Should these be False
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It definitely should. I forgot about how Python will return a bool validation on a string of "true" or "false" the same as the bool operator.
mlflow/cli.py
Outdated
"by routing these requests to the storage location that is specified by " | ||
"'--artifact-destination' directly through a proxy. The default location that " | ||
"these requests are served from is a local './mlartifacts' directory which can be " | ||
"overridden via '--artifact-destination' arguments. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"overridden via '--artifact-destination' arguments. " | |
"overridden via '--artifacts-destination' argument. " |
nit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BenWilson2 This is awesome! One point of feedback I noticed while manual QAing is that, when --serve-artifacts
is specified, --default-artifact-root
still defaults to ./mlruns
. For artifact serving, it should default to an mlflow-artifacts:/
URI. Is support for mlflow-artifacts
URIs coming in a separate PR? If so, let's merge that one first.
Once we change the default behavior, can we update the documentation for the --default-artifact-root
option? Even for the existing behavior with file stores, the mlflow server --help
output is pretty confusing and has some English syntax errors:
--default-artifact-root URI Local or S3 URI to store
artifacts, for new experiments.
Note that this flag does not
impact already-created
experiments. Default: Within
file store, if a file:/ URI is
provided. If a sql backend is
used, then this option is
required.
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
09b84eb
to
d83c7a3
Compare
mlflow/cli.py
Outdated
else: | ||
default_artifact_root = DEFAULT_LOCAL_FILE_AND_ARTIFACT_PATH | ||
default_artifact_root = resolve_default_artifact_root( | ||
serve_artifacts, default_artifact_root, backend_store_uri, True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
serve_artifacts, default_artifact_root, backend_store_uri, True | |
serve_artifacts, default_artifact_root, backend_store_uri, resolve_to_local=True |
@BenWilson2 Can we use a keyword argument here?
mlflow/server/handlers.py
Outdated
( | ||
f"Endpoint: {request.url_rule} disabled due to the mlflow server running " | ||
"without `--serve-artifacts`. To enable artifacts server functionaltiy, " | ||
"run `mlflow server` with `--serve-artfiacts`" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"run `mlflow server` with `--serve-artfiacts`" | |
"run `mlflow server` with `--serve-artifacts`" |
typo
mlflow/server/handlers.py
Outdated
return Response( | ||
( | ||
f"Endpoint: {request.url_rule} disabled due to the mlflow server running " | ||
"without `--serve-artifacts`. To enable artifacts server functionaltiy, " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"without `--serve-artifacts`. To enable artifacts server functionaltiy, " | |
"without `--serve-artifacts`. To enable artifacts server functionality, " |
typo
@BenWilson2 Can we merge the master branch to include changes made by #5070? |
Is this line missing a slash after resolved = f"{base_url}{track_parse.path}/{uri_parse.path.lstrip('/')}" |
…t-serve-artifacts Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
the parsed |
@BenWilson2 My suggested code was incorrect. I think we need a slash before resolved = f"{base_url}/{track_parse.path}{uri_parse.path.lstrip('/')}"
|
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
Ah, I get it! Added fixes for this and added fixture params to validate in the test suite. |
@pytest.fixture( | ||
scope="module", autouse=True, params=["http://localhost:5000", "http://localhost:5000/"] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doubles the number of tests. Can we just test that both http://localhost:5000
and http://localhost:5000/
can be resolved correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created another fixture and used an explicit single test for validating alternate uri naming convention
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
f"mlflow-artifacts://myhostname:4242{base_path}/hostport", | ||
f"http://myhostname:4242{base_url}{base_path}/hostport", | ||
"http://myhostname:4242", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we parametrize this test?
@pytest.parametrize("tracking_uri", ["http://localhost:5000", "http://localhost:5000/"])
@pytest.parametrize("artifact_uri, resolved_uri", [(...), (...)])
def test_xxx_yyy(tracking_uri, artifact_uri, resolved_uri):
assert MlflowArtifactsRepository.resolve_uri(tracking_uri, artifact_uri, resolved_uri) == ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely. Added.
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
f"mlflow-artifacts://myhostname:4242{base_path}/hostport", | ||
f"http://myhostname:4242{base_url}{base_path}/hostport", | ||
"http://myhostname:4242", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f"mlflow-artifacts://myhostname:4242{base_path}/hostport", | |
f"http://myhostname:4242{base_url}{base_path}/hostport", | |
"http://myhostname:4242", | |
f"mlflow-artifacts://myhostname:4242{base_path}/hostport", | |
"http://myhostname:4242", | |
f"http://myhostname:4242{base_url}{base_path}/hostport", |
Can we fix the order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like your edited way in the last comment better. Changed to that easier to read and cleaner implementation.
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
LGTM! |
@@ -10,5 +10,6 @@ | |||
# Also used as default location for artifacts, when not provided, in non local file based backends | |||
# (eg MySQL) | |||
DEFAULT_LOCAL_FILE_AND_ARTIFACT_PATH = "./mlruns" | |||
DEFAULT_ARTIFACTS_URI = "mlflow-artifacts:/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add an inline comment explaining what this is used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
certainly!
mlflow/cli.py
Outdated
f"By default, data will be logged to the {DEFAULT_ARTIFACTS_URI} uri proxy if " | ||
"the --serve-artifacts option is enabled. Otherwise, the default location will " | ||
f"be {DEFAULT_LOCAL_FILE_AND_ARTIFACT_PATH}.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f"By default, data will be logged to the {DEFAULT_ARTIFACTS_URI} uri proxy if " | |
"the --serve-artifacts option is enabled. Otherwise, the default location will " | |
f"be {DEFAULT_LOCAL_FILE_AND_ARTIFACT_PATH}.", | |
f"If the --serve-artifacts option is specified, the default artifact root is {DEFAULT_ARTIFACTS_URI}. " | |
f "otherwise, the default artifact root is {DEFAULT_LOCAL_FILE_AND_ARTIFACT_PATH}". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with 2 small docs comments. Thanks so much, @BenWilson2 !
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
What changes are proposed in this pull request?
Add the flags:
--serve-artifact-opt to enable starting an MLflow server for artifact serving purposes
--artifacts-only to enable artifact serving as an exclusive option to an MLflow server instance, disabling all other endpoints in the tracking service (/api/2.0/mlflow/* disabled, /api/2.0/mlflow-artifacts/* enabled only).
This is a follow-on to 4946.
How is this patch tested?
unit tests
Does this PR change the documentation?
ci/circleci: build_doc
check. If it's successful, proceed to thenext step, otherwise fix it.
Details
on the right to open the job page of CircleCI.Artifacts
tab.docs/build/html/index.html
.Release Notes
Is this a user-facing change?
(Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.)
Add capability to enable or disable mlflow-artifact functionality for the mlflow server, as well as the ability to enforce exclusively functionality of artifact handling for an mlflow server instance.
What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/artifacts
: Artifact stores and artifact loggingarea/build
: Build and test infrastructure for MLflowarea/docs
: MLflow documentation pagesarea/examples
: Example codearea/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models
: MLmodel format, model serialization/deserialization, flavorsarea/projects
: MLproject format, project running backendsarea/scoring
: MLflow Model server, model deployment tools, Spark UDFsarea/server-infra
: MLflow Tracking server backendarea/tracking
: Tracking Service, tracking client APIs, autologgingInterface
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows
: Windows supportLanguage
language/r
: R APIs and clientslanguage/java
: Java APIs and clientslanguage/new
: Proposals for new client languagesIntegrations
integrations/azure
: Azure and Azure ML integrationsintegrations/sagemaker
: SageMaker integrationsintegrations/databricks
: Databricks integrationsHow should the PR be classified in the release notes? Choose one:
rn/breaking-change
- The PR will be mentioned in the "Breaking Changes" sectionrn/none
- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/feature
- A new user-facing feature worth mentioning in the release notesrn/bug-fix
- A user-facing bug fix worth mentioning in the release notesrn/documentation
- A user-facing documentation change worth mentioning in the release notes