New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML-19342] Fix cross-workspace job source link on databricks #5174
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! @liangz1 Can we update this test case
mlflow/tests/tracking/context/test_databricks_job_context.py
Lines 21 to 42 in bf63c5d
def test_databricks_job_run_context_tags(): | |
patch_job_id = mock.patch("mlflow.utils.databricks_utils.get_job_id") | |
patch_job_run_id = mock.patch("mlflow.utils.databricks_utils.get_job_run_id") | |
patch_job_type = mock.patch("mlflow.utils.databricks_utils.get_job_type") | |
patch_webapp_url = mock.patch("mlflow.utils.databricks_utils.get_webapp_url") | |
with multi_context(patch_job_id, patch_job_run_id, patch_job_type, patch_webapp_url) as ( | |
job_id_mock, | |
job_run_id_mock, | |
job_type_mock, | |
webapp_url_mock, | |
): | |
assert DatabricksJobRunContext().tags() == { | |
MLFLOW_SOURCE_NAME: "jobs/{job_id}/run/{job_run_id}".format( | |
job_id=job_id_mock.return_value, job_run_id=job_run_id_mock.return_value | |
), | |
MLFLOW_SOURCE_TYPE: SourceType.to_string(SourceType.JOB), | |
MLFLOW_DATABRICKS_JOB_ID: job_id_mock.return_value, | |
MLFLOW_DATABRICKS_JOB_RUN_ID: job_run_id_mock.return_value, | |
MLFLOW_DATABRICKS_JOB_TYPE: job_type_mock.return_value, | |
MLFLOW_DATABRICKS_WEBAPP_URL: webapp_url_mock.return_value, | |
} |
Signed-off-by: Liang Zhang <liang.zhang@databricks.com>
if workspace_url is not None: | ||
tags[MLFLOW_DATABRICKS_WORKSPACE_URL] = workspace_url | ||
tags[MLFLOW_DATABRICKS_WORKSPACE_ID] = workspace_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: What if workspace_url
is None but workspace_id
is not None, and vice versa?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it makes sense to not assume that they are dependent on each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-LGTM with one comment: https://github.com/mlflow/mlflow/pull/5174/files#r774380093
Signed-off-by: Liang Zhang liang.zhang@databricks.com
What changes are proposed in this pull request?
Cross-workspace job source link and notebook source link are incorrect because it does not use the workspace URL and workspace ID that the job/notebook is executed on, but uses it's own workspace URL and workspace ID. This PR fixes them by using the logged correct workspace URL and workspace ID.
How is this patch tested?
Unit test: existing test fixed.
Does this PR change the documentation?
ci/circleci: build_doc
check. If it's successful, proceed to thenext step, otherwise fix it.
Details
on the right to open the job page of CircleCI.Artifacts
tab.docs/build/html/index.html
.Release Notes
Is this a user-facing change?
(Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.)
What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/artifacts
: Artifact stores and artifact loggingarea/build
: Build and test infrastructure for MLflowarea/docs
: MLflow documentation pagesarea/examples
: Example codearea/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models
: MLmodel format, model serialization/deserialization, flavorsarea/projects
: MLproject format, project running backendsarea/scoring
: MLflow Model server, model deployment tools, Spark UDFsarea/server-infra
: MLflow Tracking server backendarea/tracking
: Tracking Service, tracking client APIs, autologgingInterface
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows
: Windows supportLanguage
language/r
: R APIs and clientslanguage/java
: Java APIs and clientslanguage/new
: Proposals for new client languagesIntegrations
integrations/azure
: Azure and Azure ML integrationsintegrations/sagemaker
: SageMaker integrationsintegrations/databricks
: Databricks integrationsHow should the PR be classified in the release notes? Choose one:
rn/breaking-change
- The PR will be mentioned in the "Breaking Changes" sectionrn/none
- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/feature
- A new user-facing feature worth mentioning in the release notesrn/bug-fix
- A user-facing bug fix worth mentioning in the release notesrn/documentation
- A user-facing documentation change worth mentioning in the release notes