New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse subdirectory with quotes #5117
Conversation
mlflow/projects/utils.py
Outdated
if subdirectory and "." in subdirectory: | ||
raise ExecutionException("'.' is not allowed in project subdirectory paths.") | ||
return parsed_uri, subdirectory | ||
|
||
|
||
def _fix_quote(uri): | ||
# Remove quotes and, if necessary, add new ones | ||
return quote(uri.strip("'\"")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dinaldoap Thanks for the PR! Could you tell us why we need quote
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@harupy Thanks for the feedback! You question made me think in a better solution after reviewing mlflow's code. Since parsed_uri and subdirectory aren't used in command lines, they don't need to be quoted again. Besides that, I observed that removing quotes from uri instead of parsed_uri and subdirectory will allow "mlflow run" to work with quoted URIs without subdirectories. For example, mlflow run "'exam ples/sklearn_elasticnet_wine/'".
I will rewrite my commits in my branch and force push them in a minute.
mlflow/projects/utils.py
Outdated
if subdirectory and "." in subdirectory: | ||
raise ExecutionException("'.' is not allowed in project subdirectory paths.") | ||
return parsed_uri, subdirectory | ||
|
||
|
||
def _remove_quotes(uri): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _remove_quotes(uri): | |
def _strip_quotes(uri): |
nit: can we rename this function to indicate it only removes leading and trailing quotes? remove
sounds like it removes all quotes in a given string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback! function _remove_quotes
renamed to _strip_quotes
. I've already done another force push.
Signed-off-by: dinaldoap <dinaldoap@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
fix #5114 |
Signed-off-by: dinaldoap dinaldoap@gmail.com
What changes are proposed in this pull request?
Currently, mlflow.projects._parse_subdirectory splits uri by the character '#' and leaves unpaired quotes in parsed_uri and subdirectory. For example, uri = 'parsed_uri#subdirectory' becomes parsed_uri='parsed_uri and subdirectory=subdirectory'.
The proposed solution is to remove quotes from parsed_uri and subdirectory and add them back only when necessary.
How is this patch tested?
The test is located in tests/projects/test_utils.py, function test_parse_subdirectory.
Does this PR change the documentation?
ci/circleci: build_doc
check. If it's successful, proceed to thenext step, otherwise fix it.
Details
on the right to open the job page of CircleCI.Artifacts
tab.docs/build/html/index.html
.Release Notes
Is this a user-facing change?
(Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.)
Fix mlflow run to work with quoted URIs.
What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/artifacts
: Artifact stores and artifact loggingarea/build
: Build and test infrastructure for MLflowarea/docs
: MLflow documentation pagesarea/examples
: Example codearea/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models
: MLmodel format, model serialization/deserialization, flavorsarea/projects
: MLproject format, project running backendsarea/scoring
: MLflow Model server, model deployment tools, Spark UDFsarea/server-infra
: MLflow Tracking server backendarea/tracking
: Tracking Service, tracking client APIs, autologgingInterface
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows
: Windows supportLanguage
language/r
: R APIs and clientslanguage/java
: Java APIs and clientslanguage/new
: Proposals for new client languagesIntegrations
integrations/azure
: Azure and Azure ML integrationsintegrations/sagemaker
: SageMaker integrationsintegrations/databricks
: Databricks integrationsHow should the PR be classified in the release notes? Choose one:
rn/breaking-change
- The PR will be mentioned in the "Breaking Changes" sectionrn/none
- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/feature
- A new user-facing feature worth mentioning in the release notesrn/bug-fix
- A user-facing bug fix worth mentioning in the release notesrn/documentation
- A user-facing documentation change worth mentioning in the release notes