Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: On windows _create_docker_build_ctx fails #4813

Closed
6 of 23 tasks
bjacobs1 opened this issue Sep 16, 2021 · 2 comments
Closed
6 of 23 tasks

[BUG]: On windows _create_docker_build_ctx fails #4813

bjacobs1 opened this issue Sep 16, 2021 · 2 comments
Labels
area/build Build and test infrastructure for MLflow area/docker Docker use anywhere, such as MLprojects and MLmodels area/examples Example code area/projects MLproject format, project running backends area/windows Issue is unique to windows. bug Something isn't working

Comments

@bjacobs1
Copy link

Thank you for submitting an issue. Please refer to our issue policy for additional information about bug reports. For help with debugging your code, please refer to Stack Overflow.

Please fill in this bug report template to ensure a timely and thorough response.

Willingness to contribute

The MLflow Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the MLflow code base?

  • Yes. I can contribute a fix for this bug independently.
  • Yes. I would be willing to contribute a fix for this bug with guidance from the MLflow community.
  • No. I cannot contribute a bug fix at this time.

System information

  • Have I written custom code (as opposed to using a stock example script provided in MLflow): y
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows, Docker Desktop
  • MLflow installed from (source or binary): poetry installed
  • MLflow version (run mlflow --version): 1.20.2
  • Python version: 3.8.10
  • npm version, if running the dev UI:
  • Exact command to reproduce: From a poetry shell: mlflow run .

Describe the problem

When trying to build a project with a local docker environment, I get a WinError5, possibly because of this:
https://stackoverflow.com/questions/37830326/how-to-avoid-windowserror-error-5-access-is-denied

Changing l 117 in mlflow/projects/docker.py from
shutil.rmtree(directory)
to
shutil.rmtree(directory, ignore_errors=True)
resolves the issue. Not sure if that's desirable though.

Code to reproduce issue

Given the nature of the problem, this should occur with any project that uses a docker environment.

Other info / logs

(mlflow-experiments-00bVs_pK-py3.8) PS C:\Users\jacobss\gitlab\test-mlflow-project> mlflow run .
Traceback (most recent call last):
File "C:\Users\jacobss.pyenv\pyenv-win\versions\3.8.10\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\jacobss.pyenv\pyenv-win\versions\3.8.10\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users\jacobss\AppData\Local\pypoetry\Cache\virtualenvs\mlflow-experiments-00bVs_pK-py3.8\Scripts\mlflow.exe_main
.py", line 7, in
File "c:\users\jacobss\appdata\local\pypoetry\cache\virtualenvs\mlflow-experiments-00bvs_pk-py3.8\lib\site-packages\click\core.py", line 1137, in call
return self.main(*args, **kwargs)
File "c:\users\jacobss\appdata\local\pypoetry\cache\virtualenvs\mlflow-experiments-00bvs_pk-py3.8\lib\site-packages\click\core.py", line 1062, in main
rv = self.invoke(ctx)
File "c:\users\jacobss\appdata\local\pypoetry\cache\virtualenvs\mlflow-experiments-00bvs_pk-py3.8\lib\site-packages\click\core.py", line 1668, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "c:\users\jacobss\appdata\local\pypoetry\cache\virtualenvs\mlflow-experiments-00bvs_pk-py3.8\lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "c:\users\jacobss\appdata\local\pypoetry\cache\virtualenvs\mlflow-experiments-00bvs_pk-py3.8\lib\site-packages\click\core.py", line 763, in invoke
return _callback(*args, **kwargs)
File "c:\users\jacobss\appdata\local\pypoetry\cache\virtualenvs\mlflow-experiments-00bvs_pk-py3.8\lib\site-packages\mlflow\cli.py", line 168, in run
projects.run(
File "c:\users\jacobss\appdata\local\pypoetry\cache\virtualenvs\mlflow-experiments-00bvs_pk-py3.8\lib\site-packages\mlflow\projects_init
.py", line 293, in run
submitted_run_obj = run(
File "c:\users\jacobss\appdata\local\pypoetry\cache\virtualenvs\mlflow-experiments-00bvs_pk-py3.8\lib\site-packages\mlflow\projects_init
.py", line 92, in _run
submitted_run = backend.run(
File "c:\users\jacobss\appdata\local\pypoetry\cache\virtualenvs\mlflow-experiments-00bvs_pk-py3.8\lib\site-packages\mlflow\projects\backend\local.py", line 72, in run
image = build_docker_image(
File "c:\users\jacobss\appdata\local\pypoetry\cache\virtualenvs\mlflow-experiments-00bvs_pk-py3.8\lib\site-packages\mlflow\projects\docker.py", line 65, in build_docker_image
build_ctx_path = _create_docker_build_ctx(work_dir, dockerfile)
File "c:\users\jacobss\appdata\local\pypoetry\cache\virtualenvs\mlflow-experiments-00bvs_pk-py3.8\lib\site-packages\mlflow\projects\docker.py", line 117, in _create_docker_build_ctx
shutil.rmtree(directory)
File "C:\Users\jacobss.pyenv\pyenv-win\versions\3.8.10\lib\shutil.py", line 740, in rmtree
return _rmtree_unsafe(path, onerror)
File "C:\Users\jacobss.pyenv\pyenv-win\versions\3.8.10\lib\shutil.py", line 613, in _rmtree_unsafe
_rmtree_unsafe(fullname, onerror)
File "C:\Users\jacobss.pyenv\pyenv-win\versions\3.8.10\lib\shutil.py", line 613, in _rmtree_unsafe
_rmtree_unsafe(fullname, onerror)
File "C:\Users\jacobss.pyenv\pyenv-win\versions\3.8.10\lib\shutil.py", line 613, in _rmtree_unsafe
_rmtree_unsafe(fullname, onerror)
[Previous line repeated 1 more time]
File "C:\Users\jacobss.pyenv\pyenv-win\versions\3.8.10\lib\shutil.py", line 618, in _rmtree_unsafe
onerror(os.unlink, fullname, sys.exc_info())
File "C:\Users\jacobss.pyenv\pyenv-win\versions\3.8.10\lib\shutil.py", line 616, in _rmtree_unsafe
os.unlink(fullname)
PermissionError: [WinError 5] Access is denied: 'C:\Users\jacobss\AppData\Local\Temp\tmpcqiipe5p\mlflow-project-contents\.git\objects\06\de731bbe9e84c9c988e3710b397bdcc68f94c6'

What component(s), interfaces, languages, and integrations does this bug affect?

Components

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

Interface

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

Language

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

Integrations

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations
@bjacobs1 bjacobs1 added the bug Something isn't working label Sep 16, 2021
@github-actions github-actions bot added area/build Build and test infrastructure for MLflow area/docker Docker use anywhere, such as MLprojects and MLmodels area/examples Example code area/projects MLproject format, project running backends area/windows Issue is unique to windows. labels Sep 16, 2021
@MrKriss
Copy link

MrKriss commented Sep 23, 2021

I can verify this is also happening on my windows system when trying to use docker environments with MLflow projects. And it also looks to share the same underlying cause as issue #4603

With a bit more digging, the root cause seems to be that python errors when attempting to delete a file that has been marked as read only on windows. This read only flag is set on all the files in the .git/ folder by default, so if a project uses git and places a MLproject file at the root level, mlflow copies the whole .git/ folder over to a temp directory in preparation for copying it to the base docker image in the build. However mlflow is then unable to delete these files in clean up afterward, and so errors.

A fix could account for the read only flag and unset it if there is an error, as detailed here: https://stackoverflow.com/questions/4829043/how-to-remove-read-only-attrib-directory-with-python-in-windows

Or alternatively, certain folders could be excluded from the initial copy operation, like .git/, which are likely already specified in a .dockerignore file to prevent them being copied over to the docker image in the last docker build step.

@harupy
Copy link
Member

harupy commented Dec 7, 2021

fixed by #4604

@harupy harupy closed this as completed Dec 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/build Build and test infrastructure for MLflow area/docker Docker use anywhere, such as MLprojects and MLmodels area/examples Example code area/projects MLproject format, project running backends area/windows Issue is unique to windows. bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants