Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.13.2 Cannot from accelerate import Accelerator in Windows since module 'signal' has no attribute 'SIGKILL' #817

Closed
4 tasks done
T-Atlas opened this issue Nov 3, 2022 · 7 comments · Fixed by #828
Closed
4 tasks done
Assignees

Comments

@T-Atlas
Copy link

T-Atlas commented Nov 3, 2022

System Info

v0.13.2 (working nomally when in v0.12.0)
window 11
py 3.9.13
torch 1.13.0+gpu

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
  • My own task or dataset (give details below)

Reproduction

from accelerate import Accelerator

Expected behavior

(d2l) PS C:\Users\LianJunhong\Desktop\eat_pytorch_in_20_days> accelerate env
NOTE: Redirects are currently not supported in Windows or MacOs.
Traceback (most recent call last):
  File "D:\Miniconda3\envs\d2l\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "D:\Miniconda3\envs\d2l\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "D:\Miniconda3\envs\d2l\Scripts\accelerate.exe\__main__.py", line 4, in <module>
  File "D:\Miniconda3\envs\d2l\lib\site-packages\accelerate\__init__.py", line 7, in <module>
    from .accelerator import Accelerator
  File "D:\Miniconda3\envs\d2l\lib\site-packages\accelerate\accelerator.py", line 27, in <module>
    from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
  File "D:\Miniconda3\envs\d2l\lib\site-packages\accelerate\checkpointing.py", line 24, in <module>
    from .utils import (
  File "D:\Miniconda3\envs\d2l\lib\site-packages\accelerate\utils\__init__.py", line 96, in <module>
    from .launch import PrepareForLaunch, _filter_args, get_launch_prefix
  File "D:\Miniconda3\envs\d2l\lib\site-packages\accelerate\utils\launch.py", line 25, in <module>
    import torch.distributed.run as distrib_run
  File "D:\Miniconda3\envs\d2l\lib\site-packages\torch\distributed\run.py", line 386, in <module>
    from torch.distributed.launcher.api import LaunchConfig, elastic_launch
  File "D:\Miniconda3\envs\d2l\lib\site-packages\torch\distributed\launcher\__init__.py", line 10, in <module>
    from torch.distributed.launcher.api import (  # noqa: F401
  File "D:\Miniconda3\envs\d2l\lib\site-packages\torch\distributed\launcher\api.py", line 15, in <module>
    from torch.distributed.elastic.agent.server.api import WorkerSpec
  File "D:\Miniconda3\envs\d2l\lib\site-packages\torch\distributed\elastic\agent\server\__init__.py", line 40, in <module>
    from .local_elastic_agent import TORCHELASTIC_ENABLE_FILE_TIMER, TORCHELASTIC_TIMER_FILE
  File "D:\Miniconda3\envs\d2l\lib\site-packages\torch\distributed\elastic\agent\server\local_elastic_agent.py", line 19, in <module>
    import torch.distributed.elastic.timer as timer
  File "D:\Miniconda3\envs\d2l\lib\site-packages\torch\distributed\elastic\timer\__init__.py", line 44, in <module>
    from .file_based_local_timer import FileTimerClient, FileTimerServer, FileTimerRequest  # noqa: F401
  File "D:\Miniconda3\envs\d2l\lib\site-packages\torch\distributed\elastic\timer\file_based_local_timer.py", line 63, in <module>
    class FileTimerClient(TimerClient):
  File "D:\Miniconda3\envs\d2l\lib\site-packages\torch\distributed\elastic\timer\file_based_local_timer.py", line 81, in FileTimerClient
    def __init__(self, file_path: str, signal=signal.SIGKILL) -> None:
AttributeError: module 'signal' has no attribute 'SIGKILL'
@sgugger
Copy link
Collaborator

sgugger commented Nov 3, 2022

This comes from PyTorch directly, you can follow up the issue and its resolution here. In the meantime, you should downgrade your PyTorch to a version < 1.13.

@T-Atlas
Copy link
Author

T-Atlas commented Nov 4, 2022

This comes from PyTorch directly, you can follow up the issue and its resolution here. In the meantime, you should downgrade your PyTorch to a version < 1.13.

Thanks! I will follow up that issue

@versus666jzx
Copy link

Same issue

  • python3.10
  • PyTorch 1.13.0+gpu
  • Win 11
File "C:\Users\Me\venv\lib\site-packages\torch\distributed\elastic\timer\file_based_local_timer.py", line 81, in FileTimerClient
    def __init__(self, file_path: str, signal=signal.SIGKILL) -> None:
AttributeError: module 'signal' has no attribute 'SIGKILL'. Did you mean: 'SIGILL'?

Downgrade torch to 1.12.1 solve this.

@T-Atlas
Copy link
Author

T-Atlas commented Nov 8, 2022

Same issue

  • python3.10
  • PyTorch 1.13.0+gpu
  • Win 11
File "C:\Users\Me\venv\lib\site-packages\torch\distributed\elastic\timer\file_based_local_timer.py", line 81, in FileTimerClient
    def __init__(self, file_path: str, signal=signal.SIGKILL) -> None:
AttributeError: module 'signal' has no attribute 'SIGKILL'. Did you mean: 'SIGILL'?

Downgrade torch to 1.12.1 solve this.

I think so, but if you need to try PyTorch version==1.13.0, you can use accelerate v0.12.0.

@muellerzr
Copy link
Collaborator

@T-Atlas or @versus666jzx can you verify that retrying and installing accelerate with pip install git+https://github.com/huggingface/accelerate fixes your issues? Thanks!

@T-Atlas
Copy link
Author

T-Atlas commented Nov 8, 2022

@muellerzr Bug fixed with accelerate-0.14.0.dev0. Thanks!

@versus666jzx
Copy link

@muellerzr I confirm that the bug has been fixed. Thanks a lot!

  • torch - 1.13.0+cu117
  • accelerate - 0.14.0.dev0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants