Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when trying to install horovod #3306

Closed
n-balla opened this issue Dec 8, 2021 · 8 comments
Closed

Error when trying to install horovod #3306

n-balla opened this issue Dec 8, 2021 · 8 comments
Labels

Comments

@n-balla
Copy link

n-balla commented Dec 8, 2021

Hello,
I've been working on this for awhile, but couldn't figure it out yet. I would appreciate your help.

I am still not able to install horovod, although I did install all the requirements specified in the documentation.

I am getting this long error:

Installing collected packages: psutil, cloudpickle, cffi, horovod
Running setup.py install for psutil ... error
ERROR: Command errored out with exit status 1:
command: 'C:\Users\Munira\AppData\Local\Programs\Python\Python310\python.exe' -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Munira\AppData\Local\Temp\pip-install-cqd15u3\psutil_f9f6130583a240c5833aaefb1d9da5e6\setup.py'"'"'; file='"'"'C:\Users\Munira\AppData\Local\Temp\pip-install-cqd15u3\psutil_f9f6130583a240c5833aaefb1d9da5e6\setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Munira\AppData\Local\Temp\pip-record-o4cdua9y\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\Users\Munira\AppData\Local\Programs\Python\Python310\Include\psutil'
cwd: C:\Users\Munira\AppData\Local\Temp\pip-install-cqd15u3\psutil_f9f6130583a240c5833aaefb1d9da5e6
Complete output (38 lines):
running install
running build
running build_py
creating build
creating build\lib.win32-3.10
creating build\lib.win32-3.10\psutil
copying psutil_common.py -> build\lib.win32-3.10\psutil
copying psutil_compat.py -> build\lib.win32-3.10\psutil
copying psutil_psaix.py -> build\lib.win32-3.10\psutil
copying psutil_psbsd.py -> build\lib.win32-3.10\psutil
copying psutil_pslinux.py -> build\lib.win32-3.10\psutil
copying psutil_psosx.py -> build\lib.win32-3.10\psutil
copying psutil_psposix.py -> build\lib.win32-3.10\psutil
copying psutil_pssunos.py -> build\lib.win32-3.10\psutil
copying psutil_pswindows.py -> build\lib.win32-3.10\psutil
copying psutil_init
.py -> build\lib.win32-3.10\psutil
creating build\lib.win32-3.10\psutil\tests
copying psutil\tests\runner.py -> build\lib.win32-3.10\psutil\tests
copying psutil\tests\test_aix.py -> build\lib.win32-3.10\psutil\tests
copying psutil\tests\test_bsd.py -> build\lib.win32-3.10\psutil\tests
copying psutil\tests\test_connections.py -> build\lib.win32-3.10\psutil\tests
copying psutil\tests\test_contracts.py -> build\lib.win32-3.10\psutil\tests
copying psutil\tests\test_linux.py -> build\lib.win32-3.10\psutil\tests
copying psutil\tests\test_memleaks.py -> build\lib.win32-3.10\psutil\tests
copying psutil\tests\test_misc.py -> build\lib.win32-3.10\psutil\tests
copying psutil\tests\test_osx.py -> build\lib.win32-3.10\psutil\tests
copying psutil\tests\test_posix.py -> build\lib.win32-3.10\psutil\tests
copying psutil\tests\test_process.py -> build\lib.win32-3.10\psutil\tests
copying psutil\tests\test_sunos.py -> build\lib.win32-3.10\psutil\tests
copying psutil\tests\test_system.py -> build\lib.win32-3.10\psutil\tests
copying psutil\tests\test_testutils.py -> build\lib.win32-3.10\psutil\tests
copying psutil\tests\test_unicode.py -> build\lib.win32-3.10\psutil\tests
copying psutil\tests\test_windows.py -> build\lib.win32-3.10\psutil\tests
copying psutil\tests_init
.py -> build\lib.win32-3.10\psutil\tests
copying psutil\tests_main
.py -> build\lib.win32-3.10\psutil\tests
running build_ext
building 'psutil._psutil_windows' extension
error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
----------------------------------------
ERROR: Command errored out with exit status 1: 'C:\Users\Munira\AppData\Local\Programs\Python\Python310\python.exe' -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Munira\AppData\Local\Temp\pip-install-_cqd15u3\psutil_f9f6130583a240c5833aaefb1d9da5e6\setup.py'"'"'; file='"'"'C:\Users\Munira\AppData\Local\Temp\pip-install-_cqd15u3\psutil_f9f6130583a240c5833aaefb1d9da5e6\setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Munira\AppData\Local\Temp\pip-record-o4cdua9y\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\Users\Munira\AppData\Local\Programs\Python\Python310\Include\psutil' Check the logs for full command output.
WARNING: You are using pip version 21.2.4; however, version 21.3.1 is available.
You should consider upgrading via the 'C:\Users\Munira\AppData\Local\Programs\Python\Python310\python.exe -m pip install --upgrade pip' command.<

I tried to reinstall the most recent g++ but still getting the same error.

Can anyone help please?

Thank you,

@n-balla
Copy link
Author

n-balla commented Dec 8, 2021

my device specification:
Processor Intel(R) Core(TM) i9-7920X CPU @ 2.90GHz 2.90 GHz
Installed RAM 32.0 GB (31.9 GB usable)
System type 64-bit operating system, x64-based processor

@n-balla
Copy link
Author

n-balla commented Dec 8, 2021

@tgaddair I would really appreciate your help.

Also, I am very new to this. I've been reading the documentation for the past few days, but I still need some help. I need to setup a mini distributed environment in a single machine (using horovod with tensorFlow). I wrote a new class for compression, that I need to test.

Thanks a lot!

@maxhgerlach
Copy link
Collaborator

maxhgerlach commented Dec 8, 2021

Hi @n-balla, you can't use Horovod on Windows.

Edit: Maybe WSL could work, but I don't have any experience there.

@floatshadow
Copy link

floatshadow commented Dec 12, 2021

Similar error on Debian 11 with Pytorch @n-balla @maxhgerlach
I install Pytorch from source, and nccl version is 2.10.3
I first just use

HOROVOD_GPU_OPERATIONS=NCCL pip install --no-cache-dir horovod

got the same error

adding the path to nccl lib I got

HOROVOD_NCCL_HOME=/path/to/nccl HOROVOD_GPU_OPERATIONS=NCCL pip install --no-cache-dir horovod

returns error

[ 92%] Building CXX object horovod/torch/CMakeFiles/pytorch.dir/cuda_util.cc.o
cd /tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/build/temp.linux-x86_64-3.9/RelWithDebInfo/horovod/torch && /usr/bin/c++ -DEIGEN_MPL2_ONLY=1 -DHAVE_CUDA=1 -DHAVE_GLOO=1 -DHAVE_GPU=1 -DHAVE_NCCL=1 -DHAVE_NVTX=1 -DHOROVOD_GPU_ALLGATHER=78 -DHOROVOD_GPU_ALLREDUCE=78 -DHOROVOD_GPU_ALLTOALL=78 -DHOROVOD_GPU_BROADCAST=78 -DTORCH_API_INCLUDE_EXTENSION_H=1 -DTORCH_VERSION=1011000000 -Dpytorch_EXPORTS -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/HTTPRequest/include -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/boost/assert/include -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/boost/config/include -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/boost/core/include -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/boost/detail/include -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/boost/iterator/include -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/boost/lockfree/include -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/boost/mpl/include -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/boost/parameter/include -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/boost/predef/include -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/boost/preprocessor/include -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/boost/static_assert/include -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/boost/type_traits/include -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/boost/utility/include -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/lbfgs/include -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/gloo -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/eigen -I/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/third_party/flatbuffers/include -isystem /home/zsy/pytorch/build/nccl/include -isystem /home/zsy/miniconda3/envs/torch/lib/python3.9/site-packages/torch/include -isystem /home/zsy/miniconda3/envs/torch/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/zsy/miniconda3/envs/torch/lib/python3.9/site-packages/torch/include/TH -isystem /home/zsy/miniconda3/envs/torch/lib/python3.9/site-packages/torch/include/THC -isystem /home/zsy/miniconda3/envs/torch/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=1 -pthread -fPIC -Wall -ftree-vectorize -mf16c -mavx -mfma -O3 -g -DNDEBUG -fPIC -std=c++14 -MD -MT horovod/torch/CMakeFiles/pytorch.dir/cuda_util.cc.o -MF CMakeFiles/pytorch.dir/cuda_util.cc.o.d -o CMakeFiles/pytorch.dir/cuda_util.cc.o -c /tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/horovod/torch/cuda_util.cc
/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/horovod/torch/cuda_util.cc: In constructor ‘horovod::torch::with_device::with_device(int)’:
/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/horovod/torch/cuda_util.cc:34:5: error: ‘THCudaCheck’ was not declared in this scope
34 | THCudaCheck(cudaGetDevice(&restore_device_));
| ^~~~~~~~~~~
/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/horovod/torch/cuda_util.cc: In destructor ‘horovod::torch::with_device::~with_device()’:
/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/horovod/torch/cuda_util.cc:46:5: error: ‘THCudaCheck’ was not declared in this scope
46 | THCudaCheck(cudaSetDevice(restore_device_));
| ^~~~~~~~~~~
gmake[2]: *** [horovod/torch/CMakeFiles/pytorch.dir/build.make:510: horovod/torch/CMakeFiles/pytorch.dir/cuda_util.cc.o] Error 1
gmake[2]: Leaving directory '/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/build/temp.linux-x86_64-3.9/RelWithDebInfo'
gmake[1]: *** [CMakeFiles/Makefile2:300: horovod/torch/CMakeFiles/pytorch.dir/all] Error 2
gmake[1]: Leaving directory '/tmp/pip-install-atu6vpea/horovod_6ea8f5b920194ed48c773c423bbc73c0/build/temp.linux-x86_64-3.9/RelWithDebInfo'
gmake: *** [Makefile:136: all] Error 2

@maxhgerlach
Copy link
Collaborator

That's a very different error message and doesn't match this thread. It also does not look related to NCCL. Is this with PyTorch master or with a released version? We might have a build problem with nightly torch at the moment.

@floatshadow
Copy link

@maxhgerlach Much thanks : )
I have 2 Telsa K40m to profile different multi-GPU strategies on models
As CC 3.5 is not supported by pytorch, we compile pytorch from source with conda

I clone from official github repo last month and built the 1.10 released version with gcc 10.2.1

@maxhgerlach
Copy link
Collaborator

@floatshadow, I just took a look at this again. You might want to try again with the current master version of Horovod. The dependence on THCudaCheck has been removed in November (to fix build issues with torch): #3242

@stale
Copy link

stale bot commented Mar 6, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Mar 6, 2022
@stale stale bot closed this as completed Mar 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants