Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensorflow==1.15.2 package on PyPI cannot find GPU devices #36476

Closed
velovix opened this issue Feb 4, 2020 · 7 comments
Closed

tensorflow==1.15.2 package on PyPI cannot find GPU devices #36476

velovix opened this issue Feb 4, 2020 · 7 comments
Assignees
Labels
stat:awaiting response Status - Awaiting response from author subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues TF 1.15 for issues seen on TF 1.15 type:build/install Build and install issues

Comments

@velovix
Copy link

velovix commented Feb 4, 2020

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04 (in Docker)
  • TensorFlow installed from (source or binary): Binary
  • TensorFlow version: 1.15.2
  • Python version: 3.6
  • Installed using virtualenv? pip? conda?: pip
  • CUDA/cuDNN version: 10.0
  • GPU model and memory: GTX 1080 Ti 12 GB

Describe the problem

When using the tensorflow package at version 1.15.2, Tensorflow does not find my GPU. The tensorflow-gpu package works, but I was under the impression that from 1.15.0 onward, the tensorflow package would be capable of running with or without a GPU.

Looking at PyPI, I notice there isn't a tensorflow-cpu package released for 1.15.2. Has the project switched back to using the old model where only tensorflow-gpu has GPU support?

Provide the exact sequence of commands / steps that you executed before running into the problem

  1. pip3 install tensorflow==1.15.2
  2. Run the following script:
    from tensorflow.python.client import device_lib
    print(device_lib.list_local_devices())
  3. Observe that CPU devices are found, but GPU devices are not

Any other info / logs

I've been testing this in a Docker container that uses Nvidia's Cuda images. I have nvidia-docker2 installed and can confirm that the tensorflow 1.15.0 package is capable of finding the GPU device.

FROM nvidia/cuda:10.0-cudnn7-runtime-ubuntu18.04

RUN apt-get update && apt-get install -y python3-pip
RUN pip3 install --upgrade pip
RUN pip3 install tensorflow==1.15.2

ENTRYPOINT ["python3", "-c", "from tensorflow.python.client import device_lib; print(device_lib.list_local_devices())"]

I get the following logs as a result:

2020-02-04 22:29:15.945515: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-02-04 22:29:15.971253: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3393255000 Hz
2020-02-04 22:29:15.972730: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x45f0770 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-02-04 22:29:15.972756: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 14247052760952852178
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 3258632196886034744
physical_device_desc: "device: XLA_CPU device"
]
@zxydi1992
Copy link

Just found the same issue.

@gadagashwini-zz gadagashwini-zz added the TF 1.15 for issues seen on TF 1.15 label Feb 5, 2020
@gadagashwini-zz
Copy link
Contributor

@velovix, Tensorflow 1.15.2 has both GPU and CPU support. Please take a look at gist. To see available GPU devices, use

import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

Thanks!

@gadagashwini-zz gadagashwini-zz added the stat:awaiting response Status - Awaiting response from author label Feb 5, 2020
@velovix
Copy link
Author

velovix commented Feb 5, 2020

@gadagashwini

Using the gist you linked, if I restart the runtime and run all steps, I get no GPUs detected.

Num GPUs Available:  0

I don't know a lot about Colab, so I'm not sure why that would be.

When testing locally using list_physical_devices instead of list_local_devices, I'm still not able to detect GPUs with 1.15.2.

I still have a suspicion that tensorflow==1.15.2 was not built with GPU support though. Taking a look at the wheel file sizes seem suspect to me:

Wheel Name File Size
tensorflow-1.15.0-cp37-cp37m-manylinux2010_x86_64.whl 394M
tensorflow_gpu-1.15.0-cp37-cp37m-manylinux2010_x86_64.whl 393M
tensorflow-1.15.2-cp36-cp36m-manylinux2010_x86_64.whl 106M
tensorflow_gpu-1.15.2-cp37-cp37m-manylinux2010_x86_64.whl 392M

If we compare file sizes between tensorflow==1.15.0 and tensorflow-gpu==1.15.0, they're very similar. However, you can see that tensorflow==1.15.2 is significantly smaller than tensorflow-gpu==1.15.2. This is admittedly unscientific, but does suggest to me that something is going on.

@gadagashwini-zz gadagashwini-zz removed the stat:awaiting response Status - Awaiting response from author label Feb 6, 2020
@gadagashwini-zz
Copy link
Contributor

@velovix, On colab we need to change the runtime type as GPU before using Tensorflow-gpu 1.15.
See the tensorflow release note for 1.15.2, there is no change in the installation package. It is same as Tf 1.15. Please take a look at screenshot
Screenshot from 2020-02-06 12-26-10
Thanks

@gadagashwini-zz gadagashwini-zz added the stat:awaiting response Status - Awaiting response from author label Feb 6, 2020
@mihaimaruseac
Copy link
Collaborator

Duplicate of #36347

Since 1.15 single package was done via a quick workaround, we were not able to replicate it for the patch release and we reverted to dual pips. Please see duplicate issue for more details.

@velovix
Copy link
Author

velovix commented Feb 6, 2020

Thank you @mihaimaruseac for the pointer! We'll just use tensorflow-gpu instead going forward.

Would it be reasonable to add this to the patch notes for the 1.15.2 release on Github? That's where I looked for breaking changes and didn't see anything about this.

@mihaimaruseac
Copy link
Collaborator

That makes sense. I have edited the release notes.

@gadagashwini-zz gadagashwini-zz added subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues type:build/install Build and install issues labels Feb 7, 2020
tacazares pushed a commit to MiraldiLab/maxATAC that referenced this issue Jun 4, 2021
I updated tensorflow to the latest version because we could not detect a GPU. See this post for more information. tensorflow/tensorflow#36476
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting response Status - Awaiting response from author subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues TF 1.15 for issues seen on TF 1.15 type:build/install Build and install issues
Projects
None yet
Development

No branches or pull requests

4 participants