New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to enable VkFFTBackend when building itk 5.3.0 on windows #64
Comments
Montage might benefit from doing FFTs on the GPU, but the GPU needs to be more powerful than the CPU and tiles need to be big enough to justify the time it takes to transfer them to the GPU. The only sure way to to know is to try it. @Leengit or @tbirdso might be able to help with building, but you need to better describe the problem, e.g. provide the error message. |
thanks for the reply, here is my cmake config
sor for the Chinese in the message, fyi: |
Hi @zhusihan-python, a few thoughts:
Does ITK 5.3.0 compile successfully for you when
To compile ITKVkFFTBackend as a remote module alongside ITK, set the |
hi tom @tbirdso , i already build ITK 5.3.0 and ITK 5.3.0 with ITK_USE_MKL ON successful on windows 10. but failed compile it with ITK_USE_CUFFTW ON or FFTWD FFTWF ITK_USE_FFTWF_DEFAULT ON or ITK_USE_CUFFTW FFTWD FFTWF ITK_USE_FFTWF_DEFAULT ON. as i want to compare the performance of itk Montage CompleteMontage of the default fft、mkl fft、cuda fft backends. by default the Module_VkFFTBackend didn't show in the configure, then i tried move this part from Modules/Remote/CmakeLists to the root CmakeLists
or add an entry by hand |
Try setting |
i found that cmake advanced mode will get the remote Module entries, then i enabled Module_VkFFTBackend and Module_Montage, removed ITK_WRAP_PYTHOH,
fulllog: build_vkfft_nopy.txt |
Is this InsightSoftwareConsortium/ITKMontage#214 reappearing? Does the error persist if you do a clean build, in a new directory? |
yes sir. I compiled several times in seperate dst dir from seperate source code. the difference is last time in the issue you referred I didn't set the cufft ON. |
the error message seems similar FFT image filter but not exactly the same. maybe it's caused by the same reason |
Hi @zhusihan-python , for your use case would it be reasonable to install and use prebuilt ITKVkFFTBackend OpenCL Python packages instead of building them yourself, similar to the workaround in InsightSoftwareConsortium/ITKMontage#214? $ python -m pip install itk==5.3.0 itk-montage itk-vkfft
|
hi @tbirdso i can use 5.2.1 cuda version now. just curious about the speed of 5.3.0 with vkfft backends. hope i can try it in following releases. |
ITKPython has a big startup cost. Just loading all the relevant DLLs takes ~10 seconds. But most of the computation is done in C++ libraries. I wonder whether you have a relatively slow graphics card, which would make the Python+vkfft slower. @tbirdso is there a way to disable vkFFT via environment variable or something similar? That way @zhusihan-python could compare C++ CPU vs Python CPU, and test my theory of a slow graphics card. |
the python version itk what i test is 5.3.0 installed from pypi, i think the vkfft backend is not enabled by default, right. |
Correct, it is not enabled by default. Could it be that Python version is single-threaded? If you look at CPU usage when you run, does Python version use all CPU cores? |
Then I don't know why is Python variant that much slower. |
Hi @zhusihan-python , there are a couple of things that might be happening here. Echoing @dzenanz 's note, ITK uses lazy library loading and as a result the first ITK filter execution in a script can seem to take much longer to execute. To account for this we can force libraries to load before timing later executions. import itk
itk.auto_progress(2)
itk.TileMergeImageFilter # Forces underlying libraries to load
itk.auto_progress(0)
...
# your code here As part of the step above ITK will load any modules that define FFT implementations. If you have installed Would you please confirm that the GPU is in use when There is a possibility that the overhead of moving images to GPU outweighs the advantage of FFT computation for your data. In our benchmarking we focused primarily on the performance of convolution of large 3D kernels compared with large 3D images. While your data represents large 2D images, it might be the case that they are not large enough to benefit from VkFFT GPU acceleration. Would you please comment on what CPU and GPU you are using to run the script? |
My gpu in this pc is 1050ti. And i only installed itk without vkfft from pypi. I will try the autoprogress as you suggest tomorrow and upload results here. |
run SimpleMontage with the itk-vkfft-0.2.0 backend get error
|
@zhusihan-python Thanks for the error printout. We can examine the error code definitions as defined in the OpenCL header From VkFFT we see that From @dzenanz, maybe you could comment on how ITKMontage filters schedule FFTs for tile inputs and whether there is room for optimization in that scheduling? |
Here is the scheduling logic: |
the jpg image type in itk is <class 'itk.itkImagePython.itkImageRGBUC2'> |
test the same jpg image type in another pc with 64G memory and 8G gpu memory, also gets erros, but clCreateContext returned -5 different from last time
|
Hi @zhusihan-python , it looks like the GPU may be running out of dedicated memory, hence the failures. Unfortunately, unless @dzenanz has additional thoughts on the operations of ITKMontage I don't have intuition here on where optimizations may need to take place to allow your processing to proceed. My advice at this point is to open an issue on the main ITK repository for visibility in regards to your |
First easy thing to try is reducing |
hello, im trying to build itk 5.3.0 with ITK_USE_CUFFTW ON on windows, but get some problems with that.
is the VkFFTBackend a substitude for cufft, if yes, how to enable VkFFTBackend in itk? will the itk montage benefit from VkFFTBackend set ON
my environment: windows 10 cmake 3.26.1 cuda v12.1
The text was updated successfully, but these errors were encountered: