OpenCL: Blender 2.67b and OpenCL is working better

Sunday, June 2, 2013

Blender 2.67b and OpenCL is working better

I just updated to new Blender 2.67b and found out that something in OpenCL changed to better. Last time I checked previous version of Blender there was not possible to select CPU as the compute device. Now it's possible. It's even possible to use combination of CPU and GPU. Take a look at the next picture:

I can use Intel Core i5 or/and AMD Radeon graphic card as compute device. This is nice.

You can see Intel Core i5 written twice. The reason is that I have installed two OpenCL implementations. One is from Intel and one is from AMD. Sadly I don't know which is from AMD and which is from Intel but most of users will not have that problem.

What if we try to run Cycles on OpenCL? Let's start with Intel Core i5 and theirs OpenCL implementation. In console we get next output:

Compiling OpenCL kernel ...
OpenCL kernel build output:
Compilation started
In file included from <built-in>:132:
<command line>:2:36: warning: ISO C99 requires whitespace after the macro name
Compilation done
Linking started
Linking done
Kernel <kernel_ocl_path_trace> was not vectorized
Kernel <kernel_ocl_tonemap> was successfully vectorized
Done.
Kernel compilation finished in 17.80s.

You can see that whole Cycles code is quite massive stuff. Compilation takes 17.8s. Guys who wrote Cycles, put a lot of work into this code. Rendering time of default cube takes: 2.9s.

What about AMD's implementation of CPU backend? Console output is in this case less verbose:

Compiling OpenCL kernel ...
Kernel compilation finished in 5.13s.

Rendering time 2.2s is what is strange as AMD's implementation takes less time than Intel's implementation on Intel's CPUs! We're using AMD APP 1214.3 and Intel SDK 2013. But I'm not alone here. Phoronix found similar results: http://www.phoronix.com/scan.php?page=article&item=amd_intel_openclsdk&num=1 .

If we select not OpenCL computing but pure CPU implementation, it takes 1.8s. It seems that some work could be done to optimize whole thing. For my opinion pure CPU implementation is not needed any more. OpenCL implementation is enough. For machines which don't have OpenCL preinstalled, default OpenCL implementation could be bundled with Blender.

I noticed that Blender caches built OpenCL kernels. Good work! On the next start of Blender, first rendering is significantly faster.

What about GPU? At first we get a lot of trivial warnings which can be ignored:

Compiling OpenCL kernel ...
OpenCL kernel build output:
"/tmp/OCLawnF6S.cl", line 16307: warning: double-precision constant is
          represented as single-precision constant because double is not
          enabled
        float phi = M_2PI_F * randv;
                    ^

"/tmp/OCLawnF6S.cl", line 16323: warning: double-precision constant is
          represented as single-precision constant because double is not
          enabled
        float phi = M_2PI_F * randv;
                    ^

"/tmp/OCLawnF6S.cl", line 16337: warning: double-precision constant is
          represented as single-precision constant because double is not
          enabled
        float phi = M_2PI_F*u2;
                    ^

"/tmp/OCLawnF6S.cl", line 22875: warning: double-precision constant is
          represented as single-precision constant because double is not
          enabled
                float phi = M_2PI_F * randu;
                            ^

"/tmp/OCLawnF6S.cl", line 23165: warning: double-precision constant is
          represented as single-precision constant because double is not
          enabled
                float phiM = M_2PI_F * randv;
                             ^

"/tmp/OCLawnF6S.cl", line 23394: warning: double-precision constant is
          represented as single-precision constant because double is not
          enabled
                float phiM = M_2PI_F * randv;
                             ^                                                                                                                                                                                                    
                                                                                                                                                                                                                                  
"/tmp/OCLawnF6S.cl", line 24051: warning: double-precision constant is                                                                                                                                                            
          represented as single-precision constant because double is not                                                                                                                                                          
          enabled                                                                                                                                                                                                                 
                float phi = M_2PI_F * randu;                                                                                                                                                                                      
                            ^                                                                                                                                                                                                     
                                                                                                                                                                                                                                  
"/tmp/OCLawnF6S.cl", line 24427: warning: double-precision constant is                                                                                                                                                            
          represented as single-precision constant because double is not                                                                                                                                                          
          enabled                                                                                                                                                                                                                 
        const float tolerance = 1e-8;                                                                                                                                                                                             
                                ^                                                                                                                                                                                                 
                                                                                                                                                                                                                                  
"/tmp/OCLawnF6S.cl", line 24497: warning: double-precision constant is                                                                                                                                                            
          represented as single-precision constant because double is not                                                                                                                                                          
          enabled                                                                                                                                                                                                                 
        return ss->alpha_*(1.0f/M_4PI_F)*(Rdr + Rdv);                                                                                                                                                                             
                                ^                                                                                                                                                                                                 
                                                                                                                                                                                                                                  
"/tmp/OCLawnF6S.cl", line 26172: warning: double-precision constant is                                                                                                                                                            
          represented as single-precision constant because double is not
          enabled
                return atan2f(y, x) / M_2PI_F + 0.5f;
                                      ^

Error:E013:Insufficient Private Resources! 

OpenCL build failed: errors in console

But at then end we get:

Error:E013:Insufficient Private Resources!

It looks like our GPU AMD Radeon 5470 is too low end. But at least it compiles all the code. It would be nice to get Cylces working on low end GPU's, but if we think further we can see that it's not worth the effort. Real Blender users will anyway use better GPU's.

The question is why it doesn't work. Is there to less local memory? Or we have to complex program? As we're talking about private resources I think that Cycles program is to complex to our GPU. It uses to much of registers or the program is to long. Maybe splitting Cycles into more smaller kernels would help. To find out the exact problem it's needed to use KernelAnalyzer from AMD APP and try to compile kernel for all GPUs.

26 comments:

UnknownJune 2, 2013 at 7:12 AM
Hi, I downloaded Blender 2.67B, I have an Intel i7-3770 and a GPU Ati Radeon 7950. I don't have the option to switch from CPU Computing to GPU. Can I do something to render with my GPU? I know Cycles doesn't work well with OpenCL
ReplyDelete
Replies
MachJune 2, 2013 at 9:38 AM
You need to set environment variable CYCLES_OPENCL_TEST=true
ReplyDelete
Replies
UnknownJune 3, 2013 at 2:50 AM
Thanks for the answer! But how? Changing a file in the Cycles/kernel folder?
ReplyDelete
Replies
MachJune 3, 2013 at 11:32 AM
One way on Windows is here: http://support.microsoft.com/kb/310519 .

On Linux you can type into the console: CYCLES_OPENCL_TEST=true blender
ReplyDelete
Replies
UnknownJune 3, 2013 at 12:01 PM
I'm on Windows, I did that, and the GPU Compute appeared. Although, the render on GPU doesn't work.
OpenCL build failed: errors in console.
Thanks for your help
ReplyDelete
Replies
UnknownJune 3, 2013 at 12:26 PM
I have the last official drivers, I'm downloading the beta now. I don't know where Cycles put my errors files, I'm looking...
In the Official Blender version I had just the option "Tahiti" under OpenCL, I tried with the last two versions which I found on Graphicall now and there is the option CPU+GPU, but it doesn't work anyway
ReplyDelete
Replies
UnknownJune 3, 2013 at 2:02 PM
yes, works. This is the error http://postimg.org/image/9r2p8fcw1/
ReplyDelete
Replies
MachJune 4, 2013 at 10:46 AM
Really strange. I have installed Blender 2.67.57180. 57180 is the revision number.

If you look the preview versions at http://builder.blender.org/download/, it looks like the Windows is a little bit behind the Linux. Last revision is blender-2.67-r57165-win64.zip

ReplyDelete
Replies
GermanoJune 4, 2013 at 8:24 PM
I could only render with OpenCL CPU, when using the version "multiview-blender-2.67-r57177-win64.zip" found at this site: http://builder.blender.org/download/
It is interesting to try.
ReplyDelete
Replies
SergioJune 9, 2013 at 9:55 AM
Please keep exploring Cycles and OpenCL. I'm following close the development and you bring some interesting insight.
Thanks!
ReplyDelete
Replies
iamvfxJuly 21, 2013 at 4:54 AM
This comment has been removed by the author.
ReplyDelete
Replies
stargeizerJuly 30, 2013 at 9:45 PM
The problem with AMD hardware is the compiller as stated by an AMD representative here:

"I can guarantee you that active work is happening here. Fixing cycles involves work at both OpenCL compiler level and also in layers beneath it.

The work is pretty involved and will take a fair amount of time.

I have not got any timelines from AMD engineers. But it looks like this is going to take a while.

Please bear with us."

This if from http://devgurus.amd.com/message/1287979#1287979, and the date is march, 2013. AFAIK they are still working on it, so for now no news about it.

Brecht Van Lommel has stated splitting a complex megakernel like cyles means quite a lot of work and time. So either way, people wanting to use Cycles on AMD hardware will have to wait quite some time.
ReplyDelete
Replies
UnknownSeptember 10, 2013 at 7:27 PM
Blender on OSX Lion:
--------------------

Compiling OpenCL kernel ...
OpenCL error (ATI Radeon HD 6750M): OpenCL Warning : clBuildProgram failed: could not build program for 0x1021b00 (ATI Radeon HD 6750M) (err:-2)
OpenCL error (ATI Radeon HD 6750M): [CL_BUILD_ERROR] : OpenCL Build Error : Compiler build log:
:18459:10: error: initializing '__float4 *' with an expression of type '__attribute__((address_space(1))) __float4 *' changes address space of pointer
float4 *in = (__global float4*)(buffer + index*kernel_data.film.pass_stride);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
:37588:8: warning: unused variable 'ray_t'
float ray_t = 0.0f;
^

OpenCL kernel build output:
:18459:10: error: initializing '__float4 *' with an expression of type '__attribute__((address_space(1))) __float4 *' changes address space of pointer
float4 *in = (__global float4*)(buffer + index*kernel_data.film.pass_stride);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
:37588:8: warning: unused variable 'ray_t'
float ray_t = 0.0f;
^

OpenCL build failed: errors in console
ReplyDelete
Replies
e92m3October 25, 2013 at 4:57 AM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownFebruary 28, 2014 at 6:40 AM
Under Linux with catalyst driver and AMD Tahiti GPU ( Radeon HD7970) it works !!! This is great news... btw. It is working with Blender 2.69 already...
ReplyDelete
Replies
UnknownMarch 11, 2014 at 4:50 PM
I can comfirmed it to. Amd HD7950 works fine, but by using Gpu and Cpu together i get bugs in my renderings.
ReplyDelete
Replies
Phil HendryMarch 28, 2015 at 2:14 AM
This comment has been removed by the author.
ReplyDelete
Replies
Phil HendryMarch 28, 2015 at 11:42 PM
I'm trying OpenCL on a Yoga 2 Pro laptop which has Intel HD Graphics 4400 GPU but I'm getting a compilation error shown below. Is there any way to turn verbose logging on since there's not much information here? Is it going to be worthwhile bothering to try anyway??

C:\Program Files\Blender Foundation\Blender>blender.exe
Read new prefs: C:\Users\Philip\AppData\Roaming\Blender Foundation\Blender\2.73\config\userpref.blend
found bundled python: C:\Program Files\Blender Foundation\Blender\2.73\python
Imported multifiles
Device init succes
Compiling OpenCL kernel ...
OpenCL error (Intel(R) HD Graphics 4400): Build program failure.
OpenCL kernel build output:
:19472:35: warning: double precision constant requires cl_khr_fp64, casting to single precision
:1846:26: note: expanded from here
:19666:14: warning: double precision constant requires cl_khr_fp64, casting to single precision
:1846:26: note: expanded from here
:19700:14: warning: double precision constant requires cl_khr_fp64, casting to single precision
:1846:26: note: expanded from here
:19717:14: warning: double precision constant requires cl_khr_fp64, casting to single precision
:1846:26: note: expanded from here
:19732:14: warning: double precision constant requires cl_khr_fp64, casting to single precision
:1846:26: note: expanded from here
:31562:15: warning: double precision constant requires cl_khr_fp64, casting to single precision
:1846:26: note: expanded from here
:31815:21: warning: double precision constant requires cl_khr_fp64, casting to single precision
:1846:26: note: expanded from here
:31891:21: warning: double precision constant requires cl_khr_fp64, casting to single precision
:1846:26: note: expanded from here
:33111:10: warning: double precision constant requires cl_khr_fp64, casting to single precision
:1846:26: note: expanded from here
:35879:25: warning: double precision constant requires cl_khr_fp64, casting to single precision
:1846:26: note: expanded from here
:36315:58: warning: double precision constant requires cl_khr_fp64, casting to single precision
:1846:26: note: expanded from here
:39463:32: warning: double precision constant requires cl_khr_fp64, casting to single precision
:1846:26: note: expanded from here
:41640:31: warning: double precision constant requires cl_khr_fp64, casting to single precision
:1846:26: note: expanded from here
:41674:35: warning: double precision constant requires cl_khr_fp64, casting to single precision
:1846:26: note: expanded from here
:41759:12: warning: double precision constant requires cl_khr_fp64, casting to single precision
:1846:26: note: expanded from here
:41832:12: warning: double precision constant requires cl_khr_fp64, casting to single precision
:1846:26: note: expanded from here
fcl build 1 succeeded.
fcl build 2 succeeded.
Error: internal error.

OpenCL build failed: errors in console
Error: OpenCL build failed: errors in console
ReplyDelete
Replies
tejaswiniAugust 25, 2020 at 11:57 PM
Its most perceptibly horrendous piece was that the item just worked spasmodically and the data was not exact. You unmistakably canot confront anyone about what you have found if the information isn't right.data science course
ReplyDelete
Replies
UnknownOctober 8, 2022 at 11:07 PM
보령콜걸
보령콜걸
충남콜걸
연천콜걸
연천콜걸
서산콜걸
논산콜걸

ReplyDelete
Replies
Digital marketingNovember 4, 2023 at 3:25 AM
Your perspective on this topic is truly refreshing. I appreciate the unique angle you've taken.
ReplyDelete
Replies

Add comment