[radeonhd] Re: Updated R5xx 3D Programming Guide

Tue Apr 8 15:36:28 PDT 2008

Am Dienstag, den 08.04.2008, 19:39 +0200 schrieb Jerome Glisse:

> 
> GPU are not a CPU, programming them is far more complexe than programming
> CPU, you have to handle things that CPU do for you. For instance you have
> setup how data are routed to GPU.
True, thats still the main problem i see.

>  Also GPU are not intended to run usual
> program, ie program with if, swith, jump, and others alike instructions.
> GPU could handle such instructions but your often limited in the depth you
> can have (for instance no more than 16 nested for, if statement).
Well, a normal CPU doesnt have any conditional statements like "if,
switch, while etc" either, just the jz, jnz etc on assembly level.
The conditional jumps described in the R5xx guide arent all too
different (Section 7.6 Flow control and 7.6.3.1 in special).
I´m pretty sure this is the most tricky part, since ideally the
conditional jumps should only occur when either all or no stream units
want to jump, otherwise the gpu will mask a lot of units.

>  I think
> a good analogy is the cell processor where for OS they have a unit which
> is just like a CPU while stream unit look like a GPU.

I dont have any knowledge of the cell plattform, so i dont know.

> So any application which want to efficiently use a GPU need to be cut btw
> the core application which would run on a "normal" CPU and a specific part
> intended to run on GPU.

Right.

>  This part need to be designed with GPU specificity
> in mind. This why i don't think a compiler, at least in the sense you seems
> to think about, is of any use with a GPU.

Dont think so, otherwise nvidias CUDA would be useless as well.
Being able to send a program to the GPU will be usefull for benchmarking
as i already pointed out.
Compiler might be a bit overdoing it for the time being. Assembling the
opcodes into a binary seems like the obious first step. And then loading
the program into GPU memory and starting it.
Or, to be more specific,
a normal cpu program creates a buffer and loads the GPU binary,
then sends the gpu the instruction to load and execute the program.
The GPU should be able to send some results into a buffer in main
memory.
Once we get that running it would be possible to integrate that
assembler and loader into llvm and that in turn into mesa.

>  I am confident that through gallium
> we will be able to offer a sane api to enable application to properly use
> horse power of GPU. Note that similar problematic exist with multi-cpu, you
> can't really take advantage of multiple CPU only with a specific compiler,
> your application have to designed for multi-CPU.

Yes, i know of the threading issues involved with multi-CPU coding.
Obviously the compiler cant refactor your program to make efficient use
of multiple cores. But without a compiler you cant write a program
either.
> 
> Cheers,
> Jerome Glisse <glisse at freedesktop.org>

Greetings,
Syren Baran