Insane performance results or not?

Nicolai Hähnle nhaehnle at gmail.com
Wed Feb 10 11:51:58 PST 2010


Am Wednesday 10 February 2010 15:12:27 schrieb Pauli Nieminen:
> I made some testing how fast my system can move data to VRAM/GTT and I
> got very interestig results:
>
> (II) RADEON(0): BENCH: copy 3129344 bytes to vram took 78595us,
> resulting in 39Mbps
> (II) RADEON(0): BENCH: copy 3129344 bytes to gtt took 11411us,
> resulting in 274Mbps
> (II) RADEON(0): BENCH: copy 3129344 bytes to gtt took 8431us,
> resulting in 371Mbps
> (II) RADEON(0): BENCH: copy 3129344 bytes to vram took 75773us,
> resulting in 41Mbps
> (II) RADEON(0): BENCH: copy 3129344 gtt to vram took 3143us, resulting
> in 995Mbps

Why do the two copies to GTT differ by such a huge amount performance-wise? Is 
this simply caching? What does the picture look like for smaller / larger 
transfers?

>
>
> So direct write to VRAM operates only at 40 mega bytes per second.
> That is insanely slow. I hope we won't hit that kind of limit anywhere
> in any code.
>
> I did check that VRAM is WC cached in /proc/mtrr. But still it is
> surprising slow.
>
> But most insane result is that CPU can only write to GTT max 371 Mbps
> while GPU can do gtt to vram at 995Mbps. More insane in that results
> is that I was nearly sure that my memory can't operate that fast but
> still when code to check vram content runs everything is correctly in
> vram! What did GPU/AGP did to cheat that much? Is there some error in
> my test case?

Try putting the gettimeofday() *after* the radeon_bo_map(). I wouldn't be 
surprised if the seemingly fast GTT -> VRAM copy is simply because the 
transfer hasn't actually finished. Mapping VRAM forces a wait for the transfer 
to finish.

In any case, this is a very interesting experiment that I'll try to replicate 
if I can get around to it.

cu,
Nicolai


>
> System is:
> Athlon mobility XP at 2.1Ghz
> 333MHz ddr memory
> AGP 8x bus to mobility radeon 9200.
>
> Of course the most important part of message is the attachments. diff
> file contains benchmark code. Also another attachment is my shell
> scriptthat I used to run the test. It sets my cpu to performance mode
> to make sure that cpu frequency changes won't affect the results
>
> Pauli





More information about the xorg-driver-ati mailing list