Insane performance results or not?

Michel Dänzer michel at daenzer.net
Thu Feb 11 06:18:40 PST 2010


On Wed, 2010-02-10 at 23:19 +0200, Pauli Nieminen wrote: 
> On Wed, Feb 10, 2010 at 7:44 PM, Roland Scheidegger
> <sroland at tungstengraphics.com> wrote:
> > On 10.02.2010 15:12, Pauli Nieminen wrote:
> >> Hi!
> >>
> >> I made some testing how fast my system can move data to VRAM/GTT and I
> >> got very interestig results:
> >>
> >> (II) RADEON(0): BENCH: copy 3129344 bytes to vram took 78595us,
> >> resulting in 39Mbps
> > "Mbps" is a bit confusing I can only read that as Megabit per second
> > which is even more pathetic :-)
> >
> 
> MBps then. I never remember to write big B for byte :/
> 
> >> (II) RADEON(0): BENCH: copy 3129344 bytes to gtt took 11411us,
> >> resulting in 274Mbps
> >> (II) RADEON(0): BENCH: copy 3129344 bytes to gtt took 8431us,
> >> resulting in 371Mbps
> >> (II) RADEON(0): BENCH: copy 3129344 bytes to vram took 75773us,
> >> resulting in 41Mbps
> >> (II) RADEON(0): BENCH: copy 3129344 gtt to vram took 3143us, resulting
> >> in 995Mbps
> >>
> >>
> >> So direct write to VRAM operates only at 40 mega bytes per second.
> >> That is insanely slow. I hope we won't hit that kind of limit anywhere
> >> in any code.
> >>
> >> I did check that VRAM is WC cached in /proc/mtrr. But still it is
> >> surprising slow.
> > Isn't that something which fast writes should help with, an option we
> > never really got to work? I agree though it's really bad.
> > In any case I think it would be interesting to repeat those tests on
> > pci/pcie cards.
> >
> 
> Fast writes might be solution but how reliable they are? How good
> performance wise?

While fast writes might help, colour me sceptical about the lack of them
explaining the slowness. I suspect you're not actually getting
write-combining for some reason (if PAT is enabled, have you tried
disabling it?).


> Also change to memcpy instead of memmoves pushes speeds to 440-450 for
> first gtt copy and 470-490 for second copy. But still my CPU is slow
> when GPU has to first be notified about copy and then it has to signal
> back about work being done.
> 
> So memove is also very slow for writes to WC cached GTT.

I think it should be safe to change the X driver to use memcpy in
RADEONCopySwap().


-- 
Earthling Michel Dänzer           |                http://www.vmware.com
Libre software enthusiast         |          Debian, X and DRI developer


More information about the xorg-driver-ati mailing list