Insane performance results or not?

Pauli Nieminen suokkos at gmail.com
Thu Feb 11 07:31:39 PST 2010


2010/2/11 Michel Dänzer <michel at daenzer.net>:
> On Wed, 2010-02-10 at 23:19 +0200, Pauli Nieminen wrote:
>> On Wed, Feb 10, 2010 at 7:44 PM, Roland Scheidegger
>> <sroland at tungstengraphics.com> wrote:
>> > On 10.02.2010 15:12, Pauli Nieminen wrote:
>> >> Hi!
>> >>
>> >> I made some testing how fast my system can move data to VRAM/GTT and I
>> >> got very interestig results:
>> >>
>> >> (II) RADEON(0): BENCH: copy 3129344 bytes to vram took 78595us,
>> >> resulting in 39Mbps
>> > "Mbps" is a bit confusing I can only read that as Megabit per second
>> > which is even more pathetic :-)
>> >
>>
>> MBps then. I never remember to write big B for byte :/
>>
>> >> (II) RADEON(0): BENCH: copy 3129344 bytes to gtt took 11411us,
>> >> resulting in 274Mbps
>> >> (II) RADEON(0): BENCH: copy 3129344 bytes to gtt took 8431us,
>> >> resulting in 371Mbps
>> >> (II) RADEON(0): BENCH: copy 3129344 bytes to vram took 75773us,
>> >> resulting in 41Mbps
>> >> (II) RADEON(0): BENCH: copy 3129344 gtt to vram took 3143us, resulting
>> >> in 995Mbps
>> >>
>> >>
>> >> So direct write to VRAM operates only at 40 mega bytes per second.
>> >> That is insanely slow. I hope we won't hit that kind of limit anywhere
>> >> in any code.
>> >>
>> >> I did check that VRAM is WC cached in /proc/mtrr. But still it is
>> >> surprising slow.
>> > Isn't that something which fast writes should help with, an option we
>> > never really got to work? I agree though it's really bad.
>> > In any case I think it would be interesting to repeat those tests on
>> > pci/pcie cards.
>> >
>>
>> Fast writes might be solution but how reliable they are? How good
>> performance wise?
>
> While fast writes might help, colour me sceptical about the lack of them
> explaining the slowness. I suspect you're not actually getting
> write-combining for some reason (if PAT is enabled, have you tried
> disabling it?).
>

nopat doesn't boot correctly for me :/ THere is clearly some of memory
in UC state and it is very slow to run most of applications.
Performance test shows then that I get only 25-35M/s to VRAM while GTT
speed is same as with pat.

>
>> Also change to memcpy instead of memmoves pushes speeds to 440-450 for
>> first gtt copy and 470-490 for second copy. But still my CPU is slow
>> when GPU has to first be notified about copy and then it has to signal
>> back about work being done.
>>
>> So memove is also very slow for writes to WC cached GTT.
>
> I think it should be safe to change the X driver to use memcpy in
> RADEONCopySwap().
>

If wanting to be safe there could be test for overlap areas.

>
> --
> Earthling Michel Dänzer           |                http://www.vmware.com
> Libre software enthusiast         |          Debian, X and DRI developer
>


More information about the xorg-driver-ati mailing list