[PATCH] fb: Fix memcpy abuse

Soeren Sandmann sandmann at cs.au.dk
Sat Apr 30 05:39:42 PDT 2011


Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:

>>  1            2                 3                 4           Operation
>> ------   ---------------   ---------------   ---------------   ------------
>> 258000   269000 (  1.04)   544000 (  2.11)   552000 (  2.14)   Copy 10x10
>>  21300    23000 (  1.08)    43700 (  2.05)    47100 (  2.21)   Copy 100x100
>>   960      962 (  1.00)     1990 (  2.09)     1990 (  2.07)   Copy 500x500
>>
>> So it's a modest performance hit, but correctness demands it, and it's
>> probably worth keeping the 2x speedup from having the fast path in the
>> first place.
>
> In my opinion, still the best solution is to do this stuff in pixman,
> because it has its own SIMD optimized code and can do a lot better job
> than memcpy/memmove (for example, it can be improved to use MOVNTx
> instructions for x86 when scrolling large areas exceeding L2/L3 cache
> size, and this optimization can't be easily done by glibc if
> memcpy/memmove is used separately for each individual scanline). I did
> some experiments with improving scrolling performance when using
> non-hardware accelerated framebuffer earlier and it showed a really
> major speedup on ARM:
> http://lists.x.org/archives/xorg-devel/2009-November/003536.html

If I remember correctly, the main objection I had to the overlapped blt
patch was that it inlined the C fallback into all the SIMD versions
instead of just calling down through the delegate.

Other than that, I agree that doing this in pixman would be best. In
fact, I think it would make sense to have a full CopyArea implementation
in pixman that would also handle rasterops and planemasks etc, but
simply supporting overlapping in pixman_blt() is useful too.

>> -    if (alu == GXcopy && pm == FB_ALLONES && !reverse &&
>> +    careful = !((srcLine < dstLine && srcLine + width * (bpp/8) > dstLine) ||
>> +                (dstLine < srcLine && dstLine + width * (bpp/8) > srcLine)) ||
>> +              (bpp % 8);
>> +
>> +    if (alu == GXcopy && pm == FB_ALLONES && !careful &&
>>             !(srcX & 7) && !(dstX & 7) && !(width & 7)) {
>>         int i;
>>         CARD8 *src = (CARD8 *) srcLine;
>>         CARD8 *dst = (CARD8 *) dstLine;
>> -
>> +
>>         srcStride *= sizeof(FbBits);
>>         dstStride *= sizeof(FbBits);
>>         width >>= 3;
>
> What is the story with the 'reverse' variable? Does it have no use
> anymore and needs to be purged later?

It's an argument to the function that indicates whether each scanline
should be copied back-to-front. In the non-overlapping case, it can be
ignored, but it is used in other places in fbBlt().


Soren


More information about the xorg-devel mailing list