[PATCH] fb: Fix memcpy abuse
Soeren Sandmann
sandmann at cs.au.dk
Sat Apr 30 05:39:42 PDT 2011
Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:
>> 1 2 3 4 Operation
>> ------ --------------- --------------- --------------- ------------
>> 258000 269000 ( 1.04) 544000 ( 2.11) 552000 ( 2.14) Copy 10x10
>> 21300 23000 ( 1.08) 43700 ( 2.05) 47100 ( 2.21) Copy 100x100
>> 960 962 ( 1.00) 1990 ( 2.09) 1990 ( 2.07) Copy 500x500
>>
>> So it's a modest performance hit, but correctness demands it, and it's
>> probably worth keeping the 2x speedup from having the fast path in the
>> first place.
>
> In my opinion, still the best solution is to do this stuff in pixman,
> because it has its own SIMD optimized code and can do a lot better job
> than memcpy/memmove (for example, it can be improved to use MOVNTx
> instructions for x86 when scrolling large areas exceeding L2/L3 cache
> size, and this optimization can't be easily done by glibc if
> memcpy/memmove is used separately for each individual scanline). I did
> some experiments with improving scrolling performance when using
> non-hardware accelerated framebuffer earlier and it showed a really
> major speedup on ARM:
> http://lists.x.org/archives/xorg-devel/2009-November/003536.html
If I remember correctly, the main objection I had to the overlapped blt
patch was that it inlined the C fallback into all the SIMD versions
instead of just calling down through the delegate.
Other than that, I agree that doing this in pixman would be best. In
fact, I think it would make sense to have a full CopyArea implementation
in pixman that would also handle rasterops and planemasks etc, but
simply supporting overlapping in pixman_blt() is useful too.
>> - if (alu == GXcopy && pm == FB_ALLONES && !reverse &&
>> + careful = !((srcLine < dstLine && srcLine + width * (bpp/8) > dstLine) ||
>> + (dstLine < srcLine && dstLine + width * (bpp/8) > srcLine)) ||
>> + (bpp % 8);
>> +
>> + if (alu == GXcopy && pm == FB_ALLONES && !careful &&
>> !(srcX & 7) && !(dstX & 7) && !(width & 7)) {
>> int i;
>> CARD8 *src = (CARD8 *) srcLine;
>> CARD8 *dst = (CARD8 *) dstLine;
>> -
>> +
>> srcStride *= sizeof(FbBits);
>> dstStride *= sizeof(FbBits);
>> width >>= 3;
>
> What is the story with the 'reverse' variable? Does it have no use
> anymore and needs to be purged later?
It's an argument to the function that indicates whether each scanline
should be copied back-to-front. In the non-overlapping case, it can be
ignored, but it is used in other places in fbBlt().
Soren
More information about the xorg-devel
mailing list