An iwmmxt optimised shadowUpdateRotate16_270YX, with problems.

Siarhei Siamashka siarhei.siamashka at gmail.com
Fri Aug 26 05:24:24 PDT 2011


On Fri, Aug 26, 2011 at 2:21 PM, Marko Katić <dromede at gmail.com> wrote:
> Hi there!
> I'm trying to optimise shadow copies from landscape oriented 16bit shadowfb
> to portrait 16bit fb proper. This is done in shadowUpdateRotate16_270YX
> which is located in miext/shadow/shrotpackYX.h. My optimisation is aimed at
> pxa27x and xscale3 arm processors only since it uses iwmmxt asm code. I
> guess it could easily be ported to x86 mmx code too.
> As I understand, the current implementation copies a single pixel at a time,
> like this:
>  *win = *sha++;
> win += WINSTEPX(winStride);
> This also means that we're stepping over entire cachelines since every pixel
> of a single shadowfb line has to be copied to a new line of fb proper.
> My patch tries to copy 4x4 pixel blocks prerotated to portrait orientation.
>  Basically, it takes 4 lines of shadowfb and divides it into 4x4 blocks.
> Then it rotates them and copies them to fb proper. This way, instead of
> copying a single pixel per fb proper line, it copies four. The rotation code
> is
> done in iwmmxt asm and takes about 0.9 instructions per pixel (assuming the
> 4x4 block is already in iwmmxt registers). 4x4 blocks imply that the
> rectangle
> to be copied is width and height aligned to 4 pixels. If not, the patch
> reduces the rectangle to proper alignment with single pixel copies for width
> and height.
> It doesn't really work and i can't find a reason why. The inital Xfbdev
> screen is looking fine, but when i start moving the pointer or windows, all
> i get is garbage.
> The patch was tested on kdrive 1.3.0.0 running in qemu and on a Zaurus
> C-1000.
> If anyone has any suggestions, please do tell.

To get the best performance, it is important to take cache and TLB
misses into account. The 4x4 pixel blocks may be a bit too small.

I think it may be interesting to integrate iwmmxt optimizations into
pixman and then use pixman for doing these rotations in xserver. Right
now there is more or less cache friendly C implementation for rotation
in pixman:
    http://cgit.freedesktop.org/pixman/tree/pixman/pixman-fast-path.c?id=pixman-0.22.2#n1620

Rotation is at least partially covered in the pixman test suite ('make
check'), so detecting and fixing the most obvious bugs could be a bit
easier than watching for image corruption on real use. Also the
'affine-test' program from the test suite can be used as an example of
doing rotations on memory buffers with pixman:
    http://cgit.freedesktop.org/pixman/tree/test/affine-test.c?id=pixman-0.22.2

-- 
Best regards,
Siarhei Siamashka


More information about the xorg-devel mailing list