Pixman Accessors

Wed Jun 10 00:37:11 PDT 2009

> The new scheme uses the accessor functions to move the data to host
> memory, then running whatever fast paths are available, then copying
> back to video memory.
> 
> I would guess that the new scheme is faster, but I don't really care
> if it isn't. The best you can hope for is "unusable, but benchmarks
> show this is faster" vs. "unusable".

If you're fetching, say, a whole scanline (or even a whole rect) using a
single function call, then that function can sensibly do a sanity check
beforehand to see whether a straight copy will work, or whether it has
to process the data.

And it can use a CPU-optimised copy.  On uncached memory, using
instructions that read large amounts at a time (SSE, Altivec and NEON
can all read at least 16 bytes with one instruction) is far faster than
reading one pixel at a time, because the memory bus has to be restarted
for each one even if you're running sequentially.  Writing is less
critical providing write-combining is available - you just have to keep
up with the write buffer.

-- 
------
From: Jonathan Morton
      jonathan.morton at movial.com