[PATCH v2] fb: fix fast-path detection
Keith Packard
keithp at keithp.com
Tue Mar 25 08:28:57 PDT 2014
Daniel Kurtz <djkurtz at chromium.org> writes:
> Fix both of these by converting width from bits to FbBits, and let the
> pointer arithmetic convert to byte addresses.
Ok, I'm doing a bit more careful review this morning, and I think the
width parameter isn't quite right.
(Note that we can ignore bpp as fbBlt doesn't use it for computation at
all; it's purely a bit-copy function.). Imagine copying at 8-bpp two
pixels one place right:
dstLine = srcLine
srcX = 24
dstX = 32
width = 16
width >> FB_SHIFT = 0
srcLine < dstLine TRUE
&srcLine[width >> FB_SHIFT] > dstLine FALSE
dstLine < srcLine FALSE
&dstLine[width >> FB_SHIFT] > srcLine TRUE
You actually need to check whether there are any overlapping bits in the
copy. There are two ways to do this -- either add in the X offset when
computing the last address or by converting to byte addresses earlier
and using those. I think the byte address version is a lot simpler:
byteSrc = (uint8_t *) srcLine + (srcX >> 3);
byteDst = (uint8_t *) dstLine + (dstX >> 3);
byteWidth = width >> 3;
disjoint = (byteSrc + byteWidth <= byteDst) || (byteDst + byteWidth <= byteSrc);
The extra check for (bpp & 7) is not necessary as the memcpy case
already checks for byte alignment of source, dest and width.
For our test case above
dstLine = srcLine
srcX = 24
dstX = 32
width = 16
byteSrc = srcLine + 3
byteDst = dstLine + 4
byteWidth = 2
disjoint = (srcLine + 3 + 2 <= dstLine + 4) || (dstLine + 4 + 2 <= srcLine + 3)
= (srcLine + 5 <= srcLine + 4) || (srcLine + 6 <= srcLine + 3)
= FALSE || FALSE
= FALSE
However, if we copy two pixels right:
dstLine = srcLine
srcX = 24
dstX = 40
width = 16
byteSrc = srcLine + 3
byteDst = dstLine + 5
byteWidth = 2
disjoint = (srcLine + 3 + 2 <= dstLine + 5) || (dstLine + 5 + 2 <= srcLine + 3)
= (srcLine + 5 <= srcLine + 5) || (srcLine + 7 <= srcLine + 3)
= TRUE || FALSE
= TRUE
Or, if we copy one pixel left:
dstLine = srcLine
srcX = 32
dstX = 24
width = 16
byteSrc = srcLine + 4
byteDst = dstLine + 3
byteWidth = 2
disjoint = (srcLine + 4 + 2 <= dstLine + 3) || (dstLine + 3 + 2 <= srcLine + 4)
= (srcLine + 6 <= srcLine + 3) || (srcLine + 5 <= srcLine + 4)
= FALSE || FALSE
= FALSE
However, if we copy two pixels left:
dstLine = srcLine
srcX = 40
dstX = 24
width = 16
byteSrc = srcLine + 5
byteDst = dstLine + 3
byteWidth = 2
disjoint = (srcLine + 5 + 2 <= dstLine + 3) || (dstLine + 3 + 2 <= srcLine + 5)
= (srcLine + 7 <= srcLine + 3) || (srcLine + 5 <= srcLine + 5)
= FALSE || TRUE
= TRUE
Of course, we can only do this in bytes if the copy is byte aligned. The
current code already checks that with
!(srcX & 7) && !(dstX & 7) && !(width & 7)
So, our final code can look like this:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-fb-fix-fast-path-blt-detection.patch
Type: text/x-diff
Size: 5161 bytes
Desc: not available
URL: <http://lists.x.org/archives/xorg-devel/attachments/20140325/d461e2cf/attachment-0001.patch>
-------------- next part --------------
--
keith.packard at intel.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 810 bytes
Desc: not available
URL: <http://lists.x.org/archives/xorg-devel/attachments/20140325/d461e2cf/attachment-0001.sig>
More information about the xorg-devel
mailing list