[PATCH xf86-video-glint] No need for byteswapping in YV12 decoding on BE machines

Mark Kettenis mark.kettenis at xs4all.nl
Sat Dec 11 14:24:25 PST 2010


> Date: Sat, 11 Dec 2010 22:52:25 +0100 (CET)
> From: Mark Kettenis <mark.kettenis at xs4all.nl>
> 
> > From: Alan Hourihane <alanh at fairlite.co.uk>
> > Date: Thu, 09 Dec 2010 13:00:54 +0000
> > 
> > On Thu, 2010-12-09 at 13:41 +0100, Mark Kettenis wrote:
> > > > Might want to check the hardware manuals on this, but there might be a
> > > > byteswap bit someplace that does it in hardware. The PGX32 may do this
> > > > at BIOS init time, whereas a PC version may not. There is a bit for this
> > > > on the PM3, so there maybe something on PM2.
> > > 
> > > Well, obviously the PGX32 doesn't have a BIOS, but the Open Firmware
> > > Forth code on the card might indeed do something like that.  The
> > > OpenBSD kernel driver for this card also twiddles some of the
> > > endianness bits for the framebuffer windows to get things into the
> > > state expected by the glint driver.  The Linux framebuffer driver does
> > > something similar, and I'm pretty sure NetBSD has something like that
> > > as well.
> > > 
> > > I don't have access to hardware documentation.  Documentation was only
> > > ever available under NDA isn't it?  Do you expect there is a seperate
> > > bit to set the ednianness used by the YUV transformation hardware?
> > 
> > Possibly. I can't remember. But it's worth checking because the code
> > wouldn't have been added without good reason in the first place.
> 
> Did some further digging, and I think I've figured out what's going on
> here.
> 
> The way the code works is that the YV12 data is uploaded to the
> texture buffer which is then translated to RGB and copied to the
> framebuffer by the hardware.  Uploading the YV12 data to the texture
> buffer is done through the same aperture that maps the framebuffer.
> Therefore writes to the texture buffer undergo the same byte twisting
> as writes to the framebuffer.  This has some interesting consequences.
> 
> My PGX32 is a Permedia 2v and, by default, runs in 32bpp mode.  So
> pixels are 32 bits wide and therefore the glint driver configures the
> aperture to do byte swapping, such that they end up as little-endian
> RGBA values in the framebuffer.  Since the YV12 is handled as 32-bit
> wide units as well, they undergo the proper big-endian to
> little-endian transformation as well, and there is no need for
> CopyYV12() to do anything special.  That means CopyYV12LE() is doing
> the right thing.
> 
> However, if I tell X to run in 16bpp mode, the situation changes.
> Pixels are 16 bits wide and therefore the glint driver configures the
> aperture to do half-word swapping.  But since the YV12 data is handled
> as 32-bit wide units, it now ends up being mangled when written to the
> texture buffer.  Unsurprisingly this results in weird looking video as
> well.  Note that the CopyYV12BE() code doesn't fix this.
> 
> Running X in 8bpp mode is also still possible.  In that case, the
> aperture is configured not to do any byte swapping.  In this case the
> CopyYV12BE() code would do the right thing.  But to be honest, given
> the limited color space that 8bpp mode provides, it is hard to tell if
> it does.
> 
> So I guess some sort of byteswapping version of CopyYV12() will be
> necessary to fix the 16bpp.  Alternatively we could use the second
> aperture and set it up to always do byteswapping and use that to
> upload the YV12 data to the texture buffer.

The diff below (on top of my previous diff) does the the trick.  Not
sure if the 8bpp and 24bpp modes are worth supporting.  Video playback
on 8bpp looks absolutey afwul, and 24bpp is completely broken (crashes
the X server).

I'll fight with git and submit a complete diff if folks think this is a
sensible approach.


Index: src/pm2_video.c
===================================================================
RCS file: /cvs/xenocara/driver/xf86-video-glint/src/pm2_video.c,v
retrieving revision 1.3
diff -u -p -r1.3 pm2_video.c
--- src/pm2_video.c	5 Dec 2010 20:25:26 -0000	1.3
+++ src/pm2_video.c	11 Dec 2010 22:12:09 -0000
@@ -1722,6 +1722,56 @@ CopyYV12(CARD8 *Y, CARD32 *dst, int widt
     }
 }
 
+#if X_BYTE_ORDER == X_BIG_ENDIAN
+
+static void
+CopyYV12_16(CARD8 *Y, CARD32 *dst, int width, int height, int pitch)
+{
+    int Y_size = width * height;
+    CARD8 *V = Y + Y_size;
+    CARD8 *U = V + (Y_size >> 2);
+    int pad = (pitch >> 2) - (width >> 1);
+    int x;
+
+    width >>= 1;
+
+    for (height >>= 1; height > 0; height--) {
+	for (x = 0; x < width; Y += 2, x++)
+	    *dst++ = Y[1] + (V[x] << 8) + (Y[0] << 16) + (U[x] << 24);
+	dst += pad;
+	for (x = 0; x < width; Y += 2, x++)
+	    *dst++ = Y[1] + (V[x] << 8) + (Y[0] << 16) + (U[x] << 24);
+	dst += pad;
+	U += width;
+	V += width;
+    }
+}
+
+static void
+CopyYV12_8(CARD8 *Y, CARD32 *dst, int width, int height, int pitch)
+{
+    int Y_size = width * height;
+    CARD8 *V = Y + Y_size;
+    CARD8 *U = V + (Y_size >> 2);
+    int pad = (pitch >> 2) - (width >> 1);
+    int x;
+
+    width >>= 1;
+
+    for (height >>= 1; height > 0; height--) {
+	for (x = 0; x < width; Y += 2, x++)
+	    *dst++ = V[x] + (Y[1] << 8) + (U[x] << 16) + (Y[0] << 24);
+	dst += pad;
+	for (x = 0; x < width; Y += 2, x++)
+	    *dst++ = V[x] + (Y[1] << 8) + (U[x] << 16) + (Y[0] << 24);
+	dst += pad;
+	U += width;
+	V += width;
+    }
+}
+
+#endif
+
 static void
 CopyFlat(CARD8 *src, CARD8 *dst, int width, int height, int pitch)
 {
@@ -1818,8 +1868,24 @@ Permedia2PutImage(ScrnInfoPtr pScrn,
 
     switch (id) {
     case LE4CC('Y','V','1','2'):
-	CopyYV12(buf, (CARD32 *)((CARD8 *) pGlint->FbBase + pPPriv->BufferBase[0]),
-	    width, height, pPPriv->BufferStride);
+	switch(pGlint->HwBpp) {
+#if X_BYTE_ORDER == X_BIG_ENDIAN
+	case 8:
+	case 24:
+	    CopyYV12_8(buf, (CARD32 *)((CARD8 *) pGlint->FbBase + pPPriv->BufferBase[0]),
+		width, height, pPPriv->BufferStride);
+	    break;
+	case 15:
+	case 16:
+	    CopyYV12_16(buf, (CARD32 *)((CARD8 *) pGlint->FbBase + pPPriv->BufferBase[0]),
+		width, height, pPPriv->BufferStride);
+	    break;
+#endif
+	default:
+	    CopyYV12(buf, (CARD32 *)((CARD8 *) pGlint->FbBase + pPPriv->BufferBase[0]),
+		width, height, pPPriv->BufferStride);
+	    break;
+	}
 	PutYUV(pPPriv, pPPriv->BufferBase[0], FORMAT_YUYV, 1, 0);
 	break;
 


More information about the xorg-devel mailing list