[PATCH xf86-video-glint] No need for byteswapping in YV12 decoding on BE machines

Sat Dec 11 13:52:25 PST 2010

> From: Alan Hourihane <alanh at fairlite.co.uk>
> Date: Thu, 09 Dec 2010 13:00:54 +0000
> 
> On Thu, 2010-12-09 at 13:41 +0100, Mark Kettenis wrote:
> > > Might want to check the hardware manuals on this, but there might be a
> > > byteswap bit someplace that does it in hardware. The PGX32 may do this
> > > at BIOS init time, whereas a PC version may not. There is a bit for this
> > > on the PM3, so there maybe something on PM2.
> > 
> > Well, obviously the PGX32 doesn't have a BIOS, but the Open Firmware
> > Forth code on the card might indeed do something like that.  The
> > OpenBSD kernel driver for this card also twiddles some of the
> > endianness bits for the framebuffer windows to get things into the
> > state expected by the glint driver.  The Linux framebuffer driver does
> > something similar, and I'm pretty sure NetBSD has something like that
> > as well.
> > 
> > I don't have access to hardware documentation.  Documentation was only
> > ever available under NDA isn't it?  Do you expect there is a seperate
> > bit to set the ednianness used by the YUV transformation hardware?
> 
> Possibly. I can't remember. But it's worth checking because the code
> wouldn't have been added without good reason in the first place.

Did some further digging, and I think I've figured out what's going on
here.

The way the code works is that the YV12 data is uploaded to the
texture buffer which is then translated to RGB and copied to the
framebuffer by the hardware.  Uploading the YV12 data to the texture
buffer is done through the same aperture that maps the framebuffer.
Therefore writes to the texture buffer undergo the same byte twisting
as writes to the framebuffer.  This has some interesting consequences.

My PGX32 is a Permedia 2v and, by default, runs in 32bpp mode.  So
pixels are 32 bits wide and therefore the glint driver configures the
aperture to do byte swapping, such that they end up as little-endian
RGBA values in the framebuffer.  Since the YV12 is handled as 32-bit
wide units as well, they undergo the proper big-endian to
little-endian transformation as well, and there is no need for
CopyYV12() to do anything special.  That means CopyYV12LE() is doing
the right thing.

However, if I tell X to run in 16bpp mode, the situation changes.
Pixels are 16 bits wide and therefore the glint driver configures the
aperture to do half-word swapping.  But since the YV12 data is handled
as 32-bit wide units, it now ends up being mangled when written to the
texture buffer.  Unsurprisingly this results in weird looking video as
well.  Note that the CopyYV12BE() code doesn't fix this.

Running X in 8bpp mode is also still possible.  In that case, the
aperture is configured not to do any byte swapping.  In this case the
CopyYV12BE() code would do the right thing.  But to be honest, given
the limited color space that 8bpp mode provides, it is hard to tell if
it does.

So I guess some sort of byteswapping version of CopyYV12() will be
necessary to fix the 16bpp.  Alternatively we could use the second
aperture and set it up to always do byteswapping and use that to
upload the YV12 data to the texture buffer.

The strange thing is that for the Permedia 2 (as opposed to the
Permedia 2v), the code always clears the the byte swap bits in the
aperture control registers.  So tit would seem that CopyYV12BE() would
always do the right thing here.  However, without byte swapping
"normal" framebuffer drawing operations in 16bpp and 32bpp mode will
be simply broken.  So I'm fairly certain that Permedia 2 support on
big-endian machines is simply broken.