A stab at some EXA documentation

Wed Aug 3 16:43:22 PDT 2005

On Wed, 2005-08-03 at 14:59 -0700, Jesse Barnes wrote:
>   /**
>    * exaOffscreenAlloc - allocate offscreen memory
>    * @pScreen: current screen
>    * @size: size in bytes
>    * @align: alignment constraints
>    * @locked: allocate the memory as locked?
>    * @save: offscreen save routine
>    * @privdata: private driver data for @save routine
>    *
>    * Allocate some offscreen memory from the device associated with @pScreen.
>    * @size and @align determine where and how large the section is, and
>    * @locked will determine whether the new memory should be freed later on or
>    * if it should be kept in card memory until freed explicitly.  @save and
>    * @privData are used to make room for the new allocation, if necessary.
>    *
>    * Returns NULL on failure, or a pointer to the new offscreen memory on
>    * success.
>    */
>   ExaOffscreenArea *exaOffscreenAlloc(ScreenPtr pScreen, int size, int align,
> 				      Bool locked, ExaOffscreenSaveProc save,
> 				      pointer privData);

It should be noted here that align must be a power of two currently.
This is something that should get fixed, but it's the case at the
moment, at least.

> 
>   /**
>    * exaOffscreenFree - free offscreen memory
>    * @pScreen: current screen
>    * @area: area to free
>    *
>    * Free some offscreen memory previously allocated with exaOffscreenAlloc,
>    * described by @area.
>    *
>    * [Returns a pointer to something.  The last area?]
>    */
>   ExaOffscreenArea *exaOffscreenFree(ScreenPtr pScreen, ExaOffscreenArea *area)

Yeah, the pointer is to the now-freed area.  It's used in the
kicking-out process.

>   /**
>    * exaInitCard - initialize EXA card structure
>    * @exa: card structure to initialize
>    * @sync: card needs sync?
>    * @memory_base: base of framebuffer memory

That should be "pointer to the beginning of framebuffer memory"

>    * @off_screen_base: start of offscreen memory

Let's say "offset to the first free byte of offscreen memory."

>    * @memory_size: size of memory

"size of framebuffer memory", i.e. the offscreen allocator will allocate
between (memory_base + off_screen_base) to (memory_base + memory_size)

>    * @offscreen_byte_align: card alignment restriction, in bytes
>    * @offscreen_pitch: card pitch restriction, in bytes

"card pitch alignment restriction, in bytes."

>    * @flags: flags
>    * @max_x: maximum width of screen
>    * @max_y: maximum height of screen
>    *
>    * This is just a wrapper around the initialization of the EXA driver's card
>    * structure.
>    *
>    * [What is sync for?]

>    * The flags argument specifies what features the card supports, two flags
>    * are currently defined:
>    *   %EXA_OFFSCREEN_PIXMAPS - offscreen pixmaps are supported
>    *   %EXA_OFFSCREEN_ALIGN_POT - offscreen objects must be power of two
>    *     aligned'

"offscreen objects must have a power-of-two alignment of their pitch"

>    *
>    * This routine is just a macro, and so it can't fail (unless it causes a
>    * compile failure).
>    *
>    * [This macro appears broken in the current tree, s/Sync/needsSync/.]
>    */

Yeah, definitely broken.

>   void exaInitCard(EXADriverPtr *exa, Bool sync, CARD8 *memory_base,
> 		   unsigned longoff_screen_base, unsigned long memory_size,
> 		   int offscreen_byte_align, int offscreen_pitch, int flags,
> 		   int max_x, int max_y)
> 
>   /**
>    * exaMarkSync - mark a sync point
>    * @pScreen: current screen
>    *
>    * Record a marker for later synchronization.
>    *
>    * Private, core server use only?
>    */
>   void exaMarkSync(ScreenPtr pScreen)

Re: "Private...?": Not necessarily, but still probably won't be used by
most drivers.  One example would be if you fired off some sort of
acceleration as part of your XV support, you would need to exaMarkSync
to tell exa that the accelerator has been used.

>   /**
>    * exaWaitSync - wait for the last marker to complete
>    * @pScreen: current screen
>    *
>    * Wait until the device associated with @pScreen is done with the operation
>    * associated with the last exaMarkSync() call.
>    *
>    * Private, core server use only?
>    */
>   void exaWaitSync(ScreenPtr pScreen)

Re: "Private...?": No, it's a good way to ensure the accelerator is idle
without doing excess checks for idle.  Might be needed on video mode
changes, playing around in the framebuffer, etc.

>   /**
>    * exaOffscreenInit - initialize offscreen memory
>    * @pScreen: current screen
>    *
>    * Private, core server use only?
>    */
>   Bool exaOffscreenInit(ScreenPtr pScreen)

Yeah, that one's going to be exa internal, I'm pretty sure.

> Driver EXA routines
> -------------------
> EXA requires the addition of new routines to your driver's acceleration
> implementation.  The following structure defines the EXA acceleration API,
> some are required to be implemented in your driver, others are optional.
> 
>   typedef struct _ExaAccelInfo {
>   /**
>    * PrepareSolid - setup for solid fill
>    * @pPixmap: Pixmap destination
>    * @alu: raster operation
>    * @planemask: mask for fill
>    * @fg: foreground color
>    *
>    * Setup the card's engine for a solid fill operation to @pPixmap.
>    * @alu specifies the raster op for the fill, @planemask specifies an
>    * optional mask, and @fg specifies the foreground color for the fill.
>    *
>    * You can add additional fields to your driver record structure to store
>    * state needed by this routine, if necessary.
>    *
>    * Return TRUE for success, FALSE for failure.

Let's say "Returns TRUE if the driver can successfully accelerate
subsequent Solid requests for the given parameters.  FALSE results in
fallback rendering"

>    * Required.
>    */
>   Bool (*PrepareSolid)(PixmapPtr pPixmap, int alu, Pixel planemask,
>                        Pixel fg);
> 
>   /**
>    * Solid - solid fill operation
>    * @pPixmap: Pixmap destination
>    * @x1: source X coordinate
>    * @y1: source Y coordinate
>    * @x2: destination X coordinate
>    * @y2: destination Y coordinate

x1/y1 are the top-left coordinate, x2/y2 are the bottom-right.

>    *
>    * Perform the fill as specified by PrepareSolid, from x1,y1 to x2,y2.  This
>    * is very similar to the XAA solid fill routine.
>    *
>    * Must not fail.
>    *
>    * Required.
>    */
>   void (*Solid)(PixmapPtr pPixmap, int x1, int y1, int x2, int y2);
> 
>   /**
>    * DoneSolid - finish a solid fill
>    * @pPixmap: Pixmap to finish
>    *
>    * Finish the solid fill done by the last Solid call.

"Finish the solid fills done in the preceeding Solid calls."

>    * Must not fail.
>    *
>    * Required.
>    */
>   void (*DoneSolid)(PixmapPtr pPixmap);
> 
>   /**
>    * PrepareCopy - setup a copy operation
>    * @pSrcPixmap: source Pixmap
>    * @pDstPixmap: destination Pixmap
>    * @xdir: x direction for the copy
>    * @ydir: y direction for the copy
>    * @alu: raster operation
>    * @planemask: optional planemask for the copy
>    *
>    * Copy @pSrcPixmap to @pDstPixmap in the x and y direction specified,
>    * with the raster operation @alu.  @planemask specifies an optional
>    * planemask for the copy.
>    *
>    * You can add additional fields to your driver record structure to store
>    * state needed by this routine, if necessary.
>    *
>    * Return TRUE for success, FALSE for failure.

Let's say "Returns TRUE if the driver can successfully accelerate
subsequent Copy requests for the given parameters.  FALSE results in
fallback rendering"

>    * Required.
>    */
>   Bool (*PrepareCopy)(PixmapPtr pSrcPixmap, PixmapPtr pDstPixmap, int xdir,
>                       int ydir, int alu, Pixel planemask);
> 
>   /**
>    * Copy - perform a copy between two pixmaps
>    * @pDstPixmap: destination Pixmap
>    * @srcX: source X coordinate
>    * @srcY: source Y coordinate
>    * @dstX: destination X coordinate
>    * @dstY: destination Y coordinate
>    * @width: copy width
>    * @height: copy height
>    *
>    * Perform the copy setup by the previous PrepareCopy call, from
>    * (@srcX, at srcY) to (@dstX, at dstY) using @width and @height to
>    * determine the quantity of the copy.
>    *
>    * This is very similar to an XAA screen to screen copy.
>    *
>    * Must not fail.
>    *
>    * Required.
>    */
>   void (*Copy)(PixmapPtr pDstPixmap, int srcX, int srcY, int dstX, int dstY,
>                int width, int height);
> 
>   /**
>    * DoneCopy - finish a copy operation
>    * @pDstPixmap: Pixmap to complete
>    *
>    * Tear down the copy operation for @pDstPixmap, if necessary.
>    *
>    * Must not fail.
>    *
>    * Required.
>    */
>   void (*DoneCopy)(PixmapPtr pDstPixmap);
> 
>   /**
>    * CheckComposite - check to see if a composite operation is doable
>    * @op: composite operation
>    * @pSrcPicture: source Picture
>    * @pMaskPicture: mask Picture
>    * @pDstPicture: destination Picture
>    *
>    * Check to see if @pSrcPicture can be composited onto @pDstPicture with
>    * @pMaskPicture as a mask.
>    *
>    * Returns TRUE for success, FALSE for failure.
>    *
>    * Optional.

"Returns TRUE if it should be possible to accelerate the given operation
once the pixmap is migrated, or FALSE if it can't be accelerated.  While
not required, it can avoid migration and the resulting penalties for
operations not supported by the hardware."

>    */
>   Bool (*CheckComposite)(int op, PicturePtr pSrcPicture,
>                          PicturePtr pMaskPicture, PicturePtr pDstPicture);
> 
>   /**
>    * PrepareComposite - setup a composite operation
>    * @op: composite operation
>    * @pSrcPicture: source Picture
>    * @pMaskPicture: mask Picture
>    * @pDstPicture: destination Picture
>    * @pSrc: Pixmap source
>    * @pMask: Pixmap mask
>    * @pDst: Pixmap destination
>    *
>    * Setup the composite operation, @op, with the passed in parameters.
>    * [Need more detail here.]
>    *
>    * Must not fail?

May fail!  Sometimes a Prepare will happen on pixmaps that don't quite
fit the necessary rules for acceleration (the visible screen is an
important special case), in which case it also needs to return FALSE,
resulting in fallback rendering.

>    *
>    * Optional.
>    */
>   Bool (*PrepareComposite)(int op, PicturePtr pSrcPicture,
>                            PicturePtr pMaskPicture, PicturePtr pDstPicture,
>                            PixmapPtr pSrc, PixmapPtr pMask, PixmapPtr pDst);
> 
>   /**
>    * Composite - perform a composite operation
>    * @pDst: destination Pixmap
>    * @srcX: source X coordinate
>    * @srcY: source Y coordinate
>    * @maskX: X coordinate of mask
>    * @maskY: Y coordinate of mask
>    * @dstX: destination X coordinate
>    * @dstY: destination Y coordinate
>    * @width: operation width
>    * @height: operation height
>    *
>    * Perform the composite operation setup by the prior PrepareComposite call.

"Perform a composite operation set up by the last PrepareComposite call"

>    * Must not fail.
>    *
>    * Optional.
>    */
>   void (*Composite)(PixmapPtr pDst, int srcX, int srcY, int maskX, int maskY,
>                     int dstX, int dstY, int width, int height);
> 
>   /**
>    * DoneComposite - composite operation teardown
>    * @pDst: Pixmap in question
>    *
>    * Finish and teardown the composite operation performed by the last
>    * Composite call.
>    *
>    * Must not fail.
>    *
>    * Optional.
>    */
>   void (*DoneComposite)(PixmapPtr pDst);
> 
>   /**
>    * UploadToScreen - load memory into video RAM
>    * @pDst: destination Pixmap
>    * @src: source in system memory
>    * @src_pitch: width of source
>    *
>    * Copy system memory from @src to the PixmapPtr @pDst @src_pitch
>    * bytes at a time (unless the destination is narrower?).

The destination will always be wide enough.  UploadToScreen is used to
take pixmaps in system memory and put them in framebuffer, where
alignment restrictions will only increase the pitch.  Should only be
implemented as an accelerated operation, as syncing and memcpy() is done
in the fallback (absent) case.

>    * Return TRUE for success, FALSE for failure.
>    *
>    * Optional but recommended.
>    */
>   Bool (*UploadToScreen)(PixmapPtr pDst, char *src, int src_pitch);
> 
>   /**
>    * UploadToScratch - load memory into video RAM
>    * @pSrc: source Pixmap
>    * @pDst: destination Pixmap
>    *
>    * Just copy @pSrc to @pDst in the scratch area?  This could be
>    * offscreen memory or system memory in AGP space?
>    *
>    * Return TRUE for success, FALSE for failure.
>    *
>    * Optional but recommended.

Recommended in the absence of UploadToScreen, otherwise ignore.  Must
set up a space (likely reserved at startup) in framebuffer memory, write
the source pixmap (from system memory) into the area, and adjust
pDst->devKind to the pitch of the destination and pDst->devPrivate.ptr
to the pointer to the area.  The driver must take care of syncing of
UploadToScratch.  Only the data from the last UploadToScratch is valid
at any time.

>    */
>   Bool (*UploadToScratch)(PixmapPtr pSrc, PixmapPtr pDst);
> 
>   /**
>    * DownloadFromScreen - copy from video RAM to system memory
>    * @pSrc: source Pixmap
>    * @x: starting X coordinate in Pixmap
>    * @y: starting Y coordinate in Pixmap
>    * @w: copy width
>    * @h: copy height
>    * @dst: destination in system memory
>    * @dst_pitch: target width
>    *
>    * Just copy (x,y)->(x+w,y+h) from @pSrc to @dst using @dst_pitch
>    * width?
>    *
>    * Return TRUE for success, FALSE for failure.
>    *
>    * Optional but recommended.
>    */
>   Bool (*DownloadFromScreen)(PixmapPtr pSrc, int x, int y, int w, int h,
>                              char *dst, int dst_pitch);
> 
>   /**
>    * MarkSync - return a marker for later use by WaitMarker
>    * @pScreen: current screen
>    *
>    * Return a command marker for use by WaitMarker.  This is an optional
>    * optmization that can keep WaitMarker from having to idle the whole
>    * engine.
>    *
>    * Returns an integer command id marker.
>    *
>    * Optional.
>    */
>   int (*MarkSync)(ScreenPtr screen);
> 
>   /**
>    * WaitMarker - finish the last command
>    * @pScreen: current screen
>    * @marker: command marker
>    *
>    * Return after the command specified by @marker is done, or just idle
>    * the whole engine (the latter is your only option unless you implement
>    * MarkSync()).
>    *
>    * Must not fail.
>    *
>    * Required.
>    */
>   void (*WaitMarker)(ScreenPtr pScreen, int marker);
>   } ExaAccelInfoRec, *ExaAccelInfoPtr;
> 
> This is an extra, optional routine, used as a callback to the offscreen
> allocation function.
> 
>   /**
>    * ScratchSave - save the scratch area, or just throw it away
>    * @pScreen: ScreenPtr for this screen
>    * @area: offscreen area pointer to save
>    *
>    * This routine is responsible for saving the scratch area for later
>    * use, but can optionally just throw it away by setting the driver's
>    * exa_scratch field to NULL.  This is the routine that should be passed to
>    * exaOffscreenAlloc so it can save the scratch area if necessary.  It might
>    * be implemented as a copy from video RAM to AGP space, for example?
>    *
>    * Must not fail.
>    *
>    * Optional.
>    */
>   void ScratchSave(ScreenPtr pScreen, ExaOffscreenArea *area);

Where is "ScratchSave" from?

> 
> [Necessary structure description here.]
> 
> EXA driver fields
> -----------------
> Each driver record structure also needs a few additional fields if EXA
> support is to be used, e.g.:

Note that this isn't a recipe here.  For example, most cards can
probably set up the offsets/pitch/bpp once during Setup and then ignore
them for future drawing calls.

>   #if XF86EXA
>   /* Container structure, describes card and accel. hooks */
>   ExaDriverPtr EXADriverPtr;
> 
>   /* Optional fill parameters (see *Solid above) */
>   int fillPitch, fillBpp, fillXoffs;
>   CARD32 fillDstBase;
> 
>   /* Optional copy parameters (see *Copy above) */
>   int copySXoffs, copyDXoffs, copyBpp;
>   int copySPitch, copyDPitch;
>   CARD32 copySrcBase, copyDstBase;
>   int copyXdir, copyYdir;
> 
>   /* Offscreen scratch area pointer & accounting, you can implement this any
>    * way you want. */
>   ExaOffscreenArea *exa_scratch;
>   unsigned int exa_scratch_next;
> 
>   /* Not sure if these are used?
>   void (*ExaRenderCallback)(ScrnInfoPtr);
>   Time ExaRenderTime;

Nope, these last couple wouldn't exist.

> EXA initialization
> ------------------
> Your driver's AccelInit routine has to initialize the EXADriverPtr and
> exa_scratch fields if EXA support is enabled, with appropriate error
> handling (i.e.  NoAccel and NoXvideo should be set to true if EXA fails
> to initialize for whatever reason).
> 
> A few, card specific fields need to be initialized:
> 
>   EXADriverPtr->card.memoryBase = ? /* base of the framebuffer */
>   EXADriverPtr->card.memorySize = ? /* size of framebuffer */
>   EXADriverPtr->card.offScreenBase = ? /* start of offscreen memory */
>   /*
>    * Common calculation is (maybe this should be the default?)
>    * pScrn->virtualX * pScrn->virtualY * ((pScrn->bitsPerPixel * + 7) / 8)
>    */

You've probably got alignment to do on your screen's pitch (virtualX),
so I'd guess not.

>   /* Alignment restrictions for accessing card memory */
>   EXADriverPtr->card.offscreenByteAlign = ?
>   EXADriverPtr->card.offscreenPitch = ?
> 
>   /* Max screen size supported by the card? */
>   EXADriverPtr->card.maxX = ?
>   EXADriverPtr->card.maxY = ?
> 
> The AccelInit routine also needs to make sure that there's enough
> offscreen memory for certain operations to function, like Xvideo, which
> needs at least as much offscreen memory as there is framebuffer memory?

I'd say that your Xvideo needs to adjust its maximum size advertised to
deal with however little offscreen memory you have available.

> [Is SiS broken here?  It just checks to see if *any* offscreen memory is
> present, and assumes that's enough.]
> 
> And of course all the callbacks you implemented as described above (with
> whatever names you've chosen):
> 
>   EXADriverPtr->accel.WaitMarker = WaitMarker;
> 
>   /* Solid fill & copy, the bare minimum */
>   EXADriverPtr->accel.PrepareSolid = PrepareSolid;
>   EXADriverPtr->accel.Solid = Solid;
>   EXADriverPtr->accel.DoneSolid = DoneSolid;
>   EXADriverPtr->accel.PrepareCopy = PrepareCopy;
>   EXADriverPtr->accel.Copy = Copy;
>   EXADriverPtr->accel.DoneCopy = DoneCopy;
> 
>   /* [How should composite be done?] */
> 
>   /* Upload, download to/from Screen, optional */
>   EXADriverPtr->accel.UploadToScreen = UploadToScreen;
>   EXADriverPtr->accel.DownloadFromScreen = DownloadFromScreen;
> 
> After setting up the above, AccelInit should call exaDriverInit and pass in
> the current Screen and the new EXADriverPtr that was just allocated and filled
> out (don't forget to check for errors as that routine can fail).  It should
> also allocate some offscreen memory for glyph data? using the
> exaOffscreenAlloc function, e.g.
> 
>   pDrv->exa_scratch = exaOffscreenAlloc(pScreen, 128 * 1024, 16, TRUE,
> 					ScratchSave, pDrv);
> 
>   /* Why is this required?  Shouldn't exaOffscreenAlloc take care of it? */
>   if(pDrv->exa_scratch) {
>     pDrv->exa_scratch_next = pDrv->exa_scratch->offset;
>     pDrv->EXADriverPtr->accel.UploadToScratch = UploadToScratch;
>   }

Only use a scratch area if you don't have working UploadToScreen.  Try
to get UploadToScreen working first -- scratch is a dirty hack.

> [How do we know how much to allocate for glyph data?]

You don't need much -- think of how big a line of text is, and that's
probably it.  256k is pretty generous, and will keep syncing to a
minimum.

> EXA teardown
> ------------
> At screen close time, EXA drivers should free any offscreen memory that
> was allocated, call exaDriverFini with their EXADriverPtr field, and
> free it, and do any other necessary teardown.

I would be surprised if they actually did have to free their memory, but
I haven't checked.

Thanks for attacking this documentation -- it's annoying stuff to write!

-- 
Eric Anholt                                     eta at lclark.edu
http://people.freebsd.org/~anholt/              anholt at FreeBSD.org