New acceleration architecture

Zack Rusin zrusin at
Mon Jun 27 00:11:57 PDT 2005

On Sunday 26 June 2005 19:44, you wrote:
> On Sad, 2005-06-25 at 17:20, Zack Rusin wrote:
> > I'll be here to respond to any questions. If there's anything you think
> > is silly, I'll be more than happy to change it.
> >
> > I refuse to add acceleration hooks for low level primitives (e.g.
> > lines). At this day and age it really just doesn't make any sense.
> That seems silly. There are several cases where angled lines are
> horrendous worst cases for screen fetch/screen put operations especially
> as Xorg generally doesn't have DMA screen tile fetch so your performance
> for a fetch is a joke. Vertical lines in software also cause tlb
> thrashing. So yes that one does matter as any xtank player would
> understand 8)

Not that I'm doubting the greatness of xtank but I do find tailoring of an 
acceleration architecture for the needs of corner cases a little weird. Sure 
we could add primitives for lines, rectangles, triangles and so on but at 
some point we have to draw the line and say "alright, that's enough". The 
amount of acceleration hooks in GDI+ is ridicules but then again the number 
of drivers that implement them: 0. I'm drawing the line at the core X 
primitives because between the two applications that still use them I'm sure 
people will be able to take the hit. With Cairo and Arthur those hooks don't 
give us absolutely anything so they would be there only to satisfy xtank's 
and maybe some other games from early nineties. The idea is that if your 
whole desktop depends on software that is more than 10 years old then there's 
a good chance that upgrading your X server to latest and greatest is not on 
the top of your todo list. And given that a similar architecture (without the 
DMA down hooks) was in kdrive for a number of years and no one complained I 
think that we'll be fine.

> I assume btw you have a tile cache and block invalidation map/tree so
> that you don't generate unneccessary tile fetches, otherwise its going
> to suck rocks on low end systems and damage overall system throughput
> not just graphics performance.

Yes it does, but it's explicitely tailored for XRender.

> > I want to make sure the following things are very clear:
> > 1) Exa can coexist with XAA. You can keep code for both in your driver.
> Are both used together (eg XAA for lines, exa for other stuff. I'm
> assuming not

No, the aa stuff needs to replace the screen hooks so only one of them can do 
that. An interesting experiment could be trying to use XAA but with the 
memory manager from Exa. Keyword would be "could" as I don't foresee it 
making too much of a difference :)

> > 3) As everyone can see adding Exa support to a driver which already has
> > XAA support is trivial.
> Does the base exa include an exa->xaa translator for basic operations on
> older cards or is it just not practical ?

I thought about writing a script that would switch "solid fills" and "screen 
to screen copy" respectively but I decided that the work is simple enough 
that it shouldn't be an issue.

> > 4) Following the 7 steps I outlined above will speed up the common
> > desktop usage by quite a bit. Note that you don't have to be a driver
> You've measured this with one atypical card on one bus (AGP) or done
> analysis on hub based systems too ?

I'm using it :) Because of the memory manager in Exa the composite operation 
doesn't anymore require fetches all the time which helps immensely and is 
very noticeable.

> > 5) Implementing the download/upload/composite hooks will give us enough
> > power to have very fancy effects that will let us compete with
> > Microsoft/Apple desktops while we work on Xgl.
> If you have DMA and a DRI interface to grab tiles, otherwise it's going
> to hurt badly. It'll be better than now I'm sure but it'll merely "suck
> less", unless there is a good tile cache.

Well, if your "card" doesn't have DRI drivers and doesn't allow DMA transfers 
then it's not really a desktop card :) Either way, like I said, that's fine, 
those hooks are optional, the memory manager will make sure that the pixmaps 
live in the correct location and composition is always optimal for the 
hardware you have.

> > 6) The code as presented in the snapshot will be checked in on Monday
> > morning/Sunday night, dependently on the feedback I'm going to get.
> I'm watching with interest. The Voodoo2 has hardware assist for software
> compositing in 2D so this may help make better use of it. (Right now I
> don't boot the 3D engine as thats horrendously complex to use).

Yap. Voodoo has also incredibly small max texture size which won't do wonders 
for you anyway. On all desktop video hardware released after 2000 I would 
recommend using the 3D engine which is not much harder than using 2D engine 
with hostdata blits, plus you get filtering for free and have a basis for a 
working XVideo using the 3D engine which would be nice to have for XComposite 
anyway. We were talking about this a little bit on the X Dev. conference last 
week. It would be nice to have something in XVideo to let the driver know 
that nothing will be composed on top of the video (meaning we're go 
fullscreen) and the driver can switch to the back-end scaler and otherwise 
use tmus.


More information about the xorg mailing list