Dispatching and scheduling--basic questions

Tue Sep 16 12:01:55 PDT 2008

On Tue, 2008-09-16 at 20:03 +0300, Daniel Stone wrote:
> On Tue, Sep 16, 2008 at 10:10:20AM -0400, Adam Jackson wrote:
> > But from a strict performance standpoint, threading
> > just isn't a win.  Anything the X server's doing that takes material CPU
> > time is simply a bug.
> 
> Hmm.  Even enforcing fairness between clients? If you have a hostile
> client, you've already lost, but we have a lot of crap clients already
> (hello Gecko), so.  It would also presumably drop the mean/mode
> latencies while having pretty much no impact on the others: if you have
> one thread waiting on a GetImage and thus migration back to system
> memory, your other clients can still push their trivial rendering to the
> GPU and go back to sleeping.
> 
> I will admit that this paragraph has had no prior thought, and could
> probably be swiftly proven wrong.  YMMV.

I could believe a fairness argument here, but I'd like to see better
numbers first on how often clients block on the server, and what they're
waiting for when they do.

Project for anyone reading this thread: instrument the scheduler such
that when it punishes a client, it records both the last thing that
client was doing, and the number of clients now in the wait queue.  Dump
to log, run a desktop for a few days, then go do statistics.

> > [*] Except embedded stuff, but how often is that both multicore _and_
> > gpu-less.
> 
> Not really.  We're getting to the point of seeing multicore in consumer
> products, but the GPUs there are still too power-hungry to want to base
> a Render implementation on.  Of course, we're still pretty much in the
> first iteration of the current generation of those GPUs, so hopefully
> they can push the power envelope quite aggressively lower, but for a
> couple of years at least, we'll have multicore + effectively GPU-less,
> in platforms where latency is absolutely unacceptable.

ARM, you're so weird.

Well, okay, there's at least two tactics you could use here.  We could
either go to aggressive threading like in MTX, but that's not a small
project and I think the ping-pong latency from bouncing the locks around
will offset any speed win from parallelising rendering.  You can
mitigate some of that by trying to keep clients pinned to threads and
hope the kernel pins threads to cores, but atoms and root window
properties and cliplist manipulation will still knock all your locks
around... so you might improve fairness, but at the cost of best-case
latency.

Or, we keep some long-lived rendering threads in pixman, and chunk
rendering up at the last instant.  I still contend that software
rendering is the only part of the server's life that should legitimately
take significant time.  If we're going to thread to solve that problem,
then keep the complexity there, not up in dispatch.

Still, I'm kind of dismayed the GPU needs that much power.  All we need
is one texture unit.  I have to imagine the penalty for doing it in
software outweighs the additional idle current from a braindead alpha
blender...

- ajax
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.x.org/archives/xorg/attachments/20080916/485df31e/attachment.pgp>