[PATCH] sna: avoid negative timeouts

Thu Oct 3 01:17:12 PDT 2013

On Thu, Oct 03, 2013 at 03:06:03AM -0500, Felipe Contreras wrote:
> On Thu, Oct 3, 2013 at 2:36 AM, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> > On Thu, Oct 03, 2013 at 01:29:29AM -0500, Felipe Contreras wrote:
> >> It's nice to avoid X server crashes (by not passing negative values to
> >> select(3)).
> >>
> >> For more information:
> >> http://article.gmane.org/gmane.comp.freedesktop.xorg.devel/37388
> >>
> >> Signed-off-by: Felipe Contreras <felipe.contreras at gmail.com>
> >
> > Thanks for the patch, I pushed a slightly different version to consume
> > the expired flush timeout immediately.
> >
> > However, in order for this to happen something within the BlockHandler
> > must have run for at least 4ms. Which is itself very worrying - the two
> > candidates are the throttle or lock contention. (The throttle is also
> > prone to being influenced by a third party.) Such stalls are likely to
> > be noticeable as jitter or judder, so if you can spot their source
> > hopefully we can tackle the root cause.
> 
> What exactly do you mean by third party? I wouldn't notice if there
> was jitter because it's a loading screen, and there's absolutely no
> updates on the screen; it's static.

Other users of /dev/dri/card0. The most likely cause for the delay here
is waiting for the GPU to complete an operation or for waiting for
access to the GPU. 

> Do you mean you would like me to add debugging in
> sna_accel_block_handler() to figure exactly which block is taking too
> long to complete?

Right, if you do something like:

diff --git a/src/sna/sna_accel.c b/src/sna/sna_accel.c
index 21cd342..31ea1a3 100644
@@ -15931,6 +15931,8 @@ void sna_accel_close(struct sna *sna)
 
 void sna_accel_block_handler(struct sna *sna, struct timeval **tv)
 {
+	uint32_t duration[NUM_TIMERS] = { 0 };
+
 	sigtrap_assert();
 
 	if (sna->kgem.need_retire)
@@ -15947,19 +15949,28 @@ void sna_accel_block_handler(struct sna *sna, struct timeval **tv)
 	}
 
 restart:
-	if (sna_accel_do_flush(sna))
+	if (sna_accel_do_flush(sna)) {
+		duration[FLUSH_TIMER] = TIME;
 		sna_accel_flush(sna);
+		duration[FLUSH_TIMER] = TIME - duration[FLUSH_TIMER];
+	}
 	assert(sna_accel_scanout(sna) == NULL ||
 	       sna_accel_scanout(sna)->gpu_bo->exec == NULL ||
 	       sna->timer_active & (1<<(FLUSH_TIMER)));
 
-	if (sna_accel_do_throttle(sna))
+	if (sna_accel_do_throttle(sna)) {
+		duration[THROTTLE_TIMER] = TIME;
 		sna_accel_throttle(sna);
+		duration[THROTTLE_TIMER] = TIME - duration[THROTTLE_TIMER];
+	}
 	assert(!sna->kgem.need_retire ||
 	       sna->timer_active & (1<<(THROTTLE_TIMER)));
 
-	if (sna_accel_do_expire(sna))
+	if (sna_accel_do_expire(sna)) {
+		duration[EXPIRE_TIMER] = TIME;
 		sna_accel_expire(sna);
+		duration[EXPIRE_TIMER] = TIME - duration[EXPIRE_TIMER];
+	}
 	assert(!sna->kgem.need_expire ||
 	       sna->timer_active & (1<<(EXPIRE_TIMER)));
 
@@ -15981,8 +15992,14 @@ restart:
 		timeout = sna->timer_expire[FLUSH_TIMER] - TIME;
 		DBG(("%s: flush timer expires in %d [%d]\n",
 		     __FUNCTION__, timeout, sna->timer_expire[FLUSH_TIMER]));
-		if (timeout < 3)
+		if (timeout < 3) {
+			ErrorF("restarting timeout[%d] (%d, %d, %d)\n",
+			       timeout,
+			       duration[FLUSH_TIMER],
+			       duration[THROTTLE_TIMER],
+			       duration[EXPIRE_TIMER]);
 			goto restart;
+		}
 
 		if (*tv == NULL) {
 			*tv = &sna->timer_tv;

That will hopefully catch which path is consuming too much time
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre