Hi folks,<div><br></div><div>I think I've found a race condition with Timers in xorg-server. I found this in the context of using Timers within an Xinput module. As you may know, Xinput modules handle input on a signal handler which interrupts the main thread of execution.</div>
<div><br></div><div>Looking at DoTimer in xorg-server's os/WaitFor.c file with some annotations added:</div><div><br></div><div><div><font class="Apple-style-span" face="'courier new', monospace">static void</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace">DoTimer(OsTimerPtr timer, CARD32 now, OsTimerPtr *prev)</font></div><div><font class="Apple-style-span" face="'courier new', monospace">{</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"> CARD32 newTime;</font></div><div><font class="Apple-style-span" face="'courier new', monospace"><br></font></div><div><font class="Apple-style-span" face="'courier new', monospace"><meta charset="utf-8"> // A</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"> *prev = timer->next; // remote this timer from the list</font></div><div><font class="Apple-style-span" face="'courier new', monospace"><meta charset="utf-8"> // B</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"> timer->next = NULL; // clean up </font><span class="Apple-style-span" style="font-family: 'courier new', monospace; ">this timer's </span><span class="Apple-style-span" style="font-family: 'courier new', monospace; ">next pointer</span></div>
<meta charset="utf-8"><div><font class="Apple-style-span" face="'courier new', monospace"><meta charset="utf-8"> // C</font></div><div><font class="Apple-style-span" face="'courier new', monospace"> newTime = (*timer->callback)(timer, now, timer->arg);</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"> // D</font></div><div><font class="Apple-style-span" face="'courier new', monospace"> if (newTime)</font></div><div><font class="Apple-style-span" face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">        </span>TimerSet(timer, 0, newTime, timer->callback, timer->arg);</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace">}</font></div></div><div><br></div><div>Timer callbacks occur on the main thread, but an IO handler can interrupt at any time and schedule or cancel timers. Many timer handlers that worry about this will block SIGIO in their callback, so for now let's assume that no SIGIO will occur between C and D above. Think about the following sequence of events:</div>
<div><br></div><div><font class="Apple-style-span" face="'courier new', monospace">Legend:</font></div><div><font class="Apple-style-span" face="'courier new', monospace"><br></font></div><div><font class="Apple-style-span" face="'courier new', monospace">Main Thread (not indented)</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"> SIGIO handler (indented)</font></div><div><font class="Apple-style-span" face="'courier new', monospace"><br></font></div><div><font class="Apple-style-span" face="'courier new', monospace">Sequence of events:</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"><br></font></div><div><font class="Apple-style-span" face="'courier new', monospace">There is 1 live timer (timer #1), which is firing.</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace">DoTimer() called.</font></div><div><font class="Apple-style-span" face="'courier new', monospace">Timer removed from list.</font></div><div>
<font class="Apple-style-span" face="'courier new', monospace">We are at point "B" above</font></div><div><font class="Apple-style-span" face="'courier new', monospace"> SIGIO occurs</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"> Timer #1 is cancelled (noop)</font></div><div><font class="Apple-style-span" face="'courier new', monospace"> Timer #1 is rescheduled - inserted into timers list</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"> SIGIO is complete</font></div><div><font class="Apple-style-span" face="'courier new', monospace"><br></font></div><div><font class="Apple-style-span" face="'courier new', monospace"> Another module's SIGIO fires</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"> Timer #2 is scheduled and goes at end of timers list</font></div><div><font class="Apple-style-span" face="'courier new', monospace"> SIGIO is complete</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"><br></font></div>(At this point, we have 2 timers in the timers list, the first of which is inside DoTimer, and we resume the main thread)<div><font class="Apple-style-span" face="'courier new', monospace"><br>
</font></div><div><font class="Apple-style-span" face="'courier new', monospace">Timer #1's next ptr set to NULL</font></div><div><br></div>(At this point, Timer #2 is lost from the timers list!)<div><br>I'm new to the xorg-server code base, but my naive inclination is that when reading and writing the timers linked list, SIGIO (and other signals?) should be blocked. In the example above, if we blocked SIGIO at point A and unblocked it at point C, this particular race condition would be solved, it seems. However, I suspect that there are other problematic cases in WaitFor.c.</div>
<div><br></div><div>Also, to be fair, I haven't seen this problem occur in a real system; I only noticed it by looking at the code.</div><div><br></div><div>Thanks,</div><div>-andrew</div>