sensing X connection failure/timeout

Xavier Toth txtoth at gmail.com
Tue Sep 29 11:08:50 PDT 2009


I'm displaying clients on a remote X server and if I power the X
server box or pull the network cable some of the clients never
terminate. Some clients (fbpanel and gnome-screensaver in particular)
terminate and write "Fatal IO error 110 (Connection timeout) on X
server  <ip address>" to stderr, openbox the window manager we are
using does not ever realize that the server is gone. Looking at the
openbox main loop code:

void ob_main_loop_run(ObMainLoop *loop)
{
    XEvent e;
    struct timeval *wait;
    fd_set selset;
    GSList *it;
    int rc;

    loop->run = TRUE;
    loop->running = TRUE;

    while (loop->run) {
        if (loop->signal_fired) {
            guint i;
            sigset_t oldset;

            /* block signals so that we can do this without the data changing
               on us */
            sigprocmask(SIG_SETMASK, &all_signals_set, &oldset);

            for (i = 0; i < NUM_SIGNALS; ++i) {
                while (loop->signals_fired[i]) {
                    for (it = loop->signal_handlers[i];
                            it; it = g_slist_next(it)) {
                        ObMainLoopSignalHandlerType *h = it->data;
                        h->func(i, h->data);
                    }
                    loop->signals_fired[i]--;
                }
            }
            loop->signal_fired = FALSE;

            sigprocmask(SIG_SETMASK, &oldset, NULL);
        } else if (XPending(loop->display)) {
            do {
                XNextEvent(loop->display, &e);

                for (it = loop->x_handlers; it; it = g_slist_next(it)) {
                    ObMainLoopXHandlerType *h = it->data;
                    h->func(&e, h->data);
                }
            } while (XPending(loop->display) && loop->run);
        } else {
            /* this only runs if there were no x events received */

            timer_dispatch(loop, (GTimeVal**)&wait);

            selset = loop->fd_set;
            /* there is a small race condition here. if a signal occurs
               between this if() and the select() then we will not process
               the signal until 'wait' expires. possible solutions include
               using GStaticMutex, and having the signal handler set 'wait'
               to 0 */
            if (!loop->signal_fired)
                rc = select(loop->fd_max + 1, &selset, NULL, NULL, wait);

            /* handle the X events with highest prioirity */
            if (FD_ISSET(loop->fd_x, &selset))
                continue;

            g_hash_table_foreach(loop->fd_handlers,
                                 fd_handle_foreach, &selset);
        }
    }

    loop->running = FALSE;
}

I see that if there are no X event it does a select on an fd set
initialized as follows:

    loop->fd_x = ConnectionNumber(display);
    FD_ZERO(&loop->fd_set);
    FD_SET(loop->fd_x, &loop->fd_set);
    loop->fd_max = loop->fd_x;

with a wait of NULL (infinite) and it never returns.

What is the proper way to determine that a remote server connection
has failed or timed out?

Ted



More information about the xorg mailing list