[PATCH xwayland] xwayland-shm: fortify fallocate against EINTR

Mon Apr 25 10:20:35 UTC 2016

On Mon, 25 Apr 2016 11:33:00 +0200
Marek Chalupa <mchqwerty at gmail.com> wrote:

> If posix_fallocate or ftruncate is interrupted by signal
> while working, we return -1 as fd and the allocation process
> returns BadAlloc error. That causes xwayland clients to abort
> with 'BadAlloc (insufficient resources for operation)'
> even when there's a lot of resources available.
> 
> Fix it by trying again when we get EINTR.
> 
> Signed-off-by: Marek Chalupa <mchqwerty at gmail.com>
> ---
>  hw/xwayland/xwayland-shm.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/xwayland/xwayland-shm.c b/hw/xwayland/xwayland-shm.c
> index e8545b3..c199e5e 100644
> --- a/hw/xwayland/xwayland-shm.c
> +++ b/hw/xwayland/xwayland-shm.c
> @@ -140,14 +140,20 @@ os_create_anonymous_file(off_t size)
>          return -1;
>  
>  #ifdef HAVE_POSIX_FALLOCATE
> -    ret = posix_fallocate(fd, 0, size);
> +    do {
> +        ret = posix_fallocate(fd, 0, size);
> +    } while (ret == EINTR);
> +
>      if (ret != 0) {
>          close(fd);
>          errno = ret;
>          return -1;
>      }
>  #else
> -    ret = ftruncate(fd, size);
> +    do {
> +        ret = ftruncate(fd, size);
> +    } while (ret == -1 && errno == EINTR);
> +
>      if (ret < 0) {
>          close(fd);
>          return -1;

Hi Marek,

curious, how did you hit this case? And is the signal that intercept
these usually the smart scheduler's SIGALRM?

I am asking, because I have someone suffering from the EINTR issue, but
a simple restart like what you implemented here results in an endless
loop. A new signal arrives before fallocate completes every time. It is
like fallocate is not making any progress.

What is more curious is that the file is supposedly on a tmpfs, yet in
our case the 5 ms is not enough to fallocate a full-HD frame (8 MB). It
is a "Low powered NXP arm platform" I am told, I do not have access to
it myself.

It may be the platform's fault that fallocate takes such a long time.
Another thing is whether fallocate should make gradual progress or not;
if not, simple restart will not work against a regular timer signal.
That makes me wonder if in case of EINTR, we should revert to
fallocating a series of small chunks instead. But that could also be
nonsense and something else is broken, I just don't know.

Any ideas, anyone?

Should we be accounting for the possibility of an endless loop, or will
that never happen on a good platform?

Thanks,
pq
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 811 bytes
Desc: OpenPGP digital signature
URL: <https://lists.x.org/archives/xorg-devel/attachments/20160425/74fec55d/attachment.sig>