[Mesa-dev] [PATCH 1/1] i965: Make sure the shadow buffers have enough space
James Xiong
james.xiong at intel.com
Fri Apr 13 21:08:40 UTC 2018
On Fri, 13 Apr 2018 13:51:02 -0700
Kenneth Graunke <kenneth at whitecape.org> wrote:
> On Friday, April 13, 2018 1:35:45 PM PDT Kenneth Graunke wrote:
> > On Monday, April 9, 2018 4:06:16 PM PDT James Xiong wrote:
> > > From: "Xiong, James" <james.xiong at intel.com>
> > >
> > > On non-LLC platforms, we malloc shadow batch/state buffers
> > > of the same sizes as our batch/state buffers' GEM allocations.
> > > However the buffer allocator reuses similar-sized gem objects,
> > > it returns a buffer larger than we asked for in some cases
> > > and we end up with smaller shadow buffers. If we utilize the
> > > full-size of the over-allocated batch/state buffers, we may wind
> > > up accessing beyond the bounds of the shadow buffers and cause
> > > segmentation fault and/or memory corruption.
> >
> > Oh, good catch! We do indeed malloc too little if the bufmgr
Thank you for taking time to review, Kenneth.
> >
> > > A few examples:
> > > case batch state
> > > request bo shadow request bo shadow
> > > init 0 20K 20K 20K 16K 16K 16K
> > > grow_buffer 1 30K 32K 30K 24K 24K 24K
> > > grow_buffer 2 48K 48K 48K 36K 40K 36K
> > > grow_buffer 3 72K 80K 72K 60K 64K 60K
> > > grow_buffer 4 120K 128K 120K - - -
> > >
> > > batch #1, #3, #4; state #2 and #3 are problematic. We can change
> > > the order to allocate the bo first, then allocate the shadow
> > > buffer using the bo's size so that the shadow buffer have at
> > > least an equivalent size of the gem allocation.
> > >
> > > Another problem: even though the state/batch buffer could grow,
> > > when checking if it runs out space, we always compare with the
> > > initial batch/state sizes. To utilize the entire buffers, change
> > > to compare with the actual sizes.
> >
> > This is actually intentional. Our goal is to flush batches when the
> > amount of commands or state reaches those thresholds. Occasionally,
> > we'll be in the middle of emitting a draw, and unable to stop. In
> > that case, we grow the batch and keep going. But after that, we're
> > beyond our original target, so we flush next time. We don't want
> > to grow without bounds...it's meant more for emergencies, or if
> > we've badly estimated the size of the draw call.
I am not sure I get it. Let me give an example: the state buffer
gets grown once from 16K to 24K in brw_state_batch(), the used_size
becomes 20K, then brw_require_statebuffer_space(1024) gets called to
ask for 1K space, with the original logical, it compares the used size
with 16K and flush the batch even though the state buffer still has 4K
space available?
> >
> > I've sent a simpler patch which I think should hopefully fix your
> > bug: https://patchwork.freedesktop.org/patch/217107/
>
> Lionel noticed that I botched that patch. Here's an actual one:
>
> https://patchwork.freedesktop.org/patch/217108/
Yes it will fix the existing bug. However the assumption here is
that the init allocation size will NOT be rounded up as it happens to
be the bucket size.
I am working on an optimization to improve memory usage(that's how
I find out this bug), this assumption is no longer true. Essentially the
bufmgr could return a buffer with the same or larger size whether it is
same as the bucket's or not. Anyway I guess I can send the fix
later along with the optimization patches.
More information about the mesa-dev
mailing list