[Mesa-dev] [PATCH dri3proto v2] Add modifier/multi-plane requests, bump to v1.1

Fri Jul 28 07:44:42 UTC 2017

Hi Nicolai,
Trying to tackle the stride subthread in one go ...

On 25 July 2017 at 09:28, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
> On 22.07.2017 14:00, Daniel Stone wrote:
>> On 21 July 2017 at 18:32, Michel Dänzer <michel at daenzer.net> wrote:
>>> We just ran into an issue which might mean that there's still something
>>> missing in this v2 proposal:
>>>
>>> The context is DRI3 PRIME render offloading of glxgears (not useful in
>>> practice, but bear with me). The display GPU is Raven Ridge, which
>>> requires
>>> that the stride even of linear textures is a multiple of 256 bytes. The
>>> renderer GPU is Polaris12, which still supports smaller alignment of the
>>> stride. With the glxgears window of width 300, the renderer GPU driver
>>> chooses a stride of 304 (* 4 / 256 = 4.75), whereas the display GPU would
>>> require 320 (* 4 / 256 = 5). This cannot work.
>>
>>
>> The obvious answer is just to increase padding on external/winsys
>> surfaces? Increasing it for all allocations would probably be a
>> non-starter, but winsys surfaces are rare enough that you could
>> probably afford to take the hit, I guess.
>>
>>> I see two basic approaches to solve this:
>>> [...]
>>> Maybe there are other possible approaches I'm missing? Other comments?
>>
>> I don't have any great solution off the top of my head, but I'd be
>> inclined to bundle stride in with placement. TTBOMK (from having
>> looked at radv), buffers shared cross-GPU also need to be allocated
>> from a separate externally-visible memory heap. And at the moment,
>> lacking placement information at allocation time (at least for EGL
>> allocations, via DRIImage), that happens via transparent migration at
>> import time I think. Placement restrictions would probably also
>> involve communicating base address alignment requirements.
>
> Stride isn't really in the same category as placement and base address
> alignment, though.
>
> Placement and base address alignment requirements can apply to all possible
> texture layouts, while the concept of stride is specific to linear layouts.

No, I don't think it is. Tiled layouts still have a stride: if you
look at i915 X/Y/Yf/Y_CCS/Yf_CCS (the latter two containing an
auxiliary compression/fast-clear buffer), iMX/etnaviv
tiled/supertiled, or VC4 T-tiled modifiers and how they're handled
both for DRIImage and KMS interchange, they all specify a stride which
is conceptually the same as linear, if you imagine linear to be 1x1
tiled.

Most tiling users accept any integer units of tiles (buffer width
aligned to tile width), but not always. The NV12MT format used by
Samsung media decoders (also shipped in some Qualcomm SoCs) is
particularly special, IIRC requiring height to be aligned to a
multiple of two tiles.

>> Given that, I'm fairly inclined to punt those until we have the grand
>> glorious allocator, rather than trying to add it to EGL/GBM
>> separately. The modifiers stuff was a fairly obvious augmentation -
>> EGL already had no-modifier format import but no query as to which
>> formats it would accept, and modifiers are a logical extension of
>> format - but adding the other restrictions is a bigger step forward.
>
> Perhaps a third option would be to encode the required pitch_in_bytes
> alignment as part of the modifier?
>
> So DRI3GetSupportedModifiers would return DRM_FORMAT_MOD_LINEAR_256B when
> the display GPU is a Raven Ridge.
>
> More generally, we could say that fourcc_mod_code(NONE, k) means that the
> pitch_in_bytes has to be a multiple of 2**k for e.g. k <= 31. Or if you
> prefer, we could have a stride requirement in elements or pixels instead of
> bytes.

I've been thinking this over for the past couple of days, and we could
make it work in clients. But we already have DRM_FORMAT_MOD_LINEAR
with a well-defined meaning, implemented in quite a few places. I
think special-casing linear to encode stride alignment is something
we'd regret in the long run, especially given that some hardware
_does_ have more strict alignment requirements for tiled modes than
just an integer number of tiles allocated.

AFAICT, both AMD and NVIDIA are both going to use a fair bit of the
tiling enum space to encode tile size as well as layout. If allocation
alignment requirements (in both dimensions) needed to be added to
that, it's entirely likely that there wouldn't be enough space and
you'd need to put it somewhere else than the modifier, in which case
we've not even really solved the problem.

It definitely seems attractive to kill two birds with one stone, but
I'd really much rather not conflate format description/advertisement,
and allocation restriction, into one enum. I'm still on the side of
saying that this is a problem modifiers do not solve, deferring to the
allocator we need anyway in order to determine things like placement
and global optimality (e.g. rotated scanout placing further
restrictions on allocation).

Cheers,
Daniel