[Intel-gfx] [PATCH] drm/i915/pxp: limit drm-errors or warnings on firmware API failures
Teres Alexis, Alan Previn
alan.previn.teres.alexis at intel.com
Thu Feb 2 17:11:06 UTC 2023
On Thu, 2023-02-02 at 08:43 +0000, Tvrtko Ursulin wrote:
>
> On 02/02/2023 08:13, Alan Previn wrote:
> > MESA driver is creating protected context on every driver handle
> > initialization to query caps bit for app. So when running CI tests,
> > they are observing hundreds of drm_errors when enabling PXP
> > in .config but using SOC or BIOS configuration that cannot support
> > PXP sessions.
> >
> > Update error handling codes to be more selective on which errors
> > are reported as drm_error vs drm_WARN_ONCE vs drm_debug.
> > Don't completely remove all FW error replies (at least keep them
> > but use drm_debug) or else cusomers that really needs to know that
> > content protection failed won't be aware of it when debugging.
> >
> > Signed-off-by: Alan Previn <alan.previn.teres.alexis at intel.com>
>
> How does this relate to b762787bf767 ("drm/i915/pxp: Use drm_dbg if arb
> session failed due to fw version") which I thought was already fixing
> the drm_error spam caused by userspace probing?
>
Good question. That previous error was specific to a board that was using
outdated firmware version that really needed to be upgraded.
At that point i wasn't aware of the the fact that MESA was seeing
high frequency of this failure that is tied to platform issues
(BIOS configuration / SOC fusing). Also, i believe in the prior case
PXP was not enabled by default the .config in all testing.
In this latest reported bug (i realized i forgot to include the bug no. for this
new patch - https://gitlab.freedesktop.org/drm/intel/-/issues/7706#note_1746952),
i was informed that PXP is being enabled by default and there
were DUT hardware that was not PXP-capable (SOC fusing / BIOS config).
So with this patch, i am trying to balance between issues that is critical
but are root-caused from HW/platform gaps (louder drm-warn - but just ONCE)
vs other cases where it could also come from hw/sw state machine (which cannot
be a WARB_ONCE message since it can occur due to runtime operation events).
One thing to note: i am pushing-for / waiting-on our firmware team to get
blessing on more fw-error-code to error-string translations that can be allowed
upstream which is why i added the "pxp_fw_err_to_string" and a single
"drm_dbg" so that in future, we don't have to keep adding a whole new lines of
code to multiple functions but just one new error code translation - and instead
just add the new err-code-to-string entry into a single location.
note: i will re-rev with the bug id.
More information about the dri-devel
mailing list