Intel driver - DRI profiling

Thu Mar 13 17:01:56 PDT 2008

2008/3/13 Lukas Hejtmanek <xhejtman at ics.muni.cz>:
> Hello,
>
>  I did some profiling of DRI and GLX with Intel driver (i965GM). I noticed that
>  huge amount of time is spent in drm_bo_vm_nopfn function (25%). Is this
>  expected? Why is this function called so often?
>
I'll try to explain as good as I can understand it myself.
The drm_bo_vm_nopfn (from here on called nopfn) is the function in the
kernel which handles page faults for clients accessing buffers who as
had its pages removed from its vm[1,2,3]. Playing around with buffer
pages and changing caching (which is sometimes required) is a rather
costly operation. If I remember things correctly it does not map all
pages on create & first drmBoMap call, and since nopfn does not map
all the pages at once several calls will be done to it on accessing a
whole buffer that is large.

>  Another question is, why glxgears do not eat 100% CPU on Intel? On nvidia
>  (binary drivers) they do.
>
CPU vs GPU unbalance, your CPU is feeding the GPU faster then the it
can processes the commands and therefore it will sleep waiting for it
to complete the rendering. I'm guessing you only used the nvidia
driver on shiny high powered GPU's :-), Try to resize the glxgears
window to realy small size (not to small tho) and see the CPU usage go
up.

>  --
>  Lukáš Hejtmánek

I hope that enlightens you a bit at least.

Cheers Jakob.

[1] When mapping a buffer into a client vm (or just plan memory) we
reserve a region of space in that client memory that will be that
buffer. This region is quite clever in that you can pretty much always
access[2] it.

[2] Calling drmBoUnmap on a buffer does not unmap the pages from the
clients vm but instead releases a lock used for syncing.

[3] One of the conditions when the pages are removed from the clients
vm are when a buffer has been moved from one memory region to another.
For example a texture upload from local memory (cached) to ttm
(unchached). TTM does not remap them but instead is lazy, mostly
buffers will be mapped once moved and then never accessed again,
texture creation is a good example.