Most efficient way to upload small coverage masks for radeonsi?

Clemens Eisserer linuxhippy at gmail.com
Sun Nov 6 16:04:54 UTC 2016


Hello,

Java performs antialiased rendering by rasterizing coverage masks (<=
64x64 pixels in size, PictStandardA8) on the client instead of sending
trapezoid-lists to the XServer. The masks are uploaded using XPutImage
and are later used for XRenderComposite.
On the pro side this approach is *way* better than the
XGetImage+XPutImage call involved before XRender was used. However,
the downside is the really high per-primitive overhead and while SNA
does cope with this workload quite well, glamor does not.

Here are some upload bandwidth results for 64x64 masks +
XRenderComposite (src=vram_pixmap, mask=uploaded_mask, dst=window):

Haswell Laptop:
XPutImage-Glamor: 80MB/s
ShmPixmap-Glamor: 177.5MB/s
XPutImage-SNA: 585MB/s
ShmPixmap-SNA: 4000MB/s

Mullins Netbook:
XPutImage-Glamor: 30MB/s
ShmPixmap-Glamor: 33MB/s

My hope was using ShmPixmaps could help to lower the driver overhead,
by giving the driver a hint how the data was intended to be used
(exactly once). The profile (at the end of the email) doesn't contain
suspicious looking entries to me.


Any ideas to speed this process up are highly welcome.

Thank you in advance, Clemens


Profile

    SELF CUMULATIVE    FUNCTION
[   0,00%] [  92,31%]    [/usr/libexec/Xorg]
[   0,00%] [  10,31%]      In file [heap]
[   0,18%] [   5,37%]      ioctl
[   0,00%] [   4,68%]      r600_get_name
[   2,37%] [   2,40%]      surf_drm_to_winsys
[   2,37%] [   2,39%]      __memcpy_sse2_unaligned
[   2,01%] [   2,06%]      _int_malloc
[   0,08%] [   1,55%]      clock_gettime
[   1,49%] [   1,50%]      free
[   1,39%] [   1,41%]      si_draw_vbo
[   0,00%] [   1,33%]      util_blitter_get_next_surface_layer
[   1,25%] [   1,26%]      radeon_drm_cs_add_buffer
[   1,22%] [   1,23%]      radeon_winsys_surface_best
[   1,19%] [   1,21%]      radeon_winsys_surface_init
[   0,95%] [   0,96%]      r600_texture_create_object
[   0,94%] [   0,94%]      __GI_memset
[   0,93%] [   0,94%]      pthread_mutex_unlock
[   0,83%] [   0,88%]      set_tex_parameteri
[   0,83%] [   0,85%]      _int_free
[   0,80%] [   0,81%]      glamor_composite_clipped_region
[   0,04%] [   0,79%]      _mesa_Uniform1i
[   0,73%] [   0,74%]      unbind_texobj_from_image_units
[   0,63%] [   0,64%]      si_update_shaders
[   0,61%] [   0,61%]      si_make_texture_descriptor
[   0,60%] [   0,61%]      si_emit_framebuffer_state
[   0,59%] [   0,60%]      si_shader_select
[   0,59%] [   0,59%]      r600_texture_create
[   0,06%] [   0,55%]      RegionCreate
[   0,54%] [   0,55%]      hash_table_search
[   0,55%] [   0,55%]      st_choose_format
[   0,51%] [   0,53%]      __pthread_mutex_lock
[   0,00%] [   0,53%]      epoxy_glTexParameterfv_global_rewrite_ptr
[   0,50%] [   0,50%]      util_format_description
[   0,50%] [   0,50%]      _mesa_base_tex_format


More information about the xorg-driver-ati mailing list