Most efficient way to upload small coverage masks for radeonsi?
Clemens Eisserer
linuxhippy at gmail.com
Sun Nov 6 16:04:54 UTC 2016
Hello,
Java performs antialiased rendering by rasterizing coverage masks (<=
64x64 pixels in size, PictStandardA8) on the client instead of sending
trapezoid-lists to the XServer. The masks are uploaded using XPutImage
and are later used for XRenderComposite.
On the pro side this approach is *way* better than the
XGetImage+XPutImage call involved before XRender was used. However,
the downside is the really high per-primitive overhead and while SNA
does cope with this workload quite well, glamor does not.
Here are some upload bandwidth results for 64x64 masks +
XRenderComposite (src=vram_pixmap, mask=uploaded_mask, dst=window):
Haswell Laptop:
XPutImage-Glamor: 80MB/s
ShmPixmap-Glamor: 177.5MB/s
XPutImage-SNA: 585MB/s
ShmPixmap-SNA: 4000MB/s
Mullins Netbook:
XPutImage-Glamor: 30MB/s
ShmPixmap-Glamor: 33MB/s
My hope was using ShmPixmaps could help to lower the driver overhead,
by giving the driver a hint how the data was intended to be used
(exactly once). The profile (at the end of the email) doesn't contain
suspicious looking entries to me.
Any ideas to speed this process up are highly welcome.
Thank you in advance, Clemens
Profile
SELF CUMULATIVE FUNCTION
[ 0,00%] [ 92,31%] [/usr/libexec/Xorg]
[ 0,00%] [ 10,31%] In file [heap]
[ 0,18%] [ 5,37%] ioctl
[ 0,00%] [ 4,68%] r600_get_name
[ 2,37%] [ 2,40%] surf_drm_to_winsys
[ 2,37%] [ 2,39%] __memcpy_sse2_unaligned
[ 2,01%] [ 2,06%] _int_malloc
[ 0,08%] [ 1,55%] clock_gettime
[ 1,49%] [ 1,50%] free
[ 1,39%] [ 1,41%] si_draw_vbo
[ 0,00%] [ 1,33%] util_blitter_get_next_surface_layer
[ 1,25%] [ 1,26%] radeon_drm_cs_add_buffer
[ 1,22%] [ 1,23%] radeon_winsys_surface_best
[ 1,19%] [ 1,21%] radeon_winsys_surface_init
[ 0,95%] [ 0,96%] r600_texture_create_object
[ 0,94%] [ 0,94%] __GI_memset
[ 0,93%] [ 0,94%] pthread_mutex_unlock
[ 0,83%] [ 0,88%] set_tex_parameteri
[ 0,83%] [ 0,85%] _int_free
[ 0,80%] [ 0,81%] glamor_composite_clipped_region
[ 0,04%] [ 0,79%] _mesa_Uniform1i
[ 0,73%] [ 0,74%] unbind_texobj_from_image_units
[ 0,63%] [ 0,64%] si_update_shaders
[ 0,61%] [ 0,61%] si_make_texture_descriptor
[ 0,60%] [ 0,61%] si_emit_framebuffer_state
[ 0,59%] [ 0,60%] si_shader_select
[ 0,59%] [ 0,59%] r600_texture_create
[ 0,06%] [ 0,55%] RegionCreate
[ 0,54%] [ 0,55%] hash_table_search
[ 0,55%] [ 0,55%] st_choose_format
[ 0,51%] [ 0,53%] __pthread_mutex_lock
[ 0,00%] [ 0,53%] epoxy_glTexParameterfv_global_rewrite_ptr
[ 0,50%] [ 0,50%] util_format_description
[ 0,50%] [ 0,50%] _mesa_base_tex_format
More information about the xorg-driver-ati
mailing list