cairo performance and poulsbo driver

Johan Bilien jobi at via.ecp.fr
Fri Dec 18 00:10:42 PST 2009


On Wed, Dec 16, 2009, Johan Bilien wrote:
> Hi,
> 
> we are using the poulsbo driver which was written by TG for intel and is
> basically a wrapper around some Xpsb binary blob.
> 
> One of our main problems nowadays is 2D performance. I have been using
> the firefox-20090601 cairo trace as my benchmark, because our app is a
> firefox based browser.
> 
> Running cairo master (and pixman 0.16.2), the image backend is about 3
> times faster than the xlib one:
> 
> [ # ]  backend                        test  min(s) median(s) stddev.
> [  0]    image            firefox-20090601  110.229  119.423  4.00% 
> [  0]    xlib             firefox-20090601  297.572  298.082  0.82% 
> 
> So I thought that if I could make XRender use pixman instead of using
> the driver I might be able to get closer to the image backend
> performance.
> 
> I tried both disabling only the Composite hooks or all of exa's hooks,
> but unfortunately I am still far from the image backend figures:
> 
> (Composite disabled)
> [ # ]  backend                         test   min(s) median(s) stddev.count
> [  0]     xlib             firefox-20090601  269.082  269.082   0.00% 1/1
> [  0]    image             firefox-20090601  101.397  101.397   0.00% 1/1
> 
> (all EXA hooks disabled)
> [ # ]  backend                         test   min(s) median(s) stddev.count
> [  0]     xlib             firefox-20090601  271.958  271.958   0.00% 1/1
> [  0]    image             firefox-20090601  100.829  100.829   0.00% 1/1
> 
> So I'm left wondering where the overhead of the xlib backend comes from.
> If I run sysprof (profile attached) while running the trace (in the
> Composite disabled case), I can see that pixman gets only 27.5% of the
> CPU time, while 39.4% is spent "in kernel". I'm vaguely thinking that
> these could be from moving pixmap data to and from the VRAM, but really
> I have no idea.
> 
> Any idea on how to further investigate this?

Profiling with oprofile shows that when running the firefox trace (xlib
backend, Composite exa hook disabled), the CPU is idle a lot of the
time. Maybe it's waiting for the GPU a lot?

Here's the top of the profile:

10082    33.9999  vmlinux-2.6.24-24-lpia   vmlinux-2.6.24-24-lpia
acpi_idle_enter_bm
3179     10.7207  vmlinux-2.6.24-24-lpia   vmlinux-2.6.24-24-lpia
get_page_from_freelist
1662      5.6048  vmlinux-2.6.24-24-lpia   vmlinux-2.6.24-24-lpia
acpi_idle_enter_simple
1317      4.4414  libpixman-1.so.0.16.2    libpixman-1.so.0.16.2
pixman_fill_sse2
1080      3.6421  libpixman-1.so.0.16.2    libpixman-1.so.0.16.2
bits_image_fetch_transformed
804       2.7114  libc-2.7.so              libc-2.7.so              (no
symbols)
800       2.6979  Xorg                     Xorg                     (no
symbols)
545       1.8379  vmlinux-2.6.24-24-lpia   vmlinux-2.6.24-24-lpia
native_safe_halt
497       1.6761  libpixman-1.so.0.16.2    libpixman-1.so.0.16.2
pixman_rasterize_edges
412       1.3894  libpixman-1.so.0.16.2    libpixman-1.so.0.16.2
pixman_blt_sse2
411       1.3860  vmlinux-2.6.24-24-lpia   vmlinux-2.6.24-24-lpia
do_page_fault
383       1.2916  vmlinux-2.6.24-24-lpia   vmlinux-2.6.24-24-lpia
kunmap_atomic
380       1.2815  libpixman-1.so.0.16.2    libpixman-1.so.0.16.2
fetch_pixel_x8r8g8b8
341       1.1500  libcairo.so.2.10905.0    libcairo.so.2.10905.0
_cairo_bentley_ottmann_tessellate_polygon
319       1.0758  libcairo-script-interpreter.so.2.10905.0
libcairo-script-interpreter.so.2.10905.0 _scan_file
239       0.8060  vmlinux-2.6.24-24-lpia   vmlinux-2.6.24-24-lpia
free_hot_cold_page
237       0.7992  libcairo-script-interpreter.so.2.10905.0
libcairo-script-interpreter.so.2.10905.0 csi_file_getc
224       0.7554  libcairo-script-interpreter.so.2.10905.0
libcairo-script-interpreter.so.2.10905.0 _csi_hash_table_lookup
192       0.6475  libz.so.1.2.3.3          libz.so.1.2.3.3          (no
symbols)
142       0.4789  vmlinux-2.6.24-24-lpia   vmlinux-2.6.24-24-lpia
handle_mm_fault
116       0.3912  libX11.so.6.2.0          libX11.so.6.2.0          (no
symbols)
108       0.3642  vmlinux-2.6.24-24-lpia   vmlinux-2.6.24-24-lpia
kmap_atomic_prot
102       0.3440  vmlinux-2.6.24-24-lpia   vmlinux-2.6.24-24-lpia
page_address
99        0.3339  vmlinux-2.6.24-24-lpia   vmlinux-2.6.24-24-lpia
finish_task_switch
94        0.3170  vmlinux-2.6.24-24-lpia   vmlinux-2.6.24-24-lpia
cpuidle_idle_call
85        0.2866  vmlinux-2.6.24-24-lpia   vmlinux-2.6.24-24-lpia
unmap_vmas
84        0.2833  libcairo-script-interpreter.so.2.10905.0
libcairo-script-interpreter.so.2.10905.0 _csi_parse_number
84        0.2833  libpixman-1.so.0.16.2    libpixman-1.so.0.16.2
sse2_composite_add_8888_8888
79        0.2664  vmlinux-2.6.24-24-lpia   vmlinux-2.6.24-24-lpia
release_pages
77        0.2597  libcairo-script-interpreter.so.2.10905.0
libcairo-script-interpreter.so.2.10905.0 token_end
76        0.2563  libxcb.so.1.0.0          libxcb.so.1.0.0          (no
symbols)
75        0.2529  libfb.so                 libfb.so
image_from_pict
75        0.2529  libpixman-1.so.0.16.2    libpixman-1.so.0.16.2
_pixman_run_fast_path
70        0.2361  libcairo.so.2.10905.0    libcairo.so.2.10905.0
_cairo_spline_decompose_into
70        0.2361  libpixman-1.so.0.16.2    libpixman-1.so.0.16.2
sse2_composite_over_n_8888_8888_ca




More information about the xorg mailing list