Optimization idea: soft XvPutImage
Eeri.Kask at inf.tu-dresden.de
Mon Sep 22 04:55:27 PDT 2008
>> Even if you do not want to do stretch, I believe that the X Render
>> extension would require first copying the YUV data to a drawable and
>> then doing a drawable->drawable block transfer operation to do the
>> YUV transformation. In comparison, XvPutImage is a single call
>> takes an XImage, which can be in shared memory, and would normally
>> be in YUV, and specifies the YUV->RGB conversion and stretch in a
>> single operation.
> As other people pointed out, XRender does allow arbitrary 3x3
> transformations of source images, but you are right that the XRender
> protocol would require putting the data in a drawable first.
To recapitulate, (1) YUV->RGB plus (2) up-/downscaling combined is not a
pixel shuffling problem.
(1) is linear color space conversion by 3x3 matrix multiplication per
pixel one definitely does at most once per pixel,
(2) involves convolution, either antialiasing or low-pass filtering;
which probably is the reason to decouple (1) and (2) (in the sense of
computational efficiency and not software technologically).
If neglected (2) then upscaling PAL to HDTV size results in garbage.
(For entertainment applications "linear interpolation" is enough, in
science/medicine often not.)
Intel SSE2-intrinsics does a perfect job, a speedup comparable to an
order of magnitude can happen.
Once in a test connecting concurrently 4 firewire cameras at 640x480
frame size each and doing no image rescale (i.e. basically only
YUV->RGB) resulted in (pentium-4) CPU usage doubling easily, in doing
nothing else but lightmindedly injected a "redundant" (i.e. not
performing some computation along the way) frame copy somewhere. :-)
More information about the xorg