Performance improvement to vga arbitration
Henry Zhao
henry.zhao at oracle.com
Wed Jun 9 23:02:43 PDT 2010
Proposal for improving vgaarb arbitration method
It appears that after session is up, in most cases, drivers only do
non-legacy accesses.
Non-legacy accesses do not need to block each other. Blocking
arbitration is needed
mostly for session initialization and exiting. To improve performance,
we need to treat
differently to legacy and non-legacy accesses, and allow non-legacy
accesses to proceed
concurrently among devices without blocking each other. Non-legacy
accesses is assumed
to be the default for operating functions after initialization. In case
legacy accesses are
necessary for some of them, drivers can redefine them per function group
bases.
Here are some details:
(1) New lock for non-legacy access
Define another lock, vgadev->locks2 (locks2), for non-legacy access
locking
in addition to vgadev->locks (locks1), currently used for legacy access
locking.
Non-legacy access requests from a device that does not have legacy access
decoding ability should always be honored without a need of acquiring
a lock.
Non-legacy access requests from a device that has legacy access decoding
ability needs to acquire locks2 before proceeding.
Request for locks2 is blocked only when some other device already has
locks1
(on the same resources). Request for locks1 is blocked when some
other device
already has locks1 or locks2 (on the same resource). This means
request for
locks2 should not be blocked just because some other device already
has locks2
(on the same resources).
Currently we have 4 defines for resource request:
VGA_RSRC_LEGACY_IO
VGA_RSRC_LEGACY_MEM
VGA_RSRC_NORMAL_IO
VGA_RSRC_NORMAL_MEM
but only two strings for them, "io" and "mem". Add "IO" and "MEM" for non-
legacy accesses.
(2) Function group based resource request
Need to distinguish between decoding ability and decoding request
(resource
request). Decoding ability is still maintained in struct vga_device of
kernel
driver with
unsigned int decodes;
and a userland copy in dev->vgaarb_rsrc.
Currently all lock/unlocking mechanism uses resource requests from
dev->vgaarb_rsrc, which is actually decoding ability. In new design
however,
this is only the case for xf86VGAarbiterLock() and
xf86VGAarbiterUnlock(), run
during session initialization and exiting. During normal run, resource
request
is determined by a resource mask associated with each function.
Wrapping function are grouped into MAX_VGAARB_OPS_MASK number of
groups with resource masks assigned to each of them. The default
setting of mask is
VGA_RSRC_NORMAL_IO|VGA_RSRC_NORMAL_MEM, meaning non-legacy
access, but drivers can redefine any of them. In an extreme if a
driver redefines all
masks to
VGA_RSRC_NORMAL_IO|VGA_RSRC_NORMAL_MEM|
VGA_RSRC_LEGACY_IO|VGA_RSRC_LEGACY_MEM
we are returning to old arbitration algorithm.
(3) Other changes
* pci_device_vgaarb_set_target() is heavily called. Currently it
involves two
syscalls. These calls can be saved if the device in question is the
same as
in the previous call (recorded in pci_sys->vga_target). This contributes
to major performance improvement.
* OpenConsole()/CloseConsole() need to be protected by lock and unlock
as they
may have vga register accesses. Further,
OpenConsole()/CloseConsole() is run
only on a session with primary device.
I am posting the design idea for comments.
(This has been implemented and tested on both Linux and Solaris systems.)
-Henry
More information about the xorg-devel
mailing list