Performance improvement to vga arbitration

Henry Zhao henry.zhao at
Wed Jun 9 23:02:43 PDT 2010

Proposal for improving vgaarb arbitration method

It appears  that after session is up,  in most cases,  drivers only do 
non-legacy accesses.
Non-legacy accesses do not need to block each other. Blocking 
arbitration is needed
mostly for session initialization and exiting. To improve performance, 
we need to treat
differently to legacy and non-legacy accesses, and allow non-legacy 
accesses to proceed
concurrently among devices without blocking each other. Non-legacy 
accesses is assumed
to be the default for operating functions after initialization. In case 
legacy accesses are
necessary for some of them, drivers can redefine them per function group 
Here are some details:

(1) New lock for non-legacy access

  Define another lock, vgadev->locks2 (locks2), for non-legacy access 
  in addition to vgadev->locks (locks1), currently used for legacy access

  Non-legacy access requests from a device that does not have legacy access
  decoding ability should always be honored without a need of acquiring 
a lock.
  Non-legacy access requests from a device that has legacy access decoding
  ability needs to acquire locks2 before proceeding.

  Request for locks2 is blocked only when some other device already has 
  (on the same resources).  Request for locks1 is blocked when some 
other device
  already has locks1 or locks2 (on the same resource). This means 
request for
  locks2 should not be blocked just because some other device already 
has locks2
  (on the same resources).

  Currently we have 4 defines for resource request:


  but only two strings for them, "io" and "mem". Add "IO" and "MEM" for non-
  legacy accesses.

(2) Function group based resource request

  Need to distinguish between decoding ability and decoding request 
  request). Decoding ability is still maintained in struct vga_device of 
  driver with

        unsigned int decodes;

  and a userland copy in dev->vgaarb_rsrc.

  Currently all lock/unlocking mechanism uses resource requests from
  dev->vgaarb_rsrc, which is actually decoding ability. In new design 
  this is only the case for xf86VGAarbiterLock() and 
xf86VGAarbiterUnlock(), run
  during session initialization and exiting. During normal run, resource 
  is determined by a resource mask associated with each function.

  Wrapping function are grouped into MAX_VGAARB_OPS_MASK number of
  groups with resource masks assigned to each of them. The default 
setting of mask is
  access, but drivers can redefine any of them. In an extreme if a 
driver redefines all
  masks to

  we are returning to old arbitration algorithm.

(3) Other changes

  * pci_device_vgaarb_set_target() is heavily called. Currently it 
involves two
    syscalls.  These calls can be saved if the device in question is the 
same as
    in the previous call (recorded in pci_sys->vga_target). This contributes
    to major performance improvement.

  * OpenConsole()/CloseConsole() need to be protected by lock and unlock 
as they
    may have vga register accesses. Further, 
OpenConsole()/CloseConsole() is run
    only on a session with primary device.

I am posting the design idea for comments.

(This has been implemented and tested on both Linux and Solaris  systems.)


More information about the xorg-devel mailing list