[PATCH] Allocate Xv buffers to GTT.
Michel Dänzer
michel at daenzer.net
Thu Feb 11 23:42:23 PST 2010
On Fri, 2010-02-12 at 01:36 +0200, Pauli Nieminen wrote:
> 2010/2/11 Michel Dänzer <michel at daenzer.net>:
> > On Wed, 2010-02-10 at 22:44 +0200, Pauli Nieminen wrote:
> >> KMS doesn't have acceleration for upload to vram. memcpy/memmove to VRAM
> >> directly is very slow (40M/s in benchmark) which causes visible problems
> >> to video.
> >>
> >> Allocating video buffer in GTT will give good performance (350-450M/s)
> >> for memmove operation. This is nice performance boost for Xv under KMS.
> >>
> >> There is still posibility to improve if adding BLITBLT transfer to VRAM
> >> which would handle tiling and endian swapping.
> >
> > What tiling? Byte swapping is done as part of copying to the BO, so I'm
> > not sure how an additional blit could improve anything.
> >
>
> I would think that byteswaping copy would be slow (unless power has
> instruction for that) [...]
Which it does - have you looked at RADEONCopySwap()? Also,
unfortunately, not all byte swapping bits work on all GPUs, in
particular on R300 generation GPUs most of them don't seem to work.
Even if there would be followup work to do, there's no need to speculate
about that here.
> >> diff --git a/src/radeon_crtc.c b/src/radeon_crtc.c
> >> index 556b461..8384af1 100644
> >> --- a/src/radeon_crtc.c
> >> +++ b/src/radeon_crtc.c
> >> @@ -564,7 +564,7 @@ radeon_crtc_shadow_allocate (xf86CrtcPtr crtc, int width, int height)
> >> * setter for offscreen area locking in EXA currently. So, we just
> >> * allocate offscreen memory and fake up a pixmap header for it.
> >> */
> >> - rotate_offset = radeon_legacy_allocate_memory(pScrn, &radeon_crtc->crtc_rotate_mem, size, align);
> >> + rotate_offset = radeon_legacy_allocate_memory(pScrn, &radeon_crtc->crtc_rotate_mem, size, align, 0);
> >
> > This should probably be in VRAM.
> >
>
> If exa is using upload to screen to copy cursor to this memory
> location then yes. I don't know how that stuff works.
Cursor? This is a rotated scanout buffer.
> >> @@ -3179,7 +3180,7 @@ RADEONAllocateSurface(
> >> pitch = ((w << 1) + 15) & ~15;
> >> size = pitch * h;
> >>
> >> - offset = radeon_legacy_allocate_memory(pScrn, &surface_memory, size, 64);
> >> + offset = radeon_legacy_allocate_memory(pScrn, &surface_memory, size, 64, 0);
> >> if (offset == 0)
> >> return BadAlloc;
> >
> > And this?
> >
>
> No. It would be better to allocate to GTT.
Why? These 'offscreen surfaces' are intended for direct access from
other PCI(e) devices. I don't think that can work in general if the
surfaces aren't in VRAM. (I'm doubtful about this stuff working at all
with KMS, but still :)
--
Earthling Michel Dänzer | http://www.vmware.com
Libre software enthusiast | Debian, X and DRI developer
More information about the xorg-driver-ati
mailing list