[cairo] New ARMv7-A (NEON) optimisations for Pixman
Jeff Muizelaar
jeff at infidigm.net
Thu May 7 08:15:49 PDT 2009
On Thu, May 07, 2009 at 11:34:22AM +0000, Jonathan Morton wrote:
> One removes the #ifdef magic kludge from Ian's code, and replaces it
> with the autoconf test from my previous patch. Just a bit of cleanup.
>
> The other adds some basic NEON blitters for RGB565 framebuffers,
> covering SRC RGB565 and SRC xRGB8888. On our test hardware they get
> very close to maximum memory bandwidth.
>
> --
> ------
> From: Jonathan Morton
> jonathan.morton at movial.com
>
> >From d972ecbb7c8c19589cbd133a7b2a3a900fa0856c Mon Sep 17 00:00:00 2001
> From: Jonathan Morton <jmorton at sd070.hel.movial.fi>
> Date: Thu, 7 May 2009 11:54:15 +0300
> Subject: [PATCH] Test USE_GCC_INLINE_ASM instead of USE_NEON_INLINE_ASM. The former is now Autoconf enabled, and does what it says on the tin.
Pushed.
> >From f2a9ed3645013b6a95b92887f7a0577fd151f23d Mon Sep 17 00:00:00 2001
> From: Jonathan Morton <jmorton at sd070.hel.movial.fi>
> Date: Thu, 7 May 2009 12:20:02 +0300
> Subject: [PATCH] Add some NEON blitters for 16-bit framebuffers.
>
> ---
> pixman/pixman-arm-neon.c | 237 +++++++++++++++++++++++++++++++++++++++++++++-
> pixman/pixman-arm-neon.h | 30 ++++++
> pixman/pixman-pict.c | 12 +++
> pixman/pixman-utils.c | 1 +
> 4 files changed, 279 insertions(+), 1 deletions(-)
>
> diff --git a/pixman/pixman-arm-neon.c b/pixman/pixman-arm-neon.c
> index 51f7d55..3517d2d 100644
> --- a/pixman/pixman-arm-neon.c
> +++ b/pixman/pixman-arm-neon.c
> @@ -1,5 +1,5 @@
> /*
> - * Copyright © 2009 ARM Ltd
> + * Copyright © 2009 ARM Ltd, Movial Creative Technologies Oy
> *
> * Permission to use, copy, modify, distribute, and sell this software and its
> * documentation for any purpose is hereby granted without fee, provided that
> @@ -21,6 +21,7 @@
> * SOFTWARE.
> *
> * Author: Ian Rickards (ian.rickards at arm.com)
> + * Author: Jonathan Morton (jonathan.morton at movial.com)
> *
> */
>
> @@ -31,6 +32,9 @@
> #include "pixman-arm-neon.h"
>
> #include <arm_neon.h>
> +#include <string.h>
> +#include <stdio.h>
> +#include <assert.h>
I don't think these headers are needed.
>
> static force_inline uint8x8x4_t unpack0565(uint16x8_t rgb)
> @@ -1376,3 +1380,234 @@ fbCompositeSrcAdd_8888x8x8neon (pixman_op_t op,
> }
> }
>
> +#ifdef USE_GCC_INLINE_ASM
> +
> +void
> +fbCompositeSrc_16x16neon (
> + pixman_op_t op,
> + pixman_image_t * pSrc,
> + pixman_image_t * pMask,
> + pixman_image_t * pDst,
> + int16_t xSrc,
> + int16_t ySrc,
> + int16_t xMask,
> + int16_t yMask,
> + int16_t xDst,
> + int16_t yDst,
> + uint16_t width,
> + uint16_t height)
> +{
> + uint16_t *dstLine, *srcLine;
> + uint32_t dstStride, srcStride;
> +
> + if(!height || !width)
> + return;
> +
> + /* We simply copy 16-bit-aligned pixels from one place to another. */
> + fbComposeGetStart (pSrc, xSrc, ySrc, uint16_t, srcStride, srcLine, 1);
> + fbComposeGetStart (pDst, xDst, yDst, uint16_t, dstStride, dstLine, 1);
> +
> + /* Preload the first input scanline */
> + {
> + uint16_t *srcPtr = srcLine;
> + uint32_t count = width;
> +
> + asm volatile (
> + "0: @ loop \n"
> + " SUBS %[count], %[count], #32 \n"
> + " PLD [%[src]] \n"
> + " ADD %[src], %[src], #64 \n"
> + " BGT 0b \n"
> +
I think would be better if you used lowercase assembler nmemonics to be
consistent with the rest of the file.
-Jeff
More information about the cairo
mailing list