[xorg] X3100 OpenGL incredibly slow and buggy on 2.2.0
Roland Scheidegger
sroland at tungstengraphics.com
Wed Jan 2 07:27:45 PST 2008
Richard Goedeken wrote:
>> Please read this :
>> http://wiki.cchtml.com/index.php/Glxgears_is_not_a_Benchmark
>>
>> Stephane
>
> You're right, glxgears is not a comprehensive 3D acceleration benchmark, but it
> does measure some narrow window of system hardware performance. Let's look at
> my 2 main systems for comparison:
>
> Desktop PC:
> - 64-bit Gentoo, older stable drivers
> - Gigabyte GA-K8N Ultra-9 socket939 nForce4 Ultra ATX
> - Athlon64 3800 x2 - 2.0GHz 1MB total cache
> - Asus EN6600/TD/256M Silencer
> - GeForce 6600, 256M ram
> - 128 bit DDR2 ram 500MHz, 8.0 GB/s
> - 300MHz core, 8 pixel shaders, 3 vertex shaders
> - PCI Express x16
> - 2GB PC3200 DDR 400MHz SDRAM (6.4 GB/s)
>
> Set-top PC:
> - 64-bit Fedora 8, git drivers for mesa/drm/intel
> - AOpen MiniPC MP965-DR, Intel 965GM chipset
> - Intel Core 2 Duo T7500 2.2GHz Socket P 4MB total cache
> - X3100 graphics
> - shared system memory, total bandwidth 10.7 GB/s
> - 500MHz core, 8 unified shaders
> - 2*1GB dual channel PC5300 DDR2 667MHZ SDRAM
>
> By all measures it looks like the settop box would meet or beat the desktop one.
> But the difference in 3D performance in favor of the desktop is large, not
> small. Even if you write off a factor of 6 (!) times as many fps in glxgears,
> it remains that the N64 emulator I'm working on runs flawlessly with low CPU
> usage on the desktop box but bogs down on the MiniPC. glxinfo says direct
> rendering is on. If someone could suggest a good 3D benchmark for Linux I would
> be willing to run it for an experiment. Something is wrong here. Either the
> shared memory architecture really just clobbers real-world performance or else
> something is far from optimal in the software stack.
Maybe the emulator hits a software fallback somewhere. If it uses a lot
of cpu you could also try oprofile to see where.
But your suggestion the i965 should beat a 6600 is far, far off. The
i965 has 8 unified shaders, iirc this is a single simd-8 array of scalar
units. The 6600 units are not scalars, the 3 vertex shaders are
organized as 1 vec4 + 1 scalar each, and the pixel shaders are vec4 (can
also act as vec2+vec2 or vec3+vec1). Right there that's the equivalent
of 47 scalar units... Not to mention that there are actually 16 vec4
pixel shader units, not 8, since each pixel pipelines has 2 shader alus
(but they are not identical and can't do the same - unit 1 can do
texture lookups OR a mul, unit 2 is alu only which can do mad).
Also, the 6600 may not have a lot of memory bandwidth, but memory
latency is certainly much lower, plus it has way more advanced memory
bandwidth (and pixel processing too) saving methods (early-z, z buffer
compression etc.).
Todays IGPs (be it from intel, amd or nvidia) can at best rival cards
from the nvidia 6200tc or ati x300 hm class, and even that might be a
stretch...
Roland
More information about the xorg
mailing list