<br><br><div class="gmail_quote">2010/3/24 Michel Dänzer <span dir="ltr"><<a href="mailto:michel@daenzer.net">michel@daenzer.net</a>></span><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div><div></div><div class="h5"><br>
><br>
> so i am assuming that most of the work is done by some X server<br>
> "painting code" to the shadowfb in system ram, which is then DMA'ed to<br>
> the scan out buffer in the GPU VRAM.<br>
<br>
</div></div>It's only DMA'ed if you implemented that. ;)<br></blockquote><div><br>so it is PIO not DMA? just curious, where is the code that is doing that transfer?<br><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="im"><br>
> since i see a big difference between solaris/sparc and linux/x86-64,<br>
> one idea that comes to mind is that there could some overhead due to<br>
> the difference in endianess between the solaris and the linux box.<br>
<br>
</div>I don't think that matters here, we can use GPU facilities to handle<br>
endianness for direct CPU access to the framebuffer.<br>
<br></blockquote><div><br>sorry i dont understand what you mean. are you saying i should no longer use shadowfb?<br> <br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Have you tried profiling the X server in both setups to see where the<br>
cycles are being spent in each case? I'd guess that maybe the x86 CPU is<br>
faster and/or has higher write throughput to VRAM, e.g. due to write<br>
combining.<br>
<font color="#888888"><br></font></blockquote></div><br>i'll try that. thx.<br>-jfs<br><br>