<br><br><div class="gmail_quote">2010/3/24 Michel Dänzer <span dir="ltr"><<a href="mailto:michel@daenzer.net">michel@daenzer.net</a>></span><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> <div><div></div><div class="h5"><br> ><br> > so i am assuming that most of the work is done by some X server<br> > "painting code" to the shadowfb in system ram, which is then DMA'ed to<br> > the scan out buffer in the GPU VRAM.<br> <br> </div></div>It's only DMA'ed if you implemented that. ;)<br></blockquote><div><br>so it is PIO not DMA? just curious, where is the code that is doing that transfer?<br><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> <div class="im"><br> > since i see a big difference between solaris/sparc and linux/x86-64,<br> > one idea that comes to mind is that there could some overhead due to<br> > the difference in endianess between the solaris and the linux box.<br> <br> </div>I don't think that matters here, we can use GPU facilities to handle<br> endianness for direct CPU access to the framebuffer.<br> <br></blockquote><div><br>sorry i dont understand what you mean. are you saying i should no longer use shadowfb?<br> <br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> Have you tried profiling the X server in both setups to see where the<br> cycles are being spent in each case? I'd guess that maybe the x86 CPU is<br> faster and/or has higher write throughput to VRAM, e.g. due to write<br> combining.<br> <font color="#888888"><br></font></blockquote></div><br>i'll try that. thx.<br>-jfs<br><br>