Hi!<br><br>PXA270 has 32Kb of cache.<br>Each of your arrays is 300Kb long, so you have a lot of cache misses.<br>You should optimize your algorithm. <br><br>Best regards!<br><br>-- Dima<br><br><div class="gmail_quote">On Thu, Feb 4, 2010 at 12:08 PM, Erazem Polutnik <span dir="ltr"><<a href="mailto:erazem@lxnav.com">erazem@lxnav.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Hello,<br>
I create a small test program, which is intend to rotate a memory map for 90<br>
degrees.<br>
<br>
void Test()<br>
{<br>
#define DX 480<br>
#define DY 640<br>
WORD *dst,*src;<br>
src = new WORD[DX*DY];<br>
dst = new WORD[DX*DY];<br>
<br>
int pxm=DX;<br>
int pym=DY;<br>
for(int i=0;i<100;i++) {<br>
for(int py=0;py<pym;py++) {<br>
WORD *p1=dst+(py*DX);<br>
WORD const *p2=src+(pym-py-1);<br>
for(int px=0; px<pxm; px++) {<br>
if(px<DY && py<DX) {<br>
*p1=*p2;<br>
}<br>
p1++;<br>
p2+=DY;<br>
}<br>
}<br>
}<br>
delete [] dst;<br>
delete [] src;<br>
}<br>
<br>
The problem this is running very slow on Toradex Colibri PXA270 linux<br>
version 2.6.27.<br>
It takes 18seconds to finish it.<br>
If I just replace line *p1=*p2 with *p1=0xff, it takes 490ms.<br>
So my assumpation is that reading of "random" memory is very slow.<br>
I there a way to speed-up such "random" reading?<br>
<br>
Many thanks<br>
Erazem<br>
<br>
<br>
_______________________________________________<br>
linux-arm mailing list<br>
<a href="mailto:linux-arm@lists.infradead.org">linux-arm@lists.infradead.org</a><br>
<a href="http://lists.infradead.org/mailman/listinfo/linux-arm" target="_blank">http://lists.infradead.org/mailman/listinfo/linux-arm</a><br>
</blockquote></div><br>