rabbit-semi | XML Parser| page 2

Reply by Kelly ●October 8, 20042004-10-08

--- In rabbit-semi@rabb..., "Don Starr" <don@s...> wrote: > The test below was run on a 29.4 MHz RCM3000. Code was compiled > and run under DC 7.33TSE, "optimized" for speed. <snip> > Output: > Pure assembly: 199 milliseconds > memcpy(): 311 milliseconds > pointers: 3977 milliseconds > array index: 3580 milliseconds Pure assembly: 4020 KB/sec memcpy(): 2572.3 KB/sec pointers: 201.16 KB/sec indices: 223.46 KB/sec I would never have expected a 20x speed up for assembly over pointer-based C code. Truly amazing. > I'm not sure why one would use the 'pointers' or 'array indices' > versions, unless one was moving data from a circular queue to > another buffer (and even then, it could be handled with, at most, > two memcpy() or ASM LDIR operations). It might happen if you were trying to use a standard C library. Imagine how an off-the-shelf XML library might perform. Kelly

Reply by Scott Henion ●October 8, 20042004-10-08

At 02:53 PM 10/8/2004, you wrote: >Output: > Pure assembly: 199 milliseconds > memcpy(): 311 milliseconds > pointers: 3977 milliseconds > array index: 3580 milliseconds Softools Output: Pure assembly: 196 milliseconds memcpy(): 197 milliseconds pointers: 3675 milliseconds array index: 3030 milliseconds Using far pointers (xmem arrays) fmemcpy(): 197 milliseconds far pointers: 3676 milliseconds ;)

Reply by Don Starr ●October 8, 20042004-10-08

> I'm pretty amazed that pointers were not the fastest... :( > I believe there is a "poiter check" compile option. I'm not at > my rabbit bnech right now. Does turning that off make much of > a timing difference ? Both pointer and array index checking were "on" for the previous test. Here are the results when they're "off": Pure assembly: 199 milliseconds memcpy(): 310 milliseconds pointers: 3978 milliseconds array index: 3580 milliseconds Here are the previous results: > > Output: > > Pure assembly: 199 milliseconds > > memcpy(): 311 milliseconds > > pointers: 3977 milliseconds > > array index: 3580 milliseconds Doesn't look like much of a difference. I haven't looked at the generated ASM code to see what's going on. > I write pointer incrs when I want speed ... obviously I need to > rethink that assumption. Probably depends on the compiler. memcpy() will always be the fastest "straight C" implementation, as long as the library function is written in assembly (using something like LDIR). DC's memcpy() implementation is a bit odd - they check for overlapping memory regions and then user LDIR or LDDR, as appropriate. That's supposed to be a memmove() feature (though the fact that memcpy()'s behavior in such a circumstance is "undefined" makes DC's implementation perfectly legal). Pointers vs. arrays will depend on the compiler. A compiler could optimize pointers such that they were faster than using arrays, or could do the converse. -Don

Reply by Scott Henion ●October 8, 20042004-10-08

At 05:17 PM 10/8/2004, you wrote: >At 02:53 PM 10/8/2004, you wrote: > >Output: > > Pure assembly: 199 milliseconds > > memcpy(): 311 milliseconds > > pointers: 3977 milliseconds > > array index: 3580 milliseconds > >Softools Output: >Pure assembly: 196 milliseconds >memcpy(): 197 milliseconds >pointers: 3675 milliseconds >array index: 3030 milliseconds > >Using far pointers (xmem arrays) >fmemcpy(): 197 milliseconds >far pointers: 3676 milliseconds > >;) Revised results (far results were using a near pointer.) RCM3010 Softools Output: Pure assembly: 196 milliseconds memcpy(): 197 milliseconds pointers: 3676 milliseconds array index: 3030 milliseconds fmemcpy(): 1012 milliseconds far pointers: 7967 milliseconds far array index: 8499 milliseconds So it looks like using far pointers in ST C gets a 2.5x speed penalty. Not bad considering you need to do 64k wrap checks on each pointer and calc seg+offset. But it sure beats using xmem2root() and manually doing buffer swaps. <Scott

Reply by Kelly ●October 9, 20042004-10-09

--- In rabbit-semi@rabb..., Scott Henion <shenion@s...> wrote: > So it looks like using far pointers in ST C gets a 2.5x > speed penalty. Not bad considering you need to do 64k wrap > checks on each pointer and calc seg+offset. But it sure > beats using xmem2root() and manually doing buffer > swaps. Impressive. Dare I suggest someone give Duff's Device a try? http://www.lysator.liu.se/c/duffs-device.html In K&R C, targetting a Vax in 1983: send(to, from, count) register short *to, *from; register count; { register n=(count+7)/8; switch(count%8){ case 0: do{ *to = *from++; case 7: *to = *from++; case 6: *to = *from++; case 5: *to = *from++; case 4: *to = *from++; case 3: *to = *from++; case 2: *to = *from++; case 1: *to = *from++; }while(--n>0); } } Kelly

12Next

XML Parser

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group