EmbeddedRelated.com
Forums

XML Parser

Started by jacob_sullivan October 7, 2004

Does anyone have a recommendation for an XML parser that will work
with Dynamic C? I am looking for an XML library that doesn't need
to be heavily modified in order to compile onto the BL2600.

If someone has experience with XML on the rabbit platform, I'm
interested in performance. I need to board to parse a minimum of
approx 80 to 100 XML documents per second. Each document will be
about 2-3k and will consist of 10-20 elements no more than 3 deep.

If anyone has sample code, that would be very appreciated.

Jake


At 04:17 PM 10/7/2004, you wrote:
>If someone has experience with XML on the rabbit platform, I'm
>interested in performance. I need to board to parse a minimum of
>approx 80 to 100 XML documents per second. Each document will be
>about 2-3k and will consist of 10-20 elements no more than 3 deep.

100*2k/sec docs is 200kB/sec. The rabbit can barely handle just
sending/receiving the data at that rate, much less parse it.

I would use something that would not need parsing.



--- In rabbit-semi@rabb..., Scott Henion <shenion@s...>
wrote:
> At 04:17 PM 10/7/2004, you wrote:
> >If someone has experience with XML on the rabbit platform, I'm
> >interested in performance. I need to board to parse a minimum of
> >approx 80 to 100 XML documents per second. Each document will be
> >about 2-3k and will consist of 10-20 elements no more than 3 deep.
>
> 100*2k/sec docs is 200kB/sec. The rabbit can barely handle just
> sending/receiving the data at that rate, much less parse it.
>
> I would use something that would not need parsing.

I was planning on using 100M ethernet. Are you saying that although
the ethernet interface can handle 100M, the rabbit can't deal with
that much data? (e.g. drinking from a firehose).

I would prefer to not use XML for this system, but due to the
success of the folks marketing XML, I don't really have a choice.



On Thu, 07 Oct 2004 21:09:06 -0000, jacob_sullivan wrote:
>I was planning on using 100M ethernet. Are you saying that although
>the ethernet interface can handle 100M, the rabbit can't deal with
>that much data? (e.g. drinking from a firehose).

That's exactly what he's saying. In fact, if you have a Rabbit module
handy just write a quickt test program that reads the system timer,
moves a block of data (say 8K or 16K) from one data buffer to another.
Read the timer again when done and compute the data transfer rate.
You'll find you can't go much faster than Scott said -- and that's just
moving data from one buffer to another! In fact, the Rabbit can't even
saturate a 10Mbps ethernet connection.

The Rabbit platform is OK for basic (read slow and simple) ethernet
operations, but you wouldn't really want to use it for any serious
network traffic, especially if you have to do any data crunching or
real time control at the same time.

If you really have to do XML parsing on the fly, I'd consider something
more like the Netburner products. They have the CPU horsepower (50-60
MHz Coldfire CPU) and bus bandwidth to get the job done.

Matt Pobursky
Maximum Performance Systems


I'm not the original replyer, but I agree with Scott,
there is not enough horsepower in a rabbit even at
44mhz. You may play some games with dual processors
but you need a processor with much more spunk. Not
what you want maybe, but go for more execution power
cause usually your going to be asked to stuff more
application processing power into the system than is
originally defined. If you have uesed all the
processor capability up from you are ... not in a
position to add more. I would look for another
processor.

Just me, HTH.

LK
--- jacob_sullivan <mail@mail...> wrote:

>
> --- In rabbit-semi@rabb..., Scott Henion
> <shenion@s...>
> wrote:
> > At 04:17 PM 10/7/2004, you wrote:
> > >If someone has experience with XML on the rabbit
> platform, I'm
> > >interested in performance. I need to board to
> parse a minimum of
> > >approx 80 to 100 XML documents per second. Each
> document will be
> > >about 2-3k and will consist of 10-20 elements no
> more than 3 deep.
> >
> > 100*2k/sec docs is 200kB/sec. The rabbit can
> barely handle just
> > sending/receiving the data at that rate, much less
> parse it.
> >
> > I would use something that would not need parsing.
>
> I was planning on using 100M ethernet. Are you
> saying that although
> the ethernet interface can handle 100M, the rabbit
> can't deal with
> that much data? (e.g. drinking from a firehose).
>
> I would prefer to not use XML for this system, but
> due to the
> success of the folks marketing XML, I don't really
> have a choice. >
>

_______________________________




--- In rabbit-semi@rabb..., "jacob_sullivan" <mail@j...>
wrote:
>
> Does anyone have a recommendation for an XML parser that will work
> with Dynamic C?

Yep Jacob I agree with everyone else, I wrote an XML prototype on an
admittedly slower system than the Rabbit, but decided ultimately that
xml parsing was inappropriate for embedded systems. The overhead of
including data tags, and parsing, bumped my streams up by a factor of
about 30 under test.

Now I simply stream raw data to Internet servers, and let them do all
the parsing. Obviously there are issues with converting native data
to PC world, but thats what its all about, and that should be done on
machines that have the computing power to do it properly.

Just my opinion of course, and always interested in alternate.

Actually we decided that XML was not appropriate full stop, later on
in the project, and abandoned it in favour of standard Data base
engines, due to the sheer volume of data we are dealing with.

Perhaps you could give us some idea of what you have in mind ?
We/You the guys, might be able to make some recommendations for you.

Best regards Jimbo CTT systems Irish Republic



>> They have the CPU horsepower (50-60MHz Coldfire CPU) and bus bandwidth to get the job done. <<

That's what I am using to do an XML server, a 66 Mhz 5282 Coldfire.

Ron

----- Original Message -----
From: "Matt Pobursky" <rabbituser@rabb...>
To: <rabbit-semi@rabb...>
Sent: Thursday, October 07, 2004 7:17 PM
Subject: Re: [rabbit-semi] Re: XML Parser >
> On Thu, 07 Oct 2004 21:09:06 -0000, jacob_sullivan wrote:
> > I was planning on using 100M ethernet. Are you saying that although
> > the ethernet interface can handle 100M, the rabbit can't deal with
> > that much data? (e.g. drinking from a firehose).
>
> That's exactly what he's saying. In fact, if you have a Rabbit module
> handy just write a quickt test program that reads the system timer,
> moves a block of data (say 8K or 16K) from one data buffer to another.
> Read the timer again when done and compute the data transfer rate.
> You'll find you can't go much faster than Scott said -- and that's just
> moving data from one buffer to another! In fact, the Rabbit can't even
> saturate a 10Mbps ethernet connection.
>
> The Rabbit platform is OK for basic (read slow and simple) ethernet
> operations, but you wouldn't really want to use it for any serious
> network traffic, especially if you have to do any data crunching or
> real time control at the same time.
>
> If you really have to do XML parsing on the fly, I'd consider something
> more like the Netburner products. They have the CPU horsepower (50-60
> MHz Coldfire CPU) and bus bandwidth to get the job done.
>
> Matt Pobursky
> Maximum Performance Systems > Yahoo! Groups Links >




--- In rabbit-semi@rabb..., Matt Pobursky <rabbituser@m...>
wrote:
> On Thu, 07 Oct 2004 21:09:06 -0000, jacob_sullivan wrote:
> >I was planning on using 100M ethernet. Are you saying that
> > althoughthe ethernet interface can handle 100M, the rabbit
> > can't deal withthat much data? (e.g. drinking from a
firehose).
>
> That's exactly what he's saying. In fact, if you have a Rabbit
> module handy just write a quickt test program that reads the
> system timer, moves a block of data (say 8K or 16K) from one data
> buffer to another. Read the timer again when done and compute the
> data transfer rate. You'll find you can't go much faster than
> Scott said -- and that's just moving data from one buffer to
> another!

While the Ethernet throughput is certainly limited, just moving data
between buffers isn't a problem.

I performed the test above on a RCM3000 running at 29.4 MHz:

unsigned char SrcBuffer[8192], DstBuffer[8192];

int main( void )
{
auto unsigned long startTime, endTime;

startTime = MS_TIMER;

#asm
ld a, 100
loop:
ld de, DstBuffer
ld hl, SrcBuffer
ld bc, 8192
db 0xED, 0xB0 ; LDIR, to avoid DC's 'ldir_func'
dec a
jr nz, loop
#endasm

endTime = MS_TIMER;

printf( "%ld milliseconds\n", endTime - startTime );

return 0;
}

Output:
199 milliseconds

Copying a 8192-byte buffer 100 times takes 199 milliseconds. This is
what I expected, based on the timing of the LDIR instruction. On my
little 29.4 MHz machine, I could move 4 MB per second before I ran
out of CPU cycles. Unless wait states start coming into play, the 44
MHz BL2600 should do even better.

-Don



--- In rabbit-semi@rabb..., "Don Starr" <don@s...> wrote:
> Copying a 8192-byte buffer 100 times takes 199 milliseconds.

With assembly. So I image that it would be almost
impossible to exceed that speed in practice.

What sort of speed do you get if you write it in C?
Either with a pointer or loop index?

I'd do this myself, but I'm at work and don't have my toys here ;)

Kelly



--- In rabbit-semi@rabb..., "Don Starr" <don@s...> wrote:
> > Copying a 8192-byte buffer 100 times takes 199 milliseconds.
>
> With assembly.

Well, nobody placed any limitation on programming language, or
even development tools ;)

> So I image that it would be almost impossible to exceed that speed
> in practice.

Without some DMA hardware, yes.

> What sort of speed do you get if you write it in C?
> Either with a pointer or loop index?

The test below was run on a 29.4 MHz RCM3000. Code was compiled
and run under DC 7.33TSE, "optimized" for speed. A better compiler
would likely yield better results for the pointer and array index
times - the hardware isn't the only limiting factor here (that's
why my first test was pure ASM - it tests only the hardware).

unsigned char SrcBuffer[8192], DstBuffer[8192];
unsigned short int i, j;
unsigned char *pSrc, *pDst;

nodebug void withPureASM(void)
{
#asm
ld a, 100
loop:
ld de, DstBuffer
ld hl, SrcBuffer
ld bc, 8192
db 0xED, 0xB0 ; LDIR
dec a
jr nz, loop
#endasm
}

nodebug void withMemcpy( void )
{
for ( i=0; i<100; i++ )
{
memcpy( DstBuffer, SrcBuffer, sizeof(DstBuffer) );
}
}

nodebug void withPointers( void )
{
for ( i=0; i<100; i++ )
{
pSrc = SrcBuffer;
pDst = DstBuffer;
for ( j=0; j<sizeof(DstBuffer); *pDst++ = *pSrc++, j++ );
}
}

nodebug void withIndices( void )
{
for ( i=0; i<100; i++ )
{
for ( j=0; j<sizeof(DstBuffer); DstBuffer[j]=SrcBuffer[j], j++ );
}
}

int main( void )
{
auto unsigned long startTime, endTime;

startTime = MS_TIMER;
withPureASM();
endTime = MS_TIMER;
printf( "Pure assembly: %ld milliseconds\n", endTime - startTime );

startTime = MS_TIMER;
withMemcpy();
endTime = MS_TIMER;
printf( "memcpy(): %ld milliseconds\n", endTime - startTime );

startTime = MS_TIMER;
withPointers();
endTime = MS_TIMER;
printf( "pointers: %ld milliseconds\n", endTime - startTime );

startTime = MS_TIMER;
withIndices();
endTime = MS_TIMER;
printf( "array index: %ld milliseconds\n", endTime - startTime );

return 0;
}

Output:
Pure assembly: 199 milliseconds
memcpy(): 311 milliseconds
pointers: 3977 milliseconds
array index: 3580 milliseconds

I'm not sure why one would use the 'pointers' or 'array indices'
versions, unless one was moving data from a circular queue to
another buffer (and even then, it could be handled with, at most,
two memcpy() or ASM LDIR operations).

-Don