Discussion forum for the BasicX family of microcontroller chips.
|
The following questions aren't related so maybe I should have posted multiple emails |
|
|
|
-----Original Message----- From: Mike Fellinger Sent: Monday, December 13, 1999 9:34 AM To: ' Subject: Questions for NetMedia The following questions aren't related so maybe I should have posted multiple emails. 1. If more serial eeprom is added externally using the eeprom chip select, can I access it using the built-in functions? 2. Could you publish a version of the serial port module that uses explicit queues so formatted output can be directed to any device? I'm never sure I catch all the changes when you publish a new version. 3. Would code execute from the internal parallel EEPROM faster than from the serial EEPROM? If so is there a way to exploit it? mwf |
|
I'm thinking that one or two hot subroutines could be placed in the BX24 onboard 512 bytes to speed them up. Persistent variables can go in the serial EEPROM without a noticeable loss of performance but I can use more speed in some bit bashing IO routines. Sort of explicitly caching code that wants to run faster. mwf -----Original Message----- From: Jack Schoof [SMTP:] Sent: Monday, December 13, 1999 10:23 AM To: Subject: Re: [BasicX] Questions for NetMedia From: "Jack Schoof" < -----Original Message----- From: Mike Fellinger <> To: ' <> Date: Monday, December 13, 1999 9:51 AM Subject: [BasicX] Questions for NetMedia >3. Would code execute from the internal parallel EEPROM faster than >from the serial EEPROM? If so is there a way to exploit it? Yes, our byte fetch occurs at roughly 200,000 per second. If we have a 3 byte instruction, we need to fetch 3 bytes or 66,666 per second. Since we execute 65,000 instructions per second, you can see most of the time is spent fetching the bytes. You dont get something for nothing. Parallel EEprom takes a ton of pins, and is typically way more expensive. >mwf Jack --------------------------- ONElist Sponsor ---------------------------- Hey Freelancers: Find your next project through JobSwarm! You can even make $$$ in your sleep by referring friends. <a href=" http://clickme.onelist.com/ad/jobswarm1 ">Click Here</a ------------------------------------------------------------------------ |
|
-----Original Message----- From: Mike Fellinger <> To: ' <> Date: Monday, December 13, 1999 9:51 AM Subject: [BasicX] Questions for NetMedia >From: Mike Fellinger < > >-----Original Message----- >From: Mike Fellinger >Sent: Monday, December 13, 1999 9:34 AM >To: ' >Subject: Questions for NetMedia > >The following questions aren't related so maybe I should have posted >multiple emails. > >1. If more serial eeprom is added externally using the eeprom chip >select, can I access it using the built-in functions? The EEprom chip select is only meant for a single EEprom. We specifically fetch for a 25256. We have tried 25032's, 25064's, and 25128s also if you need a more cheaper option. If you need additional EEprom, then you will need to use your own chip select and use the SPI commands to access it. Or you could get a I2C EEprom and access it yourself too, that would need 2 pins though. >2. Could you publish a version of the serial port module that uses >explicit queues so formatted output can be directed to any device? I'm >never sure I catch all the changes when you publish a new version. Will look into it. >3. Would code execute from the internal parallel EEPROM faster than >from the serial EEPROM? If so is there a way to exploit it? Yes, our byte fetch occurs at roughly 200,000 per second. If we have a 3 byte instruction, we need to fetch 3 bytes or 66,666 per second. Since we execute 65,000 instructions per second, you can see most of the time is spent fetching the bytes. You dont get something for nothing. Parallel EEprom takes a ton of pins, and is typically way more expensive. >mwf Jack |
|
Unfortunately the fetcher is not smart enough to do that. Jack -----Original Message----- From: Mike Fellinger <> To: ' <> Date: Monday, December 13, 1999 10:18 AM Subject: RE: [BasicX] Questions for NetMedia >From: Mike Fellinger <> > >I'm thinking that one or two hot subroutines could be placed in the BX24 >onboard 512 bytes to speed them up. Persistent variables can go in the >serial EEPROM without a noticeable loss of performance but I can use more >speed in some bit bashing IO routines. Sort of explicitly caching code that >wants to run faster. > >mwf > > -----Original Message----- > From: Jack Schoof [SMTP:] > Sent: Monday, December 13, 1999 10:23 AM > To: > Subject: Re: [BasicX] Questions for NetMedia > > From: "Jack Schoof" < > -----Original Message----- > From: Mike Fellinger <> > To: ' <> > Date: Monday, December 13, 1999 9:51 AM > Subject: [BasicX] Questions for NetMedia > >3. Would code execute from the internal parallel EEPROM faster than > >from the serial EEPROM? If so is there a way to exploit it? > > Yes, our byte fetch occurs at roughly 200,000 per second. If we >have a 3 > byte instruction, we need to fetch 3 bytes or 66,666 per second. >Since we > execute 65,000 instructions per second, you can see most of the time >is > spent fetching the bytes. > > You dont get something for nothing. Parallel EEprom takes a ton of >pins, > and is typically way more expensive. > > > > >mwf > Jack > > --------------------------- ONElist Sponsor >---------------------------- > > Hey Freelancers: Find your next project through JobSwarm! > You can even make $$$ in your sleep by referring friends. > <a href=" http://clickme.onelist.com/ad/jobswarm1 ">Click Here</a >------------------------------------------------------------------------ |
|
Well, think about it as a future enhancement.... If you know where you want speed this would be faster than RAM caching since there is no question of thrashing. mwf -----Original Message----- From: Jack Schoof [SMTP:] Sent: Monday, December 13, 1999 11:46 AM To: Subject: Re: [BasicX] Questions for NetMedia From: "Jack Schoof" < <mailto:> > Unfortunately the fetcher is not smart enough to do that. Jack -----Original Message----- From: Mike Fellinger < <mailto:> > To: ' < <mailto:> > Date: Monday, December 13, 1999 10:18 AM Subject: RE: [BasicX] Questions for NetMedia >From: Mike Fellinger < <mailto:> > > >I'm thinking that one or two hot subroutines could be placed in the BX24 >onboard 512 bytes to speed them up. Persistent variables can go in the >serial EEPROM without a noticeable loss of performance but I can use more >speed in some bit bashing IO routines. Sort of explicitly caching code that >wants to run faster. > >mwf > > -----Original Message----- > From: Jack Schoof [SMTP:] <mailto:[SMTP:]> > Sent: Monday, December 13, 1999 10:23 AM > To: <mailto:> > Subject: Re: [BasicX] Questions for NetMedia > > From: "Jack Schoof" < <mailto:> > -----Original Message----- > From: Mike Fellinger < <mailto:> > > To: ' < <mailto:> > > Date: Monday, December 13, 1999 9:51 AM > Subject: [BasicX] Questions for NetMedia > >3. Would code execute from the internal parallel EEPROM faster than > >from the serial EEPROM? If so is there a way to exploit it? > > Yes, our byte fetch occurs at roughly 200,000 per second. If we >have a 3 > byte instruction, we need to fetch 3 bytes or 66,666 per second. >Since we > execute 65,000 instructions per second, you can see most of the time >is > spent fetching the bytes. > > You dont get something for nothing. Parallel EEprom takes a ton of >pins, > and is typically way more expensive. > > > > >mwf > Jack > > --------------------------- ONElist Sponsor >---------------------------- > > Hey Freelancers: Find your next project through JobSwarm! > You can even make $$$ in your sleep by referring friends. > <a href=" http://clickme.onelist.com/ad/jobswarm1 <http://clickme.onelist.com/ad/jobswarm1> ">Click Here</a >------------------------------------------------------------------------ --------------------------- ONElist Sponsor ---------------------------- Independent contractors: Find your next project gig through JobSwarm! You can even make $$$ by referring friends. <a href=" http://clickme.onelist.com/ad/jobswarm2 <http://clickme.onelist.com/ad/jobswarm2> ">Click Here</a ------------------------------------------------------------------------ |
|
Just slowing down the fetcher to decide whether to do internal EE, external EE, or RAM as in the BX1, slows it down. This is why the BX24 is slightly faster than the BX1 because we only fetch in one spot, with no decisions. Jack -----Original Message----- From: Mike Fellinger <> To: ' <> Date: Monday, December 13, 1999 1:02 PM Subject: RE: [BasicX] Questions for NetMedia >From: Mike Fellinger <> > >Well, think about it as a future enhancement.... If you know where you >want speed this would be faster than RAM caching since there is no question >of thrashing. > >mwf > > -----Original Message----- > From: Jack Schoof [SMTP:] > Sent: Monday, December 13, 1999 11:46 AM > To: > Subject: Re: [BasicX] Questions for NetMedia > > From: "Jack Schoof" < ><mailto:> > > > Unfortunately the fetcher is not smart enough to do that. > Jack > -----Original Message----- > From: Mike Fellinger < <mailto:> > > To: ' < ><mailto:> > > Date: Monday, December 13, 1999 10:18 AM > Subject: RE: [BasicX] Questions for NetMedia > >From: Mike Fellinger < <mailto:> > > > > >I'm thinking that one or two hot subroutines could be >placed in the BX24 > >onboard 512 bytes to speed them up. Persistent variables >can go in the > >serial EEPROM without a noticeable loss of performance but >I can use more > >speed in some bit bashing IO routines. Sort of explicitly >caching code > that > >wants to run faster. > > > >mwf > > > > -----Original Message----- > > From: Jack Schoof [SMTP:] ><mailto:[SMTP:]> > > Sent: Monday, December 13, 1999 10:23 AM > > To: <mailto:> > > Subject: Re: [BasicX] Questions for NetMedia > > > > From: "Jack Schoof" < ><mailto:> > > > > > > > -----Original Message----- > > From: Mike Fellinger < <mailto:> > > > To: ' < ><mailto:> > > > Date: Monday, December 13, 1999 9:51 AM > > Subject: [BasicX] Questions for NetMedia > > > > > > >3. Would code execute from the internal parallel >EEPROM faster than > > >from the serial EEPROM? If so is there a way to >exploit it? > > > > Yes, our byte fetch occurs at roughly 200,000 per second. >If we > >have a 3 > > byte instruction, we need to fetch 3 bytes or 66,666 per >second. > >Since we > > execute 65,000 instructions per second, you can see most >of the time > >is > > spent fetching the bytes. > > > > You dont get something for nothing. Parallel EEprom takes >a ton of > >pins, > > and is typically way more expensive. > > > > > > > >mwf > > > > > > Jack > > > > --------------------------- ONElist Sponsor > >---------------------------- > > > > Hey Freelancers: Find your next project through >JobSwarm! > > You can even make $$$ in your sleep by referring >friends. > > <a href=" http://clickme.onelist.com/ad/jobswarm1 ><http://clickme.onelist.com/ad/jobswarm1> ">Click Here</a> > > > > > >>------------------------------------------------------------------------ > > > > > > --------------------------- ONElist Sponsor >---------------------------- > Independent contractors: Find your next project gig through >JobSwarm! > You can even make $$$ by referring friends. > <a href=" http://clickme.onelist.com/ad/jobswarm2 ><http://clickme.onelist.com/ad/jobswarm2> ">Click Here</a >------------------------------------------------------------------------ |
|
The interpreter should only need overhead on calls and returns, and even that should be small. I can go into detail if you want. mwf -----Original Message----- From: Jack Schoof [SMTP:] Sent: Monday, December 13, 1999 1:29 PM To: Subject: Re: [BasicX] Questions for NetMedia From: "Jack Schoof" <> Just slowing down the fetcher to decide whether to do internal EE, external EE, or RAM as in the BX1, slows it down. This is why the BX24 is slightly faster than the BX1 because we only fetch in one spot, with no decisions. Jack -----Original Message----- From: Mike Fellinger <> To: ' <> Date: Monday, December 13, 1999 1:02 PM Subject: RE: [BasicX] Questions for NetMedia >From: Mike Fellinger <> > >Well, think about it as a future enhancement.... If you know where you >want speed this would be faster than RAM caching since there is no question >of thrashing. > >mwf > > -----Original Message----- > From: Jack Schoof [SMTP:] > Sent: Monday, December 13, 1999 11:46 AM > To: > Subject: Re: [BasicX] Questions for NetMedia > > From: "Jack Schoof" < ><mailto:> > > > Unfortunately the fetcher is not smart enough to do that. > Jack > -----Original Message----- > From: Mike Fellinger < <mailto:> > > To: ' < ><mailto:> > > Date: Monday, December 13, 1999 10:18 AM > Subject: RE: [BasicX] Questions for NetMedia > >From: Mike Fellinger < <mailto:> > > > > >I'm thinking that one or two hot subroutines could be >placed in the BX24 > >onboard 512 bytes to speed them up. Persistent variables >can go in the > >serial EEPROM without a noticeable loss of performance but >I can use more > >speed in some bit bashing IO routines. Sort of explicitly >caching code > that > >wants to run faster. > > > >mwf > > > > -----Original Message----- > > From: Jack Schoof [SMTP:] ><mailto:[SMTP:]> > > Sent: Monday, December 13, 1999 10:23 AM > > To: <mailto:> > > Subject: Re: [BasicX] Questions for NetMedia > > > > From: "Jack Schoof" < ><mailto:> > > > > > > > -----Original Message----- > > From: Mike Fellinger < <mailto:> > > > To: ' < ><mailto:> > > > Date: Monday, December 13, 1999 9:51 AM > > Subject: [BasicX] Questions for NetMedia > > > > > > >3. Would code execute from the internal parallel >EEPROM faster than > > >from the serial EEPROM? If so is there a way to >exploit it? > > > > Yes, our byte fetch occurs at roughly 200,000 per second. >If we > >have a 3 > > byte instruction, we need to fetch 3 bytes or 66,666 per >second. > >Since we > > execute 65,000 instructions per second, you can see most >of the time > >is > > spent fetching the bytes. > > > > You dont get something for nothing. Parallel EEprom takes >a ton of > >pins, > > and is typically way more expensive. > > > > > > > >mwf > > > > > > Jack > > > > --------------------------- ONElist Sponsor > >---------------------------- > > > > Hey Freelancers: Find your next project through >JobSwarm! > > You can even make $$$ in your sleep by referring >friends. > > <a href=" http://clickme.onelist.com/ad/jobswarm1 ><http://clickme.onelist.com/ad/jobswarm1> ">Click Here</a> > > > >>------------------------------------------------------------------------ > > > > > > --------------------------- ONElist Sponsor >---------------------------- > Independent contractors: Find your next project gig through >JobSwarm! > You can even make $$$ by referring friends. > <a href=" http://clickme.onelist.com/ad/jobswarm2 ><http://clickme.onelist.com/ad/jobswarm2> ">Click Here</a >------------------------------------------------------------------------ --------------------------- ONElist Sponsor ---------------------------- Got a question about the Internet? Repairing your PC? Ask a real expert at www.ExpertCentral.com With over 4700 experts, the Web's largest question and answer resource <a href=" http://clickme.onelist.com/ad/expertcentral4 ">Click Here</a ------------------------------------------------------------------------ |
|
The overhead is not just on the calls and returns. The fetcher fetches from an address space. It would be most simple to cause the fetcher to define addresses above say 32767 to be internal EEprom and those below to external EEprom. Just the fact that the fetcher has to decide whether to fetch from internal or external EEprom which is dependent on the program counter on an instruction by instruction basis would slow it down. Once it decides, it could then get the other 2 bytes quickly. We also have to decide if we are task switching or any events have come in like WaitForInterrupt or a timer tick to see what to do. The address space is broken up for ram cache. The speed is greater - maybe 60% - but not that much. Dont forget we are already 16 times faster than a Stamp II and almost 7 times faster than a Stamp II/SX. This is with a processor that is only twice as many MIPS as the PIC. We are pretty darn efficient in our fetching and execution engine. Jack -----Original Message----- From: Mike Fellinger <> To: ' <> Date: Monday, December 13, 1999 1:57 PM Subject: RE: [BasicX] Questions for NetMedia >From: Mike Fellinger <> > >The interpreter should only need overhead on calls and returns, and even >that should be small. I can go into detail if you want. > >mwf |
|
Hi Mike, I experimented with internal vs. external eeprom on the BX01 a few months ago and found that it only made about 20% difference. It probably isn't worth the trouble. Without detracting from the BX-24 itself, I have to say that Netmedia isn't being very honest about execution speed IMHO. Jack's example of fetching at 66,666 bytes/sec and executing at 65k instructions (lines/sec in the ads) is bogus as presented. That time difference is only .384 uS or 2.8 clocks at 7.3728 MHz - a CALL (ICALL OR RCALL) and RET take at least 7 in the 8535 alone. I brought this subject up with Frank on comp.robotics.misc at that time and didn't get much satisfaction. In any case, it doesn't affect the utility of the BX-24, just the reputation of Netmedia. --- Mike Fellinger <> wrote: > From: Mike Fellinger <> > > I'm thinking that one or two hot subroutines could > be placed in the BX24 > onboard 512 bytes to speed them up. Persistent > variables can go in the > serial EEPROM without a noticeable loss of > performance but I can use more > speed in some bit bashing IO routines. Sort of > explicitly caching code that > wants to run faster. > > mwf > > -----Original Message----- > From: Jack Schoof [SMTP:] > Sent: Monday, December 13, 1999 10:23 AM > To: > Subject: Re: [BasicX] Questions for NetMedia > > From: "Jack Schoof" < > -----Original Message----- > From: Mike Fellinger <> > To: ' <> > Date: Monday, December 13, 1999 9:51 AM > Subject: [BasicX] Questions for NetMedia > >3. Would code execute from the internal parallel > EEPROM faster than > >from the serial EEPROM? If so is there a way to > exploit it? > > Yes, our byte fetch occurs at roughly 200,000 per > second. If we > have a 3 > byte instruction, we need to fetch 3 bytes or > 66,666 per second. > Since we > execute 65,000 instructions per second, you can see > most of the time > is > spent fetching the bytes. > > You dont get something for nothing. Parallel > EEprom takes a ton of > pins, > and is typically way more expensive. > > > > >mwf > Jack > > --------------------------- ONElist Sponsor > ---------------------------- > > Hey Freelancers: Find your next project through > JobSwarm! > You can even make $$$ in your sleep by referring > friends. > <a href=" http://clickme.onelist.com/ad/jobswarm1 > ">Click Here</a > ------------------------------------------------------------------------ > > --------------------------- ONElist Sponsor > ---------------------------- > > Independent contractors: Find your next project gig > through JobSwarm! > You can even make $$$ by referring friends. > <a href=" http://clickme.onelist.com/ad/jobswarm2 > ">Click Here</a ------------------------------------------------------------------------ __________________________________________________ |
|
Rich: The PIC processor divides the clock rate. An instruction takes many clocks to execute. The AVR can execute one instruction per clock. A PIC executes an instruction in 200ns at 20Mhz. That means that it takes 4 clocks per instruction. In comparison, the AVR executes an instruction in 136ns at our 7.3728Mhz clock. Almost twice as fast for a third as fast of a crystal. The AVR has a 16 bit word where most PICs have 14 bits. This gives the AVR a much richer instruction set with more addressing modes. The PIC has only 35 instructions where the AVR has maybe 80-100. This means you might have to execute 2 or 3 PIC instructions for what can be done with a single AVR device. You add this all up and the AVR runs circles around the PIC. Everything that I said can also be applied to the SX part too. Just because it runs at 50Mhz means that it just might execute one instruction as fast as the AVR device. Then when you compare the BasicX operating system to the Basic Stamp, there is a world of difference in the methodology and the processing power of our Bcode(tm) instruction set. It was built from the ground up to be a multitasking Basic eating engine. It was also built to be ported to other processors. Imagine a Coldfire or other processor that can be hundreds of MIPS in preformance. BasicX would execute probably 2 to 5 million instructions per second on that type of processor. Someday.... Jack -----Original Message----- From: Johnson, Richard A <> To: 'BasicX List Server' <> Date: Monday, December 13, 1999 4:32 PM Subject: FW: [BasicX] Questions for NetMedia >From: "Johnson, Richard A" <> > >Jack: >So what is the secret of the AVR's interpretive speed? The SX clock speed is greater than 6X the AVR micro used in the BasicX. (i.e SX=50MHx, BX=7.3728MHz) There must be a fundamental software/hardware architectural difference that is pivotal to the BasicX phenominal performance. So what gives? Please oh BasicX guuuuuuuru. > >Cheers... >Rich > >> ---------- >> From: Jack Schoof[SMTP:] >> Reply To: >> Sent: Monday, December 13, 1999 2:45 PM >> To: >> Subject: Re: [BasicX] Questions for NetMedia >> >> From: "Jack Schoof" <> >> >> The overhead is not just on the calls and returns. The fetcher fetches from >> an address space. It would be most simple to cause the fetcher to define >> addresses above say 32767 to be internal EEprom and those below to external >> EEprom. Just the fact that the fetcher has to decide whether to fetch from >> internal or external EEprom which is dependent on the program counter on an >> instruction by instruction basis would slow it down. Once it decides, it >> could then get the other 2 bytes quickly. We also have to decide if we are >> task switching or any events have come in like WaitForInterrupt or a timer >> tick to see what to do. >> >> The address space is broken up for ram cache. The speed is greater - maybe >> 60% - but not that much. Dont forget we are already 16 times faster than a >> Stamp II and almost 7 times faster than a Stamp II/SX. This is with a >> processor that is only twice as many MIPS as the PIC. We are pretty darn >> efficient in our fetching and execution engine. >> >> Jack >> >> -----Original Message----- >> From: Mike Fellinger <> >> To: ' <> >> Date: Monday, December 13, 1999 1:57 PM >> Subject: RE: [BasicX] Questions for NetMedia >> >> >> >From: Mike Fellinger <> >> > >> >The interpreter should only need overhead on calls and returns, and even >> >that should be small. I can go into detail if you want. >> > >> >mwf >> > >> >> |
|
-----Original Message----- From: rob o'roo <> To: <> Date: Monday, December 13, 1999 4:40 PM Subject: RE: [BasicX] Questions for NetMedia >From: rob o'roo <> > >Hi Mike, > >I experimented with internal vs. external eeprom on >the BX01 a few months ago and found that it only made >about 20% difference. It probably isn't worth the >trouble. > >Without detracting from the BX-24 itself, I have to >say that Netmedia isn't being very honest about >execution speed IMHO. Jack's example of fetching at >66,666 bytes/sec and executing at 65k instructions >(lines/sec in the ads) is bogus as presented. That >time difference is only .384 uS or 2.8 clocks at >7.3728 MHz - a CALL (ICALL OR RCALL) and RET take at >least 7 in the 8535 alone. So lets be precise then, if you want to call it bogus. The execution test that we ran was for a 3 byte fetch instruction which is the basic line of code: I = I + 1 where I is an integer and global. The exact numbers are 7372800/4 is the SPI clock rate, /8 for 8 bits makes 230,400 bytes per second. /3 makes 76,800 (not 66,666) which is 13.021 microseconds. For 65,000 instructions to be executed that is 15.385 microseconds per 3 byte instruction. 76,800 fetches per second leaves 2.364 microseconds for checking events, fetching the right location, and executing the code. That is roughly 18 AVR single cycle instructions. And as you pointed out 7 are used by a single call and return. So a call+return, a few 2 cycle tests, and some one cycle things and I have done only 10-12 instructions. So your assignment is to make a basic language interpreter engine handle multitasking and execute a line of code in 12 assembly language instructions or less. I did. Execute the following lines and watch how fast they run on your scope: Dim I as integer ... Call putpin(25,1) I = I + 1 I = I + 1 I = I + 1 I = I + 1 I = I + 1 I = I + 1 I = I + 1 I = I + 1 I = I + 1 I = I + 1 Call putpin(25,0) ... The optimizer needs to be turned on, so that it can see that I = I + 1 can be executed in a single instruction. Subtract the putpin execution time and you will get around 65,000 lines of basic executed per second. We dont cheat and use a bogus two byte instruction that doesnt do much. This goes even faster, try it! Dim X as boolean ... Call putpin(25,1) X = true X = true X = true X = true X = true X = true X = true X = true X = true X = true Call putpin(25,0) ... I didnt measure it yet, but I would guess it goes at 80,000 or more per second. We do not advertise this since it is doesnt do much. I dont know what the instruction is that the Basic Stamp claims goes at 4000 for the BS2 or 10,000 for the BS2/SX. >I brought this subject up with Frank on >comp.robotics.misc at that time and didn't get much >satisfaction. In any case, it doesn't affect the >utility of the BX-24, just the reputation of Netmedia. So hows the reputation? Jack >--- Mike Fellinger <> wrote: >> From: Mike Fellinger <> >> >> I'm thinking that one or two hot subroutines could >> be placed in the BX24 >> onboard 512 bytes to speed them up. Persistent >> variables can go in the >> serial EEPROM without a noticeable loss of >> performance but I can use more >> speed in some bit bashing IO routines. Sort of >> explicitly caching code that >> wants to run faster. >> >> mwf >> >> -----Original Message----- >> From: Jack Schoof [SMTP:] >> Sent: Monday, December 13, 1999 10:23 AM >> To: >> Subject: Re: [BasicX] Questions for NetMedia >> >> From: "Jack Schoof" <> >> >> >> -----Original Message----- >> From: Mike Fellinger <> >> To: ' <> >> Date: Monday, December 13, 1999 9:51 AM >> Subject: [BasicX] Questions for NetMedia >> >> >> >3. Would code execute from the internal parallel >> EEPROM faster than >> >from the serial EEPROM? If so is there a way to >> exploit it? >> >> Yes, our byte fetch occurs at roughly 200,000 per >> second. If we >> have a 3 >> byte instruction, we need to fetch 3 bytes or >> 66,666 per second. >> Since we >> execute 65,000 instructions per second, you can see >> most of the time >> is >> spent fetching the bytes. >> >> You dont get something for nothing. Parallel >> EEprom takes a ton of >> pins, >> and is typically way more expensive. >> >> > >> >mwf >> >> >> Jack |
|
Netmedia's reputation is safe with me. I like my BX-01 and BX-24 and I'm looking forward to the BX-35! These chips allow me to think about robot complexity that would have cost a small fortune using Stamps or most other 'hobbyist' controllers. I am working on a design that requires distributed computing. __________________________________________________ |
|
I'm not trying to start a war over this. I like the BX24 very much. It's just that more speed, more memory, more... is always more interesting. One concept that has been used successfully is to have two or more fetchers each optimized for a memory space. You then explicitly switch fetchers using a "jump to space x, address y" code. Interrupts are constrained to begin execution in one space for simplicity. This trades a lot of checking for a little code space and carries the address space implicitly in the hardware instruction pointer. Another concept that can speed interpretation is to move task-switching into a few of the byte codes and not handle it in the fetcher at all. In the original basic that included "let" every statement began with a keyword. Only these keywords need to check for task switching events. This delays events slightly but significantly increases overall execution rate. The execution speed increase is generally enough to finish the event earlier even if it started later. Basicx can't loop within a statement so you could apply this concept to the language. mwf -----Original Message----- From: Jack Schoof [SMTP:] Sent: Monday, December 13, 1999 3:45 PM To: Subject: Re: [BasicX] Questions for NetMedia From: "Jack Schoof" <> The overhead is not just on the calls and returns. The fetcher fetches from an address space. It would be most simple to cause the fetcher to define addresses above say 32767 to be internal EEprom and those below to external EEprom. Just the fact that the fetcher has to decide whether to fetch from internal or external EEprom which is dependent on the program counter on an instruction by instruction basis would slow it down. Once it decides, it could then get the other 2 bytes quickly. We also have to decide if we are task switching or any events have come in like WaitForInterrupt or a timer tick to see what to do. The address space is broken up for ram cache. The speed is greater - maybe 60% - but not that much. Dont forget we are already 16 times faster than a Stamp II and almost 7 times faster than a Stamp II/SX. This is with a processor that is only twice as many MIPS as the PIC. We are pretty darn efficient in our fetching and execution engine. Jack -----Original Message----- From: Mike Fellinger <> To: ' <> Date: Monday, December 13, 1999 1:57 PM Subject: RE: [BasicX] Questions for NetMedia >From: Mike Fellinger <> > >The interpreter should only need overhead on calls and returns, and even >that should be small. I can go into detail if you want. > >mwf --------------------------- ONElist Sponsor ---------------------------- Hey Freelancers: Find your next project through JobSwarm! You can even make $$$ in your sleep by referring friends. <a href=" http://clickme.onelist.com/ad/jobswarm1 ">Click Here</a ------------------------------------------------------------------------ |
|
Jack, I'm not knocking the BX-24, but that's the lamest excuse for a benchmark I've ever seen. No wonder Frank wouldn't post it on the newsgroups. optimizer on, using global int variables: increment or decrement an int, 15 uS <- OK you're right 65,000+ ops/sec!! int + or - an 8 bit literal (I = I + 0), 47 uS int + or - an 16 bit literal (I = I + 256), 52 uS int + int (I = I + J), 53 uS int assignment (I = J or I = I), 30 uS (all of the above were actually dead code and could have been removed completely, and in your example the ten INCs could have been folded into one ADD LIT8 for 3x the speed if not pruned as dead) Putpin (25,0) or (25,1), 44 uS Unconditional branch (DO..LOOP), 40 uS > Execute the following lines and watch how fast they > run on your scope: > > Dim I as integer > ... > Call putpin(25,1) > I = I + 1 > I = I + 1 > I = I + 1 > I = I + 1 > I = I + 1 > I = I + 1 > I = I + 1 > I = I + 1 > I = I + 1 > I = I + 1 > Call putpin(25,0) > ... > > The optimizer needs to be turned on, so that it can > see that I = I + 1 can > be executed in a single instruction. > > Subtract the putpin execution time and you will get > around 65,000 lines of > basic executed per second. We dont cheat and use a > bogus two byte > instruction that doesnt do much. __________________________________________________ |
|
> From: rob o'roo <> > [...] > I'm not knocking the BX-24, but that's the lamest > excuse for a benchmark I've ever seen. No wonder > Frank wouldn't post it on the newsgroups. > > optimizer on, using global int variables: > increment or decrement an int, 15 uS <- OK you're > right 65,000+ ops/sec!! > > [optimization examples deleted] > > (all of the above were actually dead code and could > have been removed completely, [...] Which is precisely why I said that expressing speed in terms of "lines of code per second" is not particularly useful. The presence of an optimizer makes those numbers all but meaningless. -- Frank Manning -- NetMedia, Inc |
|
Our benchmark was written to show the speed differences between the BasicX and the Stamp. This example program was used because it was the only compatible thing that we could get the Stamp2 to run at their claimed speed of 4000 instructions per second. It's not like we can compare the stamps 32 Bit math or floating point speed with ours. Like Jack mentioned, we could have used Boolean statements and claimed a much higher figure of 80,000 + , But we didn't. Chris From: rob o'roo <> Jack, I'm not knocking the BX-24, but that's the lamest excuse for a benchmark I've ever seen. No wonder Frank wouldn't post it on the newsgroups. optimizer on, using global int variables: increment or decrement an int, 15 uS <- OK you're right 65,000+ ops/sec!! int + or - an 8 bit literal (I = I + 0), 47 uS int + or - an 16 bit literal (I = I + 256), 52 uS int + int (I = I + J), 53 uS int assignment (I = J or I = I), 30 uS (all of the above were actually dead code and could have been removed completely, and in your example the ten INCs could have been folded into one ADD LIT8 for 3x the speed if not pruned as dead) Putpin (25,0) or (25,1), 44 uS Unconditional branch (DO..LOOP), 40 uS > Execute the following lines and watch how fast they > run on your scope: > > Dim I as integer > ... > Call putpin(25,1) > I = I + 1 > I = I + 1 > I = I + 1 > I = I + 1 > I = I + 1 > I = I + 1 > I = I + 1 > I = I + 1 > I = I + 1 > I = I + 1 > Call putpin(25,0) > ... > > The optimizer needs to be turned on, so that it can > see that I = I + 1 can > be executed in a single instruction. > > Subtract the putpin execution time and you will get > around 65,000 lines of > basic executed per second. We dont cheat and use a > bogus two byte > instruction that doesnt do much. __________________________________________________ --------------------------- ONElist Sponsor ---------------------------- Your opinion counts, and we'll reward you for it! Win Prizes! Join the ZOOMERANG Internet survey panel today. It's confidential, easy, and fun! <a href=" http://clickme.onelist.com/ad/MarketTools2A ">Click Here</a ------------------------------------------------------------------------ |
|
-----Original Message----- From: rob o'roo <> To: <> Date: Tuesday, December 14, 1999 12:40 PM Subject: Re: [BasicX] Questions for NetMedia >From: rob o'roo <> > >Jack, > >I'm not knocking the BX-24, but that's the lamest >excuse for a benchmark I've ever seen. No wonder >Frank wouldn't post it on the newsgroups. > >optimizer on, using global int variables: >increment or decrement an int, 15 uS <- OK you're >right 65,000+ ops/sec!! >int + or - an 8 bit literal (I = I + 0), 47 uS 4 instructions are generated for this action, a push of the literal, a push of I and the add and the pop into I. >int + or - an 16 bit literal (I = I + 256), 52 uS Same 4 instructions they are slightly longer due to 16 bits not 8. >int + int (I = I + J), 53 uS Ditto >int assignment (I = J or I = I), 30 uS 2 Instructions, a push and a pop. >(all of the above were actually dead code and could >have been removed completely, and in your example the >ten INCs could have been folded into one ADD LIT8 for >3x the speed if not pruned as dead) The optimizer does not look for dead code. If you would like to write one please do so, you can take in basic and emit basic. You can sell it to VB people too! >Putpin (25,0) or (25,1), 44 uS Pushes the arguments and executes an internal system call >Unconditional branch (DO..LOOP), 40 uS Long jump bcode instruction The benchmark shows how many instructions per second we can do. It just so happens that a line of basic code can be codeified into a single instruction. Just like the Stamp uses to measure their chip. Even so, we blow them away. I = I + 1 is one of the most useful instructions there is!!! It is the heart of every for loop. Or when people count things. Or when walking down an array, etc, etc, etc... We wanted that as fast as possible. Or would you rather it be 47uS? Way more than enough said on this subject. Jack |