EmbeddedRelated.com
Forums

How to choose a firmware partner

Started by robi...@tesco.net May 26, 2004
"Anthony Fremont" <spam@anywhere.com> wrote in news:f0utc.14985$lY2.14045
@fe1.texas.rr.com:

> > "Alan Balmer" <albalmer@att.net> wrote > >> Anthony gave a good description. Keep in mind that many of us old >> farts have a tendency to call even semiconductor memory "core." In >> fact, crash dumps are still called core dumps even by people who never >> saw any core <g>. > > You should see the looks I get when talking about "control cards" when > referring to JCL. Every new programmer should have to punch up a deck > and learn why sequence numbers are worth the effort. ;-))) >
/hug card sorter -- Richard
Al Borowski <aj.borowski@erasethis.student.qut.edu.au> wrote in message news:<40b690dd$0$8115$5a62ac22@freenews.iinet.net.au>...
> > > > > 3) Evidently the MCU hardware itself cannot become inherently "stuck" > > because that would make the clearWDT instructions inaccessible and a > > mockery of the whole watchdog scheme. > > Of course it can becom stuck. If the "clearWDT instruction becomes > inaccessible" then something is very wrong. The watchdoing will timeout > and reset the CPU. Thats the whole point!
Yes, that was a dumb of me to "derive". But it suggested this idea:- You fill your MCU with NOPs plus one (RAM less) IO toggle and run it under EM and ES duress. If it latches permanently, you know the core is inherently vulnerable and you don't use that device.
> > If your 12c508 code is expected to be very reliable, then I'd include a > watchdog. > > Al
We have to enable WDT to conform. I hate it because the only post production failure we had was WDT induced: At cold temperatures the (CR) period of the (independent) WDT reduces by 20%. I did not realise this. I set all WDT reset-times at the most infrequent possible (at room temperature). At low temperatures, our system became locked in a constant re-boot cycle. Cheers Robin
Paul Keinanen <keinanen@sci.fi> says...

>Are you sure about the 64 bit memory word length, since the 1108 was a >36 bit machine, so a 64 bit memory word width does not make sense even >with parity or ECC bits.
_A history of Univac computers and Operating Systems_ [ http://www.cs.und.edu/~rmarsh/CLASS/CS451/HANDOUTS/os-unisys.pdf ] says "Just as the first UNIVAC 1108s were being delivered, Sperry Rand announced the 1108 II ... The memory units were for program storage and data storage, each holding up to 262,000 64-bit words." -- Guy Macon, Electronics Engineer & Project Manager for hire. Remember Doc Brown from the _Back to the Future_ movies? Do you have an "impossible" engineering project that only someone like Doc Brown can solve? My resume is at http://www.guymacon.com/
<robin.pain@tesco.net> wrote in message
news:bd24a397.0405280740.146e7b1d@posting.google.com...
> Al Borowski <aj.borowski@erasethis.student.qut.edu.au> wrote in message
news:<40b690dd$0$8115$5a62ac22@freenews.iinet.net.au>...
> You fill your MCU with NOPs plus one (RAM less) IO toggle and run it > under EM and ES duress. If it latches permanently, you know the core > is inherently vulnerable and you don't use that device.
"Latching permanently" - do you mean a) CMOS latchup (which would be down to poor integration); b) other excluded logic state (fixed via a reset); or c) loss of software control (also fixed by a reset)? I don't understand how any of these affect the choice of CPU.
> We have to enable WDT to conform. I hate it because the only post > production failure we had was WDT induced: > > At cold temperatures the (CR) period of the (independent) WDT reduces > by 20%. I did not realise this. I set all WDT reset-times at the most > infrequent possible (at room temperature). > > At low temperatures, our system became locked in a constant re-boot > cycle.
Ah, I see where you're coming from. The lesson here is not to avoid watchdogs; it's to make sure you read the datasheets. I'm afraid this was a hardware/software integration error. The watchdog was doing its best to save your ass, but you misprogrammed it. I'm a bit puzzled by this statement:
>> I set all WDT reset-times at the most infrequent possible (at room
temperature). << How exactly? Surely you didn't use a timer; that would be pretty silly. There are various approaches to watchdog-kicking, but assuming it's an external hardware device kicked via an I/O pin, then a well-proven scheme follows: - Take the I/O pin high in the heartbeat interrupt (e.g. every 5ms or whatever - regular signal) - Take the I/O pin low as the lowest priority task in your roundrobin (or scheduler - i.e. the highest-level non-interrupt driven task). This scheme has the added advantage of giving you a crude but effective CPU utilisation monitor on the I/O pin. Watching this pin with a scope gives you a good view into how busy your CPU is. Steve http://www.sfdesign.co.uk http://www.fivetrees.com
On Fri, 28 May 2004 09:38:51 -0700, Guy Macon
<http://www.guymacon.com> wrote:

> >Paul Keinanen <keinanen@sci.fi> says... > >>Are you sure about the 64 bit memory word length, since the 1108 was a >>36 bit machine, so a 64 bit memory word width does not make sense even >>with parity or ECC bits. > >_A history of Univac computers and Operating Systems_ >[ http://www.cs.und.edu/~rmarsh/CLASS/CS451/HANDOUTS/os-unisys.pdf ] >says "Just as the first UNIVAC 1108s were being delivered, >Sperry Rand announced the 1108 II ... The memory units were >for program storage and data storage, each holding up to 262,000 >64-bit words."
I think that you have misinterpreted that chapter. In my opinion, those "262,000 64-bit words" refer to the Nike-X computer _not_ the 1108 II. My first encounter with a Univac computer was with a 1108 II (installed in 1970/71) in a two day seminar how to use the Univac and the RJE (Remote Job Entry) system. I still remember the first programs I tried on it, some simple factorial and prime programs. It was quite a different thing to run those programs on a 36 bit machine, when I had previously worked only with 16 bit DDP 316/516 machines. If that Univac would have been 64 bit, I surely would have remember that, when even the 36 bit integer range looked like a huge step from 16 bit :-). Paul
<robin.pain@tesco.net> wrote

> We have to enable WDT to conform. I hate it because the only post > production failure we had was WDT induced: > > At cold temperatures the (CR) period of the (independent) WDT reduces > by 20%. I did not realise this. I set all WDT reset-times at the most > infrequent possible (at room temperature).
It's worse than that, read on.
> At low temperatures, our system became locked in a constant re-boot > cycle.
So what you seem to be saying is, "I don't read datasheets and when things go wrong for me it's the manufacturers fault". If you think it can only vary by 20% then you will be rudely educated again. For example on a 16F628 the acceptable limits (according to the datasheet) is 7 - 33mS with 18mS being "typical" (no prescaler assigned). I'm really not trying to be an ass, but you absolutely have to RTFM when working with these things.
"Alan Balmer" <albalmer@att.net> wrote in message
news:h71cb0dd4d4b61a9e1mlfrde8o0v61q842@4ax.com...
> On Thu, 27 May 2004 11:08:28 GMT, bastian42@yahoo.com (42Bastian > Schick) wrote: > > There are interesting pictures at > http://www.pdp8.net/pdp8em/pdp8em.shtml > > -- > Al Balmer > Balmer Consulting > removebalmerconsultingthis@att.net
I actually have a core memory board that looks very similar to the ones in that link. I can't remember how I acquired it, as I certainly have never seen the machine that it came from. This particular one, is made by Litton Memory Products, and is a G645E 8K x 19 bit 3W-3D 18mil Planar Memory (so says the board). I find it interesting to hear from some of the people that worked with these older machines. Thanks for the trip down memory lane (though, not my memories), Mike Anton
Guy Macon wrote:
> > CBFalconer <cbfalconer@yahoo.com> says... > > > >Alan Balmer wrote: > >> > >> We sold battery backed up memory because of this. Still, it was > >> somewhat unreliable, so all our memory was ECC, self-correcting > >> for 1-bit errors, and detecting >1-bit errors. > > > >Still should be. Grumph. Idiotic penny pinchers. > > I am typing this on a Comopaq Prliant 5500R server that I got on eBay. > Corrects all 1-bit errors, detects all 2-bit errors. Has 4 200Mhz > Pentium Pro uPs with 1MB of cache on each one (I will upgrade to > 500Mhz Pentium III Xeons when the price drops) 3GB of ECC RAM, and a > twelve disk SCSI raid array with hotswap drives. All for under $500.
I am curious, how many 1 bit or 2 bit errors has the system reported? BTW, I have watched the pricing on older Intel chips and I don't see the high end parts drop much in price until they are incredibly obsolete. Even then they can start to go back up as they become very scarce. The more mainstream chips seem to keep dropping in price. I guess nobody will pay a lot for an older Celeron or Pentium, but if you have an old server and need to replace a bad Xeon CPU chip, then it is a lot cheaper to do that than to replace the whole unit even at high CPU prices. -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAX
rickman <spamgoeshere4@yahoo.com> says...
> >Guy Macon <http://www.guymacon.com> wrote: >> >> I am typing this on a Comopaq Proliant 5500R server that I got on eBay. >> Corrects all 1-bit errors, detects all 2-bit errors. Has 4 200Mhz >> Pentium Pro uPs with 1MB of cache on each one (I will upgrade to >> 500Mhz Pentium III Xeons when the price drops) 3GB of ECC RAM, and a >> twelve disk SCSI raid array with hotswap drives. All for under $500. > >I am curious, how many 1 bit or 2 bit errors has the system reported?
None, in two years of 24/7 use and two full weeks of running memory diagnostics. With older Compaq servers, if you can measure the error rate it is much too high. -- Guy Macon, Electronics Engineer & Project Manager for hire. Remember Doc Brown from the _Back to the Future_ movies? Do you have an "impossible" engineering project that only someone like Doc Brown can solve? My resume is at http://www.guymacon.com/
Guy Macon wrote:
> rickman <spamgoeshere4@yahoo.com> says... >> Guy Macon <http://www.guymacon.com> wrote: >>> >>> I am typing this on a Comopaq Proliant 5500R server that I got >>> on eBay. Corrects all 1-bit errors, detects all 2-bit errors. >>> Has 4 200Mhz Pentium Pro uPs with 1MB of cache on each one (I >>> will upgrade to 500Mhz Pentium III Xeons when the price drops) >>> 3GB of ECC RAM, and a twelve disk SCSI raid array with hotswap >>> drives. All for under $500. >> >> I am curious, how many 1 bit or 2 bit errors has the system >> reported? > > None, in two years of 24/7 use and two full weeks of running > memory diagnostics. With older Compaq servers, if you can > measure the error rate it is much too high.
I assume that is running under Linux or your own software. AIUI there is no provision in Windoze for recording memory failures and/or corrections. -- fix (vb.): 1. to paper over, obscure, hide from public view; 2. to work around, in a way that produces unintended consequences that are worse than the original problem. Usage: "Windows ME fixes many of the shortcomings of Windows 98 SE". - Hutchison