Saving data in CPU on-chip EEPROM| page 2

Reply by Grant Edwards ●June 15, 20052005-06-15

On 2005-06-15, Tauno Voipio <tauno.voipio@iki.fi.NOSPAM.invalid> wrote:

>> [...]
>> 
>>>Please note that journaling in this form is ill suited for
>>>Flash memories, as the status byte and journal must be
>>>repetitively overwritten.
>> 
>> Doesn't the same note applie to EEPROM?
>
> Not so far - a Flash must be block erased, which complicates
> the thing quite remarkably.

Right, but you still have to worry about repeatedly erasing a
specific location and exceeding the "number of writes" spec.

-- 
Grant Edwards                   grante             Yow!  Wow! Look!! A stray
                                  at               meatball!! Let's interview
                               visi.com            it!

Reply by Tauno Voipio ●June 15, 20052005-06-15

Grant Edwards wrote:
> On 2005-06-15, Tauno Voipio <tauno.voipio@iki.fi.NOSPAM.invalid> wrote:
> 
> 
>>>[...]
>>>
>>>
>>>>Please note that journaling in this form is ill suited for
>>>>Flash memories, as the status byte and journal must be
>>>>repetitively overwritten.
>>>
>>>Doesn't the same note applie to EEPROM?
>>
>>Not so far - a Flash must be block erased, which complicates
>>the thing quite remarkably.
> 
> 
> Right, but you still have to worry about repeatedly erasing a
> specific location and exceeding the "number of writes" spec.
> 

Right.

That's why I'm using a FRAM for this kind of use.

-- 

Tauno Voipio
tauno voipio (at) iki fi

Reply by Noone ●June 15, 20052005-06-15

yossi_sr wrote:

> I think this approach will sufficiently protect the data corruption
> in EEPROM in case when switching off the Power during write cycle.
> I don't bother against any other source of disturbances. I should
> prefer to improve the hardware rather than to invest in complicated
> software algorithms.
> And probably this approach is reliable enough in this case because
> the data stored is simple and sequential.
>
> What do you think ? Will the described above approach be sufficient?
> Are there any potential errors in logic which I don't see?
> Please let me know before I start to write this software.
> Thanks!

One caution to add.  We used one 8051 microcontroller in the past that
exhibited "random program counter jumps" when powering down.  In the
absence of a voltage supervisor, the processor would randomly execute
program threads.  Since EEPROM erase and program routines were in firmware,
every once in a while the EEPROM would get clobbered.   You can add
protection against this was to not use immediate values to enable
erase/write access.  Use a RAM variable that is defined only for specific
windows.  XOR a register value into the key.  Incrementally build up a
valid access key over a long sequence of code.  A stray bulk erase could
really bite you.

A good way to test for this is to create a dirty power connection to the
processor.  We used a rotating metal can on a rottisiere motor with a
stranded wire rubbing against the body.  Pieces of tape we used to create
islands of insulation.  We fortified the code until we minimized EEPROM
random hits and then added an external voltage supervisor for good measure.

Newer processors usually have some form of internal  brownout/Vdd dropout
detection.  The point of all this is simply:  Test and verify that your
system works the way it is expected.  It is the unexpected events that
really hurt you.

Reply by David Brown ●June 16, 20052005-06-16

Noone wrote:

> 
> One caution to add.  We used one 8051 microcontroller in the past that
> exhibited "random program counter jumps" when powering down.  In the
> absence of a voltage supervisor, the processor would randomly execute
> program threads.  Since EEPROM erase and program routines were in firmware,
> every once in a while the EEPROM would get clobbered.   You can add
> protection against this was to not use immediate values to enable
> erase/write access.  Use a RAM variable that is defined only for specific
> windows.  XOR a register value into the key.  Incrementally build up a
> valid access key over a long sequence of code.  A stray bulk erase could
> really bite you.
> 
> A good way to test for this is to create a dirty power connection to the
> processor.  We used a rotating metal can on a rottisiere motor with a
> stranded wire rubbing against the body.  Pieces of tape we used to create
> islands of insulation.  We fortified the code until we minimized EEPROM
> random hits and then added an external voltage supervisor for good measure.
> 
> Newer processors usually have some form of internal  brownout/Vdd dropout
> detection.  The point of all this is simply:  Test and verify that your
> system works the way it is expected.  It is the unexpected events that
> really hurt you.
> 

I would think that a microcontroller that jumps around randomly on power 
down would cause a lot more worries than just overwriting eeprom !  In 
particular, brown-outs could cause disaster (and even with the best 
power supply, brown-outs can occur - think of users giving the system a 
"quick reset, just to be sure everything is working").  You'd definitely 
want to add an external reset device to such a micro.

Reply by Tauno Voipio ●June 16, 20052005-06-16

David Brown wrote:
> Noone wrote:
> 
>>
>> One caution to add.  We used one 8051 microcontroller in the past that
>> exhibited "random program counter jumps" when powering down.  In the
>> absence of a voltage supervisor, the processor would randomly execute
>> program threads.  Since EEPROM erase and program routines were in 
>> firmware,
>> every once in a while the EEPROM would get clobbered.   You can add
>> protection against this was to not use immediate values to enable
>> erase/write access.  Use a RAM variable that is defined only for specific
>> windows.  XOR a register value into the key.  Incrementally build up a
>> valid access key over a long sequence of code.  A stray bulk erase could
>> really bite you.
>>
>> A good way to test for this is to create a dirty power connection to the
>> processor.  We used a rotating metal can on a rottisiere motor with a
>> stranded wire rubbing against the body.  Pieces of tape we used to create
>> islands of insulation.  We fortified the code until we minimized EEPROM
>> random hits and then added an external voltage supervisor for good 
>> measure.
>>
>> Newer processors usually have some form of internal  brownout/Vdd dropout
>> detection.  The point of all this is simply:  Test and verify that your
>> system works the way it is expected.  It is the unexpected events that
>> really hurt you.
>>
> 
> I would think that a microcontroller that jumps around randomly on power 
> down would cause a lot more worries than just overwriting eeprom !  In 
> particular, brown-outs could cause disaster (and even with the best 
> power supply, brown-outs can occur - think of users giving the system a 
> "quick reset, just to be sure everything is working").  You'd definitely 
> want to add an external reset device to such a micro.

My vote to the voltage monitor, too.

Years ago, I had problems with EEPROM being clobbered despite
of triple redundancy and voting read. The culprit was the 8051
processor turning mad just before passing out when power was
going down.

The problem was completely cured with a voltage monitor / reset
chip.

-- 

Tauno Voipio
tauno voipio (at) iki fi

Reply by Meindert Sprang ●June 16, 20052005-06-16

"Tauno Voipio" <tauno.voipio@iki.fi.NOSPAM.invalid> wrote in message
news:YU9se.48$Jg2.25@read3.inet.fi...
> My vote to the voltage monitor, too.
>
> Years ago, I had problems with EEPROM being clobbered despite
> of triple redundancy and voting read. The culprit was the 8051
> processor turning mad just before passing out when power was
> going down.
>
> The problem was completely cured with a voltage monitor / reset
> chip.

I have seen the same happen with the first 8515 AVR's (the AT90S8515), which
lost or corrupted it's internal EEPROM at power down. The ATmega8515 solved
it with the Brown Out detection.

Meindert

Reply by Anton Erasmus ●June 16, 20052005-06-16

On 14 Jun 2005 13:33:20 -0700, "yossi_sr" <YSrebrnik@kinetics.co.il>
wrote:

>Hi all,
>What happens if during saving the parameter in byte of EEPROM( it takes
>about 5-10 msec)the Power Down occurs.What should be in the byte
>programmed ? The same value/ The next value/ Undetermined?
>I developed the diesel controller card which uses AT89C51ED2 processor(
>2K bytes onchip EEPROM).The system manages failure table for various
>parameters and in case of failure the appropriate
>byte in EEPROM is incremented ( counting failures).Each failure counter
>consists of 2 bytes ( counting up to FFFF).The system may be switched
>off (Power down) occasionally during programming and this may lead to
>errors such as:
>1) when writing ( incrementing) low order byte (its value is less
>   than FF)and Power Down occurs,I am not sure what this byte
>   will contain after next Power On.
>2) when the value in counter is 0x00FF, incrementing means writing
>   0x00 to low order byte and then writing 0x01 to the high
>   order   byte. If Power Down happens after writing the low order
>   byte the system will show error value 0x0000 after next Power
>   On.
>
>One solution may be to use checksum on all block of data, and
>always recalculate the new checksum after each parameter change.If
>the checksum is not OK I can clear all parameters.I don't like this
>solution because in case of one error I should clear all
>the list of failures.
>Is there any other solution or suggestion how to solve this problem?

Use Gray Code for the counters. Only one bit changes between succesive
counts, which of course means only one byte changes. Make sure that if
there is a power failure, you have enough power to at least complete
any write you are busy with. That way if power fails, you either have
incremented the counter, or you have missed only one count.
I have used this method on an AVR that counts operation cycles on a
unit. Of over 300 units that has been operating over the last 4 years,
I have had no corrupt counters.

Regards
  Anton Erasmus

Reply by Jonathan Kirwan ●June 16, 20052005-06-16

On Thu, 16 Jun 2005 10:13:03 +0200, "Meindert Sprang"
<mhsprang@NOcustomSPAMware.nl> wrote:

><snip>
>I have seen the same happen with the first 8515 AVR's (the AT90S8515), which
>lost or corrupted it's internal EEPROM at power down.
><snip>

I've been there, too, with the Atmel AT90S2313 AVRs.

Jon

Reply by Jim Granville ●June 16, 20052005-06-16

Anton Erasmus wrote:
>>One solution may be to use checksum on all block of data, and
>>always recalculate the new checksum after each parameter change.If
>>the checksum is not OK I can clear all parameters.I don't like this
>>solution because in case of one error I should clear all
>>the list of failures.
>>Is there any other solution or suggestion how to solve this problem?
> 
> 
> Use Gray Code for the counters. Only one bit changes between succesive
> counts, which of course means only one byte changes. Make sure that if
> there is a power failure, you have enough power to at least complete
> any write you are busy with. That way if power fails, you either have
> incremented the counter, or you have missed only one count.

  Sounds fine, but what actually happens in the EEPROM update process,
is the buried EE state engine first erases the byte (or page) and then
replaces it with the new value(s).
  Some have Page schemes, but allow single byte replace - that just 
means they ready the page, XOR it with the new info, and write the whole 
page back.

  Thus, even with Gray code, there are finite times, where you have
[OldValue][Erasing to 0FFH][0FFH][Writing NewValueZeroes][NewValue]

  The issue with wayward PgmCtrs ( mainly on ramp down, but also EMC ) 
is one reason there is demand for OTP Flash schemes -> devices that 
_cannot_ update their own code in the field.

-jg

Reply by Jonathan Kirwan ●June 16, 20052005-06-16

On Thu, 16 Jun 2005 12:30:37 +0200, Anton Erasmus
<nobody@spam.prevent.net> wrote:

><snip>
>Use Gray Code for the counters. Only one bit changes between succesive
>counts, which of course means only one byte changes. Make sure that if
>there is a power failure, you have enough power to at least complete
>any write you are busy with. That way if power fails, you either have
>incremented the counter, or you have missed only one count.
>I have used this method on an AVR that counts operation cycles on a
>unit. Of over 300 units that has been operating over the last 4 years,
>I have had no corrupt counters.

I'm interested in the exact details of this.  It is easy for me to
imagine how this might be done, using a routine to read a gray code
from non-volatile memory and convert it into multi-byte binary form in
RAM, where it is incremented., and then converted this result back to
gray code before writing back to non-volatile memory.

But I don't know of a direct method to simply read out the current
gray code value and more directly figure out which byte may have
changed.  It seems to me that the conversion to binary, with an
increment taking place in that domain, is necessary.

The conversion back and forth is relatively easy, but one of the
conversions involves a loop and I'm wondering if there is a method
that does not involve a loop and could be used to operate in an
expression form.

Jon