EEPROM guarantees after power loss during a write| page 2

Reply by John Devereux ●February 5, 20082008-02-05

Jim Granville <no.spam@designtools.maps.co.nz> writes:

> John Devereux wrote:
>> larwe <zwsdotcom@gmail.com> writes:
>>
>>
>>>>Say power is lost during a write to a single byte in a page. What can
>>>>I assume? Is just that byte suspect, or the whole page (or the whole
>>>>device)?
>>>
>>>The answer to this question depends rather much on whether your
>>>external brownout protection also asserts the write protect pin...
>>
>>
>> I would like to know the situation where this does not happen (i.e. no
>> external brownout detection).
>>
>> Actually in the case of the AT24C1024, it looks pretty useless
>> anyway. It is active high, which still leaves the question of brownout
>> behaviour open. And the datasheet implies it only provides write
>> protection if asserted *before* the write.
>
> If this is important, it sounds like the sort of thing you
> should run some agressive tests on.
> Make the power fail during a write, and see what happens ?
> All writes have to have a 'hidden erase', so check you can see
> that 'on demand' and then look around for collateral damage....

That could work - at least to answer the question of whether an entire
page is erased as part of a single-byte write. Perhaps a timer with an
output pin hooked up to disconnect power, then vary the delay until I
see something interesting. I don't think manually unplugging the
supply is work, the page write time is 10ms and for all I know the
vulnerable period could be a lot less than this.

> There are FRAMs, and I saw someone just released a 32KB SPI SRAM
> too.
>
> -jg
>

-- 

John Devereux

Reply by Grant Edwards ●February 5, 20082008-02-05

On 2008-02-05, Grant Edwards <grante@visi.com> wrote:
> On 2008-02-05, John Devereux <jdREMOVE@THISdevereux.me.uk> wrote:
>> Hi,
>>
>> I am wondering what guarantees are there for existing EEPROM data,
>> after power is lost during a write operation?
>>
>> I am writing a datalogging routine that writes records to an
>> EEPROM. It's an Atmel 24C1024, although the question is probably
>> applicable to other devices too. This uses "page mode" for writes -
>> the device seems to be organised as 256 byte pages.
>>
>> Say power is lost during a write to a single byte in a page. What can
>> I assume? Is just that byte suspect, or the whole page (or the whole
>> device)? 
>>
>> I can't find any information on this stuff.
>
> Atmel was very up front with me when I e-mailed their support
> address with that exact question.  They said that the byte
> being written to when the power failed will be undefined, but
> everything else will be OK.

I can't find that e-mail, and it could have been a different
vendor (I've used EEPROMs from several different ones).  You
probably should press Atmel for an answer.

-- 
Grant Edwards                   grante             Yow! PEGGY FLEMMING is
                                  at               stealing BASKET BALLS to
                               visi.com            feed the babies in VERMONT.

Reply by Robert Adsett ●February 5, 20082008-02-05

In article <87ve53i7fm.fsf@cordelia.devereux.me.uk>, John Devereux 
says...
> Hi,
> 
> I am wondering what guarantees are there for existing EEPROM data,
> after power is lost during a write operation?
> 
> I am writing a datalogging routine that writes records to an
> EEPROM. It's an Atmel 24C1024, although the question is probably
> applicable to other devices too. This uses "page mode" for writes -
> the device seems to be organised as 256 byte pages.
> 
> Say power is lost during a write to a single byte in a page. What can
> I assume? Is just that byte suspect, or the whole page (or the whole
> device)? 
> 
> The microcontroller has brownout protection, so isn't going to run
> wild - but what about the EEPROM internal state machine? Are they
> generally protected against brownout?

My experience would suggest brownout protection on the devices 
themselves may be minimal.  Brownout protection on the micro may 
actually make the problem worse.  Do you know (is it documented) what 
the state of the micro's pins is during reset as opposed to coming out 
of reset?

> If I write a single byte, does this in fact involve a hidden
> erase/write of the whole page?

Not usually for conventional EE.  If it's flash masquarading as EE.....

> 
> I can't find any information on this stuff.

There does seem to be a fair amount of resistance to providing full 
details.

Let me share a previous experience with EE

   - Environment, bit banged Microwire/SPI, electrically a bit noisy 
(100's of Amps switching near by).  Hold up cap to maintain power when 
it is detected that power is removed.  Power off detect comparitor used 
to let micro know when power was removed.
   - EEProm used to store operating parameters, operating clock and 
faults.  Clocks written every 3 to 6 min to reduce wear on EE to 
tolerable level.  Clock data protected with an ECC code.  Fault flags 
unprotected.  Parameters stored in two blocks each protected by a 
fletcher checksum.  Both banks would be read on startup and if one block 
was bad it would be restored from the other.
   - Writes to EE would check that power was valid before starting.

   - In operation occaisional field returns due to parameter corruption. 

   Results of improvment attempts.  Each one of these resulted in an 
improvement.
   - Hold up cap size increased
   - Write sequence changed so one parameter block completely updated 
with checksum before next written. It should have been written that way 
to begin with of course.
   - Extra decoupling
   - Redundant pull-up (or was it pull down?) on some of the lines.  It 
'shouldn't' have been necessary as I recall.

Although all of these helped, none eliminated the problem.  
Unfortunately it happed rarely enough that we didn't find a way to 
duplicate it in the lab.  As a test I recommended a switch to FRAM to 
reduce the window of vulnerability but that hadn't happened by the time 
I left so I don't know if it would have helped.

Some of the reading I did at the time suggested that if the EE state 
machine were interrupted things could go very wrong.

Try a search for something like reliable EE.  I did find something moons 
ago but as I recall it was from a vendor so judge that as you will.

Robert

-- 
Posted via a free Usenet account from http://www.teranews.com

Reply by Robert Adsett ●February 5, 20082008-02-05

In article <KV2qj.4178$0w.1057@newssvr27.news.prodigy.net>, Vladimir 
Vassilevsky says...
> 
> 
> John Devereux wrote:
> 
> > I am wondering what guarantees are there for existing EEPROM data,
> > after power is lost during a write operation?
> > 
> > I am writing a datalogging routine that writes records to an
> > EEPROM. It's an Atmel 24C1024, although the question is probably
> > applicable to other devices too. This uses "page mode" for writes -
> > the device seems to be organised as 256 byte pages.
> > 
> > Say power is lost during a write to a single byte in a page. What can
> > I assume? Is just that byte suspect, or the whole page (or the whole
> > device)? 
> 
> I don't think anybody can tell for sure what can happen to the flash 
> write state machine when the power goes down at sudden. Hopefully it 
> will not have enough time to destroy the whole device, so something like 
> the journaling file system could help.
> 
> You can also consider the autostore NVRAMs from Simtek:
> 
> http://www.simtek.com/simtekSite.php
> 
> Those parts are designed for the random power outages.
> Works very well indeed.

Except, of course, when they don't.  I had to modify a test bench at one 
point to add a test to write to such devices, power off, power on and 
read the device to see if the values were actually stored.

Apparently there was a bad batch and the only way to check was to power 
cycle them (with them being off for a significant time before repowering 
them).

You may want to add a check like that if you use them.

Robert

-- 
Posted via a free Usenet account from http://www.teranews.com

Reply by John Devereux ●February 6, 20082008-02-06

Robert Adsett <sub2@aeolusdevelopment.com> writes:

> In article <87ve53i7fm.fsf@cordelia.devereux.me.uk>, John Devereux 
> says...
>> Hi,
>> 
>> I am wondering what guarantees are there for existing EEPROM data,
>> after power is lost during a write operation?
>> 
>> I am writing a datalogging routine that writes records to an
>> EEPROM. It's an Atmel 24C1024, although the question is probably
>> applicable to other devices too. This uses "page mode" for writes -
>> the device seems to be organised as 256 byte pages.
>> 
>> Say power is lost during a write to a single byte in a page. What can
>> I assume? Is just that byte suspect, or the whole page (or the whole
>> device)? 
>> 
>> The microcontroller has brownout protection, so isn't going to run
>> wild - but what about the EEPROM internal state machine? Are they
>> generally protected against brownout?
>
> My experience would suggest brownout protection on the devices 
> themselves may be minimal.  Brownout protection on the micro may 
> actually make the problem worse.  Do you know (is it documented) what 
> the state of the micro's pins is during reset as opposed to coming out 
> of reset?

I naively assumed that "brownout protection" would prevent the micro
from sending arbitrary data over the I/O pins. It's an ATMega128. The
EEPROM is an I2C device (with 10k pullups on the 2 wires). The
datasheet does say that the microcontroller I/O pins go to their
"initial state" during a reset, i.e. high impedance inputs. So the I2C
lines should get pulled high. Briefly...

>> If I write a single byte, does this in fact involve a hidden
>> erase/write of the whole page?
>
> Not usually for conventional EE.  If it's flash masquarading as
> EE.....

I don't think so - but it's possible I suppose!

>> 
>> I can't find any information on this stuff.
>
> There does seem to be a fair amount of resistance to providing full 
> details.
>
> Let me share a previous experience with EE
>
>    - Environment, bit banged Microwire/SPI, electrically a bit noisy 
> (100's of Amps switching near by).  Hold up cap to maintain power when 
> it is detected that power is removed.  Power off detect comparitor used 
> to let micro know when power was removed.
>    - EEProm used to store operating parameters, operating clock and 
> faults.  Clocks written every 3 to 6 min to reduce wear on EE to 
> tolerable level.  Clock data protected with an ECC code.  Fault flags 
> unprotected.  Parameters stored in two blocks each protected by a 
> fletcher checksum.  Both banks would be read on startup and if one block 
> was bad it would be restored from the other.
>    - Writes to EE would check that power was valid before starting.
>
>    - In operation occaisional field returns due to parameter corruption. 
>
>    Results of improvment attempts.  Each one of these resulted in an 
> improvement.
>    - Hold up cap size increased
>    - Write sequence changed so one parameter block completely updated 
> with checksum before next written. It should have been written that way 
> to begin with of course.

This is basically what I will be doing (just the software part of the
above).

>    - Extra decoupling
>    - Redundant pull-up (or was it pull down?) on some of the lines.  It 
> 'shouldn't' have been necessary as I recall.
>
> Although all of these helped, none eliminated the problem.  
> Unfortunately it happed rarely enough that we didn't find a way to 
> duplicate it in the lab.  As a test I recommended a switch to FRAM to 
> reduce the window of vulnerability but that hadn't happened by the time 
> I left so I don't know if it would have helped.
>
> Some of the reading I did at the time suggested that if the EE state 
> machine were interrupted things could go very wrong.
>
> Try a search for something like reliable EE.  I did find something moons 
> ago but as I recall it was from a vendor so judge that as you will.

Interesting, thanks for sharing that. 

In my application there are a few hundred units in the field that have
no protection at all. I.e. the software is written ignoring power
failure. And we are not getting problems. But it is obviously a
possibility, so I am attempting to address it. Of course this will add
complexity and be quite awkward to test. If I am not careful I could
introduce a bug that would make things *worse*. So I want to have some
clue that it is worth doing.

-- 

John Devereux

Reply by ssubbarayan ●February 6, 20082008-02-06

On Feb 5, 11:28=A0pm, John Devereux <jdREM...@THISdevereux.me.uk> wrote:
> Hi,
>
> I am wondering what guarantees are there for existing EEPROM data,
> after power is lost during a write operation?
>
> I am writing a datalogging routine that writes records to an
> EEPROM. It's an Atmel 24C1024, although the question is probably
> applicable to other devices too. This uses "page mode" for writes -
> the device seems to be organised as 256 byte pages.
>
> Say power is lost during a write to a single byte in a page. What can
> I assume? Is just that byte suspect, or the whole page (or the whole
> device)?
>
> The microcontroller has brownout protection, so isn't going to run
> wild - but what about the EEPROM internal state machine? Are they
> generally protected against brownout?
>
> If I write a single byte, does this in fact involve a hidden
> erase/write of the whole page?
>
> I can't find any information on this stuff.
>
> --
>
> John Devereux

John,
We encountered the same problem with our product(still
encountering...!).Even though we did not have a right fix,the way we
approached to provide a work around for this:
We implemented a checksum in our software to detect data corruption in
eeprom and incase we find corruption,have a known good copy of eeprom
data backup in ROM.(external flash).This data would be copied back to
the eeprom during bootup.So this ensures customer has good data when
he bootsup.
When wrong data is updated due to brownouts,checksum is prone to vary.
We will backup good data during a situation where we conclude at least
one known set of good data is there.(This can be ascertained again by
comparing with known checksum).

We have used this workaround and after this workaround was
implemented,we never faced any problems with the content of
eeprom.Even though brownout situation still continues to happen,the
impact was greatly minimised.

As far this brownout,like your situation we also did not have either
an external capacitor or an brownout protection pin in our board.We
use ST's eeprom.I have raised a similar query to this a couple of
months bak.Given below is the link:
1)http://groups.google.co.in/group/comp.arch.embedded/browse_thread/
thread/f24017eb1e913ac6/f51e6152809d6293?
hl=3Den&lnk=3Dgst&q=3Dsubbarayan#f51e6152809d6293
2)Regarding checksum:http://groups.google.co.in/group/
comp.arch.embedded/browse_thread/thread/7bb610e206733fdf/
70757e6c50a8dfb6?hl=3Den&lnk=3Dgst&q=3Dsubbarayan#70757e6c50a8dfb6

P.S:ours is an consumer electronics product.Processor:ST,EEPROM:ST's
M24128BW .

This solution may or may not be suitable to you depending on your
product.
Hope this helps,
Regards,
s.subbarayan

Reply by John Devereux ●February 6, 20082008-02-06

ssubbarayan <ssubba@gmail.com> writes:

> On Feb 5, 11:28&nbsp;pm, John Devereux <jdREM...@THISdevereux.me.uk> wrote:
>> Hi,
>>
>> I am wondering what guarantees are there for existing EEPROM data,
>> after power is lost during a write operation?
>>
>> I am writing a datalogging routine that writes records to an
>> EEPROM. It's an Atmel 24C1024, although the question is probably
>> applicable to other devices too. This uses "page mode" for writes -
>> the device seems to be organised as 256 byte pages.
>>
>> Say power is lost during a write to a single byte in a page. What can
>> I assume? Is just that byte suspect, or the whole page (or the whole
>> device)?
>>
>> The microcontroller has brownout protection, so isn't going to run
>> wild - but what about the EEPROM internal state machine? Are they
>> generally protected against brownout?
>>
>> If I write a single byte, does this in fact involve a hidden
>> erase/write of the whole page?
>>
>> I can't find any information on this stuff.
>>
>> --
>>
>> John Devereux
>
> John,
> We encountered the same problem with our product(still
> encountering...!).Even though we did not have a right fix,the way we
> approached to provide a work around for this:
> We implemented a checksum in our software to detect data corruption in
> eeprom and incase we find corruption,have a known good copy of eeprom
> data backup in ROM.(external flash).This data would be copied back to
> the eeprom during bootup.So this ensures customer has good data when
> he bootsup.
> When wrong data is updated due to brownouts,checksum is prone to vary.
> We will backup good data during a situation where we conclude at least
> one known set of good data is there.(This can be ascertained again by
> comparing with known checksum).

This is equivalent to what I was planning. Although I don't think I
need a checksum. I was going to have "valid" markers, separate from
the data blocks. So it would go

  mark copy 1 invalid
  write new copy 1
  mark copy 1 valid
  mark copy 2 invalid
  write new copy 2
  mark copy 2 valid

On power up both copy valid flags would be checked, and any "invalid"
copy overwritten with the valid one. The "copy valid" markers would be
stored on separate pages from the data (and each other), so hopefully
will not get corrupted at the same time as the data they refer to.

Only problem with this is it requires 4 pages to be written instead of
one. Using a checksum to replace the separate flags could mean just
two pages - perhaps that is better after all.

> We have used this workaround and after this workaround was
> implemented,we never faced any problems with the content of
> eeprom.Even though brownout situation still continues to happen,the
> impact was greatly minimised.
> As far this brownout,like your situation we also did not have either
> an external capacitor or an brownout protection pin in our board.We
> use ST's eeprom.I have raised a similar query to this a couple of
> months bak.Given below is the link:
> 1)http://groups.google.co.in/group/comp.arch.embedded/browse_thread/thread/f24017eb1e913ac6/f51e6152809d6293?hl=en&lnk=gst&q=subbarayan#f51e6152809d6293

I will look at these.

By the way, long links often get scrambled up on usenet. You can make
it easier for some people if you enclose in angle brackets

<http://groups.google.co.in/group/comp.arch.embedded/browse_thread/thread/f24017eb1e913ac6/f51e6152809d6293?hl=en&lnk=gst&q=subbarayan#f51e6152809d6293>

This seems to stop them getting split up by news readers.

> 2)Regarding checksum:http://groups.google.co.in/group/
> comp.arch.embedded/browse_thread/thread/7bb610e206733fdf/
> 70757e6c50a8dfb6?hl=en&lnk=gst&q=subbarayan#70757e6c50a8dfb6
>
> P.S:ours is an consumer electronics product.Processor:ST,EEPROM:ST's
> M24128BW .
>
> This solution may or may not be suitable to you depending on your
> product.

Thank you.

> Hope this helps,
> Regards,
> s.subbarayan
>

-- 

John Devereux

Reply by Marra ●February 6, 20082008-02-06

A good way around this problem is to have a power monitor function on
the micro.

If this shows the power is going then you shouldnt write to the
EEPROM.
Depending on the power supply it might give you time  to write one or
more pages of data to the EEPROM.

I used to do work with dataloggers and if the power supply went we had
enough time to write all the data to the EEPROM before the power
supply died.
But we did have a pin on the micro that showed power was dying.
You might even need to beef up the pwoer supply caps to give yo ua bit
longer.

Reply by Robert Adsett ●February 6, 20082008-02-06

In article <87ve52787s.fsf@cordelia.devereux.me.uk>, John Devereux 
says...
> Robert Adsett <sub2@aeolusdevelopment.com> writes:
> 
> > In article <87ve53i7fm.fsf@cordelia.devereux.me.uk>, John Devereux 
> > says...
> >> Hi,
> >> 
> >> I am wondering what guarantees are there for existing EEPROM data,
> >> after power is lost during a write operation?
> >> 
> >> I am writing a datalogging routine that writes records to an
> >> EEPROM. It's an Atmel 24C1024, although the question is probably
> >> applicable to other devices too. This uses "page mode" for writes -
> >> the device seems to be organised as 256 byte pages.
> >> 
> >> Say power is lost during a write to a single byte in a page. What can
> >> I assume? Is just that byte suspect, or the whole page (or the whole
> >> device)? 
> >> 
> >> The microcontroller has brownout protection, so isn't going to run
> >> wild - but what about the EEPROM internal state machine? Are they
> >> generally protected against brownout?
> >
> > My experience would suggest brownout protection on the devices 
> > themselves may be minimal.  Brownout protection on the micro may 
> > actually make the problem worse.  Do you know (is it documented) what 
> > the state of the micro's pins is during reset as opposed to coming out 
> > of reset?
> 
> I naively assumed that "brownout protection" would prevent the micro
> from sending arbitrary data over the I/O pins. It's an ATMega128. The
> EEPROM is an I2C device (with 10k pullups on the 2 wires). The
> datasheet does say that the microcontroller I/O pins go to their
> "initial state" during a reset, i.e. high impedance inputs. So the I2C
> lines should get pulled high. Briefly...

There's another question I've remembered when dealing with brownout.  
Not only the question of whether I/O is the same in reset as on its 
rising edge but also over what range reset is asserted and will hold 
those values.  

The problem can occur (or so I've heard) if the voltage drops to a value 
that the brownout circuit can no longer hold the micro in reset but the 
voltage is still high enough for the EE to be operating.  Not normally 
an issue since most I/O fails when the voltage drops that far anyway but 
appently it can be an issue with some EEs.  And when you have a hold up 
cap any transition through such a zone will be slow.

> > Try a search for something like reliable EE.  I did find something moons 
> > ago but as I recall it was from a vendor so judge that as you will.
> 
> Interesting, thanks for sharing that. 
> 
> In my application there are a few hundred units in the field that have
> no protection at all. I.e. the software is written ignoring power
> failure. And we are not getting problems. But it is obviously a
> possibility, so I am attempting to address it. Of course this will add
> complexity and be quite awkward to test. If I am not careful I could
> introduce a bug that would make things *worse*. So I want to have some
> clue that it is worth doing.

It eill certainly help to have a checksum of some sort on the data if 
you can.  At least then you know something went wrong.  Otherwise if a 
random byte changed would you be able to tell?

If you are not getting problems I'd be tempted to make my first step 
just making sure that problems will be detected if they occur.

Robert

-- 
Posted via a free Usenet account from http://www.teranews.com

Reply by John Devereux ●February 6, 20082008-02-06

Robert Adsett <sub2@aeolusdevelopment.com> writes:

> In article <87ve52787s.fsf@cordelia.devereux.me.uk>, John Devereux 
> says...
>> Robert Adsett <sub2@aeolusdevelopment.com> writes:
>> 
>> > In article <87ve53i7fm.fsf@cordelia.devereux.me.uk>, John Devereux 
>> > says...
>> >> Hi,
>> >> 
>> >> I am wondering what guarantees are there for existing EEPROM data,
>> >> after power is lost during a write operation?
>> >> 
>> >> I am writing a datalogging routine that writes records to an
>> >> EEPROM. It's an Atmel 24C1024, although the question is probably
>> >> applicable to other devices too. This uses "page mode" for writes -
>> >> the device seems to be organised as 256 byte pages.
>> >> 
>> >> Say power is lost during a write to a single byte in a page. What can
>> >> I assume? Is just that byte suspect, or the whole page (or the whole
>> >> device)? 
>> >> 
>> >> The microcontroller has brownout protection, so isn't going to run
>> >> wild - but what about the EEPROM internal state machine? Are they
>> >> generally protected against brownout?
>> >
>> > My experience would suggest brownout protection on the devices 
>> > themselves may be minimal.  Brownout protection on the micro may 
>> > actually make the problem worse.  Do you know (is it documented) what 
>> > the state of the micro's pins is during reset as opposed to coming out 
>> > of reset?
>> 
>> I naively assumed that "brownout protection" would prevent the micro
>> from sending arbitrary data over the I/O pins. It's an ATMega128. The
>> EEPROM is an I2C device (with 10k pullups on the 2 wires). The
>> datasheet does say that the microcontroller I/O pins go to their
>> "initial state" during a reset, i.e. high impedance inputs. So the I2C
>> lines should get pulled high. Briefly...
>
> There's another question I've remembered when dealing with brownout.  
> Not only the question of whether I/O is the same in reset as on its 
> rising edge but also over what range reset is asserted and will hold 
> those values.  
>
> The problem can occur (or so I've heard) if the voltage drops to a value 
> that the brownout circuit can no longer hold the micro in reset but the 
> voltage is still high enough for the EE to be operating.

The problem is that this information does not seem to be available.

>  Not normally an issue since most I/O fails when the voltage drops
> that far anyway but appently it can be an issue with some EEs.  And
> when you have a hold up cap any transition through such a zone will
> be slow.

I was just thinking that a "hold up" cap could be a bad idea in this
respect. Might be best just to get rid of the supply ASAP - the
opposite of a hold up cap, get it through the "dangerous" region
quickly.

>> > Try a search for something like reliable EE.  I did find something moons 
>> > ago but as I recall it was from a vendor so judge that as you will.
>> 
>> Interesting, thanks for sharing that. 
>> 
>> In my application there are a few hundred units in the field that have
>> no protection at all. I.e. the software is written ignoring power
>> failure. And we are not getting problems. But it is obviously a
>> possibility, so I am attempting to address it. Of course this will add
>> complexity and be quite awkward to test. If I am not careful I could
>> introduce a bug that would make things *worse*. So I want to have some
>> clue that it is worth doing.
>
> It eill certainly help to have a checksum of some sort on the data if 
> you can.  At least then you know something went wrong.  Otherwise if a 
> random byte changed would you be able to tell?
>
> If you are not getting problems I'd be tempted to make my first step 
> just making sure that problems will be detected if they occur.


-- 

John Devereux

Previous 123 4 Next

EEPROM guarantees after power loss during a write

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group