EmbeddedRelated.com
Forums
Memfault Beyond the Launch

Flash erase time

Started by Dimiter_Popoff August 14, 2014
I had to replace a flash chip yesterday because it would
not read OK at times - and had a byte fail after programming
about all the 2M of it.
Since I don't do large volumes this is my first ever experience
of the kind and I want to ask the group for some insight.

The part was an Atmel at49bv163d ; it was on a netmca board
which came for repair (torn power cable...) after a few years
at the customer.
After replacing it (with an equivalent) things went back to normal.

The device had been opened and tampered with. I think even the flash
may have been unsoldered and resoldered but it has been done
reasonably clean so I can't say that for sure. [Reverse engineering
the netMCA could be considered only by someone uncapable of
understanding how hopeless the task is, so needing to unsolder
the flash rather than JTAG reading it via the CPU as we do it
here and pose no obstacles to might have been opted for....].

So my question is how likely is it for a flash chip like that
to wear off after 2-3 times initial programming? Yesterday I
did rewrite it may be 10 times but it showed the symptoms from
the very start.

My curiosity is directed towards the flash durability and perhaps
towards what has been done to the device.

Dimiter

------------------------------------------------------
Dimiter Popoff, TGI             http://www.tgi-sci.com
------------------------------------------------------
http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/
On Thu, 14 Aug 2014 12:58:51 +0300, Dimiter_Popoff wrote:

> I had to replace a flash chip yesterday because it would not read OK at > times - and had a byte fail after programming about all the 2M of it. > Since I don't do large volumes this is my first ever experience of the > kind and I want to ask the group for some insight. > > The part was an Atmel at49bv163d ; it was on a netmca board which came > for repair (torn power cable...) after a few years at the customer. > After replacing it (with an equivalent) things went back to normal. > > The device had been opened and tampered with. I think even the flash may > have been unsoldered and resoldered but it has been done reasonably > clean so I can't say that for sure. [Reverse engineering the netMCA > could be considered only by someone uncapable of understanding how > hopeless the task is, so needing to unsolder the flash rather than JTAG > reading it via the CPU as we do it here and pose no obstacles to might > have been opted for....]. > > So my question is how likely is it for a flash chip like that to wear > off after 2-3 times initial programming? Yesterday I did rewrite it may > be 10 times but it showed the symptoms from the very start. > > My curiosity is directed towards the flash durability and perhaps > towards what has been done to the device. > > Dimiter > > ------------------------------------------------------ > Dimiter Popoff, TGI http://www.tgi-sci.com > ------------------------------------------------------ > http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/
Hey Dimiter: I'm more inclined to suspect that whatever took out the flash was a consequence of whatever bumbling they did inside the instrument case, or perhaps the chip was taken out by some problem with the board that they managed to fix. One thing I have learned is that sometimes when a chip dies it will take other chips with it, and not always ones that are either physically or electrically close. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
On 14.8.2014 г. 20:17, Tim Wescott wrote:
> On Thu, 14 Aug 2014 12:58:51 +0300, Dimiter_Popoff wrote: > >> I had to replace a flash chip yesterday because it would not read OK at >> times - and had a byte fail after programming about all the 2M of it. >> Since I don't do large volumes this is my first ever experience of the >> kind and I want to ask the group for some insight. >> >> The part was an Atmel at49bv163d ; it was on a netmca board which came >> for repair (torn power cable...) after a few years at the customer. >> After replacing it (with an equivalent) things went back to normal. >> >> The device had been opened and tampered with. I think even the flash may >> have been unsoldered and resoldered but it has been done reasonably >> clean so I can't say that for sure. [Reverse engineering the netMCA >> could be considered only by someone uncapable of understanding how >> hopeless the task is, so needing to unsolder the flash rather than JTAG >> reading it via the CPU as we do it here and pose no obstacles to might >> have been opted for....]. >> >> So my question is how likely is it for a flash chip like that to wear >> off after 2-3 times initial programming? Yesterday I did rewrite it may >> be 10 times but it showed the symptoms from the very start. >> >> My curiosity is directed towards the flash durability and perhaps >> towards what has been done to the device. >> >> Dimiter >> >> ------------------------------------------------------ >> Dimiter Popoff, TGI http://www.tgi-sci.com >> ------------------------------------------------------ >> http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/ > > Hey Dimiter: > > I'm more inclined to suspect that whatever took out the flash was a > consequence of whatever bumbling they did inside the instrument case, or > perhaps the chip was taken out by some problem with the board that they > managed to fix. > > One thing I have learned is that sometimes when a chip dies it will take > other chips with it, and not always ones that are either physically or > electrically close. >
Hey Tim, the chip was not completely dead, it did program its first 64k or so always correctly and booted; the rest is the "ROM disk", which is used to boot from when the HDD is messed up or new and needs reinstall. Now that "ROM disk" part sometimes failed - usually the first byte, not always at the same address - to program. That byte did program on a second attempt. But when the "disk" was read certain areas of it would fail if the transfer was longer; one particular directory (only) would read correctly when read just for a "dir" or "repair", but would fail when trying to copy it elsewhere after the first 3-4 files, no matter which ones, such that the file starting sector would be corrupted and the driver would try to read some bad address (and get trapped, which is how I knew). It looked as if some sustained transfer would make a part of the flash unstable... but just a part, about 10% of the 2M, the rest worked OK. After replacing the flash things are 100% normal with the board. Can I interpret this as a sign of a flash wear off? The datasheet says 100k times programming is OK.... It has been written to here, including yesterdays tests (mostly then really), a few tens of times, max 20 times I guess. I erase the whole flash by issuing the "erase" command and wait 25 seconds (datasheet says this is the max. erase time), then I program it at 120 uS per byte (datasheet says this is max or sort of), tried to double that time - no change. May be I tried triple as well. Dimiter
On Thu, 14 Aug 2014 12:58:51 +0300, Dimiter_Popoff wrote:

> I had to replace a flash chip yesterday because it would not read OK at > times - and had a byte fail after programming about all the 2M of it. > Since I don't do large volumes this is my first ever experience of the > kind and I want to ask the group for some insight. > > The part was an Atmel at49bv163d ; it was on a netmca board which came > for repair (torn power cable...) after a few years at the customer. > After replacing it (with an equivalent) things went back to normal. > > The device had been opened and tampered with. I think even the flash may > have been unsoldered and resoldered but it has been done reasonably > clean so I can't say that for sure. [Reverse engineering the netMCA > could be considered only by someone uncapable of understanding how > hopeless the task is, so needing to unsolder the flash rather than JTAG > reading it via the CPU as we do it here and pose no obstacles to might > have been opted for....]. > > So my question is how likely is it for a flash chip like that to wear > off after 2-3 times initial programming? Yesterday I did rewrite it may > be 10 times but it showed the symptoms from the very start. > > My curiosity is directed towards the flash durability and perhaps > towards what has been done to the device. > > Dimiter > > ------------------------------------------------------ Dimiter Popoff, > TGI http://www.tgi-sci.com > ------------------------------------------------------ > http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/
Is anything else hanging on the same bus? Maybe that device is driving the bus when it should not. -- Chisolm Republic of Texas
On 14/08/14 19:17, Tim Wescott wrote:
> On Thu, 14 Aug 2014 12:58:51 +0300, Dimiter_Popoff wrote: > >> I had to replace a flash chip yesterday because it would not read OK at >> times - and had a byte fail after programming about all the 2M of it. >> Since I don't do large volumes this is my first ever experience of the >> kind and I want to ask the group for some insight. >> >> The part was an Atmel at49bv163d ; it was on a netmca board which came >> for repair (torn power cable...) after a few years at the customer. >> After replacing it (with an equivalent) things went back to normal. >> >> The device had been opened and tampered with. I think even the flash may >> have been unsoldered and resoldered but it has been done reasonably >> clean so I can't say that for sure. [Reverse engineering the netMCA >> could be considered only by someone uncapable of understanding how >> hopeless the task is, so needing to unsolder the flash rather than JTAG >> reading it via the CPU as we do it here and pose no obstacles to might >> have been opted for....]. >> >> So my question is how likely is it for a flash chip like that to wear >> off after 2-3 times initial programming? Yesterday I did rewrite it may >> be 10 times but it showed the symptoms from the very start. >> >> My curiosity is directed towards the flash durability and perhaps >> towards what has been done to the device. >> >> Dimiter >> >> ------------------------------------------------------ >> Dimiter Popoff, TGI http://www.tgi-sci.com >> ------------------------------------------------------ >> http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/ > > Hey Dimiter: > > I'm more inclined to suspect that whatever took out the flash was a > consequence of whatever bumbling they did inside the instrument case, or > perhaps the chip was taken out by some problem with the board that they > managed to fix. > > One thing I have learned is that sometimes when a chip dies it will take > other chips with it, and not always ones that are either physically or > electrically close. >
A common cause of partly damaged chips is static electricity - if an amateur has changed the chip, then it is not unreasonable to suppose that they were not careful about static. If you (Dimiter) have access to an X-ray machine, you may be able to examine the faulty chip and see if it has visible internal damage.
On 14.8.2014 г. 22:29, David Brown wrote:
> On 14/08/14 19:17, Tim Wescott wrote: >> On Thu, 14 Aug 2014 12:58:51 +0300, Dimiter_Popoff wrote: >> >>> I had to replace a flash chip yesterday because it would not read OK at >>> times - and had a byte fail after programming about all the 2M of it. >>> Since I don't do large volumes this is my first ever experience of the >>> kind and I want to ask the group for some insight. >>> >>> The part was an Atmel at49bv163d ; it was on a netmca board which came >>> for repair (torn power cable...) after a few years at the customer. >>> After replacing it (with an equivalent) things went back to normal. >>> >>> The device had been opened and tampered with. I think even the flash may >>> have been unsoldered and resoldered but it has been done reasonably >>> clean so I can't say that for sure. [Reverse engineering the netMCA >>> could be considered only by someone uncapable of understanding how >>> hopeless the task is, so needing to unsolder the flash rather than JTAG >>> reading it via the CPU as we do it here and pose no obstacles to might >>> have been opted for....]. >>> >>> So my question is how likely is it for a flash chip like that to wear >>> off after 2-3 times initial programming? Yesterday I did rewrite it may >>> be 10 times but it showed the symptoms from the very start. >>> >>> My curiosity is directed towards the flash durability and perhaps >>> towards what has been done to the device. >>> >>> Dimiter >>> >>> ------------------------------------------------------ >>> Dimiter Popoff, TGI http://www.tgi-sci.com >>> ------------------------------------------------------ >>> http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/ >> >> Hey Dimiter: >> >> I'm more inclined to suspect that whatever took out the flash was a >> consequence of whatever bumbling they did inside the instrument case, or >> perhaps the chip was taken out by some problem with the board that they >> managed to fix. >> >> One thing I have learned is that sometimes when a chip dies it will take >> other chips with it, and not always ones that are either physically or >> electrically close. >> > > A common cause of partly damaged chips is static electricity - if an > amateur has changed the chip, then it is not unreasonable to suppose > that they were not careful about static. > > If you (Dimiter) have access to an X-ray machine, you may be able to > examine the faulty chip and see if it has visible internal damage. >
Hi David, I don't have access to such an X-ray (never asked friends though). I am just curious and trying to learn. Basically I have never done hundreds (let alone thousands) of erase/program cycles to a flash like that and I wonder what other peoples experience is. ESD damage might be the cause but nothing made me look at the signal levels at each pin, i.e. I don't think there has been any zapped input. Or may be there was, facing that investigation I just soldered a new flash and moved on :-). Then ESD damage can probably be more subtle than a zapped gate, I suppose. Asking the group for experience is so much easier than a full blown investigation :D . Dimiter
On Thu, 14 Aug 2014 21:10:57 +0300, Dimiter_Popoff wrote:

> On 14.8.2014 г. 20:17, Tim Wescott wrote: >> On Thu, 14 Aug 2014 12:58:51 +0300, Dimiter_Popoff wrote: >> >>> I had to replace a flash chip yesterday because it would not read OK >>> at times - and had a byte fail after programming about all the 2M of >>> it. Since I don't do large volumes this is my first ever experience of >>> the kind and I want to ask the group for some insight. >>> >>> The part was an Atmel at49bv163d ; it was on a netmca board which came >>> for repair (torn power cable...) after a few years at the customer. >>> After replacing it (with an equivalent) things went back to normal. >>> >>> The device had been opened and tampered with. I think even the flash >>> may have been unsoldered and resoldered but it has been done >>> reasonably clean so I can't say that for sure. [Reverse engineering >>> the netMCA could be considered only by someone uncapable of >>> understanding how hopeless the task is, so needing to unsolder the >>> flash rather than JTAG reading it via the CPU as we do it here and >>> pose no obstacles to might have been opted for....]. >>> >>> So my question is how likely is it for a flash chip like that to wear >>> off after 2-3 times initial programming? Yesterday I did rewrite it >>> may be 10 times but it showed the symptoms from the very start. >>> >>> My curiosity is directed towards the flash durability and perhaps >>> towards what has been done to the device. >>> >>> Dimiter >>> >>> ------------------------------------------------------ >>> Dimiter Popoff, TGI http://www.tgi-sci.com >>> ------------------------------------------------------ >>> http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/ >> >> Hey Dimiter: >> >> I'm more inclined to suspect that whatever took out the flash was a >> consequence of whatever bumbling they did inside the instrument case, >> or perhaps the chip was taken out by some problem with the board that >> they managed to fix. >> >> One thing I have learned is that sometimes when a chip dies it will >> take other chips with it, and not always ones that are either >> physically or electrically close. >> >> > Hey Tim, > > the chip was not completely dead, it did program its first 64k or so > always correctly and booted; the rest is the "ROM disk", which is used > to boot from when the HDD is messed up or new and needs reinstall. Now > that "ROM disk" part sometimes failed - usually the first byte, not > always at the same address - to program. That byte did program on a > second attempt. > But when the "disk" was read certain areas of it would fail if the > transfer was longer; one particular directory (only) would read > correctly when read just for a "dir" or "repair", but would fail when > trying to copy it elsewhere after the first 3-4 files, no matter which > ones, such that the file starting sector would be corrupted and the > driver would try to read some bad address (and get trapped, which is how > I knew). It looked as if some sustained transfer would make a part of > the flash unstable... but just a part, > about 10% of the 2M, the rest worked OK. > > After replacing the flash things are 100% normal with the board. > > Can I interpret this as a sign of a flash wear off? The datasheet says > 100k times programming is OK.... It has been written to here, including > yesterdays tests (mostly then really), a few tens of times, max 20 times > I guess. > > I erase the whole flash by issuing the "erase" command and wait 25 > seconds (datasheet says this is the max. erase time), then I program it > at 120 uS per byte (datasheet says this is max or sort of), tried to > double that time - no change. May be I tried triple as well. > > Dimiter
Hey Dimiter: I really don't think that you're wearing out the flash in ten repetitions unless you're managing to do something severely wrong. AFAIK most modern flash chips manage their own write and erase algorithms, so unless it's oddball (or oddball-old) then you should be OK. Doesn't the chip have a way of telling you when it's busy erasing or writing? I'd use that rather than just holding off for a fixed interval. -- www.wescottdesign.com
On 14.8.2014 г. 23:32, tim wrote:
> On Thu, 14 Aug 2014 21:10:57 +0300, Dimiter_Popoff wrote: > >> On 14.8.2014 г. 20:17, Tim Wescott wrote: >>> On Thu, 14 Aug 2014 12:58:51 +0300, Dimiter_Popoff wrote: >>> >>>> I had to replace a flash chip yesterday because it would not read OK >>>> at times - and had a byte fail after programming about all the 2M of >>>> it. Since I don't do large volumes this is my first ever experience of >>>> the kind and I want to ask the group for some insight. >>>> >>>> The part was an Atmel at49bv163d ; it was on a netmca board which came >>>> for repair (torn power cable...) after a few years at the customer. >>>> After replacing it (with an equivalent) things went back to normal. >>>> >>>> The device had been opened and tampered with. I think even the flash >>>> may have been unsoldered and resoldered but it has been done >>>> reasonably clean so I can't say that for sure. [Reverse engineering >>>> the netMCA could be considered only by someone uncapable of >>>> understanding how hopeless the task is, so needing to unsolder the >>>> flash rather than JTAG reading it via the CPU as we do it here and >>>> pose no obstacles to might have been opted for....]. >>>> >>>> So my question is how likely is it for a flash chip like that to wear >>>> off after 2-3 times initial programming? Yesterday I did rewrite it >>>> may be 10 times but it showed the symptoms from the very start. >>>> >>>> My curiosity is directed towards the flash durability and perhaps >>>> towards what has been done to the device. >>>> >>>> Dimiter >>>> >>>> ------------------------------------------------------ >>>> Dimiter Popoff, TGI http://www.tgi-sci.com >>>> ------------------------------------------------------ >>>> http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/ >>> >>> Hey Dimiter: >>> >>> I'm more inclined to suspect that whatever took out the flash was a >>> consequence of whatever bumbling they did inside the instrument case, >>> or perhaps the chip was taken out by some problem with the board that >>> they managed to fix. >>> >>> One thing I have learned is that sometimes when a chip dies it will >>> take other chips with it, and not always ones that are either >>> physically or electrically close. >>> >>> >> Hey Tim, >> >> the chip was not completely dead, it did program its first 64k or so >> always correctly and booted; the rest is the "ROM disk", which is used >> to boot from when the HDD is messed up or new and needs reinstall. Now >> that "ROM disk" part sometimes failed - usually the first byte, not >> always at the same address - to program. That byte did program on a >> second attempt. >> But when the "disk" was read certain areas of it would fail if the >> transfer was longer; one particular directory (only) would read >> correctly when read just for a "dir" or "repair", but would fail when >> trying to copy it elsewhere after the first 3-4 files, no matter which >> ones, such that the file starting sector would be corrupted and the >> driver would try to read some bad address (and get trapped, which is how >> I knew). It looked as if some sustained transfer would make a part of >> the flash unstable... but just a part, >> about 10% of the 2M, the rest worked OK. >> >> After replacing the flash things are 100% normal with the board. >> >> Can I interpret this as a sign of a flash wear off? The datasheet says >> 100k times programming is OK.... It has been written to here, including >> yesterdays tests (mostly then really), a few tens of times, max 20 times >> I guess. >> >> I erase the whole flash by issuing the "erase" command and wait 25 >> seconds (datasheet says this is the max. erase time), then I program it >> at 120 uS per byte (datasheet says this is max or sort of), tried to >> double that time - no change. May be I tried triple as well. >> >> Dimiter > > > Hey Dimiter: > > I really don't think that you're wearing out the flash in ten repetitions > unless you're managing to do something severely wrong. AFAIK most modern > flash chips manage their own write and erase algorithms, so unless it's > oddball (or oddball-old) then you should be OK. > > Doesn't the chip have a way of telling you when it's busy erasing or > writing? I'd use that rather than just holding off for a fixed interval. >
Hey Tim, I think it does have some indication but as far as I remember the point of using it is only to speed things up. Then although I am small I have had a good number of flash chips at the same place and never had a problem. This unit had no problem either when shipped 2 years or so ago. Dimiter
On 14/08/14 21:58, Dimiter_Popoff wrote:
> On 14.8.2014 г. 22:29, David Brown wrote: >> On 14/08/14 19:17, Tim Wescott wrote: >>> On Thu, 14 Aug 2014 12:58:51 +0300, Dimiter_Popoff wrote: >>> >>>> I had to replace a flash chip yesterday because it would not read OK at >>>> times - and had a byte fail after programming about all the 2M of it. >>>> Since I don't do large volumes this is my first ever experience of the >>>> kind and I want to ask the group for some insight. >>>> >>>> The part was an Atmel at49bv163d ; it was on a netmca board which came >>>> for repair (torn power cable...) after a few years at the customer. >>>> After replacing it (with an equivalent) things went back to normal. >>>> >>>> The device had been opened and tampered with. I think even the flash >>>> may >>>> have been unsoldered and resoldered but it has been done reasonably >>>> clean so I can't say that for sure. [Reverse engineering the netMCA >>>> could be considered only by someone uncapable of understanding how >>>> hopeless the task is, so needing to unsolder the flash rather than JTAG >>>> reading it via the CPU as we do it here and pose no obstacles to might >>>> have been opted for....]. >>>> >>>> So my question is how likely is it for a flash chip like that to wear >>>> off after 2-3 times initial programming? Yesterday I did rewrite it may >>>> be 10 times but it showed the symptoms from the very start. >>>> >>>> My curiosity is directed towards the flash durability and perhaps >>>> towards what has been done to the device. >>>> >>>> Dimiter >>>> >>>> ------------------------------------------------------ >>>> Dimiter Popoff, TGI http://www.tgi-sci.com >>>> ------------------------------------------------------ >>>> http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/ >>> >>> Hey Dimiter: >>> >>> I'm more inclined to suspect that whatever took out the flash was a >>> consequence of whatever bumbling they did inside the instrument case, or >>> perhaps the chip was taken out by some problem with the board that they >>> managed to fix. >>> >>> One thing I have learned is that sometimes when a chip dies it will take >>> other chips with it, and not always ones that are either physically or >>> electrically close. >>> >> >> A common cause of partly damaged chips is static electricity - if an >> amateur has changed the chip, then it is not unreasonable to suppose >> that they were not careful about static. >> >> If you (Dimiter) have access to an X-ray machine, you may be able to >> examine the faulty chip and see if it has visible internal damage. >> > > Hi David, > > I don't have access to such an X-ray (never asked friends though). > I am just curious and trying to learn. Basically I have never done > hundreds (let alone thousands) of erase/program cycles to a flash > like that and I wonder what other peoples experience is. > > ESD damage might be the cause but nothing made me look at the > signal levels at each pin, i.e. I don't think there has been > any zapped input. Or may be there was, facing that investigation > I just soldered a new flash and moved on :-). Then ESD damage > can probably be more subtle than a zapped gate, I suppose. > Asking the group for experience is so much easier than a full > blown investigation :D . > > Dimiter >
The trouble is, an event like this is probably a one-off. Without having at least a few cases, there is no way to get a pattern - any theories will therefore be pure guesswork. For my own experience, I have had flashes fail on occasion, but never in a way that suggests some parts fail after only a few cycles. Although flash cycle lifetimes are only guaranteed statistically (i.e., the manufacturer says that on average only 1 in x thousand parts will fail after y thousand cycles), you are not going to get a failure after a few cycles without some serious problem or damage to the part.
Hi Dimiter,

On 8/14/2014 2:58 AM, Dimiter_Popoff wrote:
> I had to replace a flash chip yesterday because it would > not read OK at times - and had a byte fail after programming > about all the 2M of it. > Since I don't do large volumes this is my first ever experience > of the kind and I want to ask the group for some insight. > > The part was an Atmel at49bv163d ; it was on a netmca board > which came for repair (torn power cable...) after a few years > at the customer. > After replacing it (with an equivalent) things went back to normal.
Presumably NOR flash for program storage (i.e., XIP)?
> The device had been opened and tampered with. I think even the flash > may have been unsoldered and resoldered but it has been done > reasonably clean so I can't say that for sure. [Reverse engineering > the netMCA could be considered only by someone uncapable of > understanding how hopeless the task is, so needing to unsolder > the flash rather than JTAG reading it via the CPU as we do it > here and pose no obstacles to might have been opted for....].
Given our past discussions in this regard, I'd be willing to bet this is *exactly* what happened! And, that the adversary didn't really care to "understand" the design; rather, to be able to *duplicate* it (counterfeit). This is actually pretty common :< Sure, the thief can't "support" the product to the extent that they could make "upgrades". But, for an "as is" product, their development costs are *zip*!
> So my question is how likely is it for a flash chip like that > to wear off after 2-3 times initial programming? Yesterday I > did rewrite it may be 10 times but it showed the symptoms from > the very start. > > My curiosity is directed towards the flash durability and perhaps > towards what has been done to the device.
I'd guess they tried to access it in some more "conventional" reader/programmer (e.g., as a "ROM") and screwed the pooch in the process.

Memfault Beyond the Launch