Forums

Amtel SAM9 "boot from NAND" is a myth?

Started by Grant Edwards September 15, 2010
On 2010-09-16, Anders.Montonen@kapsi.spam.stop.fi.invalid <Anders.Montonen@kapsi.spam.stop.fi.invalid> wrote:
> Grant Edwards <invalid@invalid.invalid> wrote: >> On 2010-09-16, Anders.Montonen@kapsi.spam.stop.fi.invalid <Anders.Montonen@kapsi.spam.stop.fi.invalid> wrote: >>> Coincidentally Samsung's own K9F1208 is a 3.3V part and guarantees an >>> error-free block zero. >> Ah. That appears to be a lower-desnity small-block part (512Mb). I'm >> not sure what size we told the rep, but it looks like it was 1Gb or >> based on the datasheets I've seen so far. We're also only interested >> in parts we can actually buy, so it's also possible the K9F1208 isn't >> avaialable through distribution. > > They have other parts, like the K9F8G08 (1GB, 4K pages). Dunno about > availability.
Yup, I found those, but was unable to figure how to download a datasheet. So far, it appears to me that older, lower density parts are much more likely to guarantee block 0 w/o requiring ECC than the newer, higher density parts. -- Grant Edwards grant.b.edwards Yow! Hello. Just walk at along and try NOT to think gmail.com about your INTESTINES being almost FORTY YARDS LONG!!
On 16-9-2010 20:46, Grant Edwards wrote:
> On 2010-09-16, Wil Taphoorn <wil@nogo.wtms.nl> wrote: >> One of the specs is that NAND's *must* have at least one guaranteed >> valid block starting at address 0. > > Can you please point to that in the spec? > > All I can find is the definition for one of the fields in the > parameter block that tells how many valid blocks there are at the > beginning of the device: > Nowhere does it say this number can't be 0
> [..] The minimum value for this field is 1h.
Since that field is mandatory this line reads "at least one" to me.
> The blocks are guaranteed to be valid for the endurance > specified for this area (see section 5.6.1.23) when the host > follows the specified number of bits to correct. > > .. and it explicitly says they can require that the host do ECC > for those "garanteed valid" blocks.
Doesn't that mean that the programming device that is writing the boot sector has to verify for errors and, if so, reject the device? -- Wil
On 2010-09-16, Wil Taphoorn <wil@nogo.wtms.nl> wrote:
> On 16-9-2010 20:46, Grant Edwards wrote: >> On 2010-09-16, Wil Taphoorn <wil@nogo.wtms.nl> wrote: >>> One of the specs is that NAND's *must* have at least one guaranteed >>> valid block starting at address 0. >> >> Can you please point to that in the spec? >> >> All I can find is the definition for one of the fields in the >> parameter block that tells how many valid blocks there are at the >> beginning of the device:
>> The blocks are guaranteed to be valid for the endurance specified >> for this area (see section 5.6.1.23) when the host follows the >> specified number of bits to correct. >> >> .. and it explicitly says they can require that the host do ECC >> for those "garanteed valid" blocks. > > Doesn't that mean that the programming device that is writing the > boot sector has to verify for errors and, if so, reject the device?
You mean that block 0 is guaranteed good _if_ the customer throws out any devices they find with a bad block 0? Or, to phrase it differently: "Block 0 is guaranteed to be valid in all devices that have a valid block 0.". That's a statement so meaningless that even George Bush would be proud of it. ;) -- Grant Edwards grant.b.edwards Yow! I smell like a wet at reducing clinic on Columbus gmail.com Day!
On 16-9-2010 22:21, Grant Edwards wrote:
> On 2010-09-16, Wil Taphoorn <wil@nogo.wtms.nl> wrote: >> Doesn't that mean that the programming device that is writing the >> boot sector has to verify for errors and, if so, reject the device? > > You mean that block 0 is guaranteed good _if_ the customer throws out > any devices they find with a bad block 0?
No, I expect that this block can -at least once- be written without any bit errors (i.e. able to boot without ECC considerations). What I meant is that it is up to the design to take risks of reprogramming this boot sector. -- Wil
On 2010-09-16, Wil Taphoorn <wil@nogo.wtms.nl> wrote:
> On 16-9-2010 18:40, Grant Edwards wrote: >> On 2010-09-16, tim.... <tims_new_home@yahoo.co.uk> wrote: >>> >>> But don't all manufactures guarantee this? >> >> Nope. > > See what they (Hynix, Intel, Micron, Phison, SanDisk, Sony, Spansion) > have spec'ed about NAND at http://www.onfi.org > One of the specs is that NAND's *must* have at least one guaranteed > valid block starting at address 0. > >> For example, from Micron's datasheet for the MT29F1G08/MT29F1G16: >> >> * Blocks 0\u20137 (block address 00h-07h) guaranteed to be valid >> with ECC when shipped from factory (3.3V only); see Error >> Management (page 83). > > From the Hynix HY27USXX561A data sheet: > - 3.3V device: VCC = 2.7 to 3.6V : > - The 1st block is guaranteed to be a valid block up to 1K cycles > without ECC
OK, I've gotten more clarification from the hardware guys. One of the requirements is availablility in a BGA package. The above Hynix part appears to be rather old and only available in TSOP. If you look at the newer large-block Hynix parts (which are available in BGA) such as the HY27UF081G2A, the datasheet says: The 1st block is guaranteed to be a valid block up to 1K cycles with ECC. (1bit/528bytes) -- Grant Edwards grant.b.edwards Yow! Psychoanalysis?? at I thought this was a nude gmail.com rap session!!!
On 2010-09-16, Wil Taphoorn <wil@nogo.wtms.nl> wrote:
> On 16-9-2010 22:21, Grant Edwards wrote: >> On 2010-09-16, Wil Taphoorn <wil@nogo.wtms.nl> wrote: >>> Doesn't that mean that the programming device that is writing the >>> boot sector has to verify for errors and, if so, reject the device? >> >> You mean that block 0 is guaranteed good _if_ the customer throws out >> any devices they find with a bad block 0? > > No, I expect that this block can -at least once- be written without any > bit errors (i.e. able to boot without ECC considerations).
It doesn't say that anywhere in the spec. What it says is this: The blocks are guaranteed to be valid for the endurance specified for this area (see section 5.6.1.23) when the host follows the specified number of bits to correct. Note the last phrase: "when the host follows the specified number of bits to correct" The blocks are only guaranteed valid _if_ you do ECC to correct the specified number of bit-errors.
> What I meant is that it is up to the design to take risks of > reprogramming this boot sector.
OK, I understand what you mean. But, that's not what the OneNAND spec says, and the datasheets for many vendor's parts specifically state that you must do ECC if you expect block 0 to be valid. -- Grant Edwards grant.b.edwards Yow! Kids, don't gross me at off ... "Adventures with gmail.com MENTAL HYGIENE" can be carried too FAR!
On 16-9-2010 23:44, Grant Edwards wrote:
> On 2010-09-16, Wil Taphoorn <wil@nogo.wtms.nl> wrote: >> >> No, I expect that this block can -at least once- be written without any >> bit errors (i.e. able to boot without ECC considerations). > > It doesn't say that anywhere in the spec. > > The blocks are guaranteed to be valid for the endurance > specified for this area (see section 5.6.1.23) when the host > follows the specified number of bits to correct. > > Note the last phrase: > > "when the host follows the specified number of bits to correct" > > The blocks are only guaranteed valid _if_ you do ECC to correct the > specified number of bit-errors.
True, "for the endurance specified", AKA "a number of times programmed". But that doesn't mean you can't program it the first time. That is what I meant by "expect", I would not accept a device that flips a bit on the very first time it was programmed. -- Wil
On 2010-09-16, Wil Taphoorn <wil@nogo.wtms.nl> wrote:
> On 16-9-2010 23:44, Grant Edwards wrote: >> On 2010-09-16, Wil Taphoorn <wil@nogo.wtms.nl> wrote: >>> >>> No, I expect that this block can -at least once- be written without any >>> bit errors (i.e. able to boot without ECC considerations). >> >> It doesn't say that anywhere in the spec. >> >> The blocks are guaranteed to be valid for the endurance >> specified for this area (see section 5.6.1.23) when the host >> follows the specified number of bits to correct. >> >> Note the last phrase: >> >> "when the host follows the specified number of bits to correct" >> >> The blocks are only guaranteed valid _if_ you do ECC to correct the >> specified number of bit-errors. > > True, "for the endurance specified", AKA "a number of times > programmed". > > But that doesn't mean you can't program it the first time.
That's immaterial. What's important is that it doesn't mean that you _can_ program it the first time. (without ECC)
> That is what I meant by "expect", I would not accept a device that > flips a bit on the very first time it was programmed.
I know what you meant by "expect", but I doubt that what you expect determines what a fab ships. A bit can fail the first time you program block 0, and it will still meet the spec. That's what matters. You can expect all sorts of things, but if a feature isn't in the part's specification, then it's foolish to design a product that depends on that feature. The last batch of NAND chips I played with had 0 bad blocks. I can "expect" 0 bad blocks all I want, but that's not going to stop the vendor from shipping parts with up to 20 bad blocks out of 1024 next week. A design that relies on NAND parts having 0 bad blocks is a bad idea no matter how hard I expect 100% good blocks. -- Grant
> All "hardware ECC support" I have seen so far is useless for anything > but older, smaller SLC parts. "Hardware ECC support" is doing a Hamming > code in hardware, which can correct a single bit error. Current large > SLC parts, and MLC parts, need a 4- or even 8-bit-correcting code.
IMHO it is actually worse. The way many NAND datasheets are written, they allow for more than just 1 or 4 or 8 bad bits in a block. A certain number of blocks could go away COMPLETELY, and the part would still be in-spec. People commonly expect bad blocks to have more bit errors than their ECC copes with. However, nowhere in the datasheets is a guarantee for this. For what I know, blocks could just as well become all 1. Or all 0. Or return read timeout. Or worse, they could become "unerasable" - stuck at your own previous content (with your headers, and valid ECC!). Now I want to see how your FTL copes with that!
Wil Taphoorn wrote:
> On 16-9-2010 20:46, Grant Edwards wrote: >> The blocks are guaranteed to be valid for the endurance >> specified for this area (see section 5.6.1.23) when the host >> follows the specified number of bits to correct. >> >>.. and it explicitly says they can require that the host do ECC >>for those "garanteed valid" blocks. > > Doesn't that mean that the programming device that is writing the > boot sector has to verify for errors and, if so, reject the device?
I interpret that to mean that the boot sector can consist of X perfectly reliable bits and Y unreliable bits (e.g. permanently zero). The boot loader would then have to ECC-correct the unreliable bits each time it loads, and the manufacturer guarantees that Y doesn't grow above the ECC requirements. Stefan