Forums

Crc16 on power failure

Started by maxt...@libero.it January 18, 2007
David R Brooks wrote:
> CBFalconer wrote: >> Francois Grieu wrote: >>> CBFalconer <cbfalconer@yahoo.com> wrote: >>> >>>> If you can find my old release of CCITCRC >>>> for CP/M the code will be in there. It was also included in some >>>> Turbo Pascal units I released way back when. If you do find any of >>>> these, please let me know. >>> The best I can find is what appears to be your object code >>> http://www.e-tech.net/~pbetti/mirrors/www.retroarchive.org/cpm/cdrom/LAMBDA/SOUNDPOT/A/CCITCRC.LBR >>> >> >> That appears to be the object code and documentation in crunched >> and squeezed format. With some effort I can extract and >> disassemble parts of it, but not now. Why did people strip the >> source from the libraries! >> > I've unpacked it to CCITCRC.CZM and CCITCRC.DZC (thanks, Z80EMU :) > If anyone can tell me what the correct unpacker is for ?Z? files, I'll > take it the rest of the way, & post the results. (I'm sure the relevant > decompressor is on my Walnut Creek CD: I just forget which it is). > > I may even try running a Z80 disassembler on the .COM >
OK, I found the decompressor (uncr233.exe). I've put the documentation & the disassembled source on my website: www.iinet.net.au/~daveb/buffer/ccitcrc.zip I'll leave it to others to tease out the interesting bit :)
CBFalconer wrote:
> David R Brooks wrote: >> CBFalconer wrote: >> > ... snip ... >>> That appears to be the object code and documentation in crunched >>> and squeezed format. With some effort I can extract and >>> disassemble parts of it, but not now. Why did people strip the >>> source from the libraries! >> I've unpacked it to CCITCRC.CZM and CCITCRC.DZC (thanks, Z80EMU :) >> If anyone can tell me what the correct unpacker is for ?Z? files, I'll >> take it the rest of the way, & post the results. (I'm sure the relevant >> decompressor is on my Walnut Creek CD: I just forget which it is). >> >> I may even try running a Z80 disassembler on the .COM > > If you are running a CPeMulator, you can get LT31 from my page. It > will unpack them all. > > <http://cbfalconer.home.att.net/download/cpm/> > > If you disassemble just search the results for the DAA > instruction. I think it occurs twice in the subroutine, and > nowhere else. >
Hmm, having disassembled that code, I can't find a DAA instruction in it? Maybe the disassembler is acting up, but it doesn't seem so. As per my other post, that code is at http://members.iinet.com.au/~daveb/buffer/ccitcrc.zip
David R Brooks wrote:
> CBFalconer wrote: >> David R Brooks wrote: >>> CBFalconer wrote: >>> >> ... snip ... >>>> That appears to be the object code and documentation in crunched >>>> and squeezed format. With some effort I can extract and >>>> disassemble parts of it, but not now. Why did people strip the >>>> source from the libraries! >>> >>> I've unpacked it to CCITCRC.CZM and CCITCRC.DZC (thanks, Z80EMU :) >>> If anyone can tell me what the correct unpacker is for ?Z? files, I'll >>> take it the rest of the way, & post the results. (I'm sure the relevant >>> decompressor is on my Walnut Creek CD: I just forget which it is). >>> >>> I may even try running a Z80 disassembler on the .COM >> >> If you are running a CPeMulator, you can get LT31 from my page. It >> will unpack them all. >> >> <http://cbfalconer.home.att.net/download/cpm/> >> >> If you disassemble just search the results for the DAA >> instruction. I think it occurs twice in the subroutine, and >> nowhere else. >> > Hmm, having disassembled that code, I can't find a DAA instruction > in it? Maybe the disassembler is acting up, but it doesn't seem so. > As per my other post, that code is at > http://members.iinet.com.au/~daveb/buffer/ccitcrc.zip
I can't either. It may be a very early version, before I found the high speed code. Too bad the library had been repacked, as I normally put date stamps in my LBRs. I may pass the source through id2id to make things more readable. Your disassembler did a nice job, whose is it?. I can spot my techniques in its output quite nicely. You might want to try id2id-20 also. Available on my download page. -- <http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt> "A man who is right every time is not likely to do very much." -- Francis Crick, co-discover of DNA "There is nothing more amazing than stupidity in action." -- Thomas Matthews
CBFalconer wrote:
> David R Brooks wrote: >> CBFalconer wrote: >>> David R Brooks wrote: >>>> CBFalconer wrote: >>>> >>> ... snip ... >>>>> That appears to be the object code and documentation in crunched >>>>> and squeezed format. With some effort I can extract and >>>>> disassemble parts of it, but not now. Why did people strip the >>>>> source from the libraries! >>>> I've unpacked it to CCITCRC.CZM and CCITCRC.DZC (thanks, Z80EMU :) >>>> If anyone can tell me what the correct unpacker is for ?Z? files, I'll >>>> take it the rest of the way, & post the results. (I'm sure the relevant >>>> decompressor is on my Walnut Creek CD: I just forget which it is). >>>> >>>> I may even try running a Z80 disassembler on the .COM >>> If you are running a CPeMulator, you can get LT31 from my page. It >>> will unpack them all. >>> >>> <http://cbfalconer.home.att.net/download/cpm/> >>> >>> If you disassemble just search the results for the DAA >>> instruction. I think it occurs twice in the subroutine, and >>> nowhere else. >>> >> Hmm, having disassembled that code, I can't find a DAA instruction >> in it? Maybe the disassembler is acting up, but it doesn't seem so. >> As per my other post, that code is at >> http://members.iinet.com.au/~daveb/buffer/ccitcrc.zip > > I can't either. It may be a very early version, before I found the > high speed code. Too bad the library had been repacked, as I > normally put date stamps in my LBRs. > > I may pass the source through id2id to make things more readable. > Your disassembler did a nice job, whose is it?. I can spot my > techniques in its output quite nicely. You might want to try > id2id-20 also. Available on my download page. >
That disassembler is one I wrote myself, many years ago. I could post it, if anyone's interested.
In article <1169136383.738070.263090@11g2000cwr.googlegroups.com>, "maxthebaz@libero.it" <maxthebaz@libero.it> wrote:
>Our machines have this requirement: if power failure occurs, many >important variables are to be resumed from where they were interrupted >after the machine is restarted (power on in this case). In other words, >the basic idea is to keep a snapshot of the state machine before it is >interrupted. >The board is provided with: >- a 32-bit H8S/2633 Hitachi microprocessor; >- a battery-backed memory (BBM), where these variables are stored; BBM >area involved is about 16 Kbytes (the whole BBM has a 128KB capability) >- 2 big capacitors; if a blackout occurs, they guarantee a 400 msec >(Tsave) extra power supply time. >When power supply is going to fall down, a function is invoked by power >failure NMI. This function, within Tsave time, has to perform the >following main operations: >- it calculates CRC16 checksum for the BBM variable area (for our 16KB, >this requires a long time: 90 msec!). >- it saves the CRC16 checksum in BBM (of course, in a different BBM >address from the previous variable area). >Then, when machine is re-started, a new checksum of the interested BBM >area is performed: the result is compared with the previous stored one. >If they differ, a BBM corruption is assumed (error detection). > >Now I am seeking a better solution: the target is to reduce the 2 big >capacitors, i.e. to reduce Tsave time. The reason is to save space (and >money) by reducing them. I'm looking for a way to anticipate CRC16 >calculation in a safe and fast way, before power failure.
One technique that I have used in the past is to connect the backup battery to the CPU through a pair of diodes and some kind of switch (either a MOSFET or a relay). When the main power goes out, the CPU can save its state to BBM and then turn off the switch. This way it doesn't matter if you need 90 ms or 400 ms. Backup capacitors are not required at all and the net effect on battery life is minimal. This of course assumes that your battery has a high enough voltage and low enough ESR to run your CPU. --Tom.
>How does even the crudest of checksum schemes fail to detect a single bit >error? 1 bit of parity will detect 100% of single bit errors.
So, if you can limit your errors to only one bit, you can use whatever is the fastest. That's what we need, an error bit reducer! :-)
>In the OP's application checking the integrity of 16kB of battery backed >RAM why would one particularly expect single bit errors or bursts of >errors?
Because they might happen. How can you accurately predict what will happen to the contents of RAM if a battery is getting low, or is intermittent?
> What is the concept of a burst when you can arbitrarily choose how >to arrange this large array of bits into a stream?
I am pretty sure Crenshaw's article was written with serial communications in mind. But, memories can also fail such that as bytes are processed, there are groups or "bursts" of errors. Yes, you can arrange the array any way you want, but how do you know how to arrange it for better error detection?
>What I would particularly expect is complete corruption because the battery >failed or was disconnected or the system crashed and never generated a >checksum. I would consider the possibility of partial corruption perhaps >from a failing battery to be pretty slim.
Then you have to ask yourself if you are willing to take the chance at having an error go undetected. You might be right about the corruption patterns in memory (although I have seen fairly random looking ones too), but what happens if it does not fail that way?
>IMO the OP would be better off using a larger simpler checksum, he should >also guard against possible complete corruption states (all 0's, all 1', >alternate 0/0xFF's ?) giving a valid checksum. He should probably include >and check a 'magic' number in the data area and perhaps link this to >firmware version as a firmware upgrade may render the RAM image corrupt >despite a valid checksum. He should probably invalidate the checksum after >use to reduce the possibility that partial non-random data changes followed >by a crash leave the old checksum valid.
There are several "tricks" you can play in an attempt to reduce the computing cost of a high-reliability detection scheme. I know, I have used many of them. But all of them are compromises. The question that needs to be asked before any compromise is made is, "What error rate can I afford to tolerate?" IMO, the OP should start with a CRC calculation and try every possible way to make it work. Then, if it is not possible to compute the CRC, he should look at the next best compromise to see what it delivers in terms of errors detected and at what computing cost.
Mr. C wrote:

> > I am pretty sure Crenshaw's article was written with serial > communications in mind. But, memories can also fail such that as > bytes are processed, there are groups or "bursts" of errors. Yes, you > can arrange the array any way you want, but how do you know how to > arrange it for better error detection?
The definition of 'burst', as it is used in articles about CRC checksums, is a series of consecutive bits, that are *all* in error. So, a burst error of 8 bits may turn 0x00 into 0xff, or 0xaa into 0x55. Now, errors often tend to occur in bursts, but that doesn't mean each individual bit is neatly inverted. A burst error that happens in typical communication links may have a high probability of toggling a bit, but usually that probability is less than 50%. For example, you may have a case where a burst of bits is all cleared, turning 0xaa into 0x00... This is however not a 'burst error' that CRC codes are supposed to protect against. In the case of memories, I'd expect burst errors that invert groups of bits to be rare. Instead, I would expect blocks of memory to be zeroed, overwritten with the wrong data/garbage, or randomized by power loss. CRC codes aren't especially suited for any of these cases.
Arlet <usenet+5@c-scape.nl> wrote:

> Mr. C wrote: > > > I am pretty sure Crenshaw's article was written with serial > > communications in mind. But, memories can also fail such that as > > bytes are processed, there are groups or "bursts" of errors. Yes, you > > can arrange the array any way you want, but how do you know how to > > arrange it for better error detection? > > The definition of 'burst', as it is used in articles about CRC > checksums, is a series of consecutive bits, that are *all* in error.
No it isn't. A "burst error" is one in which the first and last bits are flipped, and the ones in the middle are randomly correct or wrong. A burst error of length N may contain any number of single bit errors from 2 to N. (Reference: Tanenbaum's "Computer Networks", all editions so far. It is on page 196 of my 4th edition, in section 3.2.2 "Error-Detecting Codes". My first edition has it on page 129, section 3.5.3.)
> Now, errors often tend to occur in bursts, but that doesn't mean each > individual bit is neatly inverted. A burst error that happens in > typical communication links may have a high probability of toggling a > bit, but usually that probability is less than 50%. For example, you > may have a case where a burst of bits is all cleared, turning 0xaa into > 0x00... This is however not a 'burst error' that CRC codes are supposed > to protect against.
Yes it is. A proper 16-bit CRC will detect errors in a transmission as follows (from earlier editions of Tanenbaum): - All single bit errors. - All burst errors of length 16 or less. - 99.997% of 17-bit burst errors. - 99.998% of 18-bit or longer burst errors. This is based on the definition of a burst error being as I described: the first and last bit are flipped, all others in between are random. There is a critical point, however: the CRC can only meet these standards for a single error within a transmission (either a single bit error or a single burst error). For example, if there were two single bit errors, with 1000 valid bits in between, that is a 1002 bit burst error, not two single bit errors, so the CRC has a 99.998% chance of catching it.
> In the case of memories, I'd expect burst errors that invert groups of > bits to be rare. Instead, I would expect blocks of memory to be zeroed, > overwritten with the wrong data/garbage, or randomized by power loss. > CRC codes aren't especially suited for any of these cases.
Agreed, for the most part (ignoring the "inversion" aspect). If you are using a CRC to detect memory errors, you have to consider the memory as a bit array, with bit order within bytes determined by the order in which bytes are shifted into the CRC calculation (low order bit first is more common). If there are any errors in the memory, then the distance between the first and last error bits determines the size of the "burst error", and a 16-bit CRC is only guaranteed to detect an error where a single bit is flipped, or where all errors occur within 16 bits of each other (either two adjacent bytes, or straddling two byte boundaries). For anything further apart it only has a high probability of detecting the error. There are some tricks you can pull. For example, all memory being zeroed is a common error situation. If you preset your CRC to zero, append the CRC to the data and expect a zero remainder, then a zeroed memory block will pass a CRC check. This can be avoided by storing the result of the CRC calculation separately and comparing it with the result of the previous calculation. Note that this trick doesn't work for variable length data: a CRC preset to zero with any number of zero bytes will result in a zero CRC remainder. In this case, the best technique (e.g. used by HDLC in data transmission) is to preset the CRC to 0xFFFF, append the complemented CRC to the data, and include it in the calculation at the receiving end, where a constant value is expected (0xF0B8 for HDLC, which uses the CRC-CCITT 16-bit polynomial). -- David Empson dempson@actrix.gen.nz
dempson@actrix.gen.nz (David Empson) wrote:

>No it isn't. A "burst error" is one in which the first and last bits are >flipped, and the ones in the middle are randomly correct or wrong. A >burst error of length N may contain any number of single bit errors from >2 to N.
>Yes it is. A proper 16-bit CRC will detect errors in a transmission as >follows (from earlier editions of Tanenbaum):
>- All single bit errors. >- All burst errors of length 16 or less. >- 99.997% of 17-bit burst errors. >- 99.998% of 18-bit or longer burst errors.
So unless the errors in the OPs 16kb memory array are restricted to 17 consecutive bits (in the order you choose to arrange them) a 16 bit CRC has a 1 in 2^16 (0.002%) chance of not detecting them, just the same as a 16 bit checksum generated in a variety of simpler ways. This is why I suggested a simpler 32 bit checksum which would be more effective and faster to calculate. The extra 16 bits is a trivial overhead on 16kB of data. --
>In the case of memories, I'd expect burst errors that invert groups of >bits to be rare. Instead, I would expect blocks of memory to be zeroed, >overwritten with the wrong data/garbage, or randomized by power loss. >CRC codes aren't especially suited for any of these cases.
Then what would you recommend? As David Empson mentions in this thread, the CRC-16 will catch 99.998% of situations where there are scattered errors (i.e. long bursts), worst case. I would consider that to be "good enough" for me. Consider duplicate storage of data, say in 2 places. Upon power-up the two areas could be compared for equality. If they are not exactly the same, there is an error somewhere. So if there were errors, the only way the errors could not be detected is if they appeared identically in BOTH memory areas. I wonder what the probability of that would be? Any thoughts on that?