Reply by Dimiter_Popoff February 26, 20212021-02-26
On 2/26/2021 15:03, Richard Damon wrote:
 > .....
> ... Yes, the hardware > might do some tests and just fail a read that doesn't have good signal > levels, but that is only done in expensive drives, normal drives just > assume that if it gets a bit (or bits) wrong, the sector will fail the > CRC check, so no need for the fancy (and expensive) analog hardware. >
I remember perhaps the fanciest hardware for reading data from disk. It was one of my fingers, pushing the R/W head of my 8" floppy drive until it got the problematic sector right.... :). Back then (mid 80-s) all I had were two borrowed 8" drives (Bulgarian sort of clones of some Shugart or whatever it may have been drive) and on my first designed machine, 6809 based, I managed a whole 512k using double-density (via a upd765) for which the drives were not meant (nor was MDOS, it thought of 128 byte sectors only, so my ROM did 256 byte RMW, the 765 cannot do 128 bytes DD). And well, these drives taught me not to rely on a single backup drive.... and more often than not things got really bad, never mind how many copies I did - or didn't - have, so it got down to rescuing sectors which took careful listening to what the drive was doing, having its case open so its R/W head was at hand and using the above said fancy hardware to eventually read the precious sector.... Dimiter ------------------------------------------------------ Dimiter Popoff, TGI http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/
Reply by David Brown February 26, 20212021-02-26
On 26/02/2021 14:03, Richard Damon wrote:
> On 2/26/21 5:38 AM, David Brown wrote: >> On 25/02/2021 15:44, mac wrote: >>> >>>> So, you've "spent" 4 bits and have only the ability to >>>> detect *some* types of errors. (see below) The same applies >>>> to transmitting two copies (if the two copies differ, >>>> all you know is that AT LEAST one copy is incorrect; >>>> so, should you then transmit a THIRD copy???) >>> >>> DECtape 1960s used three duplicated tracks for 3-bit data. With tape >>> there’s the option of backing up and rereading >>> >> >> With a digital tape recorder, you write two states - 0 or 1. But when >> reading, you get an analogue value back. Depending on how you do this, >> you can get a lot more information. For many kinds of transmission or >> storage system, it's rare that there is a complete reversal of the bit - >> more often, you have a low signal, an intermediary value, a missing >> carrier frequency, etc. >> >> That's how RAID on disk systems work. If you have a RAID5 array with 10 >> disks, you have 9 disks worth of data and only one disk worth of simple >> parity check, but you can correct an error - because your read errors >> are marked as failures rather than returning incorrect data. >> >> If you have two copies of the information and you know one is suspect or >> missing, then you use the other copy. If you have three copies, that >> improves your reliability against errors, increases your confidence if >> there is a single problem, and gives you a chance to do "majority >> voting" if you can't figure out which bits are most reliable. >> > > Actually, the RAID array knows which disk had the error because each > sector also has a CRC checksum on it, so you know which of the sectors > failed, so you know which one you need to recreate.
(Disks haven't used as simple checks as CRC for quite some time - they use multi-level schemes with Reed-Solomon or low density parity checks to provide correction of a range of typical error patterns, as well as identifying many more possible errors.) But however it is done, the disk returns either the correct data, or an error code - it is extremely unlikely to return incorrect data that it thinks is correct, baring bugs in its software. Thus - as I said in my post - at the RAID level, the software knows which blocks of data were read correctly. This lets it fill in the missing data in a much more efficient fashion than if it does not know what data is correct and what might be wrong. (At this point you are using erasure codes, rather than Hamming codes.)
> Yes, the hardware > might do some tests and just fail a read that doesn't have good signal > levels, but that is only done in expensive drives, normal drives just > assume that if it gets a bit (or bits) wrong, the sector will fail the > CRC check, so no need for the fancy (and expensive) analog hardware. >
I think you are mixing things up a bit here - but it is not clear what levels you are discussing. There are /many/ steps between reading a single magnetic domain on the platter to giving a result back up the SATA (or whatever) line to the computer. At the bottom end - the read head - you have analogue hardware. How much detail this reads and passes on to the next layer will depend on the technology used - some will just pass a best-guess 0 or 1, others will have more nuanced data. Above that, each error checking and correcting layer will include error tracking information as well as data that is corrected (if possible), so that the higher layers can do a better job. (You have multiple layers so that you can handle problems with individual grains, and also physically larger problems that cover bigger blocks of bits.)
Reply by Richard Damon February 26, 20212021-02-26
On 2/26/21 5:38 AM, David Brown wrote:
> On 25/02/2021 15:44, mac wrote: >> >>> So, you've "spent" 4 bits and have only the ability to >>> detect *some* types of errors. (see below) The same applies >>> to transmitting two copies (if the two copies differ, >>> all you know is that AT LEAST one copy is incorrect; >>> so, should you then transmit a THIRD copy???) >> >> DECtape 1960s used three duplicated tracks for 3-bit data. With tape >> there’s the option of backing up and rereading >> > > With a digital tape recorder, you write two states - 0 or 1. But when > reading, you get an analogue value back. Depending on how you do this, > you can get a lot more information. For many kinds of transmission or > storage system, it's rare that there is a complete reversal of the bit - > more often, you have a low signal, an intermediary value, a missing > carrier frequency, etc. > > That's how RAID on disk systems work. If you have a RAID5 array with 10 > disks, you have 9 disks worth of data and only one disk worth of simple > parity check, but you can correct an error - because your read errors > are marked as failures rather than returning incorrect data. > > If you have two copies of the information and you know one is suspect or > missing, then you use the other copy. If you have three copies, that > improves your reliability against errors, increases your confidence if > there is a single problem, and gives you a chance to do "majority > voting" if you can't figure out which bits are most reliable. >
Actually, the RAID array knows which disk had the error because each sector also has a CRC checksum on it, so you know which of the sectors failed, so you know which one you need to recreate. Yes, the hardware might do some tests and just fail a read that doesn't have good signal levels, but that is only done in expensive drives, normal drives just assume that if it gets a bit (or bits) wrong, the sector will fail the CRC check, so no need for the fancy (and expensive) analog hardware.
Reply by David Brown February 26, 20212021-02-26
On 25/02/2021 15:44, mac wrote:
> >> So, you've "spent" 4 bits and have only the ability to >> detect *some* types of errors. (see below) The same applies >> to transmitting two copies (if the two copies differ, >> all you know is that AT LEAST one copy is incorrect; >> so, should you then transmit a THIRD copy???) > > DECtape 1960s used three duplicated tracks for 3-bit data. With tape > there’s the option of backing up and rereading >
With a digital tape recorder, you write two states - 0 or 1. But when reading, you get an analogue value back. Depending on how you do this, you can get a lot more information. For many kinds of transmission or storage system, it's rare that there is a complete reversal of the bit - more often, you have a low signal, an intermediary value, a missing carrier frequency, etc. That's how RAID on disk systems work. If you have a RAID5 array with 10 disks, you have 9 disks worth of data and only one disk worth of simple parity check, but you can correct an error - because your read errors are marked as failures rather than returning incorrect data. If you have two copies of the information and you know one is suspect or missing, then you use the other copy. If you have three copies, that improves your reliability against errors, increases your confidence if there is a single problem, and gives you a chance to do "majority voting" if you can't figure out which bits are most reliable.
Reply by Don Y February 25, 20212021-02-25
On 2/25/2021 7:44 AM, mac wrote:
> >> So, you've "spent" 4 bits and have only the ability to >> detect *some* types of errors. (see below) The same applies >> to transmitting two copies (if the two copies differ, >> all you know is that AT LEAST one copy is incorrect; >> so, should you then transmit a THIRD copy???) > > DECtape 1960s used three duplicated tracks for 3-bit data. With tape > there’s the option of backing up and rereading
With some technologies, you can even read-after-write to ensure the data *did* get encoded onto the medium. With a comm-link, you can request retransmission. (and, in some technologies, you can eavesdrop on the transmission to verify that it is intact AT YOUR CONNECTION -- which may not be the case at the intended recipient's end of the wire!) But, it's an open-ended process -- you never know when (or if) you will ever be able to convince yourself that you have "valid" data. So, your recovery option is to just give up after some number of retries. And, have some "fall back" state to which you can bring your device/process to reflect that uncertainty. Comms *tend* to expect more immediacy in their effects. Tape *tends* to be a semi-offline/batch operation.
Reply by mac February 25, 20212021-02-25
> So, you've "spent" 4 bits and have only the ability to > detect *some* types of errors. (see below) The same applies > to transmitting two copies (if the two copies differ, > all you know is that AT LEAST one copy is incorrect; > so, should you then transmit a THIRD copy???)
DECtape 1960s used three duplicated tracks for 3-bit data. With tape there’s the option of backing up and rereading
Reply by David Brown February 25, 20212021-02-25
On 25/02/2021 07:47, upsidedown@downunder.com wrote:
> On Wed, 24 Feb 2021 23:32:34 +0000 (UTC), antispam@math.uni.wroc.pl > wrote: > >> Don Y <blockedofcourse@foo.invalid> wrote: >>> On 2/23/2021 3:53 PM, antispam@math.uni.wroc.pl wrote: >>>> boB <boB@k7iq.com> wrote: >>>>> >>>>> >>>>> >>>>> If I create a 1 byte command that only uses 4 bits for 16 different >>>>> functions, and there were NO following bytes (there might be folling >>>>> bytes for some CMDs), I was thinking that the lower 4 bits could be >>>>> used for error detection. >>>>> >>>>> Could of course just truncate an 8 bit CRC I suppose and place it >>>>> there. >>>>> >>>>> Just thought I would ask. Never thought of only 4 bits before but >>>>> sometimes a single byte could be efficient, bandwidth wise. >>>> >>>> CRC is build from a polynomial with specific properties. Such >>>> polynomials exist for any number of bits. With low number of >>>> bit CRC have limited detection capability: 1-bit CRC is essentially >>>> the same as parity, 4-bit CRC is not very good. OTOH with >>>> only 4 data bits 4 (or 8) bit CRC is trivial: essentially >>>> you just get back your bits. So it is the same as transmiting >>>> two copies of your command: if they match then command is >>>> hopefully OK, is not you have error. That will catch >>>> single bit errors and most two bit errors, but will miss >>>> case when both command bit and corresponding bit in the >>>> copy is togled. As other pointed out Hamming code can >>>> do better, that is discover all 2 bit errors and correct >>>> one bit errors. Or alternatively, detect up to 3 bit >>>> errors. >>> >>> No. Two copies will only ALERT you to a bit error; >>> it will not allow you to recover (correct) that error. > > If duplicating data, at least complement / negate the copy, so it will > help in detecting some types of errors. > >>> >>> 1010 1011 leaves you unsure as to whether the intended >>> command was 1010 *or* 1011 (along with any number of less >>> likely corruptions). >> >> You are reading something that I not wrote: where you find >> claim that CRC can correct errors? > > With short data+CRC after a CRC error is detected, you could flip > individual bits at a time and recalculate CRC. If CRC matches, there > is a candidate for correct message. If no other matches occur, we most > likely have the corrected message. With longer messages, the > computational load will be too high. >
For some values of "likely", that might be acceptable. The trouble you face is that a CRC does not guarantee that it will let you do that in practice - you might find that there is more than one candidate bit that could be flipped in order to get a matching CRC. (Remember also that the incorrect bit might be in the CRC itself.)
>>> So, simple duplication wastes bits. You can get better >>> coverage with other encodings. >> >> As I wrote above (with better = Hamming). > > Indeed. >
Reply by February 25, 20212021-02-25
On Wed, 24 Feb 2021 23:32:34 +0000 (UTC), antispam@math.uni.wroc.pl
wrote:

>Don Y <blockedofcourse@foo.invalid> wrote: >> On 2/23/2021 3:53 PM, antispam@math.uni.wroc.pl wrote: >> > boB <boB@k7iq.com> wrote: >> >> >> >> >> >> >> >> If I create a 1 byte command that only uses 4 bits for 16 different >> >> functions, and there were NO following bytes (there might be folling >> >> bytes for some CMDs), I was thinking that the lower 4 bits could be >> >> used for error detection. >> >> >> >> Could of course just truncate an 8 bit CRC I suppose and place it >> >> there. >> >> >> >> Just thought I would ask. Never thought of only 4 bits before but >> >> sometimes a single byte could be efficient, bandwidth wise. >> > >> > CRC is build from a polynomial with specific properties. Such >> > polynomials exist for any number of bits. With low number of >> > bit CRC have limited detection capability: 1-bit CRC is essentially >> > the same as parity, 4-bit CRC is not very good. OTOH with >> > only 4 data bits 4 (or 8) bit CRC is trivial: essentially >> > you just get back your bits. So it is the same as transmiting >> > two copies of your command: if they match then command is >> > hopefully OK, is not you have error. That will catch >> > single bit errors and most two bit errors, but will miss >> > case when both command bit and corresponding bit in the >> > copy is togled. As other pointed out Hamming code can >> > do better, that is discover all 2 bit errors and correct >> > one bit errors. Or alternatively, detect up to 3 bit >> > errors. >> >> No. Two copies will only ALERT you to a bit error; >> it will not allow you to recover (correct) that error.
If duplicating data, at least complement / negate the copy, so it will help in detecting some types of errors.
>> >> 1010 1011 leaves you unsure as to whether the intended >> command was 1010 *or* 1011 (along with any number of less >> likely corruptions). > >You are reading something that I not wrote: where you find >claim that CRC can correct errors?
With short data+CRC after a CRC error is detected, you could flip individual bits at a time and recalculate CRC. If CRC matches, there is a candidate for correct message. If no other matches occur, we most likely have the corrected message. With longer messages, the computational load will be too high.
>> So, simple duplication wastes bits. You can get better >> coverage with other encodings. > >As I wrote above (with better = Hamming).
Indeed.
Reply by Don Y February 24, 20212021-02-24
On 2/24/2021 4:32 PM, antispam@math.uni.wroc.pl wrote:
> Don Y <blockedofcourse@foo.invalid> wrote: >> On 2/23/2021 3:53 PM, antispam@math.uni.wroc.pl wrote: >>> boB <boB@k7iq.com> wrote: >>>> >>>> >>>> >>>> If I create a 1 byte command that only uses 4 bits for 16 different >>>> functions, and there were NO following bytes (there might be folling >>>> bytes for some CMDs), I was thinking that the lower 4 bits could be >>>> used for error detection. >>>> >>>> Could of course just truncate an 8 bit CRC I suppose and place it >>>> there. >>>> >>>> Just thought I would ask. Never thought of only 4 bits before but >>>> sometimes a single byte could be efficient, bandwidth wise. >>> >>> CRC is build from a polynomial with specific properties. Such >>> polynomials exist for any number of bits. With low number of >>> bit CRC have limited detection capability: 1-bit CRC is essentially >>> the same as parity, 4-bit CRC is not very good. OTOH with >>> only 4 data bits 4 (or 8) bit CRC is trivial: essentially >>> you just get back your bits. So it is the same as transmiting >>> two copies of your command: if they match then command is >>> hopefully OK, is not you have error. That will catch >>> single bit errors and most two bit errors, but will miss >>> case when both command bit and corresponding bit in the >>> copy is togled. As other pointed out Hamming code can >>> do better, that is discover all 2 bit errors and correct >>> one bit errors. Or alternatively, detect up to 3 bit >>> errors. >> >> No. Two copies will only ALERT you to a bit error; >> it will not allow you to recover (correct) that error. >> >> 1010 1011 leaves you unsure as to whether the intended >> command was 1010 *or* 1011 (along with any number of less >> likely corruptions). > > You are reading something that I not wrote: where you find > claim that CRC can correct errors?
I didn't say that YOU had said it could correct. I stated that duplicating the data only lets you DETECT errors. So, you've "spent" 4 bits and have only the ability to detect *some* types of errors. (see below) The same applies to transmitting two copies (if the two copies differ, all you know is that AT LEAST one copy is incorrect; so, should you then transmit a THIRD copy???)
>> So, simple duplication wastes bits. You can get better >> coverage with other encodings. > > As I wrote above (with better = Hamming). >
Reply by February 24, 20212021-02-24
Don Y <blockedofcourse@foo.invalid> wrote:
> On 2/23/2021 3:53 PM, antispam@math.uni.wroc.pl wrote: > > boB <boB@k7iq.com> wrote: > >> > >> > >> > >> If I create a 1 byte command that only uses 4 bits for 16 different > >> functions, and there were NO following bytes (there might be folling > >> bytes for some CMDs), I was thinking that the lower 4 bits could be > >> used for error detection. > >> > >> Could of course just truncate an 8 bit CRC I suppose and place it > >> there. > >> > >> Just thought I would ask. Never thought of only 4 bits before but > >> sometimes a single byte could be efficient, bandwidth wise. > > > > CRC is build from a polynomial with specific properties. Such > > polynomials exist for any number of bits. With low number of > > bit CRC have limited detection capability: 1-bit CRC is essentially > > the same as parity, 4-bit CRC is not very good. OTOH with > > only 4 data bits 4 (or 8) bit CRC is trivial: essentially > > you just get back your bits. So it is the same as transmiting > > two copies of your command: if they match then command is > > hopefully OK, is not you have error. That will catch > > single bit errors and most two bit errors, but will miss > > case when both command bit and corresponding bit in the > > copy is togled. As other pointed out Hamming code can > > do better, that is discover all 2 bit errors and correct > > one bit errors. Or alternatively, detect up to 3 bit > > errors. > > No. Two copies will only ALERT you to a bit error; > it will not allow you to recover (correct) that error. > > 1010 1011 leaves you unsure as to whether the intended > command was 1010 *or* 1011 (along with any number of less > likely corruptions).
You are reading something that I not wrote: where you find claim that CRC can correct errors?
> So, simple duplication wastes bits. You can get better > coverage with other encodings.
As I wrote above (with better = Hamming). -- Waldek Hebisch