Reply by D Yuniskis April 12, 20112011-04-12
Hi Vladimir,

On 4/12/2011 12:16 PM, Vladimir Vassilevsky wrote:
> Shane williams wrote: > >> Out of curiosity, I've lashed up a program (see below) to try and test >> the detection failure rate for the ccitt 16 bit crc. It repeatedly >> generates a buffer of 32 pseudo-random bytes, calculates the crc, then >> complements 8 pseudo-random bits and re-calculates the crc. > > There is significant difference in the error detection performance of > the different polynomials depending on particular block size, expected > error rate and the distribution of errors.
E.g., the way errors manifest on a CD/DVD is very different from the way they manifest in a serial comm link. And, of course, the actual environment can have a pronounced effect on communications reliability.
> The "standard" polynomials > are not necessarily the best in this regard. A polynomial which performs > extremely well in one situation may not perform in the other situation. > Thus, is good to tailor the polynomial for the needs of the particular > application. The practical way to do that is simulation.
I think the OP needs to step back and look at the entire comm process/mechanism. As I have stated before, the CRC's only work on "data". The comm system doesn't always deliver "real data" to the CRC -- e.g., if it fails to synchronize properly to the *actual* data stream (i.e., not "noise"!), then it can *drop* data and/or *fabricate* data that isn't really data at all! Unless you can understand (model, simulate) the nature of the entire process, you can't come up with a realistic error detection or recovery scheme.
Reply by Vladimir Vassilevsky April 12, 20112011-04-12

Shane williams wrote:

> Out of curiosity, I've lashed up a program (see below) to try and test > the detection failure rate for the ccitt 16 bit crc. It repeatedly > generates a buffer of 32 pseudo-random bytes, calculates the crc, then > complements 8 pseudo-random bits and re-calculates the crc.
There is significant difference in the error detection performance of the different polynomials depending on particular block size, expected error rate and the distribution of errors. The "standard" polynomials are not necessarily the best in this regard. A polynomial which performs extremely well in one situation may not perform in the other situation. Thus, is good to tailor the polynomial for the needs of the particular application. The practical way to do that is simulation. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
Reply by April 12, 20112011-04-12
On 12.04.2011 14:59, David Brown wrote:

> I certainly can't think of any type of noise or other corruption that > would swap the polarity of exactly eight random bits in a message.
Easy. Take an NRZ-encoded bit stream using bit stuffing for clock recovery (yes, CAN, I'm looking at you). Two bit flips in just the right places, creating or disrupting stuff sequences, can shift around an almost arbitrary number of bits without affecting the data length. Basically the same thing can happen in any asynchronous serial medium if the sender's or receiver's clock are sufficiently unstable to drift by about one bit time and back in the space of one frame. You could end up receiving one bit twice, another one not at all. Both scenarios end up moving an entire string of bits by one bit time. Do that with a message of alternating ones and zeroes, and you invert all of them.
Reply by David Brown April 12, 20112011-04-12
On 12/04/2011 15:15, Shane williams wrote:
> On Apr 13, 12:59 am, David Brown<da...@westcontrol.removethisbit.com> > wrote: >> On 12/04/2011 14:20, Shane williams wrote: >> >>> I'm just trying to see for myself that the rate of undetected errors >>> for an even number of bit flips with the ccitt 16 bit crc is on >>> average, at least 1 in 65536 when the number of bits flipped is more >>> than 3. >> >> Why? >> > > As I wrote at the start of this thread - curiosity. >
Fair enough, I suppose - curiosity is as good a reason as any. I did a similar thing myself a number of years ago, for hamming codes for forward error correction. I just thought you should be aware that you are spending time and effort on something of little realistic value - if you want to experiment, why not use test out other forms of corruption that you might see in use?
> >> Do you expect to use the crc in a system that flips a certain even >> number of bits every now and again? >> >> I certainly can't think of any type of noise or other corruption that >> would swap the polarity of exactly eight random bits in a message. > > um, I never said there was. > > >> >> If you want to figure out what your detection rates are for errors that >> can occur in a system, you must first characterise those errors. >> Typical real-world errors are a sequence of zeros or a sequence of ones, >> a short burst of ones followed closely by a short burst of zeros (like a >> spike on the line), failure to change from from zero to one after a long >> burst of zeros, etc. >> >> One of the reasons crcs are popular for checking transmissions is that >> the type of errors they detect are /not/ randomly and evenly distributed >> (unlike simple additive checksums). They will do better at detecting >> runs of errors, for example, than a simple "1 in 65536" would suggest. > > Sure, I'm aware of that. Any run of 16 bits or less gets detected > with the crc I'm using. >
Reply by Shane williams April 12, 20112011-04-12
On Apr 13, 12:59=A0am, David Brown <da...@westcontrol.removethisbit.com>
wrote:
> On 12/04/2011 14:20, Shane williams wrote: > > > I'm just trying to see for myself that the rate of undetected errors > > for an even number of bit flips with the ccitt 16 bit crc is on > > average, at least 1 in 65536 when the number of bits flipped is more > > than 3. > > Why? >
As I wrote at the start of this thread - curiosity.
> Do you expect to use the crc in a system that flips a certain even > number of bits every now and again? > > I certainly can't think of any type of noise or other corruption that > would swap the polarity of exactly eight random bits in a message.
um, I never said there was.
> > If you want to figure out what your detection rates are for errors that > can occur in a system, you must first characterise those errors. > Typical real-world errors are a sequence of zeros or a sequence of ones, > a short burst of ones followed closely by a short burst of zeros (like a > spike on the line), failure to change from from zero to one after a long > burst of zeros, etc. > > One of the reasons crcs are popular for checking transmissions is that > the type of errors they detect are /not/ randomly and evenly distributed > (unlike simple additive checksums). =A0They will do better at detecting > runs of errors, for example, than a simple "1 in 65536" would suggest.
Sure, I'm aware of that. Any run of 16 bits or less gets detected with the crc I'm using.
Reply by David Brown April 12, 20112011-04-12
On 12/04/2011 14:20, Shane williams wrote:
> On Apr 12, 5:31 am, D Yuniskis<not.going.to...@seen.com> wrote: >> Hi Shane, >> >> On 4/10/2011 6:04 AM, Shane williams wrote: >> >> >> >>> Out of curiosity, I've lashed up a program (see below) to try and test >>> the detection failure rate for the ccitt 16 bit crc. It repeatedly >> >> To clarify, you are trying to determine how many *undetectable* >> errors there are? What does the polynomial *claim* to detect? >> (i.e., it might detect *all* N bit errors, *some* M bit errors >> and, possibly, *no* P bit errors) > > > I'm just trying to see for myself that the rate of undetected errors > for an even number of bit flips with the ccitt 16 bit crc is on > average, at least 1 in 65536 when the number of bits flipped is more > than 3. >
Why? Do you expect to use the crc in a system that flips a certain even number of bits every now and again? I certainly can't think of any type of noise or other corruption that would swap the polarity of exactly eight random bits in a message. If you want to figure out what your detection rates are for errors that can occur in a system, you must first characterise those errors. Typical real-world errors are a sequence of zeros or a sequence of ones, a short burst of ones followed closely by a short burst of zeros (like a spike on the line), failure to change from from zero to one after a long burst of zeros, etc. One of the reasons crcs are popular for checking transmissions is that the type of errors they detect are /not/ randomly and evenly distributed (unlike simple additive checksums). They will do better at detecting runs of errors, for example, than a simple "1 in 65536" would suggest.
Reply by Shane williams April 12, 20112011-04-12
On Apr 11, 4:05=A0am, Mel <mwil...@the-wire.com> wrote:
> > The Python version below gained a bit of clarity, at the expense of runti=
me. =A0
> It took a few minutes to find no false positives (corrupt messages though=
t
> to be good) in a million trials. >
I think it should be getting about 30 false positives in a million trials if you're flipping an even number of bits and the data is truly random.
Reply by Shane williams April 12, 20112011-04-12
On Apr 12, 5:31=A0am, D Yuniskis <not.going.to...@seen.com> wrote:
> Hi Shane, > > On 4/10/2011 6:04 AM, Shane williams wrote: > > > > > Out of curiosity, I've lashed up a program (see below) to try and test > > the detection failure rate for the ccitt 16 bit crc. =A0It repeatedly > > To clarify, you are trying to determine how many *undetectable* > errors there are? =A0What does the polynomial *claim* to detect? > (i.e., it might detect *all* N bit errors, *some* M bit errors > and, possibly, *no* P bit errors)
I'm just trying to see for myself that the rate of undetected errors for an even number of bit flips with the ccitt 16 bit crc is on average, at least 1 in 65536 when the number of bits flipped is more than 3.
> > > generates a buffer of 32 pseudo-random bytes, calculates the crc, then > > complements 8 pseudo-random bits and re-calculates the crc. > > Why always 8?
I lashed up the program late at night in a hurry and the way I did the bit flipping was "manual" rather than loop driven so I needed a small even number greater than 3. I've improved my code now so I can do any number.
> > > How many iterations should I have to run it for to get the same crc on > > the "corrupted" data as the non-corrupted data? =A0The detection failur=
e
> > rate doesn't appear to be one in 65536 or anything like it. > > > If I corrupt more bits would I get a higher detection failure rate? > > This depends on the polynomial. =A0E.g., simple "parity" will > detect *all* one bit errors -- but *no* 2 bit errors. =A0And, > yet again, all *3* bit errors, etc. > > > If I generate two pseudo-random 32 byte "messages" and compare their > > CRCs, how long would I have to iterate for to get a match on the CRCs? > > How "random" is your PRNG? =A0Chances are, you are generating *it* > with a polynomial, as well...
It's not random enough. I'm mainly relying on get_millisecs from my operating system to return sufficiently varying values.
> > > Any thoughts? > > Theory will help you more than practice, here. =A0You need to run a > really *long* simulation to get meaningful numbers. =A0And, have to be > sure your PRNG really *is* random and doesn't just loop over a small > subregion of the space. =A0E.g., if your PRNG has too short of a cycle > length, there are values (combinations of bits) that you will never > exercise.
My PRNG wasn't working to start with and I got zero detection failures in 300 million iterations. I suspect the Python version posted by Mel has this problem too because he got zero "undetected errors" for a million iterations when it should have been approx 1000000/32768 =3D 30.
Reply by Shane williams April 12, 20112011-04-12
On Apr 12, 3:32=A0am, Philip Koopman <koop...@cmu.edu> wrote:
> Shane williams <shane.2471...@gmail.com> wrote: > > >Out of curiosity, I've lashed up a program (see below) to try and test > >the detection failure rate for the ccitt 16 bit crc. =A0It repeatedly > >generates a buffer of 32 pseudo-random bytes, calculates the crc, then > >complements 8 pseudo-random bits and re-calculates the crc. > > >How many iterations should I have to run it for to get the same crc on > >the "corrupted" data as the non-corrupted data? =A0The detection failure > >rate doesn't appear to be one in 65536 or anything like it. > > >If I corrupt more bits would I get a higher detection failure rate? > > For a 16-bit CRC it should be about one in 65536 on average for large > numbers of bit errors. BUT, if you use a polynomial divisible by (x+1) > what you get is 100% of odd numbers of bit flips detected and only one > out of 32768 even numbers. =A0In other words, the whole undetected error > budget gets pushed into the even number of bit errors case. =A0I'll bet > you are seeing 1/32768 because you are flipping an even number of bits. >
I was actually getting zero errors because there was something wrong with my random number generator. Now that I've improved the PRNG I'm getting similar numbers to 1/32768 regardless of whether I flip 8 bits or 150 bits. I deliberately flipped an even number of bits because I knew it detected all odd number bit flips.
> For small numbers of bit errors it all depends on the polynomial. =A0So > this is generic asymptotic error detection performance as number of bit > errors increases. > > >If I generate two pseudo-random 32 byte "messages" and compare their > >CRCs, how long would I have to iterate for to get a match on the CRCs? > > It turns out the message content doesn't matter, only the error > patterns. =A0But for that it is pseudo-random; asymptotically zero OR one > out of every 64K OR one out of every 32K for large numbers of bit errors > depending on even/odd number of errors and the polynomial.
ok, I'm getting this.
> > >Any thoughts? > > Simulations take a really long time to evaluate polynomials. =A0 Also, yo=
u
> have to use a good random number generator or your results won't be > accurate. =A0I tried that for a while but it just takes forever to get > answers. There are some analytic answers available that are pretty hairy > to do on your own, but are exact answers (not stochastic simulations).
ok. In another thread here I learnt/realized that in a byte protocol that has stop bits and start bits, corruption of a start bit can result in the data "morphing" into something else if it occurs before the length byte of a message and defeat the ability of the crc to detect single bit errors. It could also be a problem if it occurs after the length byte, depending on how gaps in data are handled etc. Thanks.
Reply by D Yuniskis April 11, 20112011-04-11
Hi Shane,

On 4/10/2011 6:04 AM, Shane williams wrote:
> > Out of curiosity, I've lashed up a program (see below) to try and test > the detection failure rate for the ccitt 16 bit crc. It repeatedly
To clarify, you are trying to determine how many *undetectable* errors there are? What does the polynomial *claim* to detect? (i.e., it might detect *all* N bit errors, *some* M bit errors and, possibly, *no* P bit errors)
> generates a buffer of 32 pseudo-random bytes, calculates the crc, then > complements 8 pseudo-random bits and re-calculates the crc.
Why always 8?
> How many iterations should I have to run it for to get the same crc on > the "corrupted" data as the non-corrupted data? The detection failure > rate doesn't appear to be one in 65536 or anything like it. > > If I corrupt more bits would I get a higher detection failure rate?
This depends on the polynomial. E.g., simple "parity" will detect *all* one bit errors -- but *no* 2 bit errors. And, yet again, all *3* bit errors, etc.
> If I generate two pseudo-random 32 byte "messages" and compare their > CRCs, how long would I have to iterate for to get a match on the CRCs?
How "random" is your PRNG? Chances are, you are generating *it* with a polynomial, as well...
> Any thoughts?
Theory will help you more than practice, here. You need to run a really *long* simulation to get meaningful numbers. And, have to be sure your PRNG really *is* random and doesn't just loop over a small subregion of the space. E.g., if your PRNG has too short of a cycle length, there are values (combinations of bits) that you will never exercise.