Hi Vladimir,

On 4/12/2011 12:16 PM, Vladimir Vassilevsky wrote:
> Shane williams wrote:
>
>> Out of curiosity, I've lashed up a program (see below) to try and test
>> the detection failure rate for the ccitt 16 bit crc. It repeatedly
>> generates a buffer of 32 pseudo-random bytes, calculates the crc, then
>> complements 8 pseudo-random bits and re-calculates the crc.
>
> There is significant difference in the error detection performance of
> the different polynomials depending on particular block size, expected
> error rate and the distribution of errors.

E.g., the way errors manifest on a CD/DVD is very different from
the way they manifest in a serial comm link.  And, of course, the
actual environment can have a pronounced effect on communications
reliability.

> The "standard" polynomials
> are not necessarily the best in this regard. A polynomial which performs
> extremely well in one situation may not perform in the other situation.
> Thus, is good to tailor the polynomial for the needs of the particular
> application. The practical way to do that is simulation.

I think the OP needs to step back and look at the entire comm
process/mechanism.  As I have stated before, the CRC's only
work on "data".  The comm system doesn't always deliver
"real data" to the CRC -- e.g., if it fails to synchronize
properly to the *actual* data stream (i.e., not "noise"!),
then it can *drop* data and/or *fabricate* data that isn't
really data at all!

Unless you can understand (model, simulate) the nature of the
entire process, you can't come up with a realistic error
detection or recovery scheme.

Shane williams wrote:

> Out of curiosity, I've lashed up a program (see below) to try and test
> the detection failure rate for the ccitt 16 bit crc.  It repeatedly
> generates a buffer of 32 pseudo-random bytes, calculates the crc, then
> complements 8 pseudo-random bits and re-calculates the crc.

There is significant difference in the error detection performance of 
the different polynomials depending on particular block size, expected 
error rate and the distribution of errors. The "standard" polynomials 
are not necessarily the best in this regard. A polynomial which performs 
extremely well in one situation may not perform in the other situation. 
Thus, is good to tailor the polynomial for the needs of the particular 
application. The practical way to do that is simulation.

Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

On 12.04.2011 14:59, David Brown wrote:

> I certainly can't think of any type of noise or other corruption that
> would swap the polarity of exactly eight random bits in a message.

Easy.  Take an NRZ-encoded bit stream using bit stuffing for clock 
recovery (yes, CAN, I'm looking at you).  Two bit flips in just the 
right places, creating or disrupting stuff sequences, can shift around 
an almost arbitrary number of bits without affecting the data length.

Basically the same thing can happen in any asynchronous serial medium if 
the sender's or receiver's clock are sufficiently unstable to drift by 
about one bit time and back in the space of one frame.  You could end up 
receiving one bit twice, another one not at all.

Both scenarios end up moving an entire string of bits by one bit time. 
Do that with a message of alternating ones and zeroes, and you invert 
all of them.

On 12/04/2011 15:15, Shane williams wrote:
> On Apr 13, 12:59 am, David Brown<da...@westcontrol.removethisbit.com>
> wrote:
>> On 12/04/2011 14:20, Shane williams wrote:
>>
>>> I'm just trying to see for myself that the rate of undetected errors
>>> for an even number of bit flips with the ccitt 16 bit crc is on
>>> average, at least 1 in 65536 when the number of bits flipped is more
>>> than 3.
>>
>> Why?
>>
>
> As I wrote at the start of this thread - curiosity.
>

Fair enough, I suppose - curiosity is as good a reason as any.  I did a 
similar thing myself a number of years ago, for hamming codes for 
forward error correction.

I just thought you should be aware that you are spending time and effort 
on something of little realistic value - if you want to experiment, why 
not use test out other forms of corruption that you might see in use?

>
>> Do you expect to use the crc in a system that flips a certain even
>> number of bits every now and again?
>>
>> I certainly can't think of any type of noise or other corruption that
>> would swap the polarity of exactly eight random bits in a message.
>
> um, I never said there was.
>
>
>>
>> If you want to figure out what your detection rates are for errors that
>> can occur in a system, you must first characterise those errors.
>> Typical real-world errors are a sequence of zeros or a sequence of ones,
>> a short burst of ones followed closely by a short burst of zeros (like a
>> spike on the line), failure to change from from zero to one after a long
>> burst of zeros, etc.
>>
>> One of the reasons crcs are popular for checking transmissions is that
>> the type of errors they detect are /not/ randomly and evenly distributed
>> (unlike simple additive checksums).  They will do better at detecting
>> runs of errors, for example, than a simple "1 in 65536" would suggest.
>
> Sure, I'm aware of that.  Any run of 16 bits or less gets detected
> with the crc I'm using.
>

On Apr 13, 12:59=A0am, David Brown <da...@westcontrol.removethisbit.com>
wrote:
> On 12/04/2011 14:20, Shane williams wrote:
>
> > I'm just trying to see for myself that the rate of undetected errors
> > for an even number of bit flips with the ccitt 16 bit crc is on
> > average, at least 1 in 65536 when the number of bits flipped is more
> > than 3.
>
> Why?
>

As I wrote at the start of this thread - curiosity.


> Do you expect to use the crc in a system that flips a certain even
> number of bits every now and again?
>
> I certainly can't think of any type of noise or other corruption that
> would swap the polarity of exactly eight random bits in a message.

um, I never said there was.


>
> If you want to figure out what your detection rates are for errors that
> can occur in a system, you must first characterise those errors.
> Typical real-world errors are a sequence of zeros or a sequence of ones,
> a short burst of ones followed closely by a short burst of zeros (like a
> spike on the line), failure to change from from zero to one after a long
> burst of zeros, etc.
>
> One of the reasons crcs are popular for checking transmissions is that
> the type of errors they detect are /not/ randomly and evenly distributed
> (unlike simple additive checksums). =A0They will do better at detecting
> runs of errors, for example, than a simple "1 in 65536" would suggest.

Sure, I'm aware of that.  Any run of 16 bits or less gets detected
with the crc I'm using.

On 12/04/2011 14:20, Shane williams wrote:
> On Apr 12, 5:31 am, D Yuniskis<not.going.to...@seen.com>  wrote:
>> Hi Shane,
>>
>> On 4/10/2011 6:04 AM, Shane williams wrote:
>>
>>
>>
>>> Out of curiosity, I've lashed up a program (see below) to try and test
>>> the detection failure rate for the ccitt 16 bit crc.  It repeatedly
>>
>> To clarify, you are trying to determine how many *undetectable*
>> errors there are?  What does the polynomial *claim* to detect?
>> (i.e., it might detect *all* N bit errors, *some* M bit errors
>> and, possibly, *no* P bit errors)
>
>
> I'm just trying to see for myself that the rate of undetected errors
> for an even number of bit flips with the ccitt 16 bit crc is on
> average, at least 1 in 65536 when the number of bits flipped is more
> than 3.
>

Why?

Do you expect to use the crc in a system that flips a certain even 
number of bits every now and again?

I certainly can't think of any type of noise or other corruption that 
would swap the polarity of exactly eight random bits in a message.

If you want to figure out what your detection rates are for errors that 
can occur in a system, you must first characterise those errors. 
Typical real-world errors are a sequence of zeros or a sequence of ones, 
a short burst of ones followed closely by a short burst of zeros (like a 
spike on the line), failure to change from from zero to one after a long 
burst of zeros, etc.

One of the reasons crcs are popular for checking transmissions is that 
the type of errors they detect are /not/ randomly and evenly distributed 
(unlike simple additive checksums).  They will do better at detecting 
runs of errors, for example, than a simple "1 in 65536" would suggest.

On Apr 11, 4:05=A0am, Mel <mwil...@the-wire.com> wrote:
>
> The Python version below gained a bit of clarity, at the expense of runti=
me. =A0
> It took a few minutes to find no false positives (corrupt messages though=
t
> to be good) in a million trials.
>

I think it should be getting about 30 false positives in a million
trials if you're flipping an even number of bits and the data is truly
random.

On Apr 12, 5:31=A0am, D Yuniskis <not.going.to...@seen.com> wrote:
> Hi Shane,
>
> On 4/10/2011 6:04 AM, Shane williams wrote:
>
>
>
> > Out of curiosity, I've lashed up a program (see below) to try and test
> > the detection failure rate for the ccitt 16 bit crc. =A0It repeatedly
>
> To clarify, you are trying to determine how many *undetectable*
> errors there are? =A0What does the polynomial *claim* to detect?
> (i.e., it might detect *all* N bit errors, *some* M bit errors
> and, possibly, *no* P bit errors)

I'm just trying to see for myself that the rate of undetected errors
for an even number of bit flips with the ccitt 16 bit crc is on
average, at least 1 in 65536 when the number of bits flipped is more
than 3.

>
> > generates a buffer of 32 pseudo-random bytes, calculates the crc, then
> > complements 8 pseudo-random bits and re-calculates the crc.
>
> Why always 8?

I lashed up the program late at night in a hurry and the way I did the
bit flipping was "manual" rather than loop driven so I needed a small
even number greater than 3.  I've improved my code now so I can do any
number.

>
> > How many iterations should I have to run it for to get the same crc on
> > the "corrupted" data as the non-corrupted data? =A0The detection failur=
e
> > rate doesn't appear to be one in 65536 or anything like it.
>
> > If I corrupt more bits would I get a higher detection failure rate?
>
> This depends on the polynomial. =A0E.g., simple "parity" will
> detect *all* one bit errors -- but *no* 2 bit errors. =A0And,
> yet again, all *3* bit errors, etc.
>
> > If I generate two pseudo-random 32 byte "messages" and compare their
> > CRCs, how long would I have to iterate for to get a match on the CRCs?
>
> How "random" is your PRNG? =A0Chances are, you are generating *it*
> with a polynomial, as well...

It's not random enough.  I'm mainly relying on get_millisecs from my
operating system to return sufficiently varying values.

>
> > Any thoughts?
>
> Theory will help you more than practice, here. =A0You need to run a
> really *long* simulation to get meaningful numbers. =A0And, have to be
> sure your PRNG really *is* random and doesn't just loop over a small
> subregion of the space. =A0E.g., if your PRNG has too short of a cycle
> length, there are values (combinations of bits) that you will never
> exercise.

My PRNG wasn't working to start with and I got zero detection failures
in 300 million iterations.  I suspect the Python version posted by Mel
has this problem too because he got zero "undetected errors" for a
million iterations when it should have been approx 1000000/32768 =3D 30.

On Apr 12, 3:32=A0am, Philip Koopman <koop...@cmu.edu> wrote:
> Shane williams <shane.2471...@gmail.com> wrote:
>
> >Out of curiosity, I've lashed up a program (see below) to try and test
> >the detection failure rate for the ccitt 16 bit crc. =A0It repeatedly
> >generates a buffer of 32 pseudo-random bytes, calculates the crc, then
> >complements 8 pseudo-random bits and re-calculates the crc.
>
> >How many iterations should I have to run it for to get the same crc on
> >the "corrupted" data as the non-corrupted data? =A0The detection failure
> >rate doesn't appear to be one in 65536 or anything like it.
>
> >If I corrupt more bits would I get a higher detection failure rate?
>
> For a 16-bit CRC it should be about one in 65536 on average for large
> numbers of bit errors. BUT, if you use a polynomial divisible by (x+1)
> what you get is 100% of odd numbers of bit flips detected and only one
> out of 32768 even numbers. =A0In other words, the whole undetected error
> budget gets pushed into the even number of bit errors case. =A0I'll bet
> you are seeing 1/32768 because you are flipping an even number of bits.
>

I was actually getting zero errors because there was something wrong
with my random number generator.  Now that I've improved the PRNG I'm
getting similar numbers to 1/32768 regardless of whether I flip 8 bits
or 150 bits.  I deliberately flipped an even number of bits because I
knew it detected all odd number bit flips.



> For small numbers of bit errors it all depends on the polynomial. =A0So
> this is generic asymptotic error detection performance as number of bit
> errors increases.
>
> >If I generate two pseudo-random 32 byte "messages" and compare their
> >CRCs, how long would I have to iterate for to get a match on the CRCs?
>
> It turns out the message content doesn't matter, only the error
> patterns. =A0But for that it is pseudo-random; asymptotically zero OR one
> out of every 64K OR one out of every 32K for large numbers of bit errors
> depending on even/odd number of errors and the polynomial.

ok, I'm getting this.


>
> >Any thoughts?
>
> Simulations take a really long time to evaluate polynomials. =A0 Also, yo=
u
> have to use a good random number generator or your results won't be
> accurate. =A0I tried that for a while but it just takes forever to get
> answers. There are some analytic answers available that are pretty hairy
> to do on your own, but are exact answers (not stochastic simulations).

ok.

In another thread here I learnt/realized that in a byte protocol that
has stop bits and start bits, corruption of a start bit can result in
the data "morphing" into something else if it occurs before the length
byte of a message and defeat the ability of the crc to detect single
bit errors.  It could also be a problem if it occurs after the length
byte, depending on how gaps in data are handled etc.

Thanks.

Hi Shane,

On 4/10/2011 6:04 AM, Shane williams wrote:
>
> Out of curiosity, I've lashed up a program (see below) to try and test
> the detection failure rate for the ccitt 16 bit crc.  It repeatedly

To clarify, you are trying to determine how many *undetectable*
errors there are?  What does the polynomial *claim* to detect?
(i.e., it might detect *all* N bit errors, *some* M bit errors
and, possibly, *no* P bit errors)

> generates a buffer of 32 pseudo-random bytes, calculates the crc, then
> complements 8 pseudo-random bits and re-calculates the crc.

Why always 8?

> How many iterations should I have to run it for to get the same crc on
> the "corrupted" data as the non-corrupted data?  The detection failure
> rate doesn't appear to be one in 65536 or anything like it.
>
> If I corrupt more bits would I get a higher detection failure rate?

This depends on the polynomial.  E.g., simple "parity" will
detect *all* one bit errors -- but *no* 2 bit errors.  And,
yet again, all *3* bit errors, etc.

> If I generate two pseudo-random 32 byte "messages" and compare their
> CRCs, how long would I have to iterate for to get a match on the CRCs?

How "random" is your PRNG?  Chances are, you are generating *it*
with a polynomial, as well...

> Any thoughts?

Theory will help you more than practice, here.  You need to run a
really *long* simulation to get meaningful numbers.  And, have to be
sure your PRNG really *is* random and doesn't just loop over a small
subregion of the space.  E.g., if your PRNG has too short of a cycle
length, there are values (combinations of bits) that you will never
exercise.