crc for versioning


Suppose you have some software that is 16K bytes.
Is CRC16 - 16 bit crc, a reasonable way of producing a number that
uniquely identifies a particular revision of this software?  What are
the odds of two different revisions having the same CRC?  Are they any
better than using a 16 bit arithmetic checksum?


TIA

Reply by ●October 6, 20052005-10-06

darkknight wrote:
> Suppose you have some software that is 16K bytes.
> Is CRC16 - 16 bit crc, a reasonable way of producing a number that
> uniquely identifies a particular revision of this software?  What are
> the odds of two different revisions having the same CRC?  Are they any
> better than using a 16 bit arithmetic checksum?
>
>
> TIA

While the chances of two images having the same CRC is 1:65536, after
you have built a few versions the probability that you will get two the
same rapidly increases.

This is similar to the classic example of shared birthdays in a school
class.  The probability that two children have the same birthday is
1:365 but the probability that two children out of a class of 20 share
a birthday is well over 50%.

A better way is to have a const initialised data element and write a
utility to autoincrement the value it stores.  Add this to the build
and ensure that the file is re-compiled on every build.

Ian

Reply by Noway2 ●October 6, 20052005-10-06

Based on experience, I would say that placing a check value in the
software is a terrific idea, especially if you are able to calculate it
and embedded it in the output file. This way, the system can verify
the integrity of the rom by running a check on the memory.

Below, I have included a short assembly code snipit that I used in a
project. The compiler has a macro called DATE that inserts the PCs
compile time and date. With this, you compute the low and high bytes
(assuming byte wide memory) and stick them in the software. Next, you
calculate a negative number that subtracts out the values added by the
date and time.

This idea won't work with a CRC wich is order dependant, but will work
with an addition checksum, which is far easier and faster to compute
during startup. While it isn't as secure as a CRC algorithm you get
the benefit of having the compile time and date embedded in the rom and
it is run time accessible so it can be displayed, etc. A combination
of a check value like a CRC along with a specific compile date and time
can give you a pretty good indication that the image is the correct
one.

/******* snip **********/
MONTH_MSB EQU ( (DATE 5) / 10)+0x30 ; Month
MONTH_LSB EQU ( (DATE 5) MOD 10) +0x30

FIRMVER dc.b 'Software Version X.X''
dc.b 'Created on '
dc.b MONTH_MSB,MONTH_LSB
dc.b '/'

CRCFIX dc.b 0-(low
(MONTH_MSB+MONTH_LSB+DAY_MSB+DAY_LSB+YEAR_MSB+YEAR_LSB+HOUR_MSB+HOUR_LSB+MINUTE_MSB+MINUTE_LSB+SECOND_MSB+SECOND_LSB))
;The CRCFIX subtracts out the month,day,year, etc for the purposes of
compling a checksum
/********* end snip ********/

Reply by Rene Tschaggelar ●October 6, 20052005-10-06

darkknight wrote:

> 
> Suppose you have some software that is 16K bytes.
> Is CRC16 - 16 bit crc, a reasonable way of producing a number that
> uniquely identifies a particular revision of this software?  What are
> the odds of two different revisions having the same CRC?  Are they any
> better than using a 16 bit arithmetic checksum?

Forget a checksum, they are too easily fooled.

A 16bit CRC protects 2^16 bits = 8kbytes against
single bit errors. So a 32bit CRC is possible a
better fit.

To check uniqueness of some date, the hash value
was created. This is polynominal with some more
digits than just a CRC.

Rene
-- 
Ing.Buero R.Tschaggelar - http://www.ibrtses.com
& commercial newsgroups - http://www.talkto.net

Reply by Grant Edwards ●October 6, 20052005-10-06

On 2005-10-06, darkknight <darkknight.21@gmail.com> wrote:

> Suppose you have some software that is 16K bytes.

> Is CRC16 - 16 bit crc, a reasonable way of producing a number that
> uniquely identifies a particular revision of this software?

That depends on the consequences of a collision.  If it kills
people, then no.  If it wastes 10 minutes of somebody's time,
probably yes.

> What are the odds of two different revisions having the same
> CRC?

Uh, 1 out of 65536.

> Are they any better than using a 16 bit arithmetic checksum?

No, they're exactly the same.

-- 
Grant Edwards                   grante             Yow!  Here I am in 53
                                  at               B.C. and all I want is a
                               visi.com            dill pickle!!

Reply by Grant Edwards ●October 6, 20052005-10-06

On 2005-10-06, Rene Tschaggelar <none@none.net> wrote:
> darkknight wrote:
>
>> 
>> Suppose you have some software that is 16K bytes.
>> Is CRC16 - 16 bit crc, a reasonable way of producing a number that
>> uniquely identifies a particular revision of this software?  What are
>> the odds of two different revisions having the same CRC?  Are they any
>> better than using a 16 bit arithmetic checksum?
>
> Forget a checksum, they are too easily fooled.

That depends on the data block's failure modes.  CRC's are
particularly good for serial data transmission media
susceptible to burst-mode errors.  If serial transmission with
burst-mode errors isn't what you're trying to detect, then a
CRC might not be "less easily fooled" than another hash.

The odds of a collision between two random blocks of data are
exactly the same for an N-bit CRC and for an N-bit "checksum".

> A 16bit CRC protects 2^16 bits = 8kbytes against single bit
> errors. So a 32bit CRC is possible a better fit.

That depends on the failure mode, the cost of the calculation,
and the cost of a failure.

> To check uniqueness of some date, the hash value was created.

A CRC is a hash value.  So is a checksum.

> This is polynominal with some more digits than just a CRC.

I presume you're talking about "one-way" hash functions like
MD5 or SHA?  

The "one-way" property prevents somebody from forging a message
with something other thana brute-force approach.  The length of
the hash is to make a brute-force attack more expensive.

If you don't care about forgeries, there are probably cheaper
hash functions.

-- 
Grant Edwards                   grante             Yow!  Finally, Zippy
                                  at               drives his 1958 RAMBLER
                               visi.com            METROPOLITAN into the
                                                   faculty dining room.

Reply by Jack Klein ●October 7, 20052005-10-07

On Thu, 06 Oct 2005 20:32:40 +1300, darkknight
<darkknight.21@gmail.com> wrote in comp.arch.embedded:

> 
> 
> Suppose you have some software that is 16K bytes.
> Is CRC16 - 16 bit crc, a reasonable way of producing a number that
> uniquely identifies a particular revision of this software?  What are
> the odds of two different revisions having the same CRC?  Are they any
> better than using a 16 bit arithmetic checksum?
> 
> 
> TIA

I used to use CRC for validating images, but looking for programming
errors in a flash/EPROM/ROM image has a considerably different failure
mode and error pattern than serial communications, which is what CRCs
are most useful for.

These days I use a Fletcher checksum, which is faster than a table
loop up CRC and doesn't need space for the table, and much faster than
bit banging a CRC in code.  The Fletcher checksum is sensitive to the
order of the data values as well as the values themselves.  Swap any
two (bytes, words) in the image and an ordinary checksum comes out the
same, but a Fletcher checksum, like a CRC, will spot the difference.

For small blocks less than 256 bytes, like EEPROM values, we use an
8-bit Fletcher sum, for sizes up to 64K a 16-bit sum, and for larger
values a 32-bit sum.  The total number of bits in the sum is twice
that of algorithm, so doing a 16-bit Fletcher sum on your 16K data
would give you a 32-bit result, which is sensitive to the ordering as
well as to the values.

Here's basic C code for a 16-bit Fletcher checksum:

struct fletch_16
{
    uint16_t sum;
    uint16_t total;
};

fletch_16 calc_fletch_16(const void *source, size_t size)
{
   const uint16_t *src = source;
   fletch_16 result = { 0 };

   while (size--)
   {
      result.sum += *src++;
      result.total += result.sum;
   }

   return result;
}

-- 
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html

Reply by ●October 7, 20052005-10-07

Jack Klein <jackklein@spamcop.net> writes:
> 
> Here's basic C code for a 16-bit Fletcher checksum:
> 
> struct fletch_16
> {
>     uint16_t sum;
>     uint16_t total;
> };
> 
> fletch_16 calc_fletch_16(const void *source, size_t size)
> {
>    const uint16_t *src = source;
>    fletch_16 result = { 0 };
> 
>    while (size--)
>    {
>       result.sum += *src++;
>       result.total += result.sum;
>    }
> 
>    return result;
> }

I think you need to change "fletch_16" to "struct fletch_16" in the
function.

(This only cought my eye because of our discussion regarding uint16_t
etc in clc!)

-- 

John Devereux

Reply by Robert Scott ●October 7, 20052005-10-07

On Thu, 06 Oct 2005 22:45:56 -0500, Jack Klein <jackklein@spamcop.net>
wrote:

>...The Fletcher checksum is sensitive to the
>order of the data values as well as the values themselves.  Swap any
>two (bytes, words) in the image and an ordinary checksum comes out the
>same, but a Fletcher checksum, like a CRC, will spot the difference...

Not always true.  For example, consider these bytes (in hex):

 80  00  00    and
 00  00  80

The first and third bytes were swapped.  But the Fletcher checksum in
both cases is {0x80, 0x80}.

However I will grant you that in most random case, the Fletcher
checksum will detect a byte swap.


>
>Here's basic C code for a 16-bit Fletcher checksum:
>
>struct fletch_16
>{
>    uint16_t sum;
>    uint16_t total;
>};
>
>fletch_16 calc_fletch_16(const void *source, size_t size)
>{
>   const uint16_t *src = source;
>   fletch_16 result = { 0 };
>
>   while (size--)
>   {
>      result.sum += *src++;
>      result.total += result.sum;
>   }
>
>   return result;
>}

-Robert Scott
 Ypsilanti, Michigan

Reply by Not Really Me ●October 10, 20052005-10-10

"darkknight" <darkknight.21@gmail.com> wrote in message 
news:dek9k19183g8k8ghbiqsja4hppkctmrv7i@4ax.com...
>
>
> Suppose you have some software that is 16K bytes.
> Is CRC16 - 16 bit crc, a reasonable way of producing a number that
> uniquely identifies a particular revision of this software?  What are
> the odds of two different revisions having the same CRC?  Are they any
> better than using a 16 bit arithmetic checksum?
>
>
> TIA

I'll take a run in a different direction.   Rather than a single value that 
may repeat, consider other options such as time and date stamping, actual 
version numbers, etc., embedded in the code.

You didn't mention what compiler you are using, but many have features for 
putting date/time stamps in code.  If you are using make files it is not too 
difficult to have each run of make spawn a simple batch file to either 
collect the system date/time or to even use a script to increment a 
version/build number.  Anything generated by a script/executable could put 
the data in a file that you can include in the compile.

I think you will find this much more applicable to your stated goal than 
using the CRC.  At the same time, if you need to check the code validity at 
run time, a CRC embedded in the file is a good choice.
-- 
Scott
Validated Software Corp.

crc for versioning

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group