On 5/10/2023 3:03 AM, Ulf Samuelsson wrote:
> Programming a flash memory can flip bits in parts of the flash memory which is 
> not programmed.
> Bit errors can also be introduced by radiation.

Executing code can also introduce write *and* read disturb events.

Ask yourself how to protect a design that allows arbitrary code to
be executed (even if in a sandbox) in the presence of potential
side-channel exploits.

Or, as a "simpler" problem:  how to detect if such an exploit
has been invoked (even possibly unintentionally)!

[Imagine devices that "run forever"...]

> Some applications require better security than others.

Exactly.

> Functional Safety may require CRC size based on code size.

On 5/3/2023 5:48 AM, David Brown wrote:

>>>> Give me the sources for Windows (Linux, *BSD, etc.) and I can
>>>> subvert all the state-of-the-art digital signing used to ensure
>>>> binaries aren't altered.&nbsp; Nothing *outside* the box is involved
>>>> so, by definition, everything I need has to reside *in* the box.
>>>
>>> No, you can't.&nbsp; The sources for Linux and *BSD /are/ all freely available.  
>>> The private signing keys used by, for example, Red Hat or Debian, are /not/ 
>>> freely available.&nbsp; You cannot make changes to a Red Hat or Debian package 
>>> that will pass the security checks - you are unable to sign the packages.
>>
>> Sure I can!&nbsp; If you are just signing a package to verify that it hasn't
>> been tampered with BUT THE CONTENTS ARE NOT ENCRYPTED, then all you have
>> to do is remove the signature check -- leaving the signature in the
>> (unchecked) executable.
> 
> Woah, you /really/ don't understand this stuff, do you?&nbsp; Here's a clue - ask 
> yourself what is being signed, and what is doing the checking.

Exactly.  You don't attack the signature or the keys.  You BUILD A NEW KERNEL 
THAT DOESN'T CHECK THE SIGNATURE.  You attack (replace if you have access
to the sources -- as I stipulated above) the "what is doing the checking".
This is c.a.e; you likely have physical access and control of the device
(unlike trying to attack a remote system)

The binary is exposed UNENCRYPTED in the signed executable (please note my
stipulation to that, too, above).  The only thing preventing its execution
(if tampered -- or unlicensed!) is the signature check.  Bypass that in any
way and the code executes AS IF signed.

I design "devices".  Don't you think if there was a foolproof way (by resorting
to "school boy techniques") to protect them from counterfeiting and tampering
that I would have already embraced that?  That EVERY computer-based product
would be INHERENTLY SECURED??

[There are ways that are far from theoretical yet considerably more
effective.  A signature check is easy to detect in an executing device
and, thus, elided.  Lather, rinse, repeat for each precursor level
of such protection.]

Please read what I've written more carefully, lest you look foolish.  Or,
spend a few years playing red-blue games and actually trying to subvert
hardware and software protection mechanisms in REAL products.  (Hint:
you will need to think down BELOW the hardware level to do so successfully
so you can bypass the hardware mechanisms that folks keep trying to embed
in their products)

> Perhaps also ask yourself if /all/ the people involved in security for Linux or 
> BSD - all the companies such as Red Hat, IBM, Intel, etc., - ask if /all/ of 
> them have got it wrong, and only /you/ realise that digital signatures on open 
> source software is useless?

The signature is only of use if the mechanism verifying it is tamperproof.
That's not possible on most (all?) devices sold.  SOMEONE has physical
access to the device so all of the mechanisms you put in place can be
subverted.

*Ask* the Linux and BSD crowds if they can GUARANTEE that ALTERED signed code
can't be executed on a system where the adversary can build and install their
own kernel.  Or, probe the innerworkings of such a device AT THEIR LEISURE/

> /Very/ occasionally, there is a lone genius that 
> understands something while all the other experts are wrong - but in most 
> cases, the loner is the one that is wrong.

In this case, you have clearly failed to understand what was being said.
So, don't count yourself in with the "experts".

If the kernel loading the executable doesn't contain code to validate the
signature  (and, if I have the sources for said kernel/OS then I can
easily *make* such a kernel) then the signature is just another unused
"section" in the BLOB.  Just like debug symbols or copyright information.

Den 2023-05-10 kl. 10:06, skrev David Brown:
> On 09/05/2023 20:42, Ulf Samuelsson wrote:
>> Den 2023-05-03 kl. 14:48, skrev David Brown:
> 
>>> It makes sense to use an 8-bit CRC on small telegrams, 16-bit CRC on 
>>> bigger things, 32-bit CRC on flash images, and 64-bit CRC when you 
>>> want to use the CRC as an identifying hash (and malicious tampering 
>>> is non-existent).&nbsp; There can also be benefits of particular choices 
>>> of CRC for particular use-cases, in terms of detection of certain 
>>> error patterns for certain lengths of data.
>>
>> Flash images larger than X kB may need a 64-bit CRC.
>> I don't remember exactly when to start considering it,
>> but something between 64kB-256kB is probably correct.
>>
>> It is all to do with Hamming Distance, and this is also affected by 
>> the polynome.
>> /Ulf
>>
>>
> "Need" is too strong a word here.&nbsp; A CRC will guarantee detection of 
> certain kinds of error (such as a single bit error), regardless of the 
> length of the data.&nbsp; Some kinds of error are limited by length.&nbsp; If you 
> plot a graph with guaranteed Hamming distance on the vertical scale and 
> length of data on the horizontal scale, each CRC will drop off in steps. 
>  &nbsp;For the same CRC size, some will hold a high Hamming distance for 
> longer and then drop off sharply, others will hold a lower Hamming 
> distance for very large data.&nbsp; And in general, a bigger CRC will be 
> better here.
> 
> But Hamming distance is not everything.&nbsp; It is important in situations 
> where there is an approximately independent risk of corruption for each 
> bit individually - such as during radio transmission.&nbsp; Programming 
> images into flash has a completely different error risk pattern.&nbsp; A 
> little Hamming is nice to guarantee that any single cell failure in the 
> flash will be be found, but the more realistic flash problems involve 
> large scale effects - failure to erase a block fully, or software flaws. 
>  &nbsp;For this kind of thing, pretty much any valid CRC polynomial works the 
> same - a 32-bit polynomial gives you a 1 in 2 ^ 32 chance of the error 
> going undetected.&nbsp; Yes, a 1 in 2 ^ 64 chance is better, but it's rarely 
> something to get excited about.

Programming a flash memory can flip bits in parts of the flash memory 
which is not programmed.
Bit errors can also be introduced by radiation.
Some applications require better security than others.
Functional Safety may require CRC size based on code size.
/Ulf


> 
> Note that if you are sending the image to a board via a potentially 
> flawed mechanism, you'll want appropriate checks during the transfers. 
> Ethernet, Wifi, Bluetooth, USB - they will all have suitable checksums 
> for each packet.&nbsp; And for some of those, Hamming distance and particular 
> choice of polynomial /is/ an important consideration.
> 
>

On 09/05/2023 20:34, Ulf Samuelsson wrote:
> Den 2023-04-30 kl. 16:19, skrev David Brown:
>> On 29/04/2023 23:03, Ulf Samuelsson wrote:
>>> Den 2023-04-28 kl. 15:04, skrev David Brown:
>>>> On 28/04/2023 10:50, Ulf Samuelsson wrote:
>>>>> Den 2023-04-28 kl. 09:38, skrev David Brown:
>>
>>>>>>
>>>>>> Or for my preferences, the CRC "DIGEST" would be put at the end of 
>>>>>> the image, rather than near the start.&nbsp; Then the "from, to" range 
>>>>>> would cover the entire image except for the final CRC.&nbsp; But I'd 
>>>>>> have a similar directive for the length of the image at a specific 
>>>>>> area near the start.
>>>>>>
>>>>>
>>>>> I really do not see a benefit of splitting the meta information 
>>>>> about the image to two separate locations.
>>>>>
>>>>> The bootloader uses the struct for all checks.
>>>>> It is a much simpler implementation once the tools support it.
>>>>>
>>>>> You might find it easier to write a tool which adds the CRC at the 
>>>>> end, but that is a different issue.
>>>>>
>>>>> Occam's Razor!
>>>>>
>>>>
>>>> There are different needs for different projects - and more than one 
>>>> way to handle them.&nbsp; I find adding a CRC at the end of the image 
>>>> works best for me, but I have no problem appreciating that other 
>>>> people have different solutions.
>>>>
>>>>
>>>>
>>>>
>>> I'd be curious to know WHY it works best for you.
>>> /Ulf
>>
>> I regularly do not have a bootloader - I am not free to put a CRC at 
>> the start of the image.&nbsp; And if the bootloader itself needs to be 
>> updatable, it is again impossible to have the CRC (or any other 
>> metadata) at the start of the image.&nbsp; I want most of the metadata to 
>> be at a fixed location as close to the start as reasonably practical 
>> (such as after the vector table, or other microcontroller-specific 
>> information that might be used for flash security, early chip setup, 
>> etc.).&nbsp; If I am to have one single checksum for the image, which is 
>> what I prefer, then it has to be at the end of the image.&nbsp; For 
>> example, there might be :
>>
>> 0x00000000 : vectors
>> 0x00000400 : external flash configuration block
>> 0x00000600 : program info metadata
>> 0x00001000 : main program
>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : CRC
>>
>> There is no way to have the metadata or CRC at the start of the image, 
>> so the CRC goes at the end.
> 
> For the Bootloader, I keep the CRC right after the vectors.
> I keep a copy of the vectors right after the CRC, and compare
> the two vector tables.
> This is to always know the location of the CRC.

Fair enough - that is an entirely reasonable alternative.  I have a 
knee-jerk reaction against duplication as a check, having cut my teeth 
on microcontrollers where 16 KB devices were "big", but of course a 
duplication of the vector table is not going to be a noticeable waste on 
a more modern device.

It does, however, mean extra steps in checking, compared to a simpler 
CRC run over the entire image.

> 
>>
>> It would be possible to have two CRCs - one that covers the vectors, 
>> configuration information, and metadata and is placed second last in 
>> the metadata block.&nbsp; A second CRC placed last in the metadata block 
>> would cover the main program - everything after the CRCs.&nbsp; That would 
>> let me have a single metadata block and no CRC at the end of the 
>> image. However, it would mean splitting the check in two, rather than 
>> one check for the whole image.&nbsp; I don't see that as a benefit.
>>
>>
>> When making images that are started from a bootloader, I certainly 
>> /could/ put the CRC at the start.&nbsp; But I see no particular reason to 
>> do so - it makes a lot more sense to keep a similar format.
>>
> You want more metadata like entry point and length, as well as text 
> information about the image. Putting things in a header means that
> location is fixed.
> There are a number of checks in my bootloader to ensure that the
> information in the header makes sense.
> 

I do have all that kind of thing too.  It's only the CRC itself that is 
put at the end, and it is easily found since the length of the image is 
in the metadata.  (We are talking about one pointer access more than 
having it at a fixed address - it's not hard to find it!).


>> (Bootloaders don't often have to check their own CRC - after all, even 
>> if the CRC fails there is usually little you can do about it, except 
>> charge on and hope for the best.&nbsp; But if the bootloader is updatable 
>> in system, then you want a CRC during the download procedure to check 
>> that you have got a good download copy before updating the flash.)
> 
> In functional safety applications you regularily check the flash 
> contents and refuse to boot if there is a mismatch.
> 

Yes, that is a possibility.

I've worked on safety-certified systems which required things like 
regular checks of flash while running (not just at bootup).  A lot of 
the so-called "safety requirements" were directly detrimental.  I 
believe many of these kinds of requirements were made by people who 
understood the "Swiss cheese" model of risks and safety, but not the 
more realistic "Hot cheese" model.  And they seem more concerned about 
box-ticking and legal arse-covering than actual risk reduction.

On 09/05/2023 20:42, Ulf Samuelsson wrote:
> Den 2023-05-03 kl. 14:48, skrev David Brown:

>> It makes sense to use an 8-bit CRC on small telegrams, 16-bit CRC on 
>> bigger things, 32-bit CRC on flash images, and 64-bit CRC when you 
>> want to use the CRC as an identifying hash (and malicious tampering is 
>> non-existent).&nbsp; There can also be benefits of particular choices of 
>> CRC for particular use-cases, in terms of detection of certain error 
>> patterns for certain lengths of data.
> 
> Flash images larger than X kB may need a 64-bit CRC.
> I don't remember exactly when to start considering it,
> but something between 64kB-256kB is probably correct.
> 
> It is all to do with Hamming Distance, and this is also affected by the 
> polynome.
> /Ulf
> 
> 
"Need" is too strong a word here.  A CRC will guarantee detection of 
certain kinds of error (such as a single bit error), regardless of the 
length of the data.  Some kinds of error are limited by length.  If you 
plot a graph with guaranteed Hamming distance on the vertical scale and 
length of data on the horizontal scale, each CRC will drop off in steps. 
  For the same CRC size, some will hold a high Hamming distance for 
longer and then drop off sharply, others will hold a lower Hamming 
distance for very large data.  And in general, a bigger CRC will be 
better here.

But Hamming distance is not everything.  It is important in situations 
where there is an approximately independent risk of corruption for each 
bit individually - such as during radio transmission.  Programming 
images into flash has a completely different error risk pattern.  A 
little Hamming is nice to guarantee that any single cell failure in the 
flash will be be found, but the more realistic flash problems involve 
large scale effects - failure to erase a block fully, or software flaws. 
  For this kind of thing, pretty much any valid CRC polynomial works the 
same - a 32-bit polynomial gives you a 1 in 2 ^ 32 chance of the error 
going undetected.  Yes, a 1 in 2 ^ 64 chance is better, but it's rarely 
something to get excited about.

Note that if you are sending the image to a board via a potentially 
flawed mechanism, you'll want appropriate checks during the transfers. 
Ethernet, Wifi, Bluetooth, USB - they will all have suitable checksums 
for each packet.  And for some of those, Hamming distance and particular 
choice of polynomial /is/ an important consideration.

Den 2023-05-03 kl. 14:48, skrev David Brown:
> On 03/05/2023 09:15, Don Y wrote:
>> On 4/24/2023 7:37 AM, David Brown wrote:
>>> On 24/04/2023 09:32, Don Y wrote:
>>>> On 4/22/2023 7:57 AM, David Brown wrote:
>>>>>>> However, in almost every case where CRC's might be useful, you 
>>>>>>> have additional checks of the sanity of the data, and an all-zero 
>>>>>>> or all-one data block would be rejected.&nbsp; For example, Ethernet 
>>>>>>> packets use CRC for integrity checking, but an attempt to send a 
>>>>>>> packet type 0 from MAC address 00:00:00:00:00:00 to address 
>>>>>>> 00:00:00:00:00:00, of length 0, would be rejected anyway.
>>>>>>
>>>>>> Why look at "data" -- which may be suspect -- and *then* check its 
>>>>>> CRC?
>>>>>> Run the CRC first.&nbsp; If it fails, decide how you are going to proceed
>>>>>> or recover.
>>>>>
>>>>> That is usually the order, yes.&nbsp; Sometimes you want "fail fast", 
>>>>> such as dropping a packet that was not addressed to you (it doesn't 
>>>>> matter if it was received correctly but for someone else, or it was 
>>>>> addressed to you but the receiver address was corrupted - you are 
>>>>> dropping the packet either way). But usually you will run the CRC 
>>>>> then look at the data.
>>>>>
>>>>> But the order doesn't matter - either way, you are still checking 
>>>>> for valid data, and if the data is invalid, it does not matter if 
>>>>> the CRC only passed by luck or by all zeros.
>>>>
>>>> You're assuming the CRC is supposed to *vouch* for the data.
>>>> The CRC can be there simply to vouch for the *transport* of a
>>>> datagram.
>>>
>>> I am assuming that the CRC is there to determine the integrity of the 
>>> data in the face of possible unintentional errors.&nbsp; That's what CRC 
>>> checks are for. They have nothing to do with the content of the data, 
>>> or the type of the data package or image.
>>
>> Exactly.&nbsp; And, a CRC on *a* protocol can use ANY ALGORITHM that the 
>> protocol
>> defines.&nbsp; Not some "canned one-size fits all" approach.
> 
> It makes sense to use an 8-bit CRC on small telegrams, 16-bit CRC on 
> bigger things, 32-bit CRC on flash images, and 64-bit CRC when you want 
> to use the CRC as an identifying hash (and malicious tampering is 
> non-existent).&nbsp; There can also be benefits of particular choices of CRC 
> for particular use-cases, in terms of detection of certain error 
> patterns for certain lengths of data.

Flash images larger than X kB may need a 64-bit CRC.
I don't remember exactly when to start considering it,
but something between 64kB-256kB is probably correct.

It is all to do with Hamming Distance, and this is also affected by the 
polynome.
/Ulf


> 
> 
> What I don't see any point in is using variations, such as different 
> initial values.&nbsp; I've already said why I think pathological cases such 
> as all zero data are normally irrelevant - but I can accept that there 
> may be occasions when they could happen, and thus a /single/ non-zero 
> initial value would be useful.
> 
>>
>>> As an example of the use of CRC's in messaging, look at Ethernet frames:
>>>
>>> <https://en.wikipedia.org/wiki/Ethernet_frame>
>>>
>>> The CRC&nbsp; does not care about the content of the data it protects.
>>
>> AND, if the packet yielded an incorrect CRC, you can assume the
>> data was corrupt... OR, you are looking at a different protocol
>> and MISTAKING it for something that you *think* it might be.
> 
> If the CRC does not match, you reject the packet or data.&nbsp; End of story. 
>  &nbsp;You don't know or care /why/ - because you cannot be sure of any reason.
> 
>>
>> If I produce a stream of data, can you tell me what the checksum
>> for THAT stream *should* be?&nbsp; You have to either be told what
>> it is (and have a way of knowing what the checksum SHOULD be)
>> *or* have to make some assumptions about it.
> 
> If you are transmitting some data then both sides need to agree on the 
> CRC algorithm (size, polynomial, initial value, etc.), and on whether a 
> check is "CRC of everything gives 0" or "CRC of everything except the 
> pre-calculated CRC equals the transmitted pre-calculated CRC".
> 
>>
>> If you have assumed wrong *or* if the data has been corrupt, then
>> the CRC should fail.&nbsp; You don't care why it failed -- because you
>> can't do anything about it.&nbsp; You just know that you can't use the data
>> in the way you THOUGHT it could be used.
>>
> 
> Well, yes.&nbsp; Obviously.
> 
> If you are making incorrect assumptions here, someone is doing a pretty 
> poor job at designing, describing or implementing the communications 
> system.&nbsp; It is just like getting the baud rate wrong on a UART link.
> 
> 
>>>> So, use a version-specific CRC on the packet.&nbsp; If it fails, then
>>>> either the data in the packet has been corrupted (which could just
>>>> as easily have involved an embedded "interface version" parameter);
>>>> or the packet was formed with the wrong CRC.
>>>>
>>>> If the CRC is correct FOR THAT VERSION OF THE PROTOCOL, then
>>>> why bother looking at a "protocol version" parameter?&nbsp; Would
>>>> you ALSO want to verify all the rest of the parameters?
>>>
>>> I'm sorry, I simply cannot see your point.&nbsp; Identifying the version 
>>> of a protocol, or other protocol type information, is a totally 
>>> orthogonal task to ensuring the integrity of the data.&nbsp; The concepts 
>>> should be handled separately.
>>
>> It is.&nbsp; A packet using protocol XYZ is delivered to port ABC.
>> Port ABC *only* handles protocol XYZ.&nbsp; Anything else arriving there,
>> with a potentially different checksum, is invalid.&nbsp; Even if, for example,
>> byte number 27 happens to have the correct "magic number" for that
>> protocol.
>>
>> Because the message doesn't obey the rules defined by the protocol
>> FOR THAT PORT.&nbsp; What do I gain by insisting that byte number 27 must
>> be 0x5A that the CRC doesn't already tell me?
>>
> 
> A CRC failure doesn't tell you that the telegram type is wrong.&nbsp; It 
> tells you that the data is corrupted.
> 
> If there can be different protocols, or telegram types, or whatever, 
> then identify them.&nbsp; Stop playing silly buggers with abuse of different 
> concepts that have different roles in the communication system.
> 
> 
>>>> Salt just ensures that you can differentiate between functionally 
>>>> identical
>>>> values.&nbsp; I.e., in a CRC, it differentiates between the "0x0000" that 
>>>> CRC-1
>>>> generates from the "0x0000" that CRC-2 generates.
>>>
>>> Can we agree that this is called an "initial value", not "salt" ?
>>
>> It depends on how you implement it.&nbsp; The point is to produce
>> different results for the same polynmomial.
> 
> It is called an "initial value" - it is not "salt".&nbsp; It doesn't matter 
> if you want to pick different initial values for your CRC, or why you 
> want to do that.&nbsp; You are still not talking about salt.
> 
> If you insist on using your own terminology, you will be left talking to 
> yourself.
> 
>>
>>>> You don't see the parallel to ensuring that *my* use of "Passw0rd" is
>>>> encoded in a different manner than *your* use of "Passw0rd"?
>>>
>>> No.&nbsp; They are different things.
>>>
>>> An important difference is that adding "salt" to a password hash is 
>>> an important security feature.&nbsp; Picking a different initial value for 
>>> a CRC instead of having appropriate protocol versioning in the data 
>>> (or a surrounding envelope) is a misfeature.
>>
>> And you don't see that verifying that a packet of data received at
>> port ABC that should only see the checksum associated with protocol
>> XYZ as being similarly related?
> 
> No.&nbsp; They are different things.
> 
> Look, I /do/ understand what you are doing, and I appreciate that you 
> think it is a good idea.&nbsp; To me, it is an unpleasant mix of orthogonal 
> concepts that needlessly complicates things.&nbsp; Just because something is 
> /possible/, does not mean it is a good idea.
> 
> 
>>>
>>>>>> See the RMI desciption.
>>>>>
>>>>> I'm sorry, I have no idea what "RMI" is or where it is described. 
>>>>> You've mentioned that abbreviation twice, but I can't figure it out.
>>>>
>>>> <https://en.wikipedia.org/wiki/RMI>
>>>> <https://en.wikipedia.org/wiki/OCL>
>>>>
>>>> Nothing magical with either term.
>>>
>>> I looked up RMI on Wikipedia before asking, and saw nothing of 
>>> relevance to CRC's or checksums.
>>
> 
> I've snipped the ramblings that have nothing to do with the question I 
> asked.&nbsp; I assume you don't want to answer me.
> 
>>
>>> I noticed no mention of "OCL" in your posts, and looking 
>>
>> You need to read more carefully.
> 
> I've looked.&nbsp; You did not mention "OCL" anywhere before giving the URL 
> to the wikipedia page.&nbsp; You only mentioned it /afterwards/ - without any 
> context that suggests what you meant.&nbsp; (Here's a hint for you - if you 
> want to refer to a wikipedia page, put a link to the /relevant/ page.)
> 
> 
> Presumably "RMI" and "OCL" have particular meanings that are relevant 
> for projects you work on, and are so familiar to you that they are part 
> of your language.&nbsp; No one else knows or cares what they are, and they 
> are irrelevant in this thread.&nbsp; So let's leave them there.
> 
>>
>>>> Give me the sources for Windows (Linux, *BSD, etc.) and I can
>>>> subvert all the state-of-the-art digital signing used to ensure
>>>> binaries aren't altered.&nbsp; Nothing *outside* the box is involved
>>>> so, by definition, everything I need has to reside *in* the box.
>>>
>>> No, you can't.&nbsp; The sources for Linux and *BSD /are/ all freely 
>>> available.&nbsp; The private signing keys used by, for example, Red Hat or 
>>> Debian, are /not/ freely available.&nbsp; You cannot make changes to a Red 
>>> Hat or Debian package that will pass the security checks - you are 
>>> unable to sign the packages.
>>
>> Sure I can!&nbsp; If you are just signing a package to verify that it hasn't
>> been tampered with BUT THE CONTENTS ARE NOT ENCRYPTED, then all you have
>> to do is remove the signature check -- leaving the signature in the
>> (unchecked) executable.
> 
> Woah, you /really/ don't understand this stuff, do you?&nbsp; Here's a clue - 
> ask yourself what is being signed, and what is doing the checking.
> 
> Perhaps also ask yourself if /all/ the people involved in security for 
> Linux or BSD - all the companies such as Red Hat, IBM, Intel, etc., - 
> ask if /all/ of them have got it wrong, and only /you/ realise that 
> digital signatures on open source software is useless?&nbsp; /Very/ 
> occasionally, there is a lone genius that understands something while 
> all the other experts are wrong - but in most cases, the loner is the 
> one that is wrong.
>

Den 2023-04-30 kl. 16:19, skrev David Brown:
> On 29/04/2023 23:03, Ulf Samuelsson wrote:
>> Den 2023-04-28 kl. 15:04, skrev David Brown:
>>> On 28/04/2023 10:50, Ulf Samuelsson wrote:
>>>> Den 2023-04-28 kl. 09:38, skrev David Brown:
> 
>>>>>
>>>>> Or for my preferences, the CRC "DIGEST" would be put at the end of 
>>>>> the image, rather than near the start.&nbsp; Then the "from, to" range 
>>>>> would cover the entire image except for the final CRC.&nbsp; But I'd 
>>>>> have a similar directive for the length of the image at a specific 
>>>>> area near the start.
>>>>>
>>>>
>>>> I really do not see a benefit of splitting the meta information 
>>>> about the image to two separate locations.
>>>>
>>>> The bootloader uses the struct for all checks.
>>>> It is a much simpler implementation once the tools support it.
>>>>
>>>> You might find it easier to write a tool which adds the CRC at the 
>>>> end, but that is a different issue.
>>>>
>>>> Occam's Razor!
>>>>
>>>
>>> There are different needs for different projects - and more than one 
>>> way to handle them.&nbsp; I find adding a CRC at the end of the image 
>>> works best for me, but I have no problem appreciating that other 
>>> people have different solutions.
>>>
>>>
>>>
>>>
>> I'd be curious to know WHY it works best for you.
>> /Ulf
> 
> I regularly do not have a bootloader - I am not free to put a CRC at the 
> start of the image.&nbsp; And if the bootloader itself needs to be updatable, 
> it is again impossible to have the CRC (or any other metadata) at the 
> start of the image.&nbsp; I want most of the metadata to be at a fixed 
> location as close to the start as reasonably practical (such as after 
> the vector table, or other microcontroller-specific information that 
> might be used for flash security, early chip setup, etc.).&nbsp; If I am to 
> have one single checksum for the image, which is what I prefer, then it 
> has to be at the end of the image.&nbsp; For example, there might be :
> 
> 0x00000000 : vectors
> 0x00000400 : external flash configuration block
> 0x00000600 : program info metadata
> 0x00001000 : main program
>  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : CRC
> 
> There is no way to have the metadata or CRC at the start of the image, 
> so the CRC goes at the end.

For the Bootloader, I keep the CRC right after the vectors.
I keep a copy of the vectors right after the CRC, and compare
the two vector tables.
This is to always know the location of the CRC.

> 
> It would be possible to have two CRCs - one that covers the vectors, 
> configuration information, and metadata and is placed second last in the 
> metadata block.&nbsp; A second CRC placed last in the metadata block would 
> cover the main program - everything after the CRCs.&nbsp; That would let me 
> have a single metadata block and no CRC at the end of the image. 
> However, it would mean splitting the check in two, rather than one check 
> for the whole image.&nbsp; I don't see that as a benefit.
> 
> 
> When making images that are started from a bootloader, I certainly 
> /could/ put the CRC at the start.&nbsp; But I see no particular reason to do 
> so - it makes a lot more sense to keep a similar format.
> 
You want more metadata like entry point and length, as well as text 
information about the image. Putting things in a header means that
location is fixed.
There are a number of checks in my bootloader to ensure that the
information in the header makes sense.

> (Bootloaders don't often have to check their own CRC - after all, even 
> if the CRC fails there is usually little you can do about it, except 
> charge on and hope for the best.&nbsp; But if the bootloader is updatable in 
> system, then you want a CRC during the download procedure to check that 
> you have got a good download copy before updating the flash.)

In functional safety applications you regularily check the flash 
contents and refuse to boot if there is a mismatch.

/Ulf
> 
> 
> 
> 
> 
> 
>

On 03/05/2023 09:15, Don Y wrote:
> On 4/24/2023 7:37 AM, David Brown wrote:
>> On 24/04/2023 09:32, Don Y wrote:
>>> On 4/22/2023 7:57 AM, David Brown wrote:
>>>>>> However, in almost every case where CRC's might be useful, you 
>>>>>> have additional checks of the sanity of the data, and an all-zero 
>>>>>> or all-one data block would be rejected.&nbsp; For example, Ethernet 
>>>>>> packets use CRC for integrity checking, but an attempt to send a 
>>>>>> packet type 0 from MAC address 00:00:00:00:00:00 to address 
>>>>>> 00:00:00:00:00:00, of length 0, would be rejected anyway.
>>>>>
>>>>> Why look at "data" -- which may be suspect -- and *then* check its 
>>>>> CRC?
>>>>> Run the CRC first.&nbsp; If it fails, decide how you are going to proceed
>>>>> or recover.
>>>>
>>>> That is usually the order, yes.&nbsp; Sometimes you want "fail fast", 
>>>> such as dropping a packet that was not addressed to you (it doesn't 
>>>> matter if it was received correctly but for someone else, or it was 
>>>> addressed to you but the receiver address was corrupted - you are 
>>>> dropping the packet either way). But usually you will run the CRC 
>>>> then look at the data.
>>>>
>>>> But the order doesn't matter - either way, you are still checking 
>>>> for valid data, and if the data is invalid, it does not matter if 
>>>> the CRC only passed by luck or by all zeros.
>>>
>>> You're assuming the CRC is supposed to *vouch* for the data.
>>> The CRC can be there simply to vouch for the *transport* of a
>>> datagram.
>>
>> I am assuming that the CRC is there to determine the integrity of the 
>> data in the face of possible unintentional errors.&nbsp; That's what CRC 
>> checks are for. They have nothing to do with the content of the data, 
>> or the type of the data package or image.
> 
> Exactly.&nbsp; And, a CRC on *a* protocol can use ANY ALGORITHM that the 
> protocol
> defines.&nbsp; Not some "canned one-size fits all" approach.

It makes sense to use an 8-bit CRC on small telegrams, 16-bit CRC on 
bigger things, 32-bit CRC on flash images, and 64-bit CRC when you want 
to use the CRC as an identifying hash (and malicious tampering is 
non-existent).  There can also be benefits of particular choices of CRC 
for particular use-cases, in terms of detection of certain error 
patterns for certain lengths of data.

What I don't see any point in is using variations, such as different 
initial values.  I've already said why I think pathological cases such 
as all zero data are normally irrelevant - but I can accept that there 
may be occasions when they could happen, and thus a /single/ non-zero 
initial value would be useful.

> 
>> As an example of the use of CRC's in messaging, look at Ethernet frames:
>>
>> <https://en.wikipedia.org/wiki/Ethernet_frame>
>>
>> The CRC&nbsp; does not care about the content of the data it protects.
> 
> AND, if the packet yielded an incorrect CRC, you can assume the
> data was corrupt... OR, you are looking at a different protocol
> and MISTAKING it for something that you *think* it might be.

If the CRC does not match, you reject the packet or data.  End of story. 
  You don't know or care /why/ - because you cannot be sure of any reason.

> 
> If I produce a stream of data, can you tell me what the checksum
> for THAT stream *should* be?&nbsp; You have to either be told what
> it is (and have a way of knowing what the checksum SHOULD be)
> *or* have to make some assumptions about it.

If you are transmitting some data then both sides need to agree on the 
CRC algorithm (size, polynomial, initial value, etc.), and on whether a 
check is "CRC of everything gives 0" or "CRC of everything except the 
pre-calculated CRC equals the transmitted pre-calculated CRC".

> 
> If you have assumed wrong *or* if the data has been corrupt, then
> the CRC should fail.&nbsp; You don't care why it failed -- because you
> can't do anything about it.&nbsp; You just know that you can't use the data
> in the way you THOUGHT it could be used.
> 

Well, yes.  Obviously.

If you are making incorrect assumptions here, someone is doing a pretty 
poor job at designing, describing or implementing the communications 
system.  It is just like getting the baud rate wrong on a UART link.

>>> So, use a version-specific CRC on the packet.&nbsp; If it fails, then
>>> either the data in the packet has been corrupted (which could just
>>> as easily have involved an embedded "interface version" parameter);
>>> or the packet was formed with the wrong CRC.
>>>
>>> If the CRC is correct FOR THAT VERSION OF THE PROTOCOL, then
>>> why bother looking at a "protocol version" parameter?&nbsp; Would
>>> you ALSO want to verify all the rest of the parameters?
>>
>> I'm sorry, I simply cannot see your point.&nbsp; Identifying the version of 
>> a protocol, or other protocol type information, is a totally 
>> orthogonal task to ensuring the integrity of the data.&nbsp; The concepts 
>> should be handled separately.
> 
> It is.&nbsp; A packet using protocol XYZ is delivered to port ABC.
> Port ABC *only* handles protocol XYZ.&nbsp; Anything else arriving there,
> with a potentially different checksum, is invalid.&nbsp; Even if, for example,
> byte number 27 happens to have the correct "magic number" for that
> protocol.
> 
> Because the message doesn't obey the rules defined by the protocol
> FOR THAT PORT.&nbsp; What do I gain by insisting that byte number 27 must
> be 0x5A that the CRC doesn't already tell me?
> 

A CRC failure doesn't tell you that the telegram type is wrong.  It 
tells you that the data is corrupted.

If there can be different protocols, or telegram types, or whatever, 
then identify them.  Stop playing silly buggers with abuse of different 
concepts that have different roles in the communication system.

>>> Salt just ensures that you can differentiate between functionally 
>>> identical
>>> values.&nbsp; I.e., in a CRC, it differentiates between the "0x0000" that 
>>> CRC-1
>>> generates from the "0x0000" that CRC-2 generates.
>>
>> Can we agree that this is called an "initial value", not "salt" ?
> 
> It depends on how you implement it.&nbsp; The point is to produce
> different results for the same polynmomial.

It is called an "initial value" - it is not "salt".  It doesn't matter 
if you want to pick different initial values for your CRC, or why you 
want to do that.  You are still not talking about salt.

If you insist on using your own terminology, you will be left talking to 
yourself.

> 
>>> You don't see the parallel to ensuring that *my* use of "Passw0rd" is
>>> encoded in a different manner than *your* use of "Passw0rd"?
>>
>> No.&nbsp; They are different things.
>>
>> An important difference is that adding "salt" to a password hash is an 
>> important security feature.&nbsp; Picking a different initial value for a 
>> CRC instead of having appropriate protocol versioning in the data (or 
>> a surrounding envelope) is a misfeature.
> 
> And you don't see that verifying that a packet of data received at
> port ABC that should only see the checksum associated with protocol
> XYZ as being similarly related?

No.  They are different things.

Look, I /do/ understand what you are doing, and I appreciate that you 
think it is a good idea.  To me, it is an unpleasant mix of orthogonal 
concepts that needlessly complicates things.  Just because something is 
/possible/, does not mean it is a good idea.

>>
>>>>> See the RMI desciption.
>>>>
>>>> I'm sorry, I have no idea what "RMI" is or where it is described. 
>>>> You've mentioned that abbreviation twice, but I can't figure it out.
>>>
>>> <https://en.wikipedia.org/wiki/RMI>
>>> <https://en.wikipedia.org/wiki/OCL>
>>>
>>> Nothing magical with either term.
>>
>> I looked up RMI on Wikipedia before asking, and saw nothing of 
>> relevance to CRC's or checksums.
> 

I've snipped the ramblings that have nothing to do with the question I 
asked.  I assume you don't want to answer me.

> 
>> I noticed no mention of "OCL" in your posts, and looking 
> 
> You need to read more carefully.

I've looked.  You did not mention "OCL" anywhere before giving the URL 
to the wikipedia page.  You only mentioned it /afterwards/ - without any 
context that suggests what you meant.  (Here's a hint for you - if you 
want to refer to a wikipedia page, put a link to the /relevant/ page.)

Presumably "RMI" and "OCL" have particular meanings that are relevant 
for projects you work on, and are so familiar to you that they are part 
of your language.  No one else knows or cares what they are, and they 
are irrelevant in this thread.  So let's leave them there.

> 
>>> Give me the sources for Windows (Linux, *BSD, etc.) and I can
>>> subvert all the state-of-the-art digital signing used to ensure
>>> binaries aren't altered.&nbsp; Nothing *outside* the box is involved
>>> so, by definition, everything I need has to reside *in* the box.
>>
>> No, you can't.&nbsp; The sources for Linux and *BSD /are/ all freely 
>> available.&nbsp; The private signing keys used by, for example, Red Hat or 
>> Debian, are /not/ freely available.&nbsp; You cannot make changes to a Red 
>> Hat or Debian package that will pass the security checks - you are 
>> unable to sign the packages.
> 
> Sure I can!&nbsp; If you are just signing a package to verify that it hasn't
> been tampered with BUT THE CONTENTS ARE NOT ENCRYPTED, then all you have
> to do is remove the signature check -- leaving the signature in the
> (unchecked) executable.

Woah, you /really/ don't understand this stuff, do you?  Here's a clue - 
ask yourself what is being signed, and what is doing the checking.

Perhaps also ask yourself if /all/ the people involved in security for 
Linux or BSD - all the companies such as Red Hat, IBM, Intel, etc., - 
ask if /all/ of them have got it wrong, and only /you/ realise that 
digital signatures on open source software is useless?  /Very/ 
occasionally, there is a lone genius that understands something while 
all the other experts are wrong - but in most cases, the loner is the 
one that is wrong.

On 4/24/2023 7:37 AM, David Brown wrote:
> On 24/04/2023 09:32, Don Y wrote:
>> On 4/22/2023 7:57 AM, David Brown wrote:
>>>>> However, in almost every case where CRC's might be useful, you have 
>>>>> additional checks of the sanity of the data, and an all-zero or all-one 
>>>>> data block would be rejected.&nbsp; For example, Ethernet packets use CRC for 
>>>>> integrity checking, but an attempt to send a packet type 0 from MAC 
>>>>> address 00:00:00:00:00:00 to address 00:00:00:00:00:00, of length 0, would 
>>>>> be rejected anyway.
>>>>
>>>> Why look at "data" -- which may be suspect -- and *then* check its CRC?
>>>> Run the CRC first.&nbsp; If it fails, decide how you are going to proceed
>>>> or recover.
>>>
>>> That is usually the order, yes.&nbsp; Sometimes you want "fail fast", such as 
>>> dropping a packet that was not addressed to you (it doesn't matter if it was 
>>> received correctly but for someone else, or it was addressed to you but the 
>>> receiver address was corrupted - you are dropping the packet either way).  
>>> But usually you will run the CRC then look at the data.
>>>
>>> But the order doesn't matter - either way, you are still checking for valid 
>>> data, and if the data is invalid, it does not matter if the CRC only passed 
>>> by luck or by all zeros.
>>
>> You're assuming the CRC is supposed to *vouch* for the data.
>> The CRC can be there simply to vouch for the *transport* of a
>> datagram.
> 
> I am assuming that the CRC is there to determine the integrity of the data in 
> the face of possible unintentional errors.&nbsp; That's what CRC checks are for.  
> They have nothing to do with the content of the data, or the type of the data 
> package or image.

Exactly.  And, a CRC on *a* protocol can use ANY ALGORITHM that the protocol
defines.  Not some "canned one-size fits all" approach.

> As an example of the use of CRC's in messaging, look at Ethernet frames:
> 
> <https://en.wikipedia.org/wiki/Ethernet_frame>
> 
> The CRC&nbsp; does not care about the content of the data it protects.

AND, if the packet yielded an incorrect CRC, you can assume the
data was corrupt... OR, you are looking at a different protocol
and MISTAKING it for something that you *think* it might be.

If I produce a stream of data, can you tell me what the checksum
for THAT stream *should* be?  You have to either be told what
it is (and have a way of knowing what the checksum SHOULD be)
*or* have to make some assumptions about it.

If you have assumed wrong *or* if the data has been corrupt, then
the CRC should fail.  You don't care why it failed -- because you
can't do anything about it.  You just know that you can't use the data
in the way you THOUGHT it could be used.

>> So, use a version-specific CRC on the packet.&nbsp; If it fails, then
>> either the data in the packet has been corrupted (which could just
>> as easily have involved an embedded "interface version" parameter);
>> or the packet was formed with the wrong CRC.
>>
>> If the CRC is correct FOR THAT VERSION OF THE PROTOCOL, then
>> why bother looking at a "protocol version" parameter?&nbsp; Would
>> you ALSO want to verify all the rest of the parameters?
> 
> I'm sorry, I simply cannot see your point.&nbsp; Identifying the version of a 
> protocol, or other protocol type information, is a totally orthogonal task to 
> ensuring the integrity of the data.&nbsp; The concepts should be handled separately.

It is.  A packet using protocol XYZ is delivered to port ABC.
Port ABC *only* handles protocol XYZ.  Anything else arriving there,
with a potentially different checksum, is invalid.  Even if, for example,
byte number 27 happens to have the correct "magic number" for that
protocol.

Because the message doesn't obey the rules defined by the protocol
FOR THAT PORT.  What do I gain by insisting that byte number 27 must
be 0x5A that the CRC doesn't already tell me?

You are assuming the CRC has to identify the protocol.  I didn't say that.
All I said was the CRC has to be correct for THAT protocol.

You likely don't use the same algorithm to compute the checksum of
a boot image as you do to verify the integrity of a ethernet datagram.
So, if you were presented with a stream of data, you wouldn't
arbitrarily decide to try different CRCs to see which yielded correct
results and, from that, *infer* the nature of the message.

Why would you think I wouldn't expect *a* particular protocol to use
a particular CRC?

>>>> What term would you have me use to indicate a "bias" applied to a CRC
>>>> algorithm?
>>>
>>> Well, first I'd note that any kind of modification to the basic CRC 
>>> algorithm is pointless from the viewpoint of its use as an integrity check.  
>>> (There have been, mostly historically, some justifications in terms of 
>>> implementation efficiency.&nbsp; For example, bit and byte re-ordering could be 
>>> done to suit hardware bit-wise implementations.)
>>>
>>> Otherwise I'd say you are picking a specific initial value if that is what 
>>> you are doing, or modifying the final value (inverting it or xor'ing it with 
>>> a fixed value).&nbsp; There is, AFAIK, no specific terms for these - and I don't 
>>> see any benefit in having one.&nbsp; Misusing the term "salt" from cryptography 
>>> is certainly not helpful.
>>
>> Salt just ensures that you can differentiate between functionally identical
>> values.&nbsp; I.e., in a CRC, it differentiates between the "0x0000" that CRC-1
>> generates from the "0x0000" that CRC-2 generates.
> 
> Can we agree that this is called an "initial value", not "salt" ?

It depends on how you implement it.  The point is to produce
different results for the same polynmomial.

>> You don't see the parallel to ensuring that *my* use of "Passw0rd" is
>> encoded in a different manner than *your* use of "Passw0rd"?
> 
> No.&nbsp; They are different things.
> 
> An important difference is that adding "salt" to a password hash is an 
> important security feature.&nbsp; Picking a different initial value for a CRC 
> instead of having appropriate protocol versioning in the data (or a surrounding 
> envelope) is a misfeature.

And you don't see that verifying that a packet of data received at
port ABC that should only see the checksum associated with protocol
XYZ as being similarly related?

Why not just assume the lower level protocols are sufficient to
guarantee reliable delivery and, if something arrives at port ABC
then, by definition, it must be intact (not corrupt) and, as
nothing other than protocol XYZ *should* target that port, why
even bother checking magic numbers in a protocol packet?

You build these *superfluous* tests into products to ensure their
integrity -- by catching ANYTHING that "can't happen" (yet
somehow does)

> The second difference is the purpose of the hashing.&nbsp; The CRC here is for data 
> integrity - spotting mistakes in the data during transfer or storage.&nbsp; The hash 
> in a password is for security, avoiding the password ever being transmitted or 
> stored in plain text.
> 
> Any coincidence in the the way these might be implemented is just that - 
> coincidence.
> 
>>>> See the RMI desciption.
>>>
>>> I'm sorry, I have no idea what "RMI" is or where it is described. You've 
>>> mentioned that abbreviation twice, but I can't figure it out.
>>
>> <https://en.wikipedia.org/wiki/RMI>
>> <https://en.wikipedia.org/wiki/OCL>
>>
>> Nothing magical with either term.
> 
> I looked up RMI on Wikipedia before asking, and saw nothing of relevance to 
> CRC's or checksums.

How do you think the marshalled arguments get from device A to (remote)
device B?  And, the result(s) from device B back to device A?

Obviously *some* form of communication medium.  So, some potential for
data to be corrupted (or altered!) in transit.  Along with other
data streams to compete for those endpoints.

Imagine invoking a function and, between the actual construction of the
stack frame and the first line of code in the targeted function, "something"
can interfere with the data you're trying to pass (and results you're
hoping to eventually receive) as well as the actual function being targeted!

You don't worry about this because the compiler handles all of the machinery
AND it relies on the CPU being well-behaved; nothing can sneak in and
disturb the address/data -busses or alter register contents during this
process.

If, OTOH, such a possibility existed (as is the case with RPC/RMI), then
you would want the compiler to generate the machinery to ensure the
arguments get to the correct function and for the function to be able to
ensure that the arguments are actually intended for it.

If any of these things failed to happen, you'd panic() -- because there's
nothing you can do, at that point.  You certainly can't fix any corrupted
values and can't deduce where they were intended to go (given that all
of that information can be just as corrupt).

With RPC/RMI, you can at least *know* that the "function linkage" failed
to operate as expected ON THIS INVOCATION.  Because the RPC/RMI can
return a result indicating whether the linkage was intact *and*, if
so, the result of the actual function invocation.

If you deliver every packet to a single port, then the process listening
to that port has to demultiplex incoming messages to determine the server-side
stub to invoke for that message instance.  You would likely use a standardized
protocol because you don't know anything about the incoming message -- except
that it is *supposed* to target a "remote procedure" (*local* to this node).

OTOH, if you target each particular remote function/procedure/method to
a function/procedure/method-SPECIFIC port, then how you handle "messages"
for one function need have no bearing on how you handle them for others.
And, you can exploit this as an added test to ensure the message you
are receiving at port JKL actually *appears* to be intended for port
JKL and not an accidental misdirect of a message intended for some
other port.

> I noticed no mention of "OCL" in your posts, and looking 

You need to read more carefully.

---8<---8<---
 >>>> I can't think of any use-cases where you would be passing around a block of
 >>>> "pure" data that could reasonably take absolutely any value, without any
 >>>> type of "envelope" information, and where you would think a CRC check is
 >>>> appropriate.
 >>>
 >>> I append a *version specific* CRC to each packet of marshalled data
 >>> in my RMIs.  If the data is corrupted in transit *or* if the
 >>> wrong version API ends up targeted, the operation will abend
 >>> because we know the data "isn't right".
 >>
 >> Using a version-specific CRC sounds silly.  Put the version information in
 >> the packet.
 >
 > The packet routed to a particular interface is *supposed* to
 > conform to "version X" of an interface.  There are different stubs
 > generated for different versions of EACH interface.  The OCL for
 > the interface defines (and is used to check) the form of that
 > interface to that service/mechanism.
 >
 > The parameters are checked on the client side -- why tie up the
 > transport medium with data that is inappropriate (redundant)
 > to THAT interface?  Why tie up the server verifying that data?
 > The stub generator can perform all of those checks automatically
 > and CONSISTENTLY based on the OCL definition of that version
 > of that interface (because developers make mistakes).
 >
 > So, at the instant you schedule the marshalled data for transmission,
 > you *know* the parameters are "appropriate" and compliant with
 > the constraints of THAT version of THAT interface.
 >
 > Now, you have to ensure the packet doesn't get corrupted (altered) in
 > transmission.  If it remains intact, then there is no need to check
 > the parameters on the server side.
 >
 > NONE OF THE PARAMETERS... including the (implied) "interface version" field!
 >
 > Yet, folks make mistakes.  So, you want some additional reassurance
 > that this is at least intended for this version of the interface,
 > ESPECIALLY IF THAT CAN BE MADE AVAILABLE FOR ZERO COST (i.e., check
 > to see if the residual is 0xDEADBEEF instead of 0xB16B00B5).
 >
 > Why burden the packet with a "protocol version" parameter?
---8<---8<---

> it up on Wikipedia gives no clues.

As I said, above:

    "If, OTOH, such a possibility existed (as is the case with RPC/RMI),
    then you would want the compiler to generate the machinery to ensure
    the arguments get to the correct function and for the function to be
    able to ensure that the arguments are actually intended for it."

You would want the IDL (Interface Definition Language) compiler to
generate stubs (client- and server-side) that enforced the constraints
specified in the IDL and OCL.

Again, in a perfect world, you'd not need any of these mechanisms.
Data wouldn't be corrupted on the wire.  Hostiles wouldn't try to
subvert those messages.  Developers would always ensure they
adhered to the contracts laid out for each API.  etc.

"Yet, folks make mistakes."

> So for now, I'll assume you don't want anyone to know what you meant and I can 
> safely ignore anything you write in connection with the terms.

Perhaps other folks were more careful in their reading (of the quoted passage,
above).

>>>> OTOH, "salting" the calculation so that it is expected to yield
>>>> a value of 0x13 means *those* situations will be flagged as errors
>>>> (and a different set of situations will sneak by, undetected).
>>>
>>> And that gives you exactly /zero/ benefit.
>>
>> See above.
> 
> I did.&nbsp; Zero benefit.

Perhaps your reading was as deficient there as you've admitted it to
be elsewhere?

> Actually, it is worse than useless - it makes it harder to identify the 
> protocol, and reduces the information content of the CRC check.
> 
>>> You run your hash algorithm, and check for the single value that indicates 
>>> no errors.&nbsp; It does not matter if that number is 0, 0x13, or - often more 
>> -----------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>
>> As you've admitted, it doesn't matter.&nbsp; So, why wouldn't I opt to have
>> an algorithm for THIS interface give me a result that is EXPECTED
>> for this protocol?&nbsp; What value picking "0"?
> 
> A /single/ result does not matter (other than needlessly complicating things).  
> Having multiple different valid results /does/ matter.

For any CRC calculation instance, you *know* what the result is expected to be.
How many different "check" algorithms do you think are operating in your
PC as you type/read (i.e., all of the protocols between devices running
in the box, all of the ROMs in those devices, the media accessed by them,
etc.)?  Has EVERY developer who needed a CRC settled on the "Holy Grail"
of CRCs... because it's easiest?  Or, have they each chosen schemes that
they consider appropriate to their needs?

I compute hashes of individual memory pages during reschedule()s.
And, verify that they are intact when next accessed (because they
may have been corrupted by a side-channel attack while not
actively being accessed -- by the owning task -- despite the
protections afforded by the MMU).  Should I use the same "check"
algorithm that I do when sending a message to another node?
Or, that I use on the wire?

Should I use the same algorithm when checking 4K pages as I would
when checking 16MB pages?  The goal isn't to *correct* errors so
I'd want one that detects the greatest number of errors LIKELY
INDUCED BY SUCH AN ATTACK (which can differ from the types of
*burst* errors that corrupt packets on the wire or lead to
read/write disturb errors in FLASH...)

As I said, up-thread:  "... you don't just use CRCs (secure hashes, etc.)
on 'code images'"

>>>>> That is why you need to distinguish between the two possibilities. If you 
>>>>> don't have to worry about malicious attacks, a 32-bit CRC takes a dozen 
>>>>> lines of C code and a 1 KB table, all running extremely efficiently.&nbsp; If 
>>>>> security is an issue, you need digital signatures - an RSA-based signature 
>>>>> system is orders of magnitude more effort in both development time and in 
>>>>> run time.
>>>>
>>>> It's considerably more expensive AND not fool-proof -- esp if the
>>>> attacker knows you are signing binaries.&nbsp; "OK, now I need to find
>>>> WHERE the signature is verified and just patch that "CALL" out
>>>> of the code".
>>>
>>> I'm not sure if that is a straw-man argument, or just showing your ignorance 
>>> of the topic.&nbsp; Do you really think security checks are done by the program 
>>> you are trying to send securely?&nbsp; That would be like trying to have building 
>>> security where people entering the building look at their own security cards.
>>
>> Do YOU really think we all design applications that run in PCs where some
>> CLOSED OS performs these tests in a manner that can't be subverted?
> 
> Do you bother to read my posts at all?&nbsp; Or do you prefer to make up things that 
> you imagine I write, so that you can make nonsensical attacks on them?  
> Certainly there is no sane reading of my posts (written and sent from an /open/ 
> OS) where "do not rely on security by obscurity" could be taken to mean "rely 
> on obscured and closed platforms".

"Do you really think security checks are done by the program you are trying
to send securely?  That would be like trying to have building security where
people entering the building look at their own security cards."

Who *else* is involved in the acceptance/verification of a code image
in an embedded product?  (Not all "run Linux")

>> *WE* (tend to) write ALL the code in the products developed, here.
>> So, whether it's the POST WE wrote that is performing the test or
>> the loader WE wrote, it's still *our* program.
>>
>> Yes, we ARE looking at our own security cards!
>>
>> Manufacturers *try* to hide ("obscurity") details of these mechanisms
>> in an attempt to improve effective security.&nbsp; But, there's nothing
>> that makes these guarantees.
> 
> Why are you trying to "persuade" me that manufacturer obscurity is a bad 
> thing?&nbsp; You have been promoting obscurity of algorithms as though it were 
> helpful for security - I have made clear that it is not.&nbsp; Are you getting your 
> own position mixed up with mine?

If the manufacturer saw no benefit to obscurity, then why embrace it?

>> Give me the sources for Windows (Linux, *BSD, etc.) and I can
>> subvert all the state-of-the-art digital signing used to ensure
>> binaries aren't altered.&nbsp; Nothing *outside* the box is involved
>> so, by definition, everything I need has to reside *in* the box.
> 
> No, you can't.&nbsp; The sources for Linux and *BSD /are/ all freely available.&nbsp; The 
> private signing keys used by, for example, Red Hat or Debian, are /not/ freely 
> available.&nbsp; You cannot make changes to a Red Hat or Debian package that will 
> pass the security checks - you are unable to sign the packages.

Sure I can!  If you are just signing a package to verify that it hasn't
been tampered with BUT THE CONTENTS ARE NOT ENCRYPTED, then all you have
to do is remove the signature check -- leaving the signature in the
(unchecked) executable.

This is different than *encrypting* the package (the OP said nothing
about encrypting his executable).

> This is precisely because something /outside/ the box /is/ involved - the 
> private half of the public/private key used for signing.&nbsp; The public half - and 
> all the details of the algorithms - is easily available to let people verify 
> the signature, but the private half is kept secret.

And, if I eliminate the check that verifies the signature, then what
value signing?  "Yes, I assume the risk of running an allegedly signed
executable (THAT MAY HAVE BEEN TAMPERED WITH)."

> (Sorry, but I've skipped and snipped the rest.&nbsp; I simply don't have time to go 
> through it in detail.&nbsp; If others find it useful or interesting, that's great, 
> but there has to be limits somewhere.)

The limits seem to be in your imagination.  You believe there's *a* way
of doing things instead of a multitude of ways, each with different
tradeoffs.  And, think you'll always have <whatever> is needed (resources,
time, staff, expertise, etc.) to get exactly those things.  The "box"
surrounding you limits what you can see.

Sad in an engineer.  But, must be incredibly comforting!

Bye, David.

On 29/04/2023 23:03, Ulf Samuelsson wrote:
> Den 2023-04-28 kl. 15:04, skrev David Brown:
>> On 28/04/2023 10:50, Ulf Samuelsson wrote:
>>> Den 2023-04-28 kl. 09:38, skrev David Brown:

>>>>
>>>> Or for my preferences, the CRC "DIGEST" would be put at the end of 
>>>> the image, rather than near the start.&nbsp; Then the "from, to" range 
>>>> would cover the entire image except for the final CRC.&nbsp; But I'd have 
>>>> a similar directive for the length of the image at a specific area 
>>>> near the start.
>>>>
>>>
>>> I really do not see a benefit of splitting the meta information about 
>>> the image to two separate locations.
>>>
>>> The bootloader uses the struct for all checks.
>>> It is a much simpler implementation once the tools support it.
>>>
>>> You might find it easier to write a tool which adds the CRC at the 
>>> end, but that is a different issue.
>>>
>>> Occam's Razor!
>>>
>>
>> There are different needs for different projects - and more than one 
>> way to handle them.&nbsp; I find adding a CRC at the end of the image works 
>> best for me, but I have no problem appreciating that other people have 
>> different solutions.
>>
>>
>>
>>
> I'd be curious to know WHY it works best for you.
> /Ulf

I regularly do not have a bootloader - I am not free to put a CRC at the 
start of the image.  And if the bootloader itself needs to be updatable, 
it is again impossible to have the CRC (or any other metadata) at the 
start of the image.  I want most of the metadata to be at a fixed 
location as close to the start as reasonably practical (such as after 
the vector table, or other microcontroller-specific information that 
might be used for flash security, early chip setup, etc.).  If I am to 
have one single checksum for the image, which is what I prefer, then it 
has to be at the end of the image.  For example, there might be :

0x00000000 : vectors
0x00000400 : external flash configuration block
0x00000600 : program info metadata
0x00001000 : main program
            : CRC

There is no way to have the metadata or CRC at the start of the image, 
so the CRC goes at the end.

It would be possible to have two CRCs - one that covers the vectors, 
configuration information, and metadata and is placed second last in the 
metadata block.  A second CRC placed last in the metadata block would 
cover the main program - everything after the CRCs.  That would let me 
have a single metadata block and no CRC at the end of the image. 
However, it would mean splitting the check in two, rather than one check 
for the whole image.  I don't see that as a benefit.

When making images that are started from a bootloader, I certainly 
/could/ put the CRC at the start.  But I see no particular reason to do 
so - it makes a lot more sense to keep a similar format.

(Bootloaders don't often have to check their own CRC - after all, even 
if the CRC fails there is usually little you can do about it, except 
charge on and hope for the best.  But if the bootloader is updatable in 
system, then you want a CRC during the download procedure to check that 
you have got a good download copy before updating the flash.)