EmbeddedRelated.com
Forums

Patch fixed strings in .hex file

Started by pozz January 16, 2024
Il 16/01/2024 16:36, David Brown ha scritto:
> On 16/01/2024 15:42, pozz wrote: >> Il 16/01/2024 13:51, David Brown ha scritto: >>> On 16/01/2024 13:19, pozz wrote: >>>> In one project I have many quasi-fixed strings that I'd like to keep >>>> in non volatile memory (Flash) to avoid losing precious RAM space. >>>> >>>> static const char s1[] = "/my/very/long/string/of/01020304"; >>>> static const char s2[] = "/another/string/01020304"; >>>> ... >>>> >>>> Substring "01020304" is a serial number that changes during >>>> production with specific device. It has the same length in bytes >>>> (it's a simple hex representation of a 32-bits integer). >>>> >>>> Of course it's too difficult and slow to rebuild the firmware during >>>> production passing to the compiler the real serial number. I think a >>>> better solution is to patch the .hex file generated by the compiler. >>>> >>>> I'm wondering how to detect the exact positions (addresses) of >>>> serial numbers to fix. >>>> >>>> The build system is gcc, so I could search for s1 in the elf file. >>>> Do you know of a tool that returns the address of a symbol in the >>>> elf or map file? >>>> >>>> Could you suggest a better approach? >>>> >>> >>> In the source code, put the serial number in as "PQRXYZ" or some >>> other distinct string of characters.  Generate bin files, not hex (or >>> convert with objcopy).  Then do a simple search for the special >>> string to find its position and replace it with the serial number >>> using a simple Python script or your other favourite tool (awk, sed, >>> perl, whatever). >> >> I thought about this approach, but is it so difficult to have the same >> exact sequence of bytes somewhere else in the output? > > Try it and see. > >> >> >>> Oh, and in the source code, don't forget to make the string "volatile". >> >> Why? >> > > If you have : > >     static const char s1[] = "PQRXYZ"; > > and your code later does, say : > >     const int last_digit = s1[5] - '0'; > > the compiler will optimise it to : > >     const int last_digit = '*'; > > i.e., it will calculate 'Z' - '0' at compile time - and if I remember by > ASCII codes correctly, that matches '*'. > > You will be messing with the string behind the compiler's back.  Make it > volatile.  "volatile const" might be unusual, but it is useful in > exactly this kind of circumstance.
Oh yes, I got the point now.
Il 16/01/2024 16:24, Grant Edwards ha scritto:
> On 2024-01-16, pozz <pozzugno@gmail.com> wrote: > >> I'm wondering how to detect the exact positions (addresses) of serial >> numbers to fix. > > Assuming there's a symbol associated with the address, the link map > will tell you what the address is.
The map file is simple to read by human, but I think it's better to use some tool (readelf or objdump) that access elf file. Even if I weren't able to create a command line for this task.
Am 16.01.2024 um 13:19 schrieb pozz:
> The build system is gcc, so I could search for s1 in the elf file. Do > you know of a tool that returns the address of a symbol in the elf or > map file?
Last time I needed that, I hacked it up myself; at least back in 32-bit times, ELF was not that hard (but I had to do that anyway to convert ELF into something the controller could boot).
> Could you suggest a better approach?
Define your memory allocations explicitly. Instead of building a binary and hacking the strings, place the strings at a fixed address and regenerate the ELF or .hex file containing them from scratch. Whether you then give the fixed addresses a name using linker magic, or just cast pointers, is a matter of taste. Stefan
On 16/01/2024 16:47, dalai lamah wrote:
> Un bel giorno pozz digit&ograve;: > >>> In the source code, put the serial number in as "PQRXYZ" or some other >>> distinct string of characters.&nbsp; Generate bin files, not hex (or convert >>> with objcopy).&nbsp; Then do a simple search for the special string to find >>> its position and replace it with the serial number using a simple Python >>> script or your other favourite tool (awk, sed, perl, whatever). >> >> I thought about this approach, but is it so difficult to have the same >> exact sequence of bytes somewhere else in the output? > > Extremely unlikely, especially since you use text strings and therefore you > actually use 64 bits (eigth ASCII characters) to represent a 32 bit number. > Besides, you don't need to use an ASCII string as the placeholder, you can > use any 64 bit number. > > If for example your binary file is 1 MB, there is one chance over 2.2 > trillion to have the same number duplicated somewhere else. > >>> Oh, and in the source code, don't forget to make the string "volatile". >> >> Why? > > To avoid that the compiler will optimize the code and "obfuscate" your > string. I don't think it is very likely, but it is not impossible, > especially if you use a very aggressive optimization level. >
Actually, this sort of thing really does happen in practice. In one of my current projects, I have some data that is filled in by post-processing the binary file, and I had to use volatile accesses to read the data or the compiler would optimise based on its knowledge of the contents it saw at compile time. This is not just theoretical. (To be fair, it is a bit more likely if - like in my case - the source file uses null characters rather than a pseudo-random string of characters.)
On 2024-01-16, David Brown <david.brown@hesbynett.no> wrote:
> On 16/01/2024 16:24, Grant Edwards wrote: >> On 2024-01-16, pozz <pozzugno@gmail.com> wrote: >> >>> I'm wondering how to detect the exact positions (addresses) of serial >>> numbers to fix. >> >> Assuming there's a symbol associated with the address, the link map >> will tell you what the address is. >> > > Making the symbol extern linkage (remove the "static") would help with that!
IIRC, if you're using gcc/binutils, there are ways to get even static symbols to show up in the link map (e.g. --fdata-sections), but making the symbol global is smplest. -- Grant
Am 16.01.2024 um 13:19 schrieb pozz:

> I'm wondering how to detect the exact positions (addresses) of serial > numbers to fix.
You do not. Instead, you set up linker scripts, linker options and/or add __attribute(()) to the variables' definitions to _place_ them at a predetermined, fixed, known-useful location. And do yourself one favour: have only _one_ instance of that number in your code. Use concatenation or similar to output it where needed. Then you can use tools like srecord GNU binutils to stamp your desired number into that fixed location in the hex file. Professional-grade chip flashing tools for production environments can usually do that by themselves, so you don't even have to edit your "official" files. Details will obviously vary by tool chain.
On 2024-01-16, Stefan Reuther <stefan.news@arcor.de> wrote:
> Am 16.01.2024 um 13:19 schrieb pozz: >> The build system is gcc, so I could search for s1 in the elf file. Do >> you know of a tool that returns the address of a symbol in the elf or >> map file? > > Last time I needed that, I hacked it up myself; at least back in 32-bit > times, ELF was not that hard (but I had to do that anyway to convert ELF > into something the controller could boot).
I think scanelf from pax-utils will do it. https://github.com/gentoo/pax-utils
On 2024-01-16, Stefan Reuther <stefan.news@arcor.de> wrote:
> Am 16.01.2024 um 13:19 schrieb pozz: >> The build system is gcc, so I could search for s1 in the elf file. Do >> you know of a tool that returns the address of a symbol in the elf or >> map file? > > Last time I needed that, I hacked it up myself; at least back in 32-bit > times, ELF was not that hard (but I had to do that anyway to convert ELF > into something the controller could boot).
libelf should help. The requirements sound similar to "we need to patch the checksum in the vector table so that a LPC MCU will boot": https://github.com/imi415/lpchecksum It should be easy to modify that to patch serial numbers.
> Define your memory allocations explicitly. Instead of building a binary > and hacking the strings, place the strings at a fixed address and > regenerate the ELF or .hex file containing them from scratch. Whether > you then give the fixed addresses a name using linker magic, or just > cast pointers, is a matter of taste.
Yes. Placing the string in a special section via the linker script will make it easier for the patch tool to locate the string. cu Michael -- Some people have no respect of age unless it is bottled.
Il 16/01/2024 19:35, Hans-Bernhard Br&ouml;ker ha scritto:
> Am 16.01.2024 um 13:19 schrieb pozz: > >> I'm wondering how to detect the exact positions (addresses) of serial >> numbers to fix. > > You do not. > > Instead, you set up linker scripts, linker options and/or add > __attribute(()) to the variables' definitions to _place_ them at a > predetermined, fixed, known-useful location.
Do you mean to choose by yourself the exact address of *each* string? And where would you put them, at the beginning, in the middle or at the end of the Flash? You need to calculate the address of the next string from the address *and length* of the previous string. It seems to me a tedious and error-prone job that could be done easily by the linker.
> And do yourself one favour: have only _one_ instance of that number in > your code.&nbsp; Use concatenation or similar to output it where needed. > > Then you can use tools like srecord GNU binutils to stamp your desired > number into that fixed location in the hex file.&nbsp; Professional-grade > chip flashing tools for production environments can usually do that by > themselves, so you don't even have to edit your "official" files. > > Details will obviously vary by tool chain. >
Patching the .hex or .bin file replacing 8 bytes starting from a known address is simple. I would write a Python script or would use one of srecord[1] tools. [1] https://srecord.sourceforge.net/
Il 17/01/2024 08:45, pozz ha scritto:
> Il 16/01/2024 19:35, Hans-Bernhard Br&ouml;ker ha scritto: >> Am 16.01.2024 um 13:19 schrieb pozz: >> >>> I'm wondering how to detect the exact positions (addresses) of serial >>> numbers to fix. >> >> You do not. >> >> Instead, you set up linker scripts, linker options and/or add >> __attribute(()) to the variables' definitions to _place_ them at a >> predetermined, fixed, known-useful location. > > Do you mean to choose by yourself the exact address of *each* string? > And where would you put them, at the beginning, in the middle or at the > end of the Flash? You need to calculate the address of the next string > from the address *and length* of the previous string. It seems to me a > tedious and error-prone job that could be done easily by the linker. > > >> And do yourself one favour: have only _one_ instance of that number in >> your code.&nbsp; Use concatenation or similar to output it where needed. >> >> Then you can use tools like srecord GNU binutils to stamp your desired >> number into that fixed location in the hex file.&nbsp; Professional-grade >> chip flashing tools for production environments can usually do that by >> themselves, so you don't even have to edit your "official" files. >> >> Details will obviously vary by tool chain. >> > > Patching the .hex or .bin file replacing 8 bytes starting from a known > address is simple. I would write a Python script or would use one of > srecord[1] tools. > > [1] https://srecord.sourceforge.net/ >
The command to patch 8 bytes in the address range 0x800-0x808 with the string "01020304" would be: srec_cat original.hex -I -E 0x800 0x808 -GEN 0x0800 0x0808 -REP_S "01020304" -O patched.hex -I -I is for Intel hex formato (input and output) -E is to exclude the bytes to patch from the original hex -GEN is to generate new bytes at a certaing range -REP_S is the constant string to repeat in the range In my case I don't really need to repeat the string in the range, because the length of the string is exactly the length of the address range.