Two code versions in Flash at the same time?

Started by Trevor Wigle September 18, 2003
Hello helpful people,

I have an interesting sort of goal/proposition:
I am working on product where we would like to allow the customer to perform
firmware updates (i.e., re-Flash the micro). (See previous thread on Flash
programming via CAN.)

We would like to be able to program the new code revision to a separate
(unused) part of Flash from the old (currently executing) code, in order to
minimize the number of our units that fail and must get returned. Only when
the new code has been verified to be intact, would we then execute the new
code.

The main purpose of this is to avoid executing the new code revision until
it is guaranteed to be intact. And, if there is a failure and the customer
unplugs it and plugs it in again, it wouldn't be running the new code, but
at least he would be running *something* which is much preferable to running
nothing (which would require the unit be returned).

Does anyone have any experience with anything like this? Does this even
sound feasible?

Some background:
I am using a 912DG218A and the program code is 24k. (There is additionally
a few hundred bytes of EEPROM and 2k of constants.) I am using the Cosmic
compiler.

I realize that the DG128A has paged memory which could allow for the
switching of old and new code revisions into the paged window, but since the
code is > 16k, this seems like a more elaborate solution is needed. The
main stumbling block as I see it is that the code is compiled/linked/etc.
using absolute addressing, whereas it seems like the relative addressing
would be needed. (I.e., the machine code produced to, say, jump to a
subroutine, contains the absolute address of that subroutine.)

Thanks!
- Andrew



In order to accomplish this effectively, I ave found that you need to
always leave a small startup kernel in the flash. The kernel is pointed
to by the reset and other exception vectors. It devides which set of
code to execute and where to forward the exception vectors via a tables
located in your two sets of code. On startup, the kernel loos at a
specific location in easch set of code to see if it is valid and
executes the one which is valisd. If both are valid, you can use a
version number in each set of code to deternmine which to execute.
(Usually the oldest, you will se why below).

After you have programmed in a new set of code with it's valid indicator
set to true (byte of $ff).and verified that it is correct., then you can
invalidate the old code by overwiting the validation byte in the old set
of code to $00. If something happens during the loading process of a new
set of code, the either the new validation byte is $ff or not. If it is,
and the old validation by is still $ff, then you revert to the old code
since it would still be valid. If the new one was not valid, then you
woud still be using the old code.

Regards
Dave Perreault Trevor Wigle wrote:

>Hello helpful people,
>
>I have an interesting sort of goal/proposition:
>I am working on product where we would like to allow the customer to perform
>firmware updates (i.e., re-Flash the micro). (See previous thread on Flash
>programming via CAN.)
>
>We would like to be able to program the new code revision to a separate
>(unused) part of Flash from the old (currently executing) code, in order to
>minimize the number of our units that fail and must get returned. Only when
>the new code has been verified to be intact, would we then execute the new
>code.
>
>The main purpose of this is to avoid executing the new code revision until
>it is guaranteed to be intact. And, if there is a failure and the customer
>unplugs it and plugs it in again, it wouldn't be running the new code, but
>at least he would be running *something* which is much preferable to running
>nothing (which would require the unit be returned).
>
>Does anyone have any experience with anything like this? Does this even
>sound feasible?
>
>Some background:
>I am using a 912DG218A and the program code is 24k. (There is additionally
>a few hundred bytes of EEPROM and 2k of constants.) I am using the Cosmic
>compiler.
>
>I realize that the DG128A has paged memory which could allow for the
>switching of old and new code revisions into the paged window, but since the
>code is > 16k, this seems like a more elaborate solution is needed. The
>main stumbling block as I see it is that the code is compiled/linked/etc.
>using absolute addressing, whereas it seems like the relative addressing
>would be needed. (I.e., the machine code produced to, say, jump to a
>subroutine, contains the absolute address of that subroutine.)
>
>Thanks!
> - Andrew >
>-------------------- >
>">http://docs.yahoo.com/info/terms/



We load new firmware over an infrared link, so there is high
probability of corruption. We do it by having control code in the high
(protected)page, then divide the remaining memory in half. Top half is
active, low half is to store new firmware until we have done a
CRC-check on it. if it's OK, the low-half overwrites the top half.
This over-writing process takes about 2 seconds on the '256 at 16MHz
bus speed. Control code needs to execute from RAM and so is to be
relocatable - we modified Gordon's bootloader.
Works rather well.
bruce.

--- In , "Dr. David A. Perreault"
<briggsroad@c...> wrote:
> In order to accomplish this effectively, I ave found that you need to
> always leave a small startup kernel in the flash. The kernel is pointed
> to by the reset and other exception vectors. It devides which set of
> code to execute and where to forward the exception vectors via a tables
> located in your two sets of code. On startup, the kernel loos at a
> specific location in easch set of code to see if it is valid and
> executes the one which is valisd. If both are valid, you can use a
> version number in each set of code to deternmine which to execute.
> (Usually the oldest, you will se why below).
>
> After you have programmed in a new set of code with it's valid
indicator
> set to true (byte of $ff).and verified that it is correct., then you
can
> invalidate the old code by overwiting the validation byte in the old
set
> of code to $00. If something happens during the loading process of a
new
> set of code, the either the new validation byte is $ff or not. If it
is,
> and the old validation by is still $ff, then you revert to the old code
> since it would still be valid. If the new one was not valid, then you
> woud still be using the old code.
>
> Regards
> Dave Perreault > Trevor Wigle wrote:
>
> >Hello helpful people,
> >
> >I have an interesting sort of goal/proposition:
> >I am working on product where we would like to allow the customer
to perform
> >firmware updates (i.e., re-Flash the micro). (See previous thread
on Flash
> >programming via CAN.)
> >
> >We would like to be able to program the new code revision to a separate
> >(unused) part of Flash from the old (currently executing) code, in
order to
> >minimize the number of our units that fail and must get returned.
Only when
> >the new code has been verified to be intact, would we then execute
the new
> >code.
> >
> >The main purpose of this is to avoid executing the new code
revision until
> >it is guaranteed to be intact. And, if there is a failure and the
customer
> >unplugs it and plugs it in again, it wouldn't be running the new
code, but
> >at least he would be running *something* which is much preferable
to running
> >nothing (which would require the unit be returned).
> >
> >Does anyone have any experience with anything like this? Does this
even
> >sound feasible?
> >
> >Some background:
> >I am using a 912DG218A and the program code is 24k. (There is
additionally
> >a few hundred bytes of EEPROM and 2k of constants.) I am using the
Cosmic
> >compiler.
> >
> >I realize that the DG128A has paged memory which could allow for the
> >switching of old and new code revisions into the paged window, but
since the
> >code is > 16k, this seems like a more elaborate solution is needed.
The
> >main stumbling block as I see it is that the code is
compiled/linked/etc.
> >using absolute addressing, whereas it seems like the relative
addressing
> >would be needed. (I.e., the machine code produced to, say, jump to a
> >subroutine, contains the absolute address of that subroutine.)
> >
> >Thanks!
> > - Andrew
> >
> >
> >
> >--------------------
> >
> >
> >
> >">http://docs.yahoo.com/info/terms/
> >
> >
> >
> >
> >




At 06:00 PM 9/18/2003 -0700, you wrote:

>I am working on product where we would like to allow the customer to perform
>firmware updates (i.e., re-Flash the micro).
>
>The main purpose of this is to avoid executing the new code revision until
>it is guaranteed to be intact. And, if there is a failure and the customer
>unplugs it and plugs it in again, it wouldn't be running the new code,
>
>Does anyone have any experience with anything like this? Does this even
>sound feasible?

Sure... I've got the same thing in a product built around the D60A. Not
hard (except that the D60A has a few quirks that I wish I could get around,
plus the P&E BDM software goes bananas if you try to map some things like
you would like to).

We put the boot loader in protected memory. The startup vector always
points to the boot code. Normal start: The boot loader makes a series of
background checks for incoming "sync requests" coming over the serial or
USB port, while doing a CRC on the existing application code. If the
application code checks good AND sync hasn't been established with a
validated update program, then control is transferred to the application.

If sync HAS been established with a validated update program, then the boot
program sits there waiting for commands (like "replace the application
program").

If the application program fails it's integrity check, then the boot
program sits there looking to establish sync with the update
program. [This also is how we get the application code in the first time.]

There are a lot more little "details" of course. Especially since our
equipment is battery powered and we have to ensure that things won't wait
forever. BTW, the update program (running on a PC) automatically updates
the appropriate CRC values and uploads them invisibly at the end of the
code/data upload.

One issue... in our case we erase the old application and reload the new on
top of it. A better scheme would be to load the new one into unused memory
and then transfer it - that way if the new one fails to load you can still
try to run the old one. We didn't have enough memory on the D60A.

We've been running this scheme for about six months in the field and have
yet to have a unit have to come back. Worst case (someone trips over the
cable, or the batteries die during the transfer, etc.) they just have to
start the upload over again.

jmk -----------
James M. Knox
TriSoft ph 512-385-0316
1109-A Shady Lane fax 512-366-4331
Austin, Tx 78721
-----------