Flash sector erase on S12DP256B

Started by Adrian Vos October 25, 2006
Hi all,

I am using an S12DP256B in an automotive engine management product. I am
using it in small memory model mode, but have it configured so that the main
firmware is in 0x4000-0x7FFF and 0xC000-0xEFFF. I have a bootloader resident
in 0xF000-0xFFFF along with the vectors. The unit resets to the bootloader
which checks if main firmware is present and jumps to it if it is present.
If it is not present, it enters a serial protocol which can upload main
firmware into the unit. The unit will also stay in the bootloaded if it ever
resets from the watchdog timer.

We have almost 1000 of these units in use on cars, and recently I have had 2
units back that entered boot mode (the software application that you used
with this product detects boot mode). Before loading firmware into the units
(which went without problem), I checked to see what had gone wrong by
reading hte memory contents via the BDM. The first sector of the flash
(0x4000-0x41FF) was blank. The rest of the firmware was fine. I suspect that
the unit enters boot mode when it attempts to jump to a routine in this
area, and then gets lost eventuating in a watchdog timeout. Actually just
looking at the memory map, I have some interupt vector function tables that
are in the erased area, meaning that on boot up, some interrupt vector will
be set to jump to 0xFFFF for the ISR.

Anyway, I doubt that my code is erasing this sector. The cars that ran fine
until they went to restart the engine at one point and it never started. It
seems that this sector was erased on either the powering down of the unit or
the powering up of the unit. This leads me to think it may have something to
do with the reset chip. I am using the the LP3470M5-4.63V reset chip with
the reset time set to 10ms. I previously had a different reset chip with a
200ms reset time, but this caused me problems in some applications, as the
system works with other products, and the other products were getting errors
due to the slow bootup time I was running.

I am thinking I should increase the reset delay, but I do not want to
increase it so much that I get my other problems back.

Can anyone provide any advice on what may have caused my problem, and how I
might remedy it. Particularly if it may be the short rest time. Apparently
the reset should be held at any time below 4.63V input voltage, and for 10ms
after it gets above 4.63V. What is the shortest reset delay that should
still provide high reliability of the flash memory. I am not looking for a
number that is much longer than it needs to be to be safe. I am looking for
the shortest delay that people have found to provide reliable operation.

Thanks,

Adrian

Send instant messages to your online friends http://au.messenger.yahoo.com
Adrian Vos wrote:

> Hi all,
>
> I am using an S12DP256B in an automotive engine management product. I am
> using it in small memory model mode, but have it configured so that the
> main
> firmware is in 0x4000-0x7FFF and 0xC000-0xEFFF. I have a bootloader
> resident
> in 0xF000-0xFFFF along with the vectors. The unit resets to the bootloader
> which checks if main firmware is present and jumps to it if it is present.

OK

> If it is not present, it enters a serial protocol which can upload main

OK

> firmware into the unit. The unit will also stay in the bootloaded if it
> ever
> resets from the watchdog timer.

Hm, it's something like saying user that software is broken (for some reason
watchdog timer expired), and we are recalling this unit. Good policy for
safety critical application. If not so safety critical then it can be
annoying, maybe give me a chance to drive to the closest workshop? Anyway
how is it done? Does common bootloader/application COP handler erase flash
sector @$4000? COP timer could expire for example due to the continuous
noise on SCI RX line, too often than normally called SCI ISR could
significantly slow down main thread or other tasks...
>
> We have almost 1000 of these units in use on cars, and recently I have had
> 2
> units back that entered boot mode (the software application that you used
> with this product detects boot mode). Before loading firmware into the
> units
> (which went without problem), I checked to see what had gone wrong by
> reading hte memory contents via the BDM. The first sector of the flash
> (0x4000-0x41FF) was blank. The rest of the firmware was fine. I suspect
> that
> the unit enters boot mode when it attempts to jump to a routine in this
> area, and then gets lost eventuating in a watchdog timeout. Actually just
> looking at the memory map, I have some interupt vector function tables
> that
> are in the erased area, meaning that on boot up, some interrupt vector
> will
> be set to jump to 0xFFFF for the ISR.
>
> Anyway, I doubt that my code is erasing this sector. The cars that ran
> fine

How it's made that single COP reset event makes bootloader permanently
active (even after PON reset). Are you using some EEPROM cell for this,
maybe bootloader erases sector at 4000?
> until they went to restart the engine at one point and it never started.
> It
> seems that this sector was erased on either the powering down of the unit
> or
> the powering up of the unit. This leads me to think it may have something
> to
> do with the reset chip. I am using the the LP3470M5-4.63V reset chip with
> the reset time set to 10ms. I previously had a different reset chip with a
> 200ms reset time, but this caused me problems in some applications, as the
> system works with other products, and the other products were getting
> errors
> due to the slow bootup time I was running.

S12 flash has some protections against power up/down failures.

1) Neither bootloader nor your code should init FCLKDIV if not necessary
(you are not going to write/erase flash). Flash operations should fail if
FCLKDIV isn't written.
2) FPROT register and FPROT byte. FPROT byte at ?FF0D is probably set to
unprotect all flash and protect just the bootloader area. FPROT register is
writeable to protected state, so in first lines of your code you should
?clear? FPROT to write protect all flash.
3) If flash status register indicates some errors, then, in order to
pgm/erase flash, these errors should be cleared first. Maybe sabotage flash
error(s) and leave it(them) uncleared in your code?

>
> I am thinking I should increase the reset delay, but I do not want to
> increase it so much that I get my other problems back.
>
> Can anyone provide any advice on what may have caused my problem, and how
> I
> might remedy it. Particularly if it may be the short rest time. Apparently
> the reset should be held at any time below 4.63V input voltage, and for
> 10ms
> after it gets above 4.63V. What is the shortest reset delay that should
> still provide high reliability of the flash memory. I am not looking for a
> number that is much longer than it needs to be to be safe. I am looking
> for
> the shortest delay that people have found to provide reliable operation.

The length of reset delay is less important for S12 than the need to keep
reset low until voltage on Vdd is out of spec.
Older 912D60(A) and 912Dx128(A) could loss the flash due slow oscilator
startup. Vdd could reach specified level way before Colpitts oscilator was
getting stable, CPU could start at nonstable clock and runaway. S12 CPU and
S12 flash have better protections against this. Anyway it's quite easy to
verify if oscilator is stable at the time /reset line is pulled up. Of
course not too easy since for automotive app you need to repeat your tests
at extreme temperatures.

I guess it is something wrong with bootloader. I'll take my words back if
your bootloader doesn't erase anything until not told do so.

Edward

>
> Thanks,
>
> Adrian
>
Adrian,

Here are some suggestions about approaches to your problem, but no solutions.

Since the "12 V" voltage varies all over the place when cranking, it may be
that the problem can be fixed with simple power supply mods, like larger
filter capacitors or over-voltage protection on the "12 V" input, or both.

Because there are so many possibilities in a situation like this, and the
testing is so difficult, I always recommend initializing ALL the interrupt
vectors to go to a tight loop, preferably one per interrupt.

This makes development in the lab much easier if unexpected interrupts
occur. It also makes it much easier to rule out an unexpected interrupt as
a cause of field failures.

The COP watchdog can get you out of the loop with a reset. You might want
to do something that gives better information about the unexpected
interrupt in the field, but the program shouldn't do an RTI from an
unexpected interrupt, because the program can't remove the interrupt
source, since the it didn't expect the interrupt.

You could also put in some defensive code just before the bootloader's
flash erasing that checks that the bootloader was properly entered and has
a good checksum. This should give you some assurance that the bootloader
was started properly and is not corrupted.

Since you seem to have plenty of space, you could also use the "two copies"
bootloading method. You burn the new copy into an unused flash buffer,
leaving the current copy undisturbed in flash. Check the checksum of the
new copy. If it checks, the bootloader makes it the operational program,
if not, it uses the old version and forgets about the bad copy.

This way, you always have at least one undisturbed copy of the operational
code available no matter what accidents occur in downloading a new
copy. You still may have some exposure to disasters in the process of
making a new good copy the operational version, but its much less, and
could possibly be eliminated completely.

Hope this helps.

Steve Russell
Nohau Emulators

At 12:19 AM 10/25/2006, Adrian Vos wrote:
>Hi all,
>
>I am using an S12DP256B in an automotive engine management product. I am
>using it in small memory model mode, but have it configured so that the main
>firmware is in 0x4000-0x7FFF and 0xC000-0xEFFF. I have a bootloader resident
>in 0xF000-0xFFFF along with the vectors. The unit resets to the bootloader
>which checks if main firmware is present and jumps to it if it is present.
>If it is not present, it enters a serial protocol which can upload main
>firmware into the unit. The unit will also stay in the bootloaded if it ever
>resets from the watchdog timer.
>
>We have almost 1000 of these units in use on cars, and recently I have had 2
>units back that entered boot mode (the software application that you used
>with this product detects boot mode). Before loading firmware into the units
>(which went without problem), I checked to see what had gone wrong by
>reading hte memory contents via the BDM. The first sector of the flash
>(0x4000-0x41FF) was blank. The rest of the firmware was fine. I suspect that
>the unit enters boot mode when it attempts to jump to a routine in this
>area, and then gets lost eventuating in a watchdog timeout. Actually just
>looking at the memory map, I have some interupt vector function tables that
>are in the erased area, meaning that on boot up, some interrupt vector will
>be set to jump to 0xFFFF for the ISR.
>
>Anyway, I doubt that my code is erasing this sector. The cars that ran fine
>until they went to restart the engine at one point and it never started. It
>seems that this sector was erased on either the powering down of the unit or
>the powering up of the unit. This leads me to think it may have something to
>do with the reset chip. I am using the the LP3470M5-4.63V reset chip with
>the reset time set to 10ms. I previously had a different reset chip with a
>200ms reset time, but this caused me problems in some applications, as the
>system works with other products, and the other products were getting errors
>due to the slow bootup time I was running.
>
>I am thinking I should increase the reset delay, but I do not want to
>increase it so much that I get my other problems back.
>
>Can anyone provide any advice on what may have caused my problem, and how I
>might remedy it. Particularly if it may be the short rest time. Apparently
>the reset should be held at any time below 4.63V input voltage, and for 10ms
>after it gets above 4.63V. What is the shortest reset delay that should
>still provide high reliability of the flash memory. I am not looking for a
>number that is much longer than it needs to be to be safe. I am looking for
>the shortest delay that people have found to provide reliable operation.
>
>Thanks,
>
>Adrian
>
>Send instant messages to your online friends http://au.messenger.yahoo.com
Thankyou Steve and Edward,

A bit more detail about how it works. There is indeed code in the unit that
can erase a flash sector. I leave the sectors unprotected in normal
operation (I recall having difficulties with this as once protected, it
cannot be unprotected without a reboot from memory as it is write once
within a certain time after boot). The sector erase code can only be invoked
by an RS232 command which is a lengthy data packet including a password and
a checksum (difficult for noise to reproduce this set of events.... but
possible that the code to erase the vector could be accidentally called by
runaway code I guess). The function that does the sector erase copies code
to ram, and then does the sector erase from ram and then returns to flash.
This function is only called once in my code in a function with a loop that
erases all not bootcode flash sectors (in this case only the first sector
was erased). The function that does the full main firmware erase is only
ever called from an RS232 message, and this code intentionally invokes a COP
reset immediately after the full firmware is erased to cause the unit to
reboot into boot code, which can then reprogram the main firmware flash with
new firmware. This prevents the unit from returning from this function to
the now erased flash code that called the function. There was no RS232
connected at the time this problem took place (The product is normally tuned
with laptop connected and them may never be connected again.. normal
operation is without the laptop connected).

Now I may have confused in the previous email. I have never had a unit
suffer a watchdog timeout in normal operation. If this did happen the unit
would enter bootmode (the engine would stop and require a power on reset to
enter normal mode again), but this condition would be reset on the next cold
bootup. In the COP reset vector I set a variable differently to power on
reset, and this causes the unit to stay in boot mode. A normal power on
reset will not set this variable and the result will be that the unti falls
out of bootmode as normal. The reason I did this was in case the firmware in
the unit was corrupted, the unit would stay in bootmode, and a PC can be
connected to the unit to replace the firmware, as the bootmode allows RS232
firmware update.... where corrupted firmware would not allow this. This
caters for cases where the unit loses power part way through a firmware
update, or uses corrupted firmware files, or if the firmware gets corrupted
in some other way.... which is what happened in this case. This unit was
actually exiting bootmode as it should on a power on reset, and entering the
normal firmware. Because the first sector had been erased, it was jumping to
code in this erased area or more likely some function pointers in this
sector that now pointed to 0xFFFF. This caused the code to runaway, and a
COP timeout which causes a COP reset that was detected by the bootcode, and
it remained in boot mode. This occured every time on power on reset
rendering the engine useless as the unit would not run it.

I actually have no problem with the way I handle this condition..... the
firmware is corrupted and should not be running the engine. I put this code
in to handle firmware upgrade failures, and it does handle this.
Unfortunately, I never intended for the unit to be able to corrupt its own
firmware once in service. My aim is to prevent this from happening again. It
has not happened before recently, and has happened to three (another unit
today since yesterday..... the firmware in the units has not changed
recently) units in the last batch of 100, which makes me wonder if it is
hardware related.

I do know that when I have run this product without a reset chip before
(when trying to test if the reset delay was the cause of my previous problem
interfacing with other devices cause by slow bootup), and I did have one
instance of the entire flash being erased when doing this (never a single
sector). I have never had this problem in earlier hardware revisions when
running with a reset time of 200ms using the MAX809L reset chip (with fixed
reset delay of 200ms). Unfortunately this 200ms delay caused other problems,
so I replaced the reset chip with the L3470-4.63V. The delay was set to
10ms. This should not cause a problem but who knows?

It is a huge coincidence to have only units from the most recent batch have
this problems when there is atleast double the number from the most recent
batch in the field running the same firmware. I have to check for any
accidental changes to this batch. I should say that the reset chip change
has been in the last 3 batches also, and it would seem that the problem is
limitted to the most recent batch. Is there anything hardware related other
than the incorrect operation of the reset line that could cause flash to be
erased?

Thanks heaps for your help. I am really clutching at straws to fix a problem
that I cannot reproduce. I have got a unit that has done this that I have
been able to see what happened (erased sector), but that is the only info I
have.

Cheers,

Adrian

----- Original Message -----
From: "Steve Russell"
To: <6...>
Sent: Thursday, October 26, 2006 4:39 AM
Subject: Re: [68HC12] Flash sector erase on S12DP256B
> Adrian,
>
> Here are some suggestions about approaches to your problem, but no
> solutions.
>
> Since the "12 V" voltage varies all over the place when cranking, it may
> be
> that the problem can be fixed with simple power supply mods, like larger
> filter capacitors or over-voltage protection on the "12 V" input, or both.
>
> Because there are so many possibilities in a situation like this, and the
> testing is so difficult, I always recommend initializing ALL the interrupt
> vectors to go to a tight loop, preferably one per interrupt.
>
> This makes development in the lab much easier if unexpected interrupts
> occur. It also makes it much easier to rule out an unexpected interrupt
> as
> a cause of field failures.
>
> The COP watchdog can get you out of the loop with a reset. You might want
> to do something that gives better information about the unexpected
> interrupt in the field, but the program shouldn't do an RTI from an
> unexpected interrupt, because the program can't remove the interrupt
> source, since the it didn't expect the interrupt.
>
> You could also put in some defensive code just before the bootloader's
> flash erasing that checks that the bootloader was properly entered and has
> a good checksum. This should give you some assurance that the bootloader
> was started properly and is not corrupted.
>
> Since you seem to have plenty of space, you could also use the "two
> copies"
> bootloading method. You burn the new copy into an unused flash buffer,
> leaving the current copy undisturbed in flash. Check the checksum of the
> new copy. If it checks, the bootloader makes it the operational program,
> if not, it uses the old version and forgets about the bad copy.
>
> This way, you always have at least one undisturbed copy of the operational
> code available no matter what accidents occur in downloading a new
> copy. You still may have some exposure to disasters in the process of
> making a new good copy the operational version, but its much less, and
> could possibly be eliminated completely.
>
> Hope this helps.
>
> Steve Russell
> Nohau Emulators
>
> At 12:19 AM 10/25/2006, Adrian Vos wrote:
>>Hi all,
>>
>>I am using an S12DP256B in an automotive engine management product. I am
>>using it in small memory model mode, but have it configured so that the
>>main
>>firmware is in 0x4000-0x7FFF and 0xC000-0xEFFF. I have a bootloader
>>resident
>>in 0xF000-0xFFFF along with the vectors. The unit resets to the bootloader
>>which checks if main firmware is present and jumps to it if it is present.
>>If it is not present, it enters a serial protocol which can upload main
>>firmware into the unit. The unit will also stay in the bootloaded if it
>>ever
>>resets from the watchdog timer.
>>
>>We have almost 1000 of these units in use on cars, and recently I have had
>>2
>>units back that entered boot mode (the software application that you used
>>with this product detects boot mode). Before loading firmware into the
>>units
>>(which went without problem), I checked to see what had gone wrong by
>>reading hte memory contents via the BDM. The first sector of the flash
>>(0x4000-0x41FF) was blank. The rest of the firmware was fine. I suspect
>>that
>>the unit enters boot mode when it attempts to jump to a routine in this
>>area, and then gets lost eventuating in a watchdog timeout. Actually just
>>looking at the memory map, I have some interupt vector function tables
>>that
>>are in the erased area, meaning that on boot up, some interrupt vector
>>will
>>be set to jump to 0xFFFF for the ISR.
>>
>>Anyway, I doubt that my code is erasing this sector. The cars that ran
>>fine
>>until they went to restart the engine at one point and it never started.
>>It
>>seems that this sector was erased on either the powering down of the unit
>>or
>>the powering up of the unit. This leads me to think it may have something
>>to
>>do with the reset chip. I am using the the LP3470M5-4.63V reset chip with
>>the reset time set to 10ms. I previously had a different reset chip with a
>>200ms reset time, but this caused me problems in some applications, as the
>>system works with other products, and the other products were getting
>>errors
>>due to the slow bootup time I was running.
>>
>>I am thinking I should increase the reset delay, but I do not want to
>>increase it so much that I get my other problems back.
>>
>>Can anyone provide any advice on what may have caused my problem, and how
>>I
>>might remedy it. Particularly if it may be the short rest time. Apparently
>>the reset should be held at any time below 4.63V input voltage, and for
>>10ms
>>after it gets above 4.63V. What is the shortest reset delay that should
>>still provide high reliability of the flash memory. I am not looking for a
>>number that is much longer than it needs to be to be safe. I am looking
>>for
>>the shortest delay that people have found to provide reliable operation.
>>
>>Thanks,
>>
>>Adrian
>>
>>Send instant messages to your online friends http://au.messenger.yahoo.com
Adrian,

the best advice I can give you is don't disregard any scenario that doesn't
make sense. I believe your password and command matching sequence is very
robust, but indeed you don't know how to reproduce this failure. I don't
want to lead you in wrong and time consuming direction but also I don't
believe runaways and unreliable S12 flash. There must be some explanaition
how and why. If S12 flash can selferase then flash command logic must be not
reliable too, maybe it does not work all, doesn't reset at power up to
default values. Why could it not reset to poweron state?- it had to be
something very odd on /reset pin, maybe reset logic doesn't work at all too
or doesn't work at some temperatures? OK, do all three defective units have
sector @4000 erased or do they have different sectors erased? If in all
cases erased region starts at 4000 then maybe it's a sign that firmware had
started eraseing the flash from bottom to top?

see more below

Adrian Vos wrote:

> Thankyou Steve and Edward,
>
> A bit more detail about how it works. There is indeed code in the unit
> that
> can erase a flash sector. I leave the sectors unprotected in normal
> operation (I recall having difficulties with this as once protected, it
> cannot be unprotected without a reboot from memory as it is write once
> within a certain time after boot). The sector erase code can only be
> invoked
> by an RS232 command which is a lengthy data packet including a password
> and
> a checksum (difficult for noise to reproduce this set of events.... but
> possible that the code to erase the vector could be accidentally called by
> runaway code I guess). The function that does the sector erase copies code

Suppose indeed there was a noise on RS232 that somehow passed your password
sequence. Is there a delay between acknowledge of command and start of
flasherase procedure? If this smart noise would happen at power down, then
/reset pin could have a chance to drop to assert level before the start of
catastrophic erase()... Excessive, out of spec noise shouldn't take veery
long and delay between from command to erase should prevent such scenario.
Maybe even better command-delay-command-delay-erase.
Are you ignoring framing errors during receiving password sequences? You
could restrict RS232 flasherase command to restart password matching
sequence if there was a framing error or excessive delay between password
characters.
How are you recognising RS232 commands and passwords? Are you buffering
characters until some terminator and then comparing whole strings? If so
then RAM buffer could hold whole or part of valid password/command chars
received previously.. just a single/couple of matching chars in noise and
flash is erased.

> to ram, and then does the sector erase from ram and then returns to flash.
> This function is only called once in my code in a function with a loop
> that
> erases all not bootcode flash sectors (in this case only the first sector
> was erased). The function that does the full main firmware erase is only
> ever called from an RS232 message, and this code intentionally invokes a
> COP
> reset immediately after the full firmware is erased to cause the unit to
> reboot into boot code, which can then reprogram the main firmware flash
> with
> new firmware. This prevents the unit from returning from this function to
> the now erased flash code that called the function. There was no RS232
> connected at the time this problem took place (The product is normally
> tuned
> with laptop connected and them may never be connected again.. normal
> operation is without the laptop connected).

Sounds reasonable. But, if reability of flash and reset pin circuit is under
question, then, I think, FPROT=0 could help you. You could keep using COP
reset to enter bootmode and reset FPROT register to flash protection byte.
Bootloader could receive password, erase and program commands. This way you
could limit the time of flash being unprotected. On PON reset bootloader
could check checksums, FPROT=0 and jump to application code.

>
> Now I may have confused in the previous email. I have never had a unit
> suffer a watchdog timeout in normal operation. If this did happen the unit
> would enter bootmode (the engine would stop and require a power on reset
> to
> enter normal mode again), but this condition would be reset on the next
> cold
> bootup. In the COP reset vector I set a variable differently to power on
> reset, and this causes the unit to stay in boot mode. A normal power on
> reset will not set this variable and the result will be that the unti
> falls
> out of bootmode as normal. The reason I did this was in case the firmware
> in
> the unit was corrupted, the unit would stay in bootmode, and a PC can be
> connected to the unit to replace the firmware, as the bootmode allows
> RS232
> firmware update.... where corrupted firmware would not allow this. This
> caters for cases where the unit loses power part way through a firmware
> update, or uses corrupted firmware files, or if the firmware gets
> corrupted
> in some other way.... which is what happened in this case. This unit was
> actually exiting bootmode as it should on a power on reset, and entering
> the
> normal firmware. Because the first sector had been erased, it was jumping
> to
> code in this erased area or more likely some function pointers in this
> sector that now pointed to 0xFFFF. This caused the code to runaway, and a
> COP timeout which causes a COP reset that was detected by the bootcode,
> and
> it remained in boot mode. This occured every time on power on reset
> rendering the engine useless as the unit would not run it.
>

Looks like this narrows the times failure could happen to powerup/down and
time intervals from powerup to engine was started? No more ideas.
> I actually have no problem with the way I handle this condition..... the
> firmware is corrupted and should not be running the engine. I put this
> code
> in to handle firmware upgrade failures, and it does handle this.
> Unfortunately, I never intended for the unit to be able to corrupt its own
> firmware once in service. My aim is to prevent this from happening again.
> It
> has not happened before recently, and has happened to three (another unit
> today since yesterday..... the firmware in the units has not changed
> recently) units in the last batch of 100, which makes me wonder if it is
> hardware related.
>
> I do know that when I have run this product without a reset chip before
> (when trying to test if the reset delay was the cause of my previous
> problem
> interfacing with other devices cause by slow bootup), and I did have one
> instance of the entire flash being erased when doing this (never a single
> sector). I have never had this problem in earlier hardware revisions when
> running with a reset time of 200ms using the MAX809L reset chip (with
> fixed
> reset delay of 200ms). Unfortunately this 200ms delay caused other
> problems,
> so I replaced the reset chip with the L3470-4.63V. The delay was set to
> 10ms. This should not cause a problem but who knows?
>

What you could tell about oscilator? Is it crystal + Collpits mode? Collpits
does start slowly. S12 is bit more protected against slow oscilator startup
than D60A, but after D60A I'm using only Pierce oscilator and no problems
since then. BTW for D60A and its Collpitts oscilator, reset delay of
100-200ms was more than enough. Maybe try to measure how fast does oscilator
start on defective unit and compare it to good unit.

> It is a huge coincidence to have only units from the most recent batch
> have
> this problems when there is atleast double the number from the most recent
> batch in the field running the same firmware. I have to check for any
> accidental changes to this batch. I should say that the reset chip change
> has been in the last 3 batches also, and it would seem that the problem is
> limitted to the most recent batch. Is there anything hardware related
> other
> than the incorrect operation of the reset line that could cause flash to
> be
> erased?

Since COP does only reset to bootloader and doesn't erase any portion of
flash, I see only reset delay vs oscilator startup and possibly not enough
protection against RS232 erase command. Regarding reset and its driver, I
think, for erase of flash it's not enough to release reset line while
oscilator didn't stabilize or while Vdd is too low. Mad CPU also has to
write something to FCLKDIV, then write a word to wordaligned flash address,
write valid flash sector erase command, write to flash command register. Too
many coincidencies should take place in order to make it happen... BTW, do
you have a large capacitor on reset line? Hope you don't have it, else Vdd
could drop below valid level while reset line could be kept high sourced
from capacitor. Of course MAX809 or similars should discharge /reset line
capacitance quickly but indeed it's something to check.
>
> Thanks heaps for your help. I am really clutching at straws to fix a
> problem
> that I cannot reproduce. I have got a unit that has done this that I have
> been able to see what happened (erased sector), but that is the only info
> I
> have.
>

No more straws, sorry
> Cheers,
>
> Adrian
>

Regards,
Edward

> ----- Original Message -----
> From: "Steve Russell"
> To: <6...>
> Sent: Thursday, October 26, 2006 4:39 AM
> Subject: Re: [68HC12] Flash sector erase on S12DP256B
>> Adrian,
>>
>> Here are some suggestions about approaches to your problem, but no
>> solutions.
>>
>> Since the "12 V" voltage varies all over the place when cranking, it may
>> be
>> that the problem can be fixed with simple power supply mods, like larger
>> filter capacitors or over-voltage protection on the "12 V" input, or
>> both.
>>
>> Because there are so many possibilities in a situation like this, and the
>> testing is so difficult, I always recommend initializing ALL the
>> interrupt
>> vectors to go to a tight loop, preferably one per interrupt.
>>
>> This makes development in the lab much easier if unexpected interrupts
>> occur. It also makes it much easier to rule out an unexpected interrupt
>> as
>> a cause of field failures.
>>
>> The COP watchdog can get you out of the loop with a reset. You might
>> want
>> to do something that gives better information about the unexpected
>> interrupt in the field, but the program shouldn't do an RTI from an
>> unexpected interrupt, because the program can't remove the interrupt
>> source, since the it didn't expect the interrupt.
>>
>> You could also put in some defensive code just before the bootloader's
>> flash erasing that checks that the bootloader was properly entered and
>> has
>> a good checksum. This should give you some assurance that the bootloader
>> was started properly and is not corrupted.
>>
>> Since you seem to have plenty of space, you could also use the "two
>> copies"
>> bootloading method. You burn the new copy into an unused flash buffer,
>> leaving the current copy undisturbed in flash. Check the checksum of the
>> new copy. If it checks, the bootloader makes it the operational program,
>> if not, it uses the old version and forgets about the bad copy.
>>
>> This way, you always have at least one undisturbed copy of the
>> operational
>> code available no matter what accidents occur in downloading a new
>> copy. You still may have some exposure to disasters in the process of
>> making a new good copy the operational version, but its much less, and
>> could possibly be eliminated completely.
>>
>> Hope this helps.
>>
>> Steve Russell
>> Nohau Emulators
>>
>> At 12:19 AM 10/25/2006, Adrian Vos wrote:
>>>Hi all,
>>>
>>>I am using an S12DP256B in an automotive engine management product. I am
>>>using it in small memory model mode, but have it configured so that the
>>>main
>>>firmware is in 0x4000-0x7FFF and 0xC000-0xEFFF. I have a bootloader
>>>resident
>>>in 0xF000-0xFFFF along with the vectors. The unit resets to the
>>>bootloader
>>>which checks if main firmware is present and jumps to it if it is
>>>present.
>>>If it is not present, it enters a serial protocol which can upload main
>>>firmware into the unit. The unit will also stay in the bootloaded if it
>>>ever
>>>resets from the watchdog timer.
>>>
>>>We have almost 1000 of these units in use on cars, and recently I have
>>>had
>>>2
>>>units back that entered boot mode (the software application that you used
>>>with this product detects boot mode). Before loading firmware into the
>>>units
>>>(which went without problem), I checked to see what had gone wrong by
>>>reading hte memory contents via the BDM. The first sector of the flash
>>>(0x4000-0x41FF) was blank. The rest of the firmware was fine. I suspect
>>>that
>>>the unit enters boot mode when it attempts to jump to a routine in this
>>>area, and then gets lost eventuating in a watchdog timeout. Actually just
>>>looking at the memory map, I have some interupt vector function tables
>>>that
>>>are in the erased area, meaning that on boot up, some interrupt vector
>>>will
>>>be set to jump to 0xFFFF for the ISR.
>>>
>>>Anyway, I doubt that my code is erasing this sector. The cars that ran
>>>fine
>>>until they went to restart the engine at one point and it never started.
>>>It
>>>seems that this sector was erased on either the powering down of the unit
>>>or
>>>the powering up of the unit. This leads me to think it may have something
>>>to
>>>do with the reset chip. I am using the the LP3470M5-4.63V reset chip with
>>>the reset time set to 10ms. I previously had a different reset chip with
>>>a
>>>200ms reset time, but this caused me problems in some applications, as
>>>the
>>>system works with other products, and the other products were getting
>>>errors
>>>due to the slow bootup time I was running.
>>>
>>>I am thinking I should increase the reset delay, but I do not want to
>>>increase it so much that I get my other problems back.
>>>
>>>Can anyone provide any advice on what may have caused my problem, and how
>>>I
>>>might remedy it. Particularly if it may be the short rest time.
>>>Apparently
>>>the reset should be held at any time below 4.63V input voltage, and for
>>>10ms
>>>after it gets above 4.63V. What is the shortest reset delay that should
>>>still provide high reliability of the flash memory. I am not looking for
>>>a
>>>number that is much longer than it needs to be to be safe. I am looking
>>>for
>>>the shortest delay that people have found to provide reliable operation.
>>>
>>>Thanks,
>>>
>>>Adrian
>>>
>>>Send instant messages to your online friends
>>>http://au.messenger.yahoo.com
>>>
>>>
>>>
>>>
Adrian,

Some notes on further straws below.

My reading of all this is that the units did not fail in the middle of
operation, but failed only on power up.

Is this correct?

Steve Russell
Nohau Emulators

At 05:42 PM 10/25/2006, Adrian Vos wrote:
>Thankyou Steve and Edward,
>
>A bit more detail about how it works. There is indeed code in the unit that
>can erase a flash sector. I leave the sectors unprotected in normal
>operation (I recall having difficulties with this as once protected, it
>cannot be unprotected without a reboot from memory as it is write once
>within a certain time after boot).

I think you are confusing the HC11 and the HC12 families.

The HC11 has the "within the first 64 cycles after reset" write restriction
on some registers.

The HC12 family has always been "write once" restriction only, sometimes
with the errata "write twice in special modes".

>The sector erase code can only be invoked
>by an RS232 command which is a lengthy data packet including a password and
>a checksum (difficult for noise to reproduce this set of events.... but
>possible that the code to erase the vector could be accidentally called by
>runaway code I guess). The function that does the sector erase copies code
>to ram, and then does the sector erase from ram and then returns to flash.
>This function is only called once in my code in a function with a loop that
>erases all not bootcode flash sectors (in this case only the first sector
>was erased). The function that does the full main firmware erase is only
>ever called from an RS232 message, and this code intentionally invokes a COP
>reset immediately after the full firmware is erased to cause the unit to
>reboot into boot code, which can then reprogram the main firmware flash with
>new firmware. This prevents the unit from returning from this function to
>the now erased flash code that called the function. There was no RS232
>connected at the time this problem took place (The product is normally tuned
>with laptop connected and them may never be connected again.. normal
>operation is without the laptop connected).
>
>Now I may have confused in the previous email. I have never had a unit
>suffer a watchdog timeout in normal operation. If this did happen the unit
>would enter bootmode (the engine would stop and require a power on reset to
>enter normal mode again), but this condition would be reset on the next cold
>bootup. In the COP reset vector I set a variable differently to power on
>reset, and this causes the unit to stay in boot mode. A normal power on
>reset will not set this variable and the result will be that the unti falls
>out of bootmode as normal. The reason I did this was in case the firmware in
>the unit was corrupted, the unit would stay in bootmode, and a PC can be
>connected to the unit to replace the firmware, as the bootmode allows RS232
>firmware update.... where corrupted firmware would not allow this. This
>caters for cases where the unit loses power part way through a firmware
>update, or uses corrupted firmware files, or if the firmware gets corrupted
>in some other way.... which is what happened in this case.

I think that this means that the erasing occurred somewhere between power
off and the de-assertion of reset on power on.

>This unit was
>actually exiting bootmode as it should on a power on reset, and entering the
>normal firmware. Because the first sector had been erased, it was jumping to
>code in this erased area or more likely some function pointers in this
>sector that now pointed to 0xFFFF. This caused the code to runaway, and a
>COP timeout which causes a COP reset that was detected by the bootcode, and
>it remained in boot mode. This occured every time on power on reset
>rendering the engine useless as the unit would not run it.
>
>I actually have no problem with the way I handle this condition..... the
>firmware is corrupted and should not be running the engine. I put this code
>in to handle firmware upgrade failures, and it does handle this.
>Unfortunately, I never intended for the unit to be able to corrupt its own
>firmware once in service. My aim is to prevent this from happening again. It
>has not happened before recently, and has happened to three (another unit
>today since yesterday..... the firmware in the units has not changed
>recently) units in the last batch of 100, which makes me wonder if it is
>hardware related.

This suggests looking carefully at the differences between the failing unit
and a sample of the previous batch that worked. Did anything at all change?

Perhaps the PCB material is slightly different or the board is not as well
cleaned. Either might change the oscillator parameters enough to cause
trouble.

Changes in the power supply and input filtering components that cause less
protection that previously are also possible

Perhaps the failing units are somehow installed differently from the first
2 changed batches? Installed in an electrically noisier environment that
previous units?

Could the failing units have been operated at more extreme temperatures
that previous units? Crystal oscillator startup problems are rumored to
show at temperature extremes.

>I do know that when I have run this product without a reset chip before
>(when trying to test if the reset delay was the cause of my previous problem
>interfacing with other devices cause by slow bootup), and I did have one
>instance of the entire flash being erased when doing this (never a single
>sector). I have never had this problem in earlier hardware revisions when
>running with a reset time of 200ms using the MAX809L reset chip (with fixed
>reset delay of 200ms). Unfortunately this 200ms delay caused other problems,
>so I replaced the reset chip with the L3470-4.63V. The delay was set to
>10ms. This should not cause a problem but who knows?

I think that triggering a scope on the de-assertion of reset at the MCU and
looking at some suitable clock signal and the reset line while going
through bad input power scenarios would be instructive. If you can enable
ECLK and look at that it would be ideal.

If I did these tests, I would also try freeze spray and heat gun to get a
feel for extreme temperature operation. You have to be fairly cautions
about changing the temperature of the board too fast. Perhaps soaking in a
refrigerator and an oven is a better plan.

I guess that I would also try very carefully re-reading the data sheets and
relevant application notes for both reset chips, looking for subtle
differences and hints about gotchas.

>It is a huge coincidence to have only units from the most recent batch have
>this problems when there is atleast double the number from the most recent
>batch in the field running the same firmware. I have to check for any
>accidental changes to this batch.

Very carefully! The circumstantial evidence for a change is pretty
convincing.

>I should say that the reset chip change
>has been in the last 3 batches also, and it would seem that the problem is
>limitted to the most recent batch. Is there anything hardware related other
>than the incorrect operation of the reset line that could cause flash to be
>erased?

Noisy power or wildly varying clock period. Both have been seen to cause
wildly unpredictable operation of HCS-12 parts.

Are the XFC filter components unchanged and well soldered?
I've seen some really bad behavior when switching to the on-chip PLL with
nothing connected to XFC.

>Thanks heaps for your help. I am really clutching at straws to fix a problem
>that I cannot reproduce. I have got a unit that has done this that I have
>been able to see what happened (erased sector), but that is the only info I
>have.
>
>Cheers,
>
>Adrian
>
>----- Original Message -----
>From: "Steve Russell"
>To: <6...>
>Sent: Thursday, October 26, 2006 4:39 AM
>Subject: Re: [68HC12] Flash sector erase on S12DP256B
> > Adrian,
> >
> > Here are some suggestions about approaches to your problem, but no
> > solutions.
> >
> > Since the "12 V" voltage varies all over the place when cranking, it may
> > be
> > that the problem can be fixed with simple power supply mods, like larger
> > filter capacitors or over-voltage protection on the "12 V" input, or both.
> >
> > Because there are so many possibilities in a situation like this, and the
> > testing is so difficult, I always recommend initializing ALL the interrupt
> > vectors to go to a tight loop, preferably one per interrupt.
> >
> > This makes development in the lab much easier if unexpected interrupts
> > occur. It also makes it much easier to rule out an unexpected interrupt
> > as
> > a cause of field failures.
> >
> > The COP watchdog can get you out of the loop with a reset. You might want
> > to do something that gives better information about the unexpected
> > interrupt in the field, but the program shouldn't do an RTI from an
> > unexpected interrupt, because the program can't remove the interrupt
> > source, since the it didn't expect the interrupt.
> >
> > You could also put in some defensive code just before the bootloader's
> > flash erasing that checks that the bootloader was properly entered and has
> > a good checksum. This should give you some assurance that the bootloader
> > was started properly and is not corrupted.
> >
> > Since you seem to have plenty of space, you could also use the "two
> > copies"
> > bootloading method. You burn the new copy into an unused flash buffer,
> > leaving the current copy undisturbed in flash. Check the checksum of the
> > new copy. If it checks, the bootloader makes it the operational program,
> > if not, it uses the old version and forgets about the bad copy.
> >
> > This way, you always have at least one undisturbed copy of the operational
> > code available no matter what accidents occur in downloading a new
> > copy. You still may have some exposure to disasters in the process of
> > making a new good copy the operational version, but its much less, and
> > could possibly be eliminated completely.
> >
> > Hope this helps.
> >
> > Steve Russell
> > Nohau Emulators
> >
> > At 12:19 AM 10/25/2006, Adrian Vos wrote:
> >>Hi all,
> >>
> >>I am using an S12DP256B in an automotive engine management product. I am
> >>using it in small memory model mode, but have it configured so that the
> >>main
> >>firmware is in 0x4000-0x7FFF and 0xC000-0xEFFF. I have a bootloader
> >>resident
> >>in 0xF000-0xFFFF along with the vectors. The unit resets to the bootloader
> >>which checks if main firmware is present and jumps to it if it is present.
> >>If it is not present, it enters a serial protocol which can upload main
> >>firmware into the unit. The unit will also stay in the bootloaded if it
> >>ever
> >>resets from the watchdog timer.
> >>
> >>We have almost 1000 of these units in use on cars, and recently I have had
> >>2
> >>units back that entered boot mode (the software application that you used
> >>with this product detects boot mode). Before loading firmware into the
> >>units
> >>(which went without problem), I checked to see what had gone wrong by
> >>reading hte memory contents via the BDM. The first sector of the flash
> >>(0x4000-0x41FF) was blank. The rest of the firmware was fine. I suspect
> >>that
> >>the unit enters boot mode when it attempts to jump to a routine in this
> >>area, and then gets lost eventuating in a watchdog timeout. Actually just
> >>looking at the memory map, I have some interupt vector function tables
> >>that
> >>are in the erased area, meaning that on boot up, some interrupt vector
> >>will
> >>be set to jump to 0xFFFF for the ISR.
> >>
> >>Anyway, I doubt that my code is erasing this sector. The cars that ran
> >>fine
> >>until they went to restart the engine at one point and it never started.
> >>It
> >>seems that this sector was erased on either the powering down of the unit
> >>or
> >>the powering up of the unit. This leads me to think it may have something
> >>to
> >>do with the reset chip. I am using the the LP3470M5-4.63V reset chip with
> >>the reset time set to 10ms. I previously had a different reset chip with a
> >>200ms reset time, but this caused me problems in some applications, as the
> >>system works with other products, and the other products were getting
> >>errors
> >>due to the slow bootup time I was running.
> >>
> >>I am thinking I should increase the reset delay, but I do not want to
> >>increase it so much that I get my other problems back.
> >>
> >>Can anyone provide any advice on what may have caused my problem, and how
> >>I
> >>might remedy it. Particularly if it may be the short rest time. Apparently
> >>the reset should be held at any time below 4.63V input voltage, and for
> >>10ms
> >>after it gets above 4.63V. What is the shortest reset delay that should
> >>still provide high reliability of the flash memory. I am not looking for a
> >>number that is much longer than it needs to be to be safe. I am looking
> >>for
> >>the shortest delay that people have found to provide reliable operation.
> >>
> >>Thanks,
> >>
> >>Adrian
> >>
> >>Send instant messages to your online friends http://au.messenger.yahoo.com
> >>
> >>
> >>
> >>
Thanks Edward and Steve,

Unfortunately I have only been able to test one of the failed units since
the others had their firmware replaced in the field to get the cars running
again. It is probably likely they will do the same thing again?!?!? :(

Anyway, further investigation down the hardware track lead me to find that
the reset chip on this unit was faulty. The reset time was around 2ms to 4ms
and not consistently the same time, but changing in this range each time. I
then noticed it was poorly soldered on by hand. I resoldered it and it was
still providing the same results. I tried different capacitor values which
set the reset time, and this made no difference to the reset time. I
replaced the chip and went to correct capacitor value, and the reset time
went back to the consistent 10ms it was designed for. I am not sure if a 2ms
reset time would be short enough for the oscillator to not have started up
properly, but my research reveals it to be much lower than recommended reset
time, so definately a possibility.

I have since found out that in the last batch of units, they ran out of
these reset chips, and returned half the batch without the reset chip
fitted. Someone at my company then hand soldered the rest of the reset chips
on when we received more stock. The soldering method seems to have been
pretty ordinary, as not only is the part misaligned (even though the
connections were good), but the part has been damaged in some way (static,
or overheat??) to make it ignore the external capacitor that set the reset
time. One can only presume it to have had other undesired effects on the
reset output also (maybe at temperature extremes, the reset time may have
been further reduced, or even eliminated??)

At this stage, I am going to assume this is the cause of the problem in the
absence of further evidence to provide any other possibility, and the lack
of evidence to conclusively find the problem. I am happy because this should
mean the previous batch field units are fine (and I do not have to alter the
design... but have to alter quality control), but I now have to try and find
and fix any other units that may have the same problem. :(

Thanks heaps for your help. It really gave me direction in looking into
this.

Cheers,

Adrian

----- Original Message -----
From: "Steve Russell"
To: <6...>; <6...>
Sent: Friday, October 27, 2006 6:11 AM
Subject: Re: [68HC12] Flash sector erase on S12DP256B
> Adrian,
>
> Some notes on further straws below.
>
> My reading of all this is that the units did not fail in the middle of
> operation, but failed only on power up.
>
> Is this correct?
>
> Steve Russell
> Nohau Emulators
>
> At 05:42 PM 10/25/2006, Adrian Vos wrote:
>>Thankyou Steve and Edward,
>>
>>A bit more detail about how it works. There is indeed code in the unit
>>that
>>can erase a flash sector. I leave the sectors unprotected in normal
>>operation (I recall having difficulties with this as once protected, it
>>cannot be unprotected without a reboot from memory as it is write once
>>within a certain time after boot).
>
> I think you are confusing the HC11 and the HC12 families.
>
> The HC11 has the "within the first 64 cycles after reset" write
> restriction
> on some registers.
>
> The HC12 family has always been "write once" restriction only, sometimes
> with the errata "write twice in special modes".
>
>>The sector erase code can only be invoked
>>by an RS232 command which is a lengthy data packet including a password
>>and
>>a checksum (difficult for noise to reproduce this set of events.... but
>>possible that the code to erase the vector could be accidentally called by
>>runaway code I guess). The function that does the sector erase copies code
>>to ram, and then does the sector erase from ram and then returns to flash.
>>This function is only called once in my code in a function with a loop
>>that
>>erases all not bootcode flash sectors (in this case only the first sector
>>was erased). The function that does the full main firmware erase is only
>>ever called from an RS232 message, and this code intentionally invokes a
>>COP
>>reset immediately after the full firmware is erased to cause the unit to
>>reboot into boot code, which can then reprogram the main firmware flash
>>with
>>new firmware. This prevents the unit from returning from this function to
>>the now erased flash code that called the function. There was no RS232
>>connected at the time this problem took place (The product is normally
>>tuned
>>with laptop connected and them may never be connected again.. normal
>>operation is without the laptop connected).
>>
>>Now I may have confused in the previous email. I have never had a unit
>>suffer a watchdog timeout in normal operation. If this did happen the unit
>>would enter bootmode (the engine would stop and require a power on reset
>>to
>>enter normal mode again), but this condition would be reset on the next
>>cold
>>bootup. In the COP reset vector I set a variable differently to power on
>>reset, and this causes the unit to stay in boot mode. A normal power on
>>reset will not set this variable and the result will be that the unti
>>falls
>>out of bootmode as normal. The reason I did this was in case the firmware
>>in
>>the unit was corrupted, the unit would stay in bootmode, and a PC can be
>>connected to the unit to replace the firmware, as the bootmode allows
>>RS232
>>firmware update.... where corrupted firmware would not allow this. This
>>caters for cases where the unit loses power part way through a firmware
>>update, or uses corrupted firmware files, or if the firmware gets
>>corrupted
>>in some other way.... which is what happened in this case.
>
> I think that this means that the erasing occurred somewhere between power
> off and the de-assertion of reset on power on.
>
>>This unit was
>>actually exiting bootmode as it should on a power on reset, and entering
>>the
>>normal firmware. Because the first sector had been erased, it was jumping
>>to
>>code in this erased area or more likely some function pointers in this
>>sector that now pointed to 0xFFFF. This caused the code to runaway, and a
>>COP timeout which causes a COP reset that was detected by the bootcode,
>>and
>>it remained in boot mode. This occured every time on power on reset
>>rendering the engine useless as the unit would not run it.
>>
>>I actually have no problem with the way I handle this condition..... the
>>firmware is corrupted and should not be running the engine. I put this
>>code
>>in to handle firmware upgrade failures, and it does handle this.
>>Unfortunately, I never intended for the unit to be able to corrupt its own
>>firmware once in service. My aim is to prevent this from happening again.
>>It
>>has not happened before recently, and has happened to three (another unit
>>today since yesterday..... the firmware in the units has not changed
>>recently) units in the last batch of 100, which makes me wonder if it is
>>hardware related.
>
> This suggests looking carefully at the differences between the failing
> unit
> and a sample of the previous batch that worked. Did anything at all
> change?
>
> Perhaps the PCB material is slightly different or the board is not as well
> cleaned. Either might change the oscillator parameters enough to cause
> trouble.
>
> Changes in the power supply and input filtering components that cause less
> protection that previously are also possible
>
> Perhaps the failing units are somehow installed differently from the first
> 2 changed batches? Installed in an electrically noisier environment that
> previous units?
>
> Could the failing units have been operated at more extreme temperatures
> that previous units? Crystal oscillator startup problems are rumored to
> show at temperature extremes.
>
>>I do know that when I have run this product without a reset chip before
>>(when trying to test if the reset delay was the cause of my previous
>>problem
>>interfacing with other devices cause by slow bootup), and I did have one
>>instance of the entire flash being erased when doing this (never a single
>>sector). I have never had this problem in earlier hardware revisions when
>>running with a reset time of 200ms using the MAX809L reset chip (with
>>fixed
>>reset delay of 200ms). Unfortunately this 200ms delay caused other
>>problems,
>>so I replaced the reset chip with the L3470-4.63V. The delay was set to
>>10ms. This should not cause a problem but who knows?
>
> I think that triggering a scope on the de-assertion of reset at the MCU
> and
> looking at some suitable clock signal and the reset line while going
> through bad input power scenarios would be instructive. If you can enable
> ECLK and look at that it would be ideal.
>
> If I did these tests, I would also try freeze spray and heat gun to get a
> feel for extreme temperature operation. You have to be fairly cautions
> about changing the temperature of the board too fast. Perhaps soaking in
> a
> refrigerator and an oven is a better plan.
>
> I guess that I would also try very carefully re-reading the data sheets
> and
> relevant application notes for both reset chips, looking for subtle
> differences and hints about gotchas.
>
>>It is a huge coincidence to have only units from the most recent batch
>>have
>>this problems when there is atleast double the number from the most recent
>>batch in the field running the same firmware. I have to check for any
>>accidental changes to this batch.
>
> Very carefully! The circumstantial evidence for a change is pretty
> convincing.
>
>>I should say that the reset chip change
>>has been in the last 3 batches also, and it would seem that the problem is
>>limitted to the most recent batch. Is there anything hardware related
>>other
>>than the incorrect operation of the reset line that could cause flash to
>>be
>>erased?
>
> Noisy power or wildly varying clock period. Both have been seen to cause
> wildly unpredictable operation of HCS-12 parts.
>
> Are the XFC filter components unchanged and well soldered?
> I've seen some really bad behavior when switching to the on-chip PLL with
> nothing connected to XFC.
>
>>Thanks heaps for your help. I am really clutching at straws to fix a
>>problem
>>that I cannot reproduce. I have got a unit that has done this that I have
>>been able to see what happened (erased sector), but that is the only info
>>I
>>have.
>>
>>Cheers,
>>
>>Adrian
>>
>>----- Original Message -----
>>From: "Steve Russell"
>>To: <6...>
>>Sent: Thursday, October 26, 2006 4:39 AM
>>Subject: Re: [68HC12] Flash sector erase on S12DP256B
>> > Adrian,
>> >
>> > Here are some suggestions about approaches to your problem, but no
>> > solutions.
>> >
>> > Since the "12 V" voltage varies all over the place when cranking, it
>> > may
>> > be
>> > that the problem can be fixed with simple power supply mods, like
>> > larger
>> > filter capacitors or over-voltage protection on the "12 V" input, or
>> > both.
>> >
>> > Because there are so many possibilities in a situation like this, and
>> > the
>> > testing is so difficult, I always recommend initializing ALL the
>> > interrupt
>> > vectors to go to a tight loop, preferably one per interrupt.
>> >
>> > This makes development in the lab much easier if unexpected interrupts
>> > occur. It also makes it much easier to rule out an unexpected
>> > interrupt
>> > as
>> > a cause of field failures.
>> >
>> > The COP watchdog can get you out of the loop with a reset. You might
>> > want
>> > to do something that gives better information about the unexpected
>> > interrupt in the field, but the program shouldn't do an RTI from an
>> > unexpected interrupt, because the program can't remove the interrupt
>> > source, since the it didn't expect the interrupt.
>> >
>> > You could also put in some defensive code just before the bootloader's
>> > flash erasing that checks that the bootloader was properly entered and
>> > has
>> > a good checksum. This should give you some assurance that the
>> > bootloader
>> > was started properly and is not corrupted.
>> >
>> > Since you seem to have plenty of space, you could also use the "two
>> > copies"
>> > bootloading method. You burn the new copy into an unused flash buffer,
>> > leaving the current copy undisturbed in flash. Check the checksum of
>> > the
>> > new copy. If it checks, the bootloader makes it the operational
>> > program,
>> > if not, it uses the old version and forgets about the bad copy.
>> >
>> > This way, you always have at least one undisturbed copy of the
>> > operational
>> > code available no matter what accidents occur in downloading a new
>> > copy. You still may have some exposure to disasters in the process of
>> > making a new good copy the operational version, but its much less, and
>> > could possibly be eliminated completely.
>> >
>> > Hope this helps.
>> >
>> > Steve Russell
>> > Nohau Emulators
>> >
>> > At 12:19 AM 10/25/2006, Adrian Vos wrote:
>> >>Hi all,
>> >>
>> >>I am using an S12DP256B in an automotive engine management product. I
>> >>am
>> >>using it in small memory model mode, but have it configured so that the
>> >>main
>> >>firmware is in 0x4000-0x7FFF and 0xC000-0xEFFF. I have a bootloader
>> >>resident
>> >>in 0xF000-0xFFFF along with the vectors. The unit resets to the
>> >>bootloader
>> >>which checks if main firmware is present and jumps to it if it is
>> >>present.
>> >>If it is not present, it enters a serial protocol which can upload main
>> >>firmware into the unit. The unit will also stay in the bootloaded if it
>> >>ever
>> >>resets from the watchdog timer.
>> >>
>> >>We have almost 1000 of these units in use on cars, and recently I have
>> >>had
>> >>2
>> >>units back that entered boot mode (the software application that you
>> >>used
>> >>with this product detects boot mode). Before loading firmware into the
>> >>units
>> >>(which went without problem), I checked to see what had gone wrong by
>> >>reading hte memory contents via the BDM. The first sector of the flash
>> >>(0x4000-0x41FF) was blank. The rest of the firmware was fine. I suspect
>> >>that
>> >>the unit enters boot mode when it attempts to jump to a routine in this
>> >>area, and then gets lost eventuating in a watchdog timeout. Actually
>> >>just
>> >>looking at the memory map, I have some interupt vector function tables
>> >>that
>> >>are in the erased area, meaning that on boot up, some interrupt vector
>> >>will
>> >>be set to jump to 0xFFFF for the ISR.
>> >>
>> >>Anyway, I doubt that my code is erasing this sector. The cars that ran
>> >>fine
>> >>until they went to restart the engine at one point and it never
>> >>started.
>> >>It
>> >>seems that this sector was erased on either the powering down of the
>> >>unit
>> >>or
>> >>the powering up of the unit. This leads me to think it may have
>> >>something
>> >>to
>> >>do with the reset chip. I am using the the LP3470M5-4.63V reset chip
>> >>with
>> >>the reset time set to 10ms. I previously had a different reset chip
>> >>with a
>> >>200ms reset time, but this caused me problems in some applications, as
>> >>the
>> >>system works with other products, and the other products were getting
>> >>errors
>> >>due to the slow bootup time I was running.
>> >>
>> >>I am thinking I should increase the reset delay, but I do not want to
>> >>increase it so much that I get my other problems back.
>> >>
>> >>Can anyone provide any advice on what may have caused my problem, and
>> >>how
>> >>I
>> >>might remedy it. Particularly if it may be the short rest time.
>> >>Apparently
>> >>the reset should be held at any time below 4.63V input voltage, and for
>> >>10ms
>> >>after it gets above 4.63V. What is the shortest reset delay that should
>> >>still provide high reliability of the flash memory. I am not looking
>> >>for a
>> >>number that is much longer than it needs to be to be safe. I am looking
>> >>for
>> >>the shortest delay that people have found to provide reliable
>> >>operation.
>> >>
>> >>Thanks,
>> >>
>> >>Adrian
>> >>
>> >>Send instant messages to your online friends
>> >>http://au.messenger.yahoo.com
>> >>
>> >>
>> >>
>> >>
Adrian,

Thanks very much for the update!

I always like to hear how mysterious problems get solved.

I believe that there is no procedure so simple or so foolproof that it
can't be screwed up by someone not paying attention or someone trying to
save a penny.

Unfortunately, I occasionally demonstrate the "not paying attention' part
of this.

Steve Russell
Nohau Emulators

At 05:40 PM 10/26/2006, Adrian Vos wrote:
>Thanks Edward and Steve,
>
>Unfortunately I have only been able to test one of the failed units since
>the others had their firmware replaced in the field to get the cars running
>again. It is probably likely they will do the same thing again?!?!? :(
>
>Anyway, further investigation down the hardware track lead me to find that
>the reset chip on this unit was faulty. The reset time was around 2ms to 4ms
>and not consistently the same time, but changing in this range each time. I
>then noticed it was poorly soldered on by hand. I resoldered it and it was
>still providing the same results. I tried different capacitor values which
>set the reset time, and this made no difference to the reset time. I
>replaced the chip and went to correct capacitor value, and the reset time
>went back to the consistent 10ms it was designed for. I am not sure if a 2ms
>reset time would be short enough for the oscillator to not have started up
>properly, but my research reveals it to be much lower than recommended reset
>time, so definately a possibility.
>
>I have since found out that in the last batch of units, they ran out of
>these reset chips, and returned half the batch without the reset chip
>fitted. Someone at my company then hand soldered the rest of the reset chips
>on when we received more stock. The soldering method seems to have been
>pretty ordinary, as not only is the part misaligned (even though the
>connections were good), but the part has been damaged in some way (static,
>or overheat??) to make it ignore the external capacitor that set the reset
>time. One can only presume it to have had other undesired effects on the
>reset output also (maybe at temperature extremes, the reset time may have
>been further reduced, or even eliminated??)
>
>At this stage, I am going to assume this is the cause of the problem in the
>absence of further evidence to provide any other possibility, and the lack
>of evidence to conclusively find the problem. I am happy because this should
>mean the previous batch field units are fine (and I do not have to alter the
>design... but have to alter quality control), but I now have to try and find
>and fix any other units that may have the same problem. :(
>
>Thanks heaps for your help. It really gave me direction in looking into
>this.
>
>Cheers,
>
>Adrian
>----- Original Message -----
>From: "Steve Russell"
>To: <6...>; <6...>
>Sent: Friday, October 27, 2006 6:11 AM
>Subject: Re: [68HC12] Flash sector erase on S12DP256B
> > Adrian,
> >
> > Some notes on further straws below.
> >
> > My reading of all this is that the units did not fail in the middle of
> > operation, but failed only on power up.
> >
> > Is this correct?
> >
> > Steve Russell
> > Nohau Emulators
> >
> > At 05:42 PM 10/25/2006, Adrian Vos wrote:
> >>Thankyou Steve and Edward,
> >>
> >>A bit more detail about how it works. There is indeed code in the unit
> >>that
> >>can erase a flash sector. I leave the sectors unprotected in normal
> >>operation (I recall having difficulties with this as once protected, it
> >>cannot be unprotected without a reboot from memory as it is write once
> >>within a certain time after boot).
> >
> > I think you are confusing the HC11 and the HC12 families.
> >
> > The HC11 has the "within the first 64 cycles after reset" write
> > restriction
> > on some registers.
> >
> > The HC12 family has always been "write once" restriction only, sometimes
> > with the errata "write twice in special modes".
> >
> >>The sector erase code can only be invoked
> >>by an RS232 command which is a lengthy data packet including a password
> >>and
> >>a checksum (difficult for noise to reproduce this set of events.... but
> >>possible that the code to erase the vector could be accidentally called by
> >>runaway code I guess). The function that does the sector erase copies code
> >>to ram, and then does the sector erase from ram and then returns to flash.
> >>This function is only called once in my code in a function with a loop
> >>that
> >>erases all not bootcode flash sectors (in this case only the first sector
> >>was erased). The function that does the full main firmware erase is only
> >>ever called from an RS232 message, and this code intentionally invokes a
> >>COP
> >>reset immediately after the full firmware is erased to cause the unit to
> >>reboot into boot code, which can then reprogram the main firmware flash
> >>with
> >>new firmware. This prevents the unit from returning from this function to
> >>the now erased flash code that called the function. There was no RS232
> >>connected at the time this problem took place (The product is normally
> >>tuned
> >>with laptop connected and them may never be connected again.. normal
> >>operation is without the laptop connected).
> >>
> >>Now I may have confused in the previous email. I have never had a unit
> >>suffer a watchdog timeout in normal operation. If this did happen the unit
> >>would enter bootmode (the engine would stop and require a power on reset
> >>to
> >>enter normal mode again), but this condition would be reset on the next
> >>cold
> >>bootup. In the COP reset vector I set a variable differently to power on
> >>reset, and this causes the unit to stay in boot mode. A normal power on
> >>reset will not set this variable and the result will be that the unti
> >>falls
> >>out of bootmode as normal. The reason I did this was in case the firmware
> >>in
> >>the unit was corrupted, the unit would stay in bootmode, and a PC can be
> >>connected to the unit to replace the firmware, as the bootmode allows
> >>RS232
> >>firmware update.... where corrupted firmware would not allow this. This
> >>caters for cases where the unit loses power part way through a firmware
> >>update, or uses corrupted firmware files, or if the firmware gets
> >>corrupted
> >>in some other way.... which is what happened in this case.
> >
> > I think that this means that the erasing occurred somewhere between power
> > off and the de-assertion of reset on power on.
> >
> >>This unit was
> >>actually exiting bootmode as it should on a power on reset, and entering
> >>the
> >>normal firmware. Because the first sector had been erased, it was jumping
> >>to
> >>code in this erased area or more likely some function pointers in this
> >>sector that now pointed to 0xFFFF. This caused the code to runaway, and a
> >>COP timeout which causes a COP reset that was detected by the bootcode,
> >>and
> >>it remained in boot mode. This occured every time on power on reset
> >>rendering the engine useless as the unit would not run it.
> >>
> >>I actually have no problem with the way I handle this condition..... the
> >>firmware is corrupted and should not be running the engine. I put this
> >>code
> >>in to handle firmware upgrade failures, and it does handle this.
> >>Unfortunately, I never intended for the unit to be able to corrupt its own
> >>firmware once in service. My aim is to prevent this from happening again.
> >>It
> >>has not happened before recently, and has happened to three (another unit
> >>today since yesterday..... the firmware in the units has not changed
> >>recently) units in the last batch of 100, which makes me wonder if it is
> >>hardware related.
> >
> > This suggests looking carefully at the differences between the failing
> > unit
> > and a sample of the previous batch that worked. Did anything at all
> > change?
> >
> > Perhaps the PCB material is slightly different or the board is not as well
> > cleaned. Either might change the oscillator parameters enough to cause
> > trouble.
> >
> > Changes in the power supply and input filtering components that cause less
> > protection that previously are also possible
> >
> > Perhaps the failing units are somehow installed differently from the first
> > 2 changed batches? Installed in an electrically noisier environment that
> > previous units?
> >
> > Could the failing units have been operated at more extreme temperatures
> > that previous units? Crystal oscillator startup problems are rumored to
> > show at temperature extremes.
> >
> >>I do know that when I have run this product without a reset chip before
> >>(when trying to test if the reset delay was the cause of my previous
> >>problem
> >>interfacing with other devices cause by slow bootup), and I did have one
> >>instance of the entire flash being erased when doing this (never a single
> >>sector). I have never had this problem in earlier hardware revisions when
> >>running with a reset time of 200ms using the MAX809L reset chip (with
> >>fixed
> >>reset delay of 200ms). Unfortunately this 200ms delay caused other
> >>problems,
> >>so I replaced the reset chip with the L3470-4.63V. The delay was set to
> >>10ms. This should not cause a problem but who knows?
> >
> > I think that triggering a scope on the de-assertion of reset at the MCU
> > and
> > looking at some suitable clock signal and the reset line while going
> > through bad input power scenarios would be instructive. If you can enable
> > ECLK and look at that it would be ideal.
> >
> > If I did these tests, I would also try freeze spray and heat gun to get a
> > feel for extreme temperature operation. You have to be fairly cautions
> > about changing the temperature of the board too fast. Perhaps soaking in
> > a
> > refrigerator and an oven is a better plan.
> >
> > I guess that I would also try very carefully re-reading the data sheets
> > and
> > relevant application notes for both reset chips, looking for subtle
> > differences and hints about gotchas.
> >
> >>It is a huge coincidence to have only units from the most recent batch
> >>have
> >>this problems when there is atleast double the number from the most recent
> >>batch in the field running the same firmware. I have to check for any
> >>accidental changes to this batch.
> >
> > Very carefully! The circumstantial evidence for a change is pretty
> > convincing.
> >
> >>I should say that the reset chip change
> >>has been in the last 3 batches also, and it would seem that the problem is
> >>limitted to the most recent batch. Is there anything hardware related
> >>other
> >>than the incorrect operation of the reset line that could cause flash to
> >>be
> >>erased?
> >
> > Noisy power or wildly varying clock period. Both have been seen to cause
> > wildly unpredictable operation of HCS-12 parts.
> >
> > Are the XFC filter components unchanged and well soldered?
> > I've seen some really bad behavior when switching to the on-chip PLL with
> > nothing connected to XFC.
> >
> >>Thanks heaps for your help. I am really clutching at straws to fix a
> >>problem
> >>that I cannot reproduce. I have got a unit that has done this that I have
> >>been able to see what happened (erased sector), but that is the only info
> >>I
> >>have.
> >>
> >>Cheers,
> >>
> >>Adrian
> >>
> >>----- Original Message -----
> >>From: "Steve Russell"
> >>To: <6...>
> >>Sent: Thursday, October 26, 2006 4:39 AM
> >>Subject: Re: [68HC12] Flash sector erase on S12DP256B
> >>
> >>
> >> > Adrian,
> >> >
> >> > Here are some suggestions about approaches to your problem, but no
> >> > solutions.
> >> >
> >> > Since the "12 V" voltage varies all over the place when cranking, it
> >> > may
> >> > be
> >> > that the problem can be fixed with simple power supply mods, like
> >> > larger
> >> > filter capacitors or over-voltage protection on the "12 V" input, or
> >> > both.
> >> >
> >> > Because there are so many possibilities in a situation like this, and
> >> > the
> >> > testing is so difficult, I always recommend initializing ALL the
> >> > interrupt
> >> > vectors to go to a tight loop, preferably one per interrupt.
> >> >
> >> > This makes development in the lab much easier if unexpected interrupts
> >> > occur. It also makes it much easier to rule out an unexpected
> >> > interrupt
> >> > as
> >> > a cause of field failures.
> >> >
> >> > The COP watchdog can get you out of the loop with a reset. You might
> >> > want
> >> > to do something that gives better information about the unexpected
> >> > interrupt in the field, but the program shouldn't do an RTI from an
> >> > unexpected interrupt, because the program can't remove the interrupt
> >> > source, since the it didn't expect the interrupt.
> >> >
> >> > You could also put in some defensive code just before the bootloader's
> >> > flash erasing that checks that the bootloader was properly entered and
> >> > has
> >> > a good checksum. This should give you some assurance that the
> >> > bootloader
> >> > was started properly and is not corrupted.
> >> >
> >> > Since you seem to have plenty of space, you could also use the "two
> >> > copies"
> >> > bootloading method. You burn the new copy into an unused flash buffer,
> >> > leaving the current copy undisturbed in flash. Check the checksum of
> >> > the
> >> > new copy. If it checks, the bootloader makes it the operational
> >> > program,
> >> > if not, it uses the old version and forgets about the bad copy.
> >> >
> >> > This way, you always have at least one undisturbed copy of the
> >> > operational
> >> > code available no matter what accidents occur in downloading a new
> >> > copy. You still may have some exposure to disasters in the process of
> >> > making a new good copy the operational version, but its much less, and
> >> > could possibly be eliminated completely.
> >> >
> >> > Hope this helps.
> >> >
> >> > Steve Russell
> >> > Nohau Emulators
> >> >
> >> > At 12:19 AM 10/25/2006, Adrian Vos wrote:
> >> >>Hi all,
> >> >>
> >> >>I am using an S12DP256B in an automotive engine management product. I
> >> >>am
> >> >>using it in small memory model mode, but have it configured so that the
> >> >>main
> >> >>firmware is in 0x4000-0x7FFF and 0xC000-0xEFFF. I have a bootloader
> >> >>resident
> >> >>in 0xF000-0xFFFF along with the vectors. The unit resets to the
> >> >>bootloader
> >> >>which checks if main firmware is present and jumps to it if it is
> >> >>present.
> >> >>If it is not present, it enters a serial protocol which can upload main
> >> >>firmware into the unit. The unit will also stay in the bootloaded if it
> >> >>ever
> >> >>resets from the watchdog timer.
> >> >>
> >> >>We have almost 1000 of these units in use on cars, and recently I have
> >> >>had
> >> >>2
> >> >>units back that entered boot mode (the software application that you
> >> >>used
> >> >>with this product detects boot mode). Before loading firmware into the
> >> >>units
> >> >>(which went without problem), I checked to see what had gone wrong by
> >> >>reading hte memory contents via the BDM. The first sector of the flash
> >> >>(0x4000-0x41FF) was blank. The rest of the firmware was fine. I suspect
> >> >>that
> >> >>the unit enters boot mode when it attempts to jump to a routine in this
> >> >>area, and then gets lost eventuating in a watchdog timeout. Actually
> >> >>just
> >> >>looking at the memory map, I have some interupt vector function tables
> >> >>that
> >> >>are in the erased area, meaning that on boot up, some interrupt vector
> >> >>will
> >> >>be set to jump to 0xFFFF for the ISR.
> >> >>
> >> >>Anyway, I doubt that my code is erasing this sector. The cars that ran
> >> >>fine
> >> >>until they went to restart the engine at one point and it never
> >> >>started.
> >> >>It
> >> >>seems that this sector was erased on either the powering down of the
> >> >>unit
> >> >>or
> >> >>the powering up of the unit. This leads me to think it may have
> >> >>something
> >> >>to
> >> >>do with the reset chip. I am using the the LP3470M5-4.63V reset chip
> >> >>with
> >> >>the reset time set to 10ms. I previously had a different reset chip
> >> >>with a
> >> >>200ms reset time, but this caused me problems in some applications, as
> >> >>the
> >> >>system works with other products, and the other products were getting
> >> >>errors
> >> >>due to the slow bootup time I was running.
> >> >>
> >> >>I am thinking I should increase the reset delay, but I do not want to
> >> >>increase it so much that I get my other problems back.
> >> >>
> >> >>Can anyone provide any advice on what may have caused my problem, and
> >> >>how
> >> >>I
> >> >>might remedy it. Particularly if it may be the short rest time.
> >> >>Apparently
> >> >>the reset should be held at any time below 4.63V input voltage, and for
> >> >>10ms
> >> >>after it gets above 4.63V. What is the shortest reset delay that should
> >> >>still provide high reliability of the flash memory. I am not looking
> >> >>for a
> >> >>number that is much longer than it needs to be to be safe. I am looking
> >> >>for
> >> >>the shortest delay that people have found to provide reliable
> >> >>operation.
> >> >>
> >> >>Thanks,
> >> >>
> >> >>Adrian
> >> >>
> >> >>Send instant messages to your online friends
> >> >>http://au.messenger.yahoo.com
> >> >>
> >> >>
> >> >>
> >> >>
--- In 6..., "Adrian Vos" wrote:

Alls well, etc.

As an afterthought, why have flah erasure code embedded?
Just make it part of the download.

Cheers,

Theo

> Unfortunately I have only been able to test one of the failed units since
> the others had their firmware replaced in the field to get the cars running
> again. It is probably likely they will do the same thing again?!?!? :(
>
> Anyway, further investigation down the hardware track lead me to find that
> the reset chip on this unit was faulty. The reset time was around 2ms to 4ms
> and not consistently the same time, but changing in this range each time. I
> then noticed it was poorly soldered on by hand. I resoldered it and it was
> still providing the same results. I tried different capacitor values which
> set the reset time, and this made no difference to the reset time. I
> replaced the chip and went to correct capacitor value, and the reset time
> went back to the consistent 10ms it was designed for. I am not sure if a 2ms
> reset time would be short enough for the oscillator to not have started up
> properly, but my research reveals it to be much lower than recommended reset
> time, so definately a possibility.
>
> I have since found out that in the last batch of units, they ran out of
> these reset chips, and returned half the batch without the reset chip
> fitted. Someone at my company then hand soldered the rest of the reset chips
> on when we received more stock. The soldering method seems to have been
> pretty ordinary, as not only is the part misaligned (even though the
> connections were good), but the part has been damaged in some way (static,
> or overheat??) to make it ignore the external capacitor that set the reset
> time. One can only presume it to have had other undesired effects on the
> reset output also (maybe at temperature extremes, the reset time may have
> been further reduced, or even eliminated??)
>
> At this stage, I am going to assume this is the cause of the problem in the
> absence of further evidence to provide any other possibility, and the lack
> of evidence to conclusively find the problem. I am happy because this should
> mean the previous batch field units are fine (and I do not have to alter the
> design... but have to alter quality control), but I now have to try and find
> and fix any other units that may have the same problem. :(
>
> Thanks heaps for your help. It really gave me direction in looking into
> this.
>
> Cheers,
>
> Adrian
> ----- Original Message -----
> From: "Steve Russell"
> To: <6...>; <6...>
> Sent: Friday, October 27, 2006 6:11 AM
> Subject: Re: [68HC12] Flash sector erase on S12DP256B
> > Adrian,
> >
> > Some notes on further straws below.
> >
> > My reading of all this is that the units did not fail in the middle of
> > operation, but failed only on power up.
> >
> > Is this correct?
> >
> > Steve Russell
> > Nohau Emulators
> >
> > At 05:42 PM 10/25/2006, Adrian Vos wrote:
> >>Thankyou Steve and Edward,
> >>
> >>A bit more detail about how it works. There is indeed code in the unit
> >>that
> >>can erase a flash sector. I leave the sectors unprotected in normal
> >>operation (I recall having difficulties with this as once protected, it
> >>cannot be unprotected without a reboot from memory as it is write once
> >>within a certain time after boot).
> >
> > I think you are confusing the HC11 and the HC12 families.
> >
> > The HC11 has the "within the first 64 cycles after reset" write
> > restriction
> > on some registers.
> >
> > The HC12 family has always been "write once" restriction only, sometimes
> > with the errata "write twice in special modes".
> >
> >>The sector erase code can only be invoked
> >>by an RS232 command which is a lengthy data packet including a password
> >>and
> >>a checksum (difficult for noise to reproduce this set of events.... but
> >>possible that the code to erase the vector could be accidentally called by
> >>runaway code I guess). The function that does the sector erase copies code
> >>to ram, and then does the sector erase from ram and then returns to flash.
> >>This function is only called once in my code in a function with a loop
> >>that
> >>erases all not bootcode flash sectors (in this case only the first sector
> >>was erased). The function that does the full main firmware erase is only
> >>ever called from an RS232 message, and this code intentionally invokes a
> >>COP
> >>reset immediately after the full firmware is erased to cause the unit to
> >>reboot into boot code, which can then reprogram the main firmware flash
> >>with
> >>new firmware. This prevents the unit from returning from this function to
> >>the now erased flash code that called the function. There was no RS232
> >>connected at the time this problem took place (The product is normally
> >>tuned
> >>with laptop connected and them may never be connected again.. normal
> >>operation is without the laptop connected).
> >>
> >>Now I may have confused in the previous email. I have never had a unit
> >>suffer a watchdog timeout in normal operation. If this did happen the unit
> >>would enter bootmode (the engine would stop and require a power on reset
> >>to
> >>enter normal mode again), but this condition would be reset on the next
> >>cold
> >>bootup. In the COP reset vector I set a variable differently to power on
> >>reset, and this causes the unit to stay in boot mode. A normal power on
> >>reset will not set this variable and the result will be that the unti
> >>falls
> >>out of bootmode as normal. The reason I did this was in case the firmware
> >>in
> >>the unit was corrupted, the unit would stay in bootmode, and a PC can be
> >>connected to the unit to replace the firmware, as the bootmode allows
> >>RS232
> >>firmware update.... where corrupted firmware would not allow this. This
> >>caters for cases where the unit loses power part way through a firmware
> >>update, or uses corrupted firmware files, or if the firmware gets
> >>corrupted
> >>in some other way.... which is what happened in this case.
> >
> > I think that this means that the erasing occurred somewhere between power
> > off and the de-assertion of reset on power on.
> >
> >>This unit was
> >>actually exiting bootmode as it should on a power on reset, and entering
> >>the
> >>normal firmware. Because the first sector had been erased, it was jumping
> >>to
> >>code in this erased area or more likely some function pointers in this
> >>sector that now pointed to 0xFFFF. This caused the code to runaway, and a
> >>COP timeout which causes a COP reset that was detected by the bootcode,
> >>and
> >>it remained in boot mode. This occured every time on power on reset
> >>rendering the engine useless as the unit would not run it.
> >>
> >>I actually have no problem with the way I handle this condition..... the
> >>firmware is corrupted and should not be running the engine. I put this
> >>code
> >>in to handle firmware upgrade failures, and it does handle this.
> >>Unfortunately, I never intended for the unit to be able to corrupt its own
> >>firmware once in service. My aim is to prevent this from happening again.
> >>It
> >>has not happened before recently, and has happened to three (another unit
> >>today since yesterday..... the firmware in the units has not changed
> >>recently) units in the last batch of 100, which makes me wonder if it is
> >>hardware related.
> >
> > This suggests looking carefully at the differences between the failing
> > unit
> > and a sample of the previous batch that worked. Did anything at all
> > change?
> >
> > Perhaps the PCB material is slightly different or the board is not as well
> > cleaned. Either might change the oscillator parameters enough to cause
> > trouble.
> >
> > Changes in the power supply and input filtering components that cause less
> > protection that previously are also possible
> >
> > Perhaps the failing units are somehow installed differently from the first
> > 2 changed batches? Installed in an electrically noisier environment that
> > previous units?
> >
> > Could the failing units have been operated at more extreme temperatures
> > that previous units? Crystal oscillator startup problems are rumored to
> > show at temperature extremes.
> >
> >>I do know that when I have run this product without a reset chip before
> >>(when trying to test if the reset delay was the cause of my previous
> >>problem
> >>interfacing with other devices cause by slow bootup), and I did have one
> >>instance of the entire flash being erased when doing this (never a single
> >>sector). I have never had this problem in earlier hardware revisions when
> >>running with a reset time of 200ms using the MAX809L reset chip (with
> >>fixed
> >>reset delay of 200ms). Unfortunately this 200ms delay caused other
> >>problems,
> >>so I replaced the reset chip with the L3470-4.63V. The delay was set to
> >>10ms. This should not cause a problem but who knows?
> >
> > I think that triggering a scope on the de-assertion of reset at the MCU
> > and
> > looking at some suitable clock signal and the reset line while going
> > through bad input power scenarios would be instructive. If you can enable
> > ECLK and look at that it would be ideal.
> >
> > If I did these tests, I would also try freeze spray and heat gun to get a
> > feel for extreme temperature operation. You have to be fairly cautions
> > about changing the temperature of the board too fast. Perhaps soaking in
> > a
> > refrigerator and an oven is a better plan.
> >
> > I guess that I would also try very carefully re-reading the data sheets
> > and
> > relevant application notes for both reset chips, looking for subtle
> > differences and hints about gotchas.
> >
> >>It is a huge coincidence to have only units from the most recent batch
> >>have
> >>this problems when there is atleast double the number from the most recent
> >>batch in the field running the same firmware. I have to check for any
> >>accidental changes to this batch.
> >
> > Very carefully! The circumstantial evidence for a change is pretty
> > convincing.
> >
> >>I should say that the reset chip change
> >>has been in the last 3 batches also, and it would seem that the problem is
> >>limitted to the most recent batch. Is there anything hardware related
> >>other
> >>than the incorrect operation of the reset line that could cause flash to
> >>be
> >>erased?
> >
> > Noisy power or wildly varying clock period. Both have been seen to cause
> > wildly unpredictable operation of HCS-12 parts.
> >
> > Are the XFC filter components unchanged and well soldered?
> > I've seen some really bad behavior when switching to the on-chip PLL with
> > nothing connected to XFC.
> >
> >>Thanks heaps for your help. I am really clutching at straws to fix a
> >>problem
> >>that I cannot reproduce. I have got a unit that has done this that I have
> >>been able to see what happened (erased sector), but that is the only info
> >>I
> >>have.
> >>
> >>Cheers,
> >>
> >>Adrian
> >>
> >>----- Original Message -----
> >>From: "Steve Russell"
> >>To: <6...>
> >>Sent: Thursday, October 26, 2006 4:39 AM
> >>Subject: Re: [68HC12] Flash sector erase on S12DP256B
> >>
> >>
> >> > Adrian,
> >> >
> >> > Here are some suggestions about approaches to your problem, but no
> >> > solutions.
> >> >
> >> > Since the "12 V" voltage varies all over the place when cranking, it
> >> > may
> >> > be
> >> > that the problem can be fixed with simple power supply mods, like
> >> > larger
> >> > filter capacitors or over-voltage protection on the "12 V" input, or
> >> > both.
> >> >
> >> > Because there are so many possibilities in a situation like this, and
> >> > the
> >> > testing is so difficult, I always recommend initializing ALL the
> >> > interrupt
> >> > vectors to go to a tight loop, preferably one per interrupt.
> >> >
> >> > This makes development in the lab much easier if unexpected interrupts
> >> > occur. It also makes it much easier to rule out an unexpected
> >> > interrupt
> >> > as
> >> > a cause of field failures.
> >> >
> >> > The COP watchdog can get you out of the loop with a reset. You might
> >> > want
> >> > to do something that gives better information about the unexpected
> >> > interrupt in the field, but the program shouldn't do an RTI from an
> >> > unexpected interrupt, because the program can't remove the interrupt
> >> > source, since the it didn't expect the interrupt.
> >> >
> >> > You could also put in some defensive code just before the bootloader's
> >> > flash erasing that checks that the bootloader was properly entered and
> >> > has
> >> > a good checksum. This should give you some assurance that the
> >> > bootloader
> >> > was started properly and is not corrupted.
> >> >
> >> > Since you seem to have plenty of space, you could also use the "two
> >> > copies"
> >> > bootloading method. You burn the new copy into an unused flash buffer,
> >> > leaving the current copy undisturbed in flash. Check the checksum of
> >> > the
> >> > new copy. If it checks, the bootloader makes it the operational
> >> > program,
> >> > if not, it uses the old version and forgets about the bad copy.
> >> >
> >> > This way, you always have at least one undisturbed copy of the
> >> > operational
> >> > code available no matter what accidents occur in downloading a new
> >> > copy. You still may have some exposure to disasters in the process of
> >> > making a new good copy the operational version, but its much less, and
> >> > could possibly be eliminated completely.
> >> >
> >> > Hope this helps.
> >> >
> >> > Steve Russell
> >> > Nohau Emulators
> >> >
> >> > At 12:19 AM 10/25/2006, Adrian Vos wrote:
> >> >>Hi all,
> >> >>
> >> >>I am using an S12DP256B in an automotive engine management product. I
> >> >>am
> >> >>using it in small memory model mode, but have it configured so that the
> >> >>main
> >> >>firmware is in 0x4000-0x7FFF and 0xC000-0xEFFF. I have a bootloader
> >> >>resident
> >> >>in 0xF000-0xFFFF along with the vectors. The unit resets to the
> >> >>bootloader
> >> >>which checks if main firmware is present and jumps to it if it is
> >> >>present.
> >> >>If it is not present, it enters a serial protocol which can upload main
> >> >>firmware into the unit. The unit will also stay in the bootloaded if it
> >> >>ever
> >> >>resets from the watchdog timer.
> >> >>
> >> >>We have almost 1000 of these units in use on cars, and recently I have
> >> >>had
> >> >>2
> >> >>units back that entered boot mode (the software application that you
> >> >>used
> >> >>with this product detects boot mode). Before loading firmware into the
> >> >>units
> >> >>(which went without problem), I checked to see what had gone wrong by
> >> >>reading hte memory contents via the BDM. The first sector of the flash
> >> >>(0x4000-0x41FF) was blank. The rest of the firmware was fine. I suspect
> >> >>that
> >> >>the unit enters boot mode when it attempts to jump to a routine in this
> >> >>area, and then gets lost eventuating in a watchdog timeout. Actually
> >> >>just
> >> >>looking at the memory map, I have some interupt vector function tables
> >> >>that
> >> >>are in the erased area, meaning that on boot up, some interrupt vector
> >> >>will
> >> >>be set to jump to 0xFFFF for the ISR.
> >> >>
> >> >>Anyway, I doubt that my code is erasing this sector. The cars that ran
> >> >>fine
> >> >>until they went to restart the engine at one point and it never
> >> >>started.
> >> >>It
> >> >>seems that this sector was erased on either the powering down of the
> >> >>unit
> >> >>or
> >> >>the powering up of the unit. This leads me to think it may have
> >> >>something
> >> >>to
> >> >>do with the reset chip. I am using the the LP3470M5-4.63V reset chip
> >> >>with
> >> >>the reset time set to 10ms. I previously had a different reset chip
> >> >>with a
> >> >>200ms reset time, but this caused me problems in some applications, as
> >> >>the
> >> >>system works with other products, and the other products were getting
> >> >>errors
> >> >>due to the slow bootup time I was running.
> >> >>
> >> >>I am thinking I should increase the reset delay, but I do not want to
> >> >>increase it so much that I get my other problems back.
> >> >>
> >> >>Can anyone provide any advice on what may have caused my problem, and
> >> >>how
> >> >>I
> >> >>might remedy it. Particularly if it may be the short rest time.
> >> >>Apparently
> >> >>the reset should be held at any time below 4.63V input voltage, and for
> >> >>10ms
> >> >>after it gets above 4.63V. What is the shortest reset delay that should
> >> >>still provide high reliability of the flash memory. I am not looking
> >> >>for a
> >> >>number that is much longer than it needs to be to be safe. I am looking
> >> >>for
> >> >>the shortest delay that people have found to provide reliable
> >> >>operation.
> >> >>
> >> >>Thanks,
> >> >>
> >> >>Adrian
> >> >>
> >> >>Send instant messages to your online friends
> >> >>http://au.messenger.yahoo.com
> >> >>
> >> >>
> >> >>
> >> >>
I have been out of the office and just catching up on this
fascinating thread. In summary, are we saying that a short reset can
cause a flash erase or did it cause Adrians flash erase code to
unintentionally execute due to startup problems.

Also, for those of us with tight security requirements, downloading
and executing the flash erase code will take some serious thought. I
don't like the thought that a "hacker" could download and run code to
read my flash contents. But, if I trust the encryption of the
downloaded application why shouldn't I trust the same encryption of
the downloaded flash erase & program routines???? Thanks for the
thought provoking dialog....

-Dan

--- In 6..., "theobee00" wrote:
>
> --- In 6..., "Adrian Vos" wrote:
>
> Alls well, etc.
>
> As an afterthought, why have flah erasure code embedded?
> Just make it part of the download.
>
> Cheers,
>
> Theo
>
> > Unfortunately I have only been able to test one of the failed
units since
> > the others had their firmware replaced in the field to get the
cars running
> > again. It is probably likely they will do the same thing
again?!?!? :(
> >
> > Anyway, further investigation down the hardware track lead me to
find that
> > the reset chip on this unit was faulty. The reset time was around
2ms to 4ms
> > and not consistently the same time, but changing in this range
each time. I
> > then noticed it was poorly soldered on by hand. I resoldered it
and it was
> > still providing the same results. I tried different capacitor
values which
> > set the reset time, and this made no difference to the reset
time. I
> > replaced the chip and went to correct capacitor value, and the
reset time
> > went back to the consistent 10ms it was designed for. I am not
sure if a 2ms
> > reset time would be short enough for the oscillator to not have
started up
> > properly, but my research reveals it to be much lower than
recommended reset
> > time, so definately a possibility.
> >
> > I have since found out that in the last batch of units, they ran
out of
> > these reset chips, and returned half the batch without the reset
chip
> > fitted. Someone at my company then hand soldered the rest of the
reset chips
> > on when we received more stock. The soldering method seems to
have been
> > pretty ordinary, as not only is the part misaligned (even though
the
> > connections were good), but the part has been damaged in some way
(static,
> > or overheat??) to make it ignore the external capacitor that set
the reset
> > time. One can only presume it to have had other undesired effects
on the
> > reset output also (maybe at temperature extremes, the reset time
may have
> > been further reduced, or even eliminated??)
> >
> > At this stage, I am going to assume this is the cause of the
problem in the
> > absence of further evidence to provide any other possibility, and
the lack
> > of evidence to conclusively find the problem. I am happy because
this should
> > mean the previous batch field units are fine (and I do not have
to alter the
> > design... but have to alter quality control), but I now have to
try and find
> > and fix any other units that may have the same problem. :(
> >
> > Thanks heaps for your help. It really gave me direction in
looking into
> > this.
> >
> > Cheers,
> >
> > Adrian
> >
> >
> >
> >
> > ----- Original Message -----
> > From: "Steve Russell"
> > To: <6...>; <6...>
> > Sent: Friday, October 27, 2006 6:11 AM
> > Subject: Re: [68HC12] Flash sector erase on S12DP256B
> >
> >
> > > Adrian,
> > >
> > > Some notes on further straws below.
> > >
> > > My reading of all this is that the units did not fail in the
middle of
> > > operation, but failed only on power up.
> > >
> > > Is this correct?
> > >
> > > Steve Russell
> > > Nohau Emulators
> > >
> > > At 05:42 PM 10/25/2006, Adrian Vos wrote:
> > >>Thankyou Steve and Edward,
> > >>
> > >>A bit more detail about how it works. There is indeed code in
the unit
> > >>that
> > >>can erase a flash sector. I leave the sectors unprotected in
normal
> > >>operation (I recall having difficulties with this as once
protected, it
> > >>cannot be unprotected without a reboot from memory as it is
write once
> > >>within a certain time after boot).
> > >
> > > I think you are confusing the HC11 and the HC12 families.
> > >
> > > The HC11 has the "within the first 64 cycles after reset" write
> > > restriction
> > > on some registers.
> > >
> > > The HC12 family has always been "write once" restriction only,
sometimes
> > > with the errata "write twice in special modes".
> > >
> > >>The sector erase code can only be invoked
> > >>by an RS232 command which is a lengthy data packet including a
password
> > >>and
> > >>a checksum (difficult for noise to reproduce this set of
events.... but
> > >>possible that the code to erase the vector could be
accidentally called by
> > >>runaway code I guess). The function that does the sector erase
copies code
> > >>to ram, and then does the sector erase from ram and then
returns to flash.
> > >>This function is only called once in my code in a function with
a loop
> > >>that
> > >>erases all not bootcode flash sectors (in this case only the
first sector
> > >>was erased). The function that does the full main firmware
erase is only
> > >>ever called from an RS232 message, and this code intentionally
invokes a
> > >>COP
> > >>reset immediately after the full firmware is erased to cause
the unit to
> > >>reboot into boot code, which can then reprogram the main
firmware flash
> > >>with
> > >>new firmware. This prevents the unit from returning from this
function to
> > >>the now erased flash code that called the function. There was
no RS232
> > >>connected at the time this problem took place (The product is
normally
> > >>tuned
> > >>with laptop connected and them may never be connected again..
normal
> > >>operation is without the laptop connected).
> > >>
> > >>Now I may have confused in the previous email. I have never had
a unit
> > >>suffer a watchdog timeout in normal operation. If this did
happen the unit
> > >>would enter bootmode (the engine would stop and require a power
on reset
> > >>to
> > >>enter normal mode again), but this condition would be reset on
the next
> > >>cold
> > >>bootup. In the COP reset vector I set a variable differently to
power on
> > >>reset, and this causes the unit to stay in boot mode. A normal
power on
> > >>reset will not set this variable and the result will be that
the unti
> > >>falls
> > >>out of bootmode as normal. The reason I did this was in case
the firmware
> > >>in
> > >>the unit was corrupted, the unit would stay in bootmode, and a
PC can be
> > >>connected to the unit to replace the firmware, as the bootmode
allows
> > >>RS232
> > >>firmware update.... where corrupted firmware would not allow
this. This
> > >>caters for cases where the unit loses power part way through a
firmware
> > >>update, or uses corrupted firmware files, or if the firmware
gets
> > >>corrupted
> > >>in some other way.... which is what happened in this case.
> > >
> > > I think that this means that the erasing occurred somewhere
between power
> > > off and the de-assertion of reset on power on.
> > >
> > >>This unit was
> > >>actually exiting bootmode as it should on a power on reset, and
entering
> > >>the
> > >>normal firmware. Because the first sector had been erased, it
was jumping
> > >>to
> > >>code in this erased area or more likely some function pointers
in this
> > >>sector that now pointed to 0xFFFF. This caused the code to
runaway, and a
> > >>COP timeout which causes a COP reset that was detected by the
bootcode,
> > >>and
> > >>it remained in boot mode. This occured every time on power on
reset
> > >>rendering the engine useless as the unit would not run it.
> > >>
> > >>I actually have no problem with the way I handle this
condition..... the
> > >>firmware is corrupted and should not be running the engine. I
put this
> > >>code
> > >>in to handle firmware upgrade failures, and it does handle this.
> > >>Unfortunately, I never intended for the unit to be able to
corrupt its own
> > >>firmware once in service. My aim is to prevent this from
happening again.
> > >>It
> > >>has not happened before recently, and has happened to three
(another unit
> > >>today since yesterday..... the firmware in the units has not
changed
> > >>recently) units in the last batch of 100, which makes me wonder
if it is
> > >>hardware related.
> > >
> > > This suggests looking carefully at the differences between the
failing
> > > unit
> > > and a sample of the previous batch that worked. Did anything
at all
> > > change?
> > >
> > > Perhaps the PCB material is slightly different or the board is
not as well
> > > cleaned. Either might change the oscillator parameters enough
to cause
> > > trouble.
> > >
> > > Changes in the power supply and input filtering components that
cause less
> > > protection that previously are also possible
> > >
> > > Perhaps the failing units are somehow installed differently
from the first
> > > 2 changed batches? Installed in an electrically noisier
environment that
> > > previous units?
> > >
> > > Could the failing units have been operated at more extreme
temperatures
> > > that previous units? Crystal oscillator startup problems are
rumored to
> > > show at temperature extremes.
> > >
> > >>I do know that when I have run this product without a reset
chip before
> > >>(when trying to test if the reset delay was the cause of my
previous
> > >>problem
> > >>interfacing with other devices cause by slow bootup), and I did
have one
> > >>instance of the entire flash being erased when doing this
(never a single
> > >>sector). I have never had this problem in earlier hardware
revisions when
> > >>running with a reset time of 200ms using the MAX809L reset chip
(with
> > >>fixed
> > >>reset delay of 200ms). Unfortunately this 200ms delay caused
other
> > >>problems,
> > >>so I replaced the reset chip with the L3470-4.63V. The delay
was set to
> > >>10ms. This should not cause a problem but who knows?
> > >
> > > I think that triggering a scope on the de-assertion of reset at
the MCU
> > > and
> > > looking at some suitable clock signal and the reset line while
going
> > > through bad input power scenarios would be instructive. If you
can enable
> > > ECLK and look at that it would be ideal.
> > >
> > > If I did these tests, I would also try freeze spray and heat
gun to get a
> > > feel for extreme temperature operation. You have to be fairly
cautions
> > > about changing the temperature of the board too fast. Perhaps
soaking in
> > > a
> > > refrigerator and an oven is a better plan.
> > >
> > > I guess that I would also try very carefully re-reading the
data sheets
> > > and
> > > relevant application notes for both reset chips, looking for
subtle
> > > differences and hints about gotchas.
> > >
> > >>It is a huge coincidence to have only units from the most
recent batch
> > >>have
> > >>this problems when there is atleast double the number from the
most recent
> > >>batch in the field running the same firmware. I have to check
for any
> > >>accidental changes to this batch.
> > >
> > > Very carefully! The circumstantial evidence for a change is
pretty
> > > convincing.
> > >
> > >>I should say that the reset chip change
> > >>has been in the last 3 batches also, and it would seem that the
problem is
> > >>limitted to the most recent batch. Is there anything hardware
related
> > >>other
> > >>than the incorrect operation of the reset line that could cause
flash to
> > >>be
> > >>erased?
> > >
> > > Noisy power or wildly varying clock period. Both have been
seen to cause
> > > wildly unpredictable operation of HCS-12 parts.
> > >
> > > Are the XFC filter components unchanged and well soldered?
> > > I've seen some really bad behavior when switching to the on-
chip PLL with
> > > nothing connected to XFC.
> > >
> > >>Thanks heaps for your help. I am really clutching at straws to
fix a
> > >>problem
> > >>that I cannot reproduce. I have got a unit that has done this
that I have
> > >>been able to see what happened (erased sector), but that is the
only info
> > >>I
> > >>have.
> > >>
> > >>Cheers,
> > >>
> > >>Adrian
> > >>
> > >>----- Original Message -----
> > >>From: "Steve Russell"
> > >>To: <6...>
> > >>Sent: Thursday, October 26, 2006 4:39 AM
> > >>Subject: Re: [68HC12] Flash sector erase on S12DP256B
> > >>
> > >>
> > >> > Adrian,
> > >> >
> > >> > Here are some suggestions about approaches to your problem,
but no
> > >> > solutions.
> > >> >
> > >> > Since the "12 V" voltage varies all over the place when
cranking, it
> > >> > may
> > >> > be
> > >> > that the problem can be fixed with simple power supply mods,
like
> > >> > larger
> > >> > filter capacitors or over-voltage protection on the "12 V"
input, or
> > >> > both.
> > >> >
> > >> > Because there are so many possibilities in a situation like
this, and
> > >> > the
> > >> > testing is so difficult, I always recommend initializing ALL
the
> > >> > interrupt
> > >> > vectors to go to a tight loop, preferably one per interrupt.
> > >> >
> > >> > This makes development in the lab much easier if unexpected
interrupts
> > >> > occur. It also makes it much easier to rule out an
unexpected
> > >> > interrupt
> > >> > as
> > >> > a cause of field failures.
> > >> >
> > >> > The COP watchdog can get you out of the loop with a reset.
You might
> > >> > want
> > >> > to do something that gives better information about the
unexpected
> > >> > interrupt in the field, but the program shouldn't do an RTI
from an
> > >> > unexpected interrupt, because the program can't remove the
interrupt
> > >> > source, since the it didn't expect the interrupt.
> > >> >
> > >> > You could also put in some defensive code just before the
bootloader's
> > >> > flash erasing that checks that the bootloader was properly
entered and
> > >> > has
> > >> > a good checksum. This should give you some assurance that
the
> > >> > bootloader
> > >> > was started properly and is not corrupted.
> > >> >
> > >> > Since you seem to have plenty of space, you could also use
the "two
> > >> > copies"
> > >> > bootloading method. You burn the new copy into an unused
flash buffer,
> > >> > leaving the current copy undisturbed in flash. Check the
checksum of
> > >> > the
> > >> > new copy. If it checks, the bootloader makes it the
operational
> > >> > program,
> > >> > if not, it uses the old version and forgets about the bad
copy.
> > >> >
> > >> > This way, you always have at least one undisturbed copy of
the
> > >> > operational
> > >> > code available no matter what accidents occur in downloading
a new
> > >> > copy. You still may have some exposure to disasters in the
process of
> > >> > making a new good copy the operational version, but its much
less, and
> > >> > could possibly be eliminated completely.
> > >> >
> > >> > Hope this helps.
> > >> >
> > >> > Steve Russell
> > >> > Nohau Emulators
> > >> >
> > >> > At 12:19 AM 10/25/2006, Adrian Vos wrote:
> > >> >>Hi all,
> > >> >>
> > >> >>I am using an S12DP256B in an automotive engine management
product. I
> > >> >>am
> > >> >>using it in small memory model mode, but have it configured
so that the
> > >> >>main
> > >> >>firmware is in 0x4000-0x7FFF and 0xC000-0xEFFF. I have a
bootloader
> > >> >>resident
> > >> >>in 0xF000-0xFFFF along with the vectors. The unit resets to
the
> > >> >>bootloader
> > >> >>which checks if main firmware is present and jumps to it if
it is
> > >> >>present.
> > >> >>If it is not present, it enters a serial protocol which can
upload main
> > >> >>firmware into the unit. The unit will also stay in the
bootloaded if it
> > >> >>ever
> > >> >>resets from the watchdog timer.
> > >> >>
> > >> >>We have almost 1000 of these units in use on cars, and
recently I have
> > >> >>had
> > >> >>2
> > >> >>units back that entered boot mode (the software application
that you
> > >> >>used
> > >> >>with this product detects boot mode). Before loading
firmware into the
> > >> >>units
> > >> >>(which went without problem), I checked to see what had gone
wrong by
> > >> >>reading hte memory contents via the BDM. The first sector of
the flash
> > >> >>(0x4000-0x41FF) was blank. The rest of the firmware was
fine. I suspect
> > >> >>that
> > >> >>the unit enters boot mode when it attempts to jump to a
routine in this
> > >> >>area, and then gets lost eventuating in a watchdog timeout.
Actually
> > >> >>just
> > >> >>looking at the memory map, I have some interupt vector
function tables
> > >> >>that
> > >> >>are in the erased area, meaning that on boot up, some
interrupt vector
> > >> >>will
> > >> >>be set to jump to 0xFFFF for the ISR.
> > >> >>
> > >> >>Anyway, I doubt that my code is erasing this sector. The
cars that ran
> > >> >>fine
> > >> >>until they went to restart the engine at one point and it
never
> > >> >>started.
> > >> >>It
> > >> >>seems that this sector was erased on either the powering
down of the
> > >> >>unit
> > >> >>or
> > >> >>the powering up of the unit. This leads me to think it may
have
> > >> >>something
> > >> >>to
> > >> >>do with the reset chip. I am using the the LP3470M5-4.63V
reset chip
> > >> >>with
> > >> >>the reset time set to 10ms. I previously had a different
reset chip
> > >> >>with a
> > >> >>200ms reset time, but this caused me problems in some
applications, as
> > >> >>the
> > >> >>system works with other products, and the other products
were getting
> > >> >>errors
> > >> >>due to the slow bootup time I was running.
> > >> >>
> > >> >>I am thinking I should increase the reset delay, but I do
not want to
> > >> >>increase it so much that I get my other problems back.
> > >> >>
> > >> >>Can anyone provide any advice on what may have caused my
problem, and
> > >> >>how
> > >> >>I
> > >> >>might remedy it. Particularly if it may be the short rest
time.
> > >> >>Apparently
> > >> >>the reset should be held at any time below 4.63V input
voltage, and for
> > >> >>10ms
> > >> >>after it gets above 4.63V. What is the shortest reset delay
that should
> > >> >>still provide high reliability of the flash memory. I am not
looking
> > >> >>for a
> > >> >>number that is much longer than it needs to be to be safe. I
am looking
> > >> >>for
> > >> >>the shortest delay that people have found to provide
reliable
> > >> >>operation.
> > >> >>
> > >> >>Thanks,
> > >> >>
> > >> >>Adrian
> > >> >>
> > >> >>Send instant messages to your online friends
> > >> >>http://au.messenger.yahoo.com
> > >> >>
> > >> >>
> > >> >>
> > >> >>