EmbeddedRelated.com
Forums

how to debug reset caused by WDTCTL security key violation

Started by Xiaohui Liu October 28, 2012
Hi everyone,

I'm working on a sensor project which uses
TelosBbased
on msp430f1611 running
TinyOS . My program is reset some time after boot
up. After the PUC reset, IFG1 is found with WDTIFG bit set, indicating the
watchdog timer initiates the reset. This can happen under two cases:
1) Watchdog timer expiration when in watchdog mode only.
But watchdog timer is never started, so this cannot happen.
2) Watchdog timer security key violation.
There is no place that my program explicitly writes WDTCTL (i.e., 0x0120h).
So there must be some memory access bug in my code, which illegally writes
WDTCTL and causes security key violation.
Is there any debug tool to help locate where this happens? Or any any
suggestion on how I should proceed to locate the bug? My program is of
thousands of lines, so manual check is non-trivial.

Please weigh in if you have any suggestion. Thank you very much in advance.

More detailed information of the bug can be found
here
.

-Xiaohui Liu


Beginning Microcontrollers with the MSP430

you may not have enabled the watchdog, but most versions of the startup code enable the watchdog timer by default.. even if the first line of your code disabled it, it might not even get to your code before reseting.. this can happen if you have alot of variables that have to be cleared and initialized or data to be copied..
 
In this case you need to modify the startup code (cstart.asm) to not enable the wd timer and recompile..

>________________________________
>From: Xiaohui Liu
>To: msp430
>Sent: Sunday, October 28, 2012 8:10 PM
>Subject: [msp430] how to debug reset caused by WDTCTL security key violation
>

>Hi everyone,
>
>I'm working on a sensor project which uses
>TelosBbased
>on msp430f1611 running
>TinyOS . My program is reset some time after boot
>up. After the PUC reset, IFG1 is found with WDTIFG bit set, indicating the
>watchdog timer initiates the reset. This can happen under two cases:
>1) Watchdog timer expiration when in watchdog mode only.
>But watchdog timer is never started, so this cannot happen.
>2) Watchdog timer security key violation.
>There is no place that my program explicitly writes WDTCTL (i.e., 0x0120h).
>So there must be some memory access bug in my code, which illegally writes
>WDTCTL and causes security key violation.
>Is there any debug tool to help locate where this happens? Or any any
>suggestion on how I should proceed to locate the bug? My program is of
>thousands of lines, so manual check is non-trivial.
>
>Please weigh in if you have any suggestion. Thank you very much in advance.
>
>More detailed information of the bug can be found
>here
>.
>
>-Xiaohui Liu
>
>
>



wRITE AN isr HANDLER FOR THE wdt INTERRUPT, TRAP IT HERE BEFORE THE puc
CAN OCCUR then examine your stack to see where the code was executing
prior to the violation. Then take it from there.

You cannot do this after the PUC as the C start up sequences usually
screw with the evidence.

Al

On 29/10/2012 10:40 AM, Xiaohui Liu wrote:
> Hi everyone,
>
> I'm working on a sensor project which uses
> TelosBbased
> on msp430f1611 running
> TinyOS . My program is reset some time after boot
> up. After the PUC reset, IFG1 is found with WDTIFG bit set, indicating the
> watchdog timer initiates the reset. This can happen under two cases:
> 1) Watchdog timer expiration when in watchdog mode only.
> But watchdog timer is never started, so this cannot happen.
> 2) Watchdog timer security key violation.
> There is no place that my program explicitly writes WDTCTL (i.e., 0x0120h).
> So there must be some memory access bug in my code, which illegally writes
> WDTCTL and causes security key violation.
> Is there any debug tool to help locate where this happens? Or any any
> suggestion on how I should proceed to locate the bug? My program is of
> thousands of lines, so manual check is non-trivial.
>
> Please weigh in if you have any suggestion. Thank you very much in advance.
>
> More detailed information of the bug can be found
> here
> .
>
> -Xiaohui Liu
>
>
>
Hi,

The watchdog is enabled during initialization.

Any suggestion on how to locate this bug? I've been wrestling with this bug for the past few weeks and still not found it. I'd really appreciate if you can help.

--- In m..., Joe Radomski wrote:
>
> you may not have enabled the watchdog, but most versions of the startup code enable the watchdog timer by default.. even if the first line of your code disabled it, it might not even get to your code before reseting.. this can happen if you have alot of variables that have to be cleared and initialized or data to be copied..
>  
> In this case you need to modify the startup code (cstart.asm) to not enable the wd timer and recompile..
>
>
>
> >________________________________
> >From: Xiaohui Liu
> >To: msp430
> >Sent: Sunday, October 28, 2012 8:10 PM
> >Subject: [msp430] how to debug reset caused by WDTCTL security key violation
> >
> > 
> >Hi everyone,
> >
> >I'm working on a sensor project which uses
> >TelosBbased
> >on msp430f1611 running
> >TinyOS . My program is reset some time after boot
> >up. After the PUC reset, IFG1 is found with WDTIFG bit set, indicating the
> >watchdog timer initiates the reset. This can happen under two cases:
> >1) Watchdog timer expiration when in watchdog mode only.
> >But watchdog timer is never started, so this cannot happen.
> >2) Watchdog timer security key violation.
> >There is no place that my program explicitly writes WDTCTL (i.e., 0x0120h).
> >So there must be some memory access bug in my code, which illegally writes
> >WDTCTL and causes security key violation.
> >Is there any debug tool to help locate where this happens? Or any any
> >suggestion on how I should proceed to locate the bug? My program is of
> >thousands of lines, so manual check is non-trivial.
> >
> >Please weigh in if you have any suggestion. Thank you very much in advance.
> >
> >More detailed information of the bug can be found
> >here
> >.
> >
> >-Xiaohui Liu
> >
> >
> >
> >
> >
>
>
>

there is a good chance that all the initialization is taking too long.. in that case you have to keep the watchdog disabled in the startup code..
 
>________________________________
>From: sinotrinity
>To: m...
>Sent: Sunday, November 11, 2012 4:15 PM
>Subject: [msp430] Re: how to debug reset caused by WDTCTL security key violation
>

>Hi,
>
>The watchdog is enabled during initialization.
>
>Any suggestion on how to locate this bug? I've been wrestling with this bug for the past few weeks and still not found it. I'd really appreciate if you can help.
>
>--- In mailto:msp430%40yahoogroups.com, Joe Radomski wrote:
>>
>> you may not have enabled the watchdog, but most versions of the startup code enable the watchdog timer by default.. even if the first line of your code disabled it, it might not even get to your code before reseting.. this can happen if you have alot of variables that have to be cleared and initialized or data to be copied..
>>  
>> In this case you need to modify the startup code (cstart.asm) to not enable the wd timer and recompile..
>>
>>
>>
>> >________________________________
>> >From: Xiaohui Liu
>> >To: msp430
>> >Sent: Sunday, October 28, 2012 8:10 PM
>> >Subject: [msp430] how to debug reset caused by WDTCTL security key violation
>> >
>> > 
>> >Hi everyone,
>> >
>> >I'm working on a sensor project which uses
>> >TelosBbased
>> >on msp430f1611 running
>> >TinyOS . My program is reset some time after boot
>> >up. After the PUC reset, IFG1 is found with WDTIFG bit set, indicating the
>> >watchdog timer initiates the reset. This can happen under two cases:
>> >1) Watchdog timer expiration when in watchdog mode only.
>> >But watchdog timer is never started, so this cannot happen.
>> >2) Watchdog timer security key violation.
>> >There is no place that my program explicitly writes WDTCTL (i.e., 0x0120h).
>> >So there must be some memory access bug in my code, which illegally writes
>> >WDTCTL and causes security key violation.
>> >Is there any debug tool to help locate where this happens? Or any any
>> >suggestion on how I should proceed to locate the bug? My program is of
>> >thousands of lines, so manual check is non-trivial.
>> >
>> >Please weigh in if you have any suggestion. Thank you very much in advance.
>> >
>> >More detailed information of the bug can be found
>> >here
>> >.
>> >
>> >-Xiaohui Liu
>> >
>> >
>> >
>> >
>> >
>>
>>
>


Hi,

But no interrupt will be requested if the watchdog timer is in watchdog mode, which is the setting in my case.
According to the datasheet, "When the WDT is configured to operate in watchdog mode, either writing to WDTCTL with an incorrect password, or expiration of the selected time interval triggers a PUC."

Is there anyway to take a snapshot of the execution status right before PUC removes all the clues so we can trace back?

Thanks.

--- In m..., Onestone wrote:
>
> wRITE AN isr HANDLER FOR THE wdt INTERRUPT, TRAP IT HERE BEFORE THE puc
> CAN OCCUR then examine your stack to see where the code was executing
> prior to the violation. Then take it from there.
>
> You cannot do this after the PUC as the C start up sequences usually
> screw with the evidence.
>
> Al
>
> On 29/10/2012 10:40 AM, Xiaohui Liu wrote:
> > Hi everyone,
> >
> > I'm working on a sensor project which uses
> > TelosBbased
> > on msp430f1611 running
> > TinyOS . My program is reset some time after boot
> > up. After the PUC reset, IFG1 is found with WDTIFG bit set, indicating the
> > watchdog timer initiates the reset. This can happen under two cases:
> > 1) Watchdog timer expiration when in watchdog mode only.
> > But watchdog timer is never started, so this cannot happen.
> > 2) Watchdog timer security key violation.
> > There is no place that my program explicitly writes WDTCTL (i.e., 0x0120h).
> > So there must be some memory access bug in my code, which illegally writes
> > WDTCTL and causes security key violation.
> > Is there any debug tool to help locate where this happens? Or any any
> > suggestion on how I should proceed to locate the bug? My program is of
> > thousands of lines, so manual check is non-trivial.
> >
> > Please weigh in if you have any suggestion. Thank you very much in advance.
> >
> > More detailed information of the bug can be found
> > here
> > .
> >
> > -Xiaohui Liu
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
Sorry about the confusion, I mean the watchdog is *disabled* during
initialization, not *enabled*. So case (1) of watchdog timer expiration
cannot happen.

On Sun, Nov 11, 2012 at 4:31 PM, Joe Radomski wrote:

> **
> there is a good chance that all the initialization is taking too long.. in
> that case you have to keep the watchdog disabled in the startup code..
> >________________________________
> >From: sinotrinity
> >To: m...
> >Sent: Sunday, November 11, 2012 4:15 PM
> >Subject: [msp430] Re: how to debug reset caused by WDTCTL security key
> violation
>
> >
> >
> >Hi,
> >
> >The watchdog is enabled during initialization.
> >
> >Any suggestion on how to locate this bug? I've been wrestling with this
> bug for the past few weeks and still not found it. I'd really appreciate if
> you can help.
> >
> >--- In mailto:msp430%40yahoogroups.com, Joe Radomski
> wrote:
> >>
> >> you may not have enabled the watchdog, but most versions of the startup
> code enable the watchdog timer by default.. even if the first line of your
> code disabled it, it might not even get to your code before reseting.. this
> can happen if you have alot of variables that have to be cleared and
> initialized or data to be copied..
> >> br /> > >> In this case you need to modify the startup code (cstart.asm) to not
> enable the wd timer and recompile..
> >>
> >>
> >>
> >> >________________________________
> >> >From: Xiaohui Liu
> >> >To: msp430
> >> >Sent: Sunday, October 28, 2012 8:10 PM
> >> >Subject: [msp430] how to debug reset caused by WDTCTL security key
> violation
> >> >
> >> >br /> > >> >Hi everyone,
> >> >
> >> >I'm working on a sensor project which uses
> >> >TelosB<
> http://www.memsic.com/products/wireless-sensor-networks/wireless-modules.html
> >based
> >> >on msp430f1611 running
> >> >TinyOS . My program is reset some time after boot
> >> >up. After the PUC reset, IFG1 is found with WDTIFG bit set, indicating
> the
> >> >watchdog timer initiates the reset. This can happen under two cases:
> >> >1) Watchdog timer expiration when in watchdog mode only.
> >> >But watchdog timer is never started, so this cannot happen.
> >> >2) Watchdog timer security key violation.
> >> >There is no place that my program explicitly writes WDTCTL (i.e.,
> 0x0120h).
> >> >So there must be some memory access bug in my code, which illegally
> writes
> >> >WDTCTL and causes security key violation.
> >> >Is there any debug tool to help locate where this happens? Or any any
> >> >suggestion on how I should proceed to locate the bug? My program is of
> >> >thousands of lines, so manual check is non-trivial.
> >> >
> >> >Please weigh in if you have any suggestion. Thank you very much in
> advance.
> >> >
> >> >More detailed information of the bug can be found
> >> >here
> >> >.
> >> >
> >> >-Xiaohui Liu
> >> >
> >> >
> >> >
> >> >
> >> >
> >>
> >>
> >>
> >
> >
> >
>
>
>


On Sun, 11 Nov 2012 17:05:56 -0500, Xiaohui Liu wrote:

>Sorry about the confusion, I mean the watchdog is *disabled* during
>initialization, not *enabled*. So case (1) of watchdog timer expiration
>cannot happen.

You earlier mentioned that you logically believe there must
be a security key violation causing the reset. I followed
your logic and, assuming all of the code was yours, I'd agree
with the conclusion. The fact is, though, that you have other
software in your system and you haven't discussed whether or
not you've gone through all of that code as well to ensure
that your statements about not turning on the WDT are valid.
It's possible it is happening in code you didn't write, given
that you include such code in your application.

Either way, why haven't you halted upon discovery of a
security key violation PUC, stopped immediately, and examined
RAM for the old stack information? It should still be there
(though of course upon reset you won't necessary have the
stack register to examine, you can still go look at where you
know the old stack data to reside.) You can set the stack
information so that it is possible to determine the extent of
the stack at the time of WDT PUC and if you know enough about
the activation frames used, I suspect you can work out where
things were at by dumping out that data and examining it
manually.

Is there a reason you haven't tried this? Do you have access
to the source code for the parts you didn't write, also, so
you can check for direct WDT references? How many places in
your code do you use indirection where it may result in this
problem? Have you "instrumented" that code so you compare
with the WDT address of interest before attempting an access?

Jon
I should add that I don't remember if I ever saw a table
showing the values of each register upon POR and upon PUC,
seperately. They do discuss these values for peripheral
registers. But the CPU registers? I forget if I ever saw
them. In particular, I'm not sure of R1's value on PUC due to
WDT security violation. I suspect R4 through R15 are
unchanged, though. I've no idea about R1. In the case of a
power or brown out though I think the registers are either
random and/or unchanged depending upon exact circumstances.
Still again, not sure about R1 on PUC or POR. But I think you
don't have to have it, though it may be nice.

Can anyone reference a TI document on this exact topic? (I
did check a few Family Guides already.)

Jon
Thanks for your kindly reply. My comments are inline.

On Sun, Nov 11, 2012 at 5:28 PM, Jon Kirwan wrote:

> **
> On Sun, 11 Nov 2012 17:05:56 -0500, Xiaohui Liu wrote:
>
> >Sorry about the confusion, I mean the watchdog is *disabled* during
> >initialization, not *enabled*. So case (1) of watchdog timer expiration
> >cannot happen.
>
> You earlier mentioned that you logically believe there must
> be a security key violation causing the reset. I followed
> your logic and, assuming all of the code was yours, I'd agree
> with the conclusion. The fact is, though, that you have other
> software in your system and you haven't discussed whether or
> not you've gone through all of that code as well to ensure
> that your statements about not turning on the WDT are valid.
> It's possible it is happening in code you didn't write, given
> that you include such code in your application.
>
> Either way, why haven't you halted upon discovery of a
> security key violation PUC, stopped immediately, and examined
> RAM for the old stack information?
>
How to achieve this, namely, "halted upon discovery of a security key
violation PUC"?

> It should still be there
> (though of course upon reset you won't necessary have the
> stack register to examine, you can still go look at where you
> know the old stack data to reside.) You can set the stack
> information so that it is possible to determine the extent of
> the stack at the time of WDT PUC and if you know enough about
> the activation frames used, I suspect you can work out where
> things were at by dumping out that data and examining it
> manually.
>
> Is there a reason you haven't tried this? Do you have access
> to the source code for the parts you didn't write, also, so
> you can check for direct WDT references? How many places in
> your code do you use indirection where it may result in this
> problem? Have you "instrumented" that code so you compare
> with the WDT address of interest before attempting an access?
>
For the reason mentioned above.
Yes, I have access to every line of the code.
The problem is that somewhere WDT address of interest, i.e., WDTCTL
register, is accessed unintentionally because, e.g., array index is out of
bound. And I'm having difficulty locating where this access occurs.

>
> Jon
>
>