EmbeddedRelated.com
Forums

Lock up through power cycling.

Started by Kipton Moravec February 20, 2007
On Wed, 2007-02-21 at 09:57 -0600, Dan Muzzey wrote:
>
> The large caps on the power line won't do much for high frequency noise.
> If you are looking to get rid of high frequency stuff throw some small
> caps in there.

I will try that. My large capacitors were more for shock recovery for AA
batteries. If they got banged and broke contact for an instant.

> Pay careful attention to trace lengths here. Trace
> inductance can effectively remove the caps from the circuit if you are
> not very careful.

My power trace lengths can not physically get much shorter. They are all
less than 100 mils and most are less than 50 mils. The via from the
ground or power plane to the pin is about 10 mils from the pin. The
capacitors are directly below on the other side of the board with
the .1uF the closest, the 1uF and 10uF next closest.

> Putting a pi filter on the line can help as well.
> >From your description I would be surprised if it was a power problem.
> Now, granted, I've been surprised before but I'd focus my efforts on
> that 32KHz crystal and the flash write routines.

The 32K crystal is only used for the ACLK, not for MCLK (instructions).
But I will check it.

I will remove the flash routines as they are only used for initializing
the information memory.

> TI has an app note on
> checking the stability of 32K crystals where you mess with the
> equivalent series resistance. Might want to dig into there website and
> take a look there.

MSP430 32-kHz Crystal Oscillators
SLAA322.PDF
I am reading it now.

> Its relatively new and would have saved us months of
> headaches. Just to eliminate a possibility, is it possible to power the
> circuit through a known good source to eliminate the issue completely?
> Maybe skip the switcher and feed 12V into a 3.3 regulator?

Let me look at that. I think I put a zero Ohm resister between the
output of the switcher and the power plane.

>
>

> Dan M
>
>
> Thanks,
> Kip
>
> > Lou
> >
> > __________________________________________________________
> > Want a degree but can't afford to quit? Top school degrees online - in
> as
> > fast as 1 year
> >
> http://forms.nextag.com/goto.jsp?url=/serv/main/buyer/education.jsp?doSe
> arch=n&tm=y&searchcation_text_links_88_h288c&s@79&pQ16
> > earch=n&tm=y&searchcation_text_links_88_h288c&s@79&pQ16>
> >
> >
> >
> >
>
> >
> >
> > Yahoo! Groups Links
> >
> >
> >
> >
> --
> Kipton Moravec >
>
>
>
>
> Yahoo! Groups Links
--
Kipton Moravec

Beginning Microcontrollers with the MSP430

I am betting on a stack Issue...why not fill the unused space with a know value like DEADBEEF ;) and have the ram dumped at periodic intervals to see if ram usage is growing

Also make sure the thermal pad on the 1611 is soldered to the board.. I had a long term stability issue on a board where it was not soldered..

Kipton Moravec wrote:
I got to thinking that I was going to do the SVS but had not gotten
around to it, and it may help me if it was indeed a power problem. So I
hooked up the partner board to the one I am having problems with, (since
I am still waiting the 30 hours or so for it to fail.) And added the SVS
code.

I am measuring 3.32V for power.

So in my initialization I added the line

mov.b #0xA8,&SVSCTL ; Set the threshold at 3.05V and POR

I set a break point on that line, and ran to it.

I single stepped (using IAR Compiler) while watching the SVSCTL register
in the debugger, and the SVSCTL register went to 0xFF, and the IAR
Debugger locked up on me. I had to get out of the IAR IDE to recover.

Now I can not talk to the board at all. I have rebooted the PC (multiple
times). I was using the USB JTAG debugger, and switched to the parallel
port debugger. Same thing.

The message I get is:
Emulator
No device found or device disconnected.
Please reconnect the device and press Retry to reconnect
Or press Cancel to abort.

I commented out the line and grabbed another board and was able to talk
to it with the debugger. So it is not the debugger.

Back to the dead board, it looks like the power is O.K. but neither the
32KHz or 3.56MHz is oscillating. I use the 32 KHz crystal for the ACLK
and 3.56 MHz crystal for the SMCLK, and the internal osc for the MCLK.
I think the '1611 died at that instant.

Any ideas for recovery or just solder a new chip on?

===============
As for the flash write, I only do it once to initialize the information
memory. I will comment it out after the first run, and tell the compiler
not to erase the information memory, then I will not have any code that
writes to flash in the program.

On Tue, 2007-02-20 at 16:21 +0000, Joerg Schulze-Clewing wrote:
> Hello Kip,
>
> Could there be any section in your code that might initiate a spurious
> flash write? I am a hardware guy so my first look would be at the
> supply voltage. Is the SVS/BOR configured properly (chapter 6 of
> family spec)? If so, do you have a digital scope that features nifty
> trigger qualifiers so you can set it to go off and log when the supply
> voltage falls outside a window? Tektronix TDS series or something like
> that.
>
> I don't know what else is on your circuit board but could there be
> anything that can generate spikes beyond of what the bypass capacitors
> can muffle?
>
> Regards, Joerg
>
> http://www.analogconsultants.com/
> Yahoo! Groups Links
--
Kipton Moravec
________________________________

From: m... [mailto:m...] On Behalf
Of Kipton Moravec
Sent: Wednesday, February 21, 2007 12:50 PM
To: m...
Subject: RE: [msp430] Lock up through power cycling.

On Wed, 2007-02-21 at 09:57 -0600, Dan Muzzey wrote:
>
> The large caps on the power line won't do much for high frequency
noise.
> If you are looking to get rid of high frequency stuff throw some small
> caps in there.

I will try that. My large capacitors were more for shock recovery for AA
batteries. If they got banged and broke contact for an instant.

> Pay careful attention to trace lengths here. Trace
> inductance can effectively remove the caps from the circuit if you are
> not very careful.

My power trace lengths can not physically get much shorter. They are all
less than 100 mils and most are less than 50 mils. The via from the
ground or power plane to the pin is about 10 mils from the pin. The
capacitors are directly below on the other side of the board with
the .1uF the closest, the 1uF and 10uF next closest.

Keep in mind with bypass capacitors that you add to the lead length of
the capacitor whenever you have a stub leading to the capacitor from the
ground or Vrail. Basically a T arrangement with the capacitor on the
bottom of the T. The leg of the T becomes part of the lead and the high
frequency energy can bypass the cap. A V type arrangement where the
entire voltage trace dips down to the tip of the capacitor helps to
lower the inductance. Now, all that being said, I would not anticipate
this having any noticeable effect on your device. Its something that
I've started designing in because its theoretically true and makes
sense. It may help the device be more tolerant of high powered RF
energy and it may help get rid of really fast, stubborn transients.

Good luck
> Putting a pi filter on the line can help as well.
> >From your description I would be surprised if it was a power problem.
> Now, granted, I've been surprised before but I'd focus my efforts on
> that 32KHz crystal and the flash write routines.

The 32K crystal is only used for the ACLK, not for MCLK (instructions).
But I will check it.

I will remove the flash routines as they are only used for initializing
the information memory.

> TI has an app note on
> checking the stability of 32K crystals where you mess with the
> equivalent series resistance. Might want to dig into there website and
> take a look there.

MSP430 32-kHz Crystal Oscillators
SLAA322.PDF
I am reading it now.

> Its relatively new and would have saved us months of
> headaches. Just to eliminate a possibility, is it possible to power
the
> circuit through a known good source to eliminate the issue completely?
> Maybe skip the switcher and feed 12V into a 3.3 regulator?

Let me look at that. I think I put a zero Ohm resister between the
output of the switcher and the power plane.
> Dan M
> Thanks,
> Kip
>
> > Lou
> >
> > __________________________________________________________
> > Want a degree but can't afford to quit? Top school degrees online -
in
> as
> > fast as 1 year
> http://forms.nextag.com/goto.jsp?url=/serv/main/buyer/education.jsp?doSe
e>
> arch=n&tm=y&searchcation_text_links_88_h288c&s@79&pQ16
>
>
> earch=n&tm=y&searchcation_text_links_88_h288c&s@79&pQ16>
> >
> >
> >
> >

>
> >
> >
> > Yahoo! Groups Links
> >
> >
> >
> >
> --
> Kipton Moravec
>

> Yahoo! Groups Links
--
Kipton Moravec >
>I am betting on a stack Issue...why not fill the unused space with a know
>value like DEADBEEF ;) and have the ram dumped at periodic intervals to see
>if ram usage is growing

I agree. Because he seems to have a problem after a certain period of time
has passed, I also think it is something in his code. I would recommend in
a periodic ISR to read the value of the stack pointer and write it to a
memory location. Then, at the 29th hour, stop the processor and read the
value of the stack pointer variable. Something seems to be building up and
finally causes a problem at the 30th hour.

I suppose it could also be the heap. If you do any alloc/dealloc's, I would
look at that to see if somehow it is filling memory.

BTW, I like your DEADBEEF fill, I will have to try that some time. I used
to use just DEAD.

Lou

_________________________________________________________________
Dont miss your chance to WIN 10 hours of private jet travel from Microsoft
Office Live http://clk.atdmt.com/MRT/go/mcrssaub0540002499mrt/direct/01/



Yahoo! Groups - Join or create groups, clubs, forums & communities. Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/msp430/

<*> Your email settings:
Individual Email | Traditional

<*> To change settings online go to:
http://groups.yahoo.com/group/msp430/join
(Yahoo! ID required)

<*> To change settings via email:
mailto:m...
mailto:m...

<*> To unsubscribe from this group, send an email to:
m...

<*> Your use of Yahoo! Groups - Join or create groups, clubs, forums & communities. is subject to:
http://docs.yahoo.com/info/terms/
I stay away from alloc/dealloc in embedded systems. I never use them.

The Stack already seems to be pre-loaded with 0xCD. I have watched it
and so far the initialization routines use more of the stack than the
main program, that runs after the first 60 seconds and we get GPS lock.
It does not seem to be creeping.

I have removed the two subroutines that initialize information memory at
startup, and load it with erasing main memory only. There are no FLASH
write routines in the code at all.

If it was something with anything to do with RAM, (Stack, allocations,
etc.) cycling power should fix it. It does not.

On Thu, 2007-02-22 at 07:54 -0600, Lou C wrote:
> >I am betting on a stack Issue...why not fill the unused space with a know
> >value like DEADBEEF ;) and have the ram dumped at periodic intervals to see
> >if ram usage is growing
>
> I agree. Because he seems to have a problem after a certain period of time
> has passed, I also think it is something in his code. I would recommend in
> a periodic ISR to read the value of the stack pointer and write it to a
> memory location. Then, at the 29th hour, stop the processor and read the
> value of the stack pointer variable. Something seems to be building up and
> finally causes a problem at the 30th hour.
>
> I suppose it could also be the heap. If you do any alloc/dealloc's, I would
> look at that to see if somehow it is filling memory.
>
> BTW, I like your DEADBEEF fill, I will have to try that some time. I used
> to use just DEAD.
>
> Lou
>
> _________________________________________________________________
> Don’t miss your chance to WIN 10 hours of private jet travel from Microsoft®
> Office Live http://clk.atdmt.com/MRT/go/mcrssaub0540002499mrt/direct/01/
>
> Yahoo! Groups - Join or create groups, clubs, forums & communities. Links
--
Kipton Moravec
I am trying to summarize possible causes of a MSP430 lockup. I
crudely classify the severity of the lockup into (a) appears dead,
(b) needs re-programming, (c) needs cycling the power or a reset.

I will start by stating what I know. I hope you guys will correct my
mistakes and add more possible causes to the following.
(a) Appears dead -- The device cannot even be re-programmed.

Aside from the obvious possible causes (such as a steamroller ran
over it), there is an interesting one.

This possible cause applies to devices with SVS. If your code sets
the VLD field of the SVSCTL register to a voltage that is higher than
your programming tool can supply, then your tool probably cannot
erase it and re-program it. For example, you can set VLD to 3.7V and
most tools can only supply up to 3.6V.

I have two solutions to unlock such a situation. One way is to fixe
the software of the tool. It should hold the nRST pin of the device
low before power up the device. After power up, it should set up
JTAG/BSL to control the device before it releases nRST. Most existing
software of programming tool powers up the device first without
holding nRST pin down and tries to gain JTAG/BSL control of the
device afterwards. By that time, if the existing code in the device
already sets VLD above the power, the SVS holds the device in reset
mode and the tool cannot gain control of the device.

The other solution is to modify the hardware of the tool so that it
can supply a voltage higher than the current VLD setting.

Are there other possible causes for an un-damaged device to resist
erase and re-programming?
(b) Needs re-programming -- Cycling the power cannot un-lock it.

For this kind of lockup, the only cause that I know of is corrupted
Flash.

In the tread Lock up through power cycling many different possible
causes are pointed out. But in my opinion, either the Flash is
corrupted, or re-cycling the power is not really carried out.

Anyone disagree with this?
(c) Needs cycling the power or a reset

Power glitches, brownout, stack/heap overflow, indexing out of bound,
etc. may cause a lockup of this kind. Cycling the power can always un-
lock it.

Anyone disagree with this?
Couple more see below...Some of these are not actually micro problems but appear as micro problems.

________________________________

From: m... [mailto:m...] On Behalf Of old_cow_yellow
Sent: Thursday, February 22, 2007 3:16 PM
To: m...
Subject: [msp430] Re: Lock up through power cycling.

I am trying to summarize possible causes of a MSP430 lockup. I
crudely classify the severity of the lockup into (a) appears dead,
(b) needs re-programming, (c) needs cycling the power or a reset.

I will start by stating what I know. I hope you guys will correct my
mistakes and add more possible causes to the following.

(a) Appears dead -- The device cannot even be re-programmed.

Aside from the obvious possible causes (such as a steamroller ran
over it), there is an interesting one.

This possible cause applies to devices with SVS. If your code sets
the VLD field of the SVSCTL register to a voltage that is higher than
your programming tool can supply, then your tool probably cannot
erase it and re-program it. For example, you can set VLD to 3.7V and
most tools can only supply up to 3.6V.

I have two solutions to unlock such a situation. One way is to fixe
the software of the tool. It should hold the nRST pin of the device
low before power up the device. After power up, it should set up
JTAG/BSL to control the device before it releases nRST. Most existing
software of programming tool powers up the device first without
holding nRST pin down and tries to gain JTAG/BSL control of the
device afterwards. By that time, if the existing code in the device
already sets VLD above the power, the SVS holds the device in reset
mode and the tool cannot gain control of the device.

The other solution is to modify the hardware of the tool so that it
can supply a voltage higher than the current VLD setting.

Are there other possible causes for an un-damaged device to resist
erase and re-programming?

ESD

Blown Fuse

Power Supply Problems

Bad wire on the programmer (somewhat silly but its happened)

Bad Clock - Reprogramming seems to fix the issue although this is deceiving.

Computer needs rebooted.

Bad watchdog chip holding the device in reset.

(b) Needs re-programming -- Cycling the power cannot un-lock it.

For this kind of lockup, the only cause that I know of is corrupted
Flash.

In the tread Lock up through power cycling many different possible
causes are pointed out. But in my opinion, either the Flash is
corrupted, or re-cycling the power is not really carried out

Anyone disagree with this?

Bad default values stored in flash that put the unit in a strange undefined state where it appears to be completely locked up but is in reality running an infinite delay loop (or something like that)

(c) Needs cycling the power or a reset - A good external WDT should fix.

Power glitches, brownout, stack/heap overflow, indexing out of bound,
etc. may cause a lockup of this kind. Cycling the power can always un-
lock it.

ESD

Code Problems

Interrupt Problems

Bad Clock - 32K crystal needs a kick to get started.

Anyone disagree with this?
Hi
did you consider some abnormal elctromagnetic activity It seems that you get a burst of interrupts that hangs the cpu.
another thing is that a flash writing routine stiil resides in the rom part and it may accidntly be triggerd to write to flash.
Ezra
________________________________________________________________________
Check Out the new free AIM(R) Mail -- 2 GB of storage and industry-leading spam and email virus protection.
Old Cow Yellow
>(b) Needs re-programming -- Cycling the power cannot un-lock it.
>
>For this kind of lockup, the only cause that I know of is corrupted
>Flash.

>In the tread Lock up through power cycling many different possible
>causes are pointed out. But in my opinion, either the Flash is
>corrupted, or re-cycling the power is not really carried out.

>Anyone disagree with this?

I agree, except there is another odd case if you are running code from RAM
and you are not loading the RAM correctly via the code at power up. It is
possible for the RAM to be set correctly when the code is downloaded from a
FET. Everything works fine until you have to restart the unit or it crashes.

This appears to have the symptoms above.

I know because I have done it.

Spencer
Hi all,

I am using a wireless sensor node hardware with MSP430F1611 and a
CC2420. I am having similar problem defined under the discussion given
with "Lock up through power cycling".

My hardware is undeterministically locked up. However, a power off /
power on event does not make it work.

I would like to ask if Kipton has found the problem. (I could not find
the root cause or the answer in the mailing list)

Thanks,

Tolga Coplu

www.genetlab.com

________________________________

From: m... [mailto:m...] On Behalf
Of Kipton Moravec
Sent: Thursday, February 22, 2007 6:16 PM
To: m...
Subject: Re: [msp430] Re: Lock up through power cycling.

I stay away from alloc/dealloc in embedded systems. I never use them.

The Stack already seems to be pre-loaded with 0xCD. I have watched it
and so far the initialization routines use more of the stack than the
main program, that runs after the first 60 seconds and we get GPS lock.
It does not seem to be creeping.

I have removed the two subroutines that initialize information memory at
startup, and load it with erasing main memory only. There are no FLASH
write routines in the code at all.

If it was something with anything to do with RAM, (Stack, allocations,
etc.) cycling power should fix it. It does not.

On Thu, 2007-02-22 at 07:54 -0600, Lou C wrote:
> >I am betting on a stack Issue...why not fill the unused space with a
know
> >value like DEADBEEF ;) and have the ram dumped at periodic intervals
to see
> >if ram usage is growing
>
> I agree. Because he seems to have a problem after a certain period of
time
> has passed, I also think it is something in his code. I would
recommend in
> a periodic ISR to read the value of the stack pointer and write it to
a
> memory location. Then, at the 29th hour, stop the processor and read
the
> value of the stack pointer variable. Something seems to be building up
and
> finally causes a problem at the 30th hour.
>
> I suppose it could also be the heap. If you do any alloc/dealloc's, I
would
> look at that to see if somehow it is filling memory.
>
> BTW, I like your DEADBEEF fill, I will have to try that some time. I
used
> to use just DEAD.
>
> Lou
>
> __________________________________________________________
> Don't miss your chance to WIN 10 hours of private jet travel from
Microsoft(r)
> Office Live
http://clk.atdmt.com/MRT/go/mcrssaub0540002499mrt/direct/01/

> Yahoo! Groups - Join or create groups, clubs, forums &
communities. Links
--
Kipton Moravec >