"Morten M J�rgensen" <neax@fake.mail.com> wrote in message
news:4656ee6d$0$199$edfadb0f@dread11.news.tele.dk...
> I verified that the application not by accident jumps to the undefined
part
> of the program memory and just runs until it returns to 0x0000 (Reset
> Vector). Further more all volatile memory is cleared when the problem
> occurs - stating that my problem is a reset for sure!
Nope.
> *
> I have made a "Reset Source checking" testing on the first five bits of
the
> MCUSR register which descripes the cause of the last reset. I'm breakning
> the program execution at 0x0000 so I can check this register first thing
> after a reset. But when my problem occurs the register setting is always
> 0x00. When ressting the device other ways the register indicates the right
> reasson e.g. JTAG reset or External Reset.
>
> A possibility could be a stack overflow caused by the recursivity - but
that
> should not generate a uC reset?
Nope, but it can cause a return to 0000.
Try to imagine this:
You run a recursive routine, that, at some point tests something and keeps
calling itself until some result variable yields 0. The moment your stack,
wich also contains return addresses, grows into the data area and your
routine decides the test result is 0 and stores this value....on the stack.
You have just overwritten your return address with 0, the routine ends, a
return is executed to.... 0. Bingo! It appears as if your processor resets
but none of the Reset Source bits is set. Try to figure out the stack needs
of your routine, implement a recursion iteration counter as a global
variable and test/check/display this variable at reset, before the RAM gets
zeroed. This should give you a pretty sure evidence if and when your stack
overflows.
You should ALWAYS limit the number of iterations of a recursive process.
Meindert
Reply by John B●May 25, 20072007-05-25
On 25/05/2007 Morten M Jxrgensen wrote:
> Hello All,
>
> I'm having a hard time struggling with a reset of my application. The
> Atmega644 is used for an RFID application and handles communication +
> protocol and all RFID operations including an anticollision scheme.
> The anticollision is implemented via recursive call's.
>
> The application is written in CodeVisionAVR C compiler vers. 1.25.2.
>
> I'm debugging in AVRstudio vers. 4.12 build 460 using JTAG mkII ICE.
>
> The problem is an uncontrolled reset of the ATmega644. The reset
> always occurs when I loads the anticollision to a maximum by makeing
> the RFID reader handle way to many tags in the read area.
>
> Now the mysterious part is:
> *
> I have removed all Watchdog enabling/resetting to eliminate the
> possibility of a watchdog reset. I have verified that the watchdog
> registers are not written during execution.
>
> *
> I have disabled the "Brown-out enable" fuse.
>
> *
> I verified that the application not by accident jumps to the
> undefined part of the program memory and just runs until it returns
> to 0x0000 (Reset Vector). Further more all volatile memory is cleared
> when the problem occurs - stating that my problem is a reset for sure!
>
> *
> I have made a "Reset Source checking" testing on the first five bits
> of the MCUSR register which descripes the cause of the last reset.
> I'm breakning the program execution at 0x0000 so I can check this
> register first thing after a reset. But when my problem occurs the
> register setting is always 0x00. When ressting the device other ways
> the register indicates the right reasson e.g. JTAG reset or External
> Reset.
>
> A possibility could be a stack overflow caused by the recursivity -
> but that should not generate a uC reset?
>
> I have tried getting closer to the bug using EEprom debug variables
> and so, but the antocollision algorithm is very timing strict which
> just made this aproach corrupting the application.
>
> Any ideas on how I'll be getting closer to solve this anoying bug? Or
> anyone have an idea why I see this problem?
>
> Best Regards
The source of a reset can be found in the MCUSR register. Read it on
startup and then reset it. If you are not getting a 'real' reset but
just a jump to the reset vector, the register will remain cleared.
--
John B
Reply by Robert Adsett●May 25, 20072007-05-25
On May 25, 10:11 am, "Morten M J=F8rgensen" <n...@fake.mail.com> wrote:
> I verified that the application not by accident jumps to the undefined pa=
rt
> of the program memory and just runs until it returns to 0x0000 (Reset
> Vector). Further more all volatile memory is cleared when the problem
> occurs - stating that my problem is a reset for sure!
Nope, that just means that you have probably run a major portion of
your startup again. If you started from your 2nd, 3rd, or 4th
instruction of your startup would the observable results be any
different than a full reset? Startup usually contains code for other
basic setup (such as stack position) before memory clearing. Since
much of that is already done you wouldn't notice if it had bee
skipped.
> A possibility could be a stack overflow caused by the recursivity - but t=
hat
> should not generate a uC reset?
No but it could still jump to your start location. All that has to
happen is the return address on the stack gets overwritten with the
address of the start vector and then on return you do something
similar to a reset missing only the HW side effects.
> Any ideas on how I'll be getting closer to solve this anoying bug? Or any=
one
> have an idea why I see this problem?
Limit the number of tags you'll process. Sneak up on the number that
starts causing a problem. It may be easier to diagnose with a minimal
case. do NOT ignore odd behaviour at quantities below that required to
cause the failure, they may be early signs of the root cause and since
you may still have a partially operating system they might be easier
to diagnose. And take a good look at what ever memory usage you have
on a per tag basis. If you are using dynamic memory allocation
particularly something from the *alloc family there is a good chance
the heap and stack are colliding, and if you are then you probably
should switch to something more robust.
Robert
Reply by TT_Man●May 25, 20072007-05-25
> Now the mysterious part is:
> *
> I have removed all Watchdog enabling/resetting to eliminate the
> possibility of a watchdog reset. I have verified that the watchdog
> registers are not written during execution.
>
> *
> I have disabled the "Brown-out enable" fuse.
>
> *
> I verified that the application not by accident jumps to the undefined
> part of the program memory and just runs until it returns to 0x0000 (Reset
> Vector). Further more all volatile memory is cleared when the problem
> occurs - stating that my problem is a reset for sure!
>
> *
> I have made a "Reset Source checking" testing on the first five bits of
> the MCUSR register which descripes the cause of the last reset. I'm
> breakning the program execution at 0x0000 so I can check this register
> first thing after a reset. But when my problem occurs the register setting
> is always 0x00. When ressting the device other ways the register indicates
> the right reasson e.g. JTAG reset or External Reset.
>
> A possibility could be a stack overflow caused by the recursivity - but
> that should not generate a uC reset?
>
> I have tried getting closer to the bug using EEprom debug variables and
> so, but the antocollision algorithm is very timing strict which just made
> this aproach corrupting the application.
>
> Any ideas on how I'll be getting closer to solve this anoying bug? Or
> anyone have an idea why I see this problem?
>
> Best Regards
>
A stack overflow will do the most strange things.....
Reply by ●May 25, 20072007-05-25
Hello All,
I'm having a hard time struggling with a reset of my application. The
Atmega644 is used for an RFID application and handles communication +
protocol and all RFID operations including an anticollision scheme. The
anticollision is implemented via recursive call's.
The application is written in CodeVisionAVR C compiler vers. 1.25.2.
I'm debugging in AVRstudio vers. 4.12 build 460 using JTAG mkII ICE.
The problem is an uncontrolled reset of the ATmega644. The reset always
occurs when I loads the anticollision to a maximum by makeing the RFID
reader handle way to many tags in the read area.
Now the mysterious part is:
*
I have removed all Watchdog enabling/resetting to eliminate the possibility
of a watchdog reset. I have verified that the watchdog registers are not
written during execution.
*
I have disabled the "Brown-out enable" fuse.
*
I verified that the application not by accident jumps to the undefined part
of the program memory and just runs until it returns to 0x0000 (Reset
Vector). Further more all volatile memory is cleared when the problem
occurs - stating that my problem is a reset for sure!
*
I have made a "Reset Source checking" testing on the first five bits of the
MCUSR register which descripes the cause of the last reset. I'm breakning
the program execution at 0x0000 so I can check this register first thing
after a reset. But when my problem occurs the register setting is always
0x00. When ressting the device other ways the register indicates the right
reasson e.g. JTAG reset or External Reset.
A possibility could be a stack overflow caused by the recursivity - but that
should not generate a uC reset?
I have tried getting closer to the bug using EEprom debug variables and so,
but the antocollision algorithm is very timing strict which just made this
aproach corrupting the application.
Any ideas on how I'll be getting closer to solve this anoying bug? Or anyone
have an idea why I see this problem?
Best Regards
--
Morten M. J.
Ba.Sci.EE
(this is also posted on avrfreaks.net)