EmbeddedRelated.com
Forums

UART TX FIFO and INTs problem

Started by forum_microbit February 17, 2004
Thanks Bill,

I'll certainly keep that one in mind, because I'll be porting a mini_OS,
loosely based on Oleg Skidan's SOS, which I ported to CrossWorks for MSP430,
and sped up the preemption for manual/auto event handling and for the
timers.

Usual story I guess, when a kid's let loose with much more power (here ARM,
and
the kid is me :-), that power too easily turns against you .. :-)

I am truly grateful Bill for your sharing of your experiences.
I interfaced to your UART package as a "sanity check", and found I had
identical
problems as with my code. That was a step in the right direction...

Brgds
Kris
www.microbit.com.au ----- Original Message -----
From: "Bill Knight" <>
To: <>
Sent: Monday, February 23, 2004 3:13 AM
Subject: Re: [lpc2000] UART TX FIFO and INTs problem - SOLVED > Version 2 of the UART, Blinky LED codeis now posted. I went ahead and
> protected the code which re-enables the uart interrupt in addition to
> when it is disabled. While the demo code does not have a preemptive
> OS, if it did, it would be possible (not probable) for another process
> to sneak in during the execution of the RMW sequence and modify the
> mask. The protection closes that possibility.
>
> As a side note. The disabling and restoring of the global interrupt
> flag in the cpsr can has a similar problem. It has to do with the
> processor actually having multiple psr's and preemptive processes
> doing the same thing to it while the main line code thinks it is
> either enabling or disabling the current IRQ bit. However, without
> preemptive processes or ISR's which muck with the interrupt flag bits,
> I don't believe there is a problem. For more info on this, Atmel has
> an APP note discribing it on their web site in the AT91 ARM7 section.
>
> Regards
> -Bill Knight
> R O SoftWare >
> --
------
> Yahoo! Groups Links
>
> a.. To




An Engineer's Guide to the LPC2100 Series

Hi Bill,

I wouldn't want to attempt to work around the problenm with the
default handler. More that I would want it to invoke a very specific
prompt, so I know there is an error lurking somewhere.
Maybe I should rephrase :

" Should my default handler muck with anything ? "
I presume the answer would ne "no" (ie, not write to VicVectAddr etc.)
Presently it just toggles a LED, and then returns from ISR.

Cheers,
Kris > I don't think using the default handler to mask the problem is
> the answer. You might however, use it to inform the user &/or
> yourself of an internal error so it can be corrected in the
> next release of your code.
>
> -Bill >
> On Mon, 23 Feb 2004 03:04:05 +1100, microbit wrote:
>
> Hi Bill,
>
> Thanks for the prompt response.
> This has actually been my main problem.
> It's not my "style" when I develop to just change my SW, and go
> "oh, yeah, well it works like that but not like that, so what".
> If I write code a particular way, then I expect it to work a particular
way,
> and if it doesn't - I want to know why !
>
> Anywho, I'm glad I persisted with this.
> I had indeed noticed that I didn't have this weird problems either when I
> used
> Gloabl enable/disable on IRQ. Bah,what a pain.
>
> I'll use your workaround, it sounds plausible.
> I can't believe that this problem didn't show up when just printing away
> with printf(), but ONLY when running the BASIC interpreter, and with a
> specific
> program running !
>
> Do you think I should maintain a Default_IRQ handler (I don't plan to use
> non-vectored
> INTs), case this happens again ?
>
> I'm naturally concerned that if I write to ViCVectAddr at end of
Default_IRQ
> handler,
> that I'll miss actual INTs, as opposed to spurious INTs.
>
> Your summary must be dead-on, because I found that different optimisations
> would
> exhibit the problem or not.
>
> Thanks a lot !
> Kris >
> Bill wrote :
>
> > Kris
> > Thanks for spotting this. It's the old spurious interrupt problem.
> > The fix is to disable global interrupts around the first
read-modify-write
> > instruction. Doing the direct write (U0IER = xxx) can still allow the
> problem
> > to happen. What happens is the interrupt occurs and is recognized while
> > the masking instruction is executing but before it has completed. Then
> > when the instruction does complete, the interrupt can't find the vector
> > so uses the default. So to fix:
>
> > Disable global interrupts
> > mask interrupt
> > Enable global interrupts
>
> > do your stuff
> > unmask interrupt
>
> > NOTE: I don't think this last modification of the mask register needs
> > protection. Someone correct me if I am wrong. > > Regards
> > -Bill Knight
> > R O SoftWare >
> > On Mon, 23 Feb 2004 02:09:38 +1100, microbit wrote:
>
> > Hi all,
>
> > This s a hairy one - ARM Gurus, please read this !!!
> > (Bill, this will cause a problem in your UART package too, since you
> > set VicDefVectAddr to the Reset Vector, and you also use a RMW
> > operation on UxIER)
>
> > I figured out why I was getting "resets" while UART0 is TXing
> > interrupt driven.
> > The problem only occurs when I disable the THE interrupt by masking :
>
> > U0IER &= ~ 0x02; /* Disable THE */
> > ......
> > U0IER |= 0x02;
>
> > When I disable and then re-enable THE interrupts with a direct write :
>
> > U0IER = 0x01; /* Disable THE, leave RDA enabled */
> > .....
> > U0IER = 0x03; /* Reenable THE */
>
> > I didn't get these "resets".
>
> > This is the problem :
> > -----------------------
>
> > I didn't have VicDefVectAddr specified, so it was set to its default
> 0x00000000
> > address. (All IRQs are disabled, except for VIC Channel # 6)
> > While U0IER &= ~ 0x02 is executing, there MUST be a few cycles where the
> THE
> > is disabled, BUT the interrupt asserts at the right moment, hence
causing
> a non-vectored
> > assertion.... (which of course vectors to 0x000000, and ultimately
results
> in a RESET)
> > Is this a known problem on the VIC ?
> > Is this normal on the VIC ?
>
> > I put in a "dummy" function default_IRQ_handler() so I can break on it,
> and when it hits
> > that function, VicRawIntr reads 0x3008.
> > If I merely indicate on a LED when that function is hit, but then return
> from interrupt, the
> > execution happpily picks up after the U0IER &=~0x02; statement, and
things
> run as good as gold.
> > NO TX interrupts were missed by the looks of it, since the non-vectored
> dummy function doesn't
> > service or alter anything, the pending interrupt must be serviced when
THE
> is re-emabled.
>
> > Are there any ARM gurus here that can shed some light on this ?
> > I've been just about driven to tears over this problem.
> > I can't just use a direct write (I have an interrupt on the other UART
in
> RX, and that operates on
> > a CRC function, the RF transmit function needs to operate on the same
CRC
> function, so I need
> > masking of the Interrupt Enables, rather than directly setting/clearing
> all INT sources on UxIER)
>
> > As far as I'm concerned this classifies as a silicon BUG, since UxIER
bits
> cannot assert themselves,
> > and it is indeed a R/W register.
>
> > BTW : How do I work out with VicRawIntr (0x3008 in my case) which
Slot/Int
> is asserted ?
> > I can't find a bitmap on it.
>
> > Best regards,
> > Kris > >
> --
> ------
> > Yahoo! Groups Links
>
> > a.. To >
> Yahoo! Groups Links >
> --
------
> Yahoo! Groups Links
>
> a.. To




On Sun, 22 Feb 2004 11:18:03 -0500, J.C. Wren wrote:

Bill Knight wrote:

> Kris
> Thanks for spotting this. It's the old spurious interrupt problem.
> The fix is to disable global interrupts around the first read-modify-write
> instruction. Doing the direct write (U0IER = xxx) can still allow the
> problem
> to happen. What happens is the interrupt occurs and is recognized while
> the masking instruction is executing but before it has completed. Then
> when the instruction does complete, the interrupt can't find the vector
> so uses the default. So to fix:

[snip]

Why should the direct write cause this problem? The issue with
read-modify-write makes perfect sense, but the write should be an atomic
operation. How would you get an interrupt between <nothing> and the write?

--jc

==========================================================================
This problem shows up on the Coldfire list every so often. It MAY not be
a problem on the ARM. The idea is that the interrupt is recognized by the
hardware and scheduled to execute at the end of the instruction that is
modifying the register. When the instruction has completed, the interrupt
is no longer there. As I write this I relize it is less likey a problem
considering the mask being changed is in the UART while the vector being
processed is in the VIC (in hardware, after the signal coming from the
UART). The problem on the ColdFire was the mask bit was effectively in
its "VIC". So I'm guessing that simply writing the register would also
fix the problem.

-Bill



I haven't had any problems at all with the direct write.
The weirdest things were happening though with the masking.
For example, comm_tx[tx_head++]=ch; would cause problems
too (protected but only with U0IER=0x01 ),
whereas :
temp = (tx_head+1)%RS232_SIZE;
...
comm_tx[tx_head]=(UCHAR)c;
tx_head=temp;

wouldn't .....
Go figure....
I need to retest all these permutations.

-- Kris ----- Original Message -----
From: "J.C. Wren" <>
To: <>
Sent: Monday, February 23, 2004 3:18 AM
Subject: Re: [lpc2000] UART TX FIFO and INTs problem - SOLVED > Bill Knight wrote:
>
> > Kris
> > Thanks for spotting this. It's the old spurious interrupt problem.
> > The fix is to disable global interrupts around the first
read-modify-write
> > instruction. Doing the direct write (U0IER = xxx) can still allow the
> > problem
> > to happen. What happens is the interrupt occurs and is recognized while
> > the masking instruction is executing but before it has completed. Then
> > when the instruction does complete, the interrupt can't find the vector
> > so uses the default. So to fix:
> >
> [snip]
>
> Why should the direct write cause this problem? The issue with
> read-modify-write makes perfect sense, but the write should be an atomic
> operation. How would you get an interrupt between <nothing> and the
write?
>
> --jc >
>
> --
------
> Yahoo! Groups Links
>
> a.. To




> ==========================================================================
> This problem shows up on the Coldfire list every so often. It MAY not be
> a problem on the ARM. The idea is that the interrupt is recognized by the
> hardware and scheduled to execute at the end of the instruction that is
> modifying the register. When the instruction has completed, the interrupt
> is no longer there. As I write this I relize it is less likey a problem
> considering the mask being changed is in the UART while the vector being
> processed is in the VIC (in hardware, after the signal coming from the
> UART). The problem on the ColdFire was the mask bit was effectively in
> its "VIC". So I'm guessing that simply writing the register would also
> fix the problem.
>
> -Bill

Hi Bill et al,

In light of the previous revelations, I tried what I described before.
Does anyone know what is going on here ?

Outputting a character into circular buffer works fine like this :
=======================================
UINT temp;

/* Handle TX buffer wrap */
temp = (tx_head+1)%RS232_SIZE;

/* Disable TX interrupts */
U0IER = 0x01;

/* Circ buffer not empty ? */
if (comm_tx_running)
{
comm_tx[tx_head] = (UCHAR)ch;
tx_head = temp;
}
else
{
comm_tx_running = 1;
U0THR = ch;
}

/* Reenable TX interrupts */
U0IER = 0x03;
return (ch); But doing a write to the TX buffer with post increment also causes Default
IRQs :
====================================================

/* Disable TX interrupts */
U0IER = 0x01;

/* Circ buffer not empty ? */
if (comm_tx_running)
{
comm_tx[tx_head++] = (UCHAR)ch;
tx_head %= RS232_SIZE;
}
else
{
comm_tx_running = 1;
U0THR = ch;
}

/* Reenable TX interrupts */
U0IER = 0x03;
return (ch);

-------------------

Could this have to do with optimisation ?
I have to honestly volunteer I'm stumped with that one.

-- Kris



> In light of the previous revelations, I tried what I described before.
> Does anyone know what is going on here ?
>
> Outputting a character into circular buffer works fine like this :
> =======================================
> UINT temp;
>
> /* Handle TX buffer wrap */
temp = (tx_head+1)%RS232_SIZE;
>
> /* Disable TX interrupts */
> U0IER = 0x01;
>
> /* Circ buffer not empty ? */
> if (comm_tx_running)
> {
> comm_tx[tx_head] = (UCHAR)ch;
> tx_head = temp;
> }
> else
> {
> comm_tx_running = 1;
> U0THR = ch;
> }
>
> /* Reenable TX interrupts */
> U0IER = 0x03;
> return (ch); > But doing a write to the TX buffer with post increment also causes Default
> IRQs :
> ====================================================
>
> /* Disable TX interrupts */
> U0IER = 0x01;
>
> /* Circ buffer not empty ? */
> if (comm_tx_running)
> {
> comm_tx[tx_head++] = (UCHAR)ch;
> tx_head %= RS232_SIZE;
> }
> else
> {
> comm_tx_running = 1;
> U0THR = ch;
> }
>
> /* Reenable TX interrupts */
> U0IER = 0x03;
> return (ch);
>
> -------------------
>
> Could this have to do with optimisation ?
> I have to honestly volunteer I'm stumped with that one.
>
> -- Kris

I'd second this. I see nothing in here to stop a compiler from
moving stuff around to it's hearts content. You might want
to look at adding some barrier() statements to the above code
to prevent that kind of code rearrangement.

Cheers,
David




Hi David,

> I'd second this. I see nothing in here to stop a compiler from
> moving stuff around to it's hearts content. You might want
> to look at adding some barrier() statements to the above code
> to prevent that kind of code rearrangement.

I'm not too familiar with GNU (I presume barrier() is GNU).
Whta's a barrier() statement.
Forgive me if I convey dumb or lazy.
I _do_ want to learn, and I don't expect others to do my homework for me.
I have noticed an "inline" statement in Bill's code, I don't even know what
that's
all about.

I used to be a die-hard IAR fan, and in hindsight I find that using IAR
hasn't made me
as good a programmer as I ought to be by now, seems some things are done
"for you",
giving some sort of "false sense of security".
Using other tools is like a cold shower suddenly.

-- Kris



> > I'd second this. I see nothing in here to stop a compiler from
> > moving stuff around to it's hearts content. You might want
> > to look at adding some barrier() statements to the above code
> > to prevent that kind of code rearrangement.
>
> I'm not too familiar with GNU (I presume barrier() is GNU).
> Whta's a barrier() statement.
> Forgive me if I convey dumb or lazy.
> I _do_ want to learn, and I don't expect others to do my homework for me.
> I have noticed an "inline" statement in Bill's code, I don't even know what
> that's
> all about.

barrier() is just a family of constructs which, like volatile, tell the
compiler what it's allowed to move around and what it's not. Consider the
following pseudo code:

write_to_reg(location_A, value_1);
delay()
write_to_reg(location_A, value_2);

There is no guarantee that those statements will be performed in the order
you might guess. The compiler may even optimize some of them away
entirely. If write_to_reg is a macro, the compiler may chose to remove
the first invocation as the second will just overwrite it. It may even
remove the delay if it can figure out that it's not effecting any
variables.

> I used to be a die-hard IAR fan, and in hindsight I find that using IAR
> hasn't made me
> as good a programmer as I ought to be by now, seems some things are done
> "for you",
> giving some sort of "false sense of security".
> Using other tools is like a cold shower suddenly.

That's one of the reasons that GNU folk tend to be a little condescending
to users of other tools. Using the GNU tools, you're faced daily with
the understanding that all systems are not the same. You learn, with time,
to adapt to that and stop making certain assumptions. Like you say, you
end up doing more things by yourself, but you also learn that they're
being done and why. I think it may make one a more attentative programmer.

There are two kinds of barriers that you're going to see. There are two
main areas where code execution may get re-arranged or out of order. Those
are 1) the compiler or 2) the chip. I don't think there are any out of
order implementations of the ARM arch, yet--there may be, but the ARM7TDMI
sure isn't one of them. So, that leaves the compiler. You may, from time
to time, need to communicate to the compiler that it is not allowed to move
certain statements around. The 'barrier()' call says that nothing may be
moved from above to below this call or vise versa. You may need to bracket
a statement with these if you really want everything above to happen first
and everything following to happen after. So, our example may become:
write_to_reg(location_A, value_1);
barrier();
delay();
barrier();
write_to_reg(location_A, value_2);
barrier();

The barrier after the first write_to_reg() is to make sure that the write
took place before the delay. The barrier after the delay is to ensure
that the second write_to_reg() doesn't get swapped with the delay. The
final call to barrier is to ensure that the second write takes place
*now*.

C is a good way to express algorithms, but it is a bad way to express timing
of events. So, barriers allow it to do both. Yes, it's a bit cumbersome,
but for general purpose code, it's not common. You're just writing system
code, so you tend to run into it a bit more.

I have no idea what IAR uses for this kind of construct, but it must have
*something*. Maybe a little manual reading will help now that you know
what to look for.

Just be happy you're not running on a processor that does agressive out
of order code execution. Then you have to worry about both the compiler
and the chip moving things about on you. Now, that's fun. :)

Good luck.

Cheers,
David




At 12:48 PM 2/22/04 -0500, you wrote:
> > In light of the previous revelations, I tried what I described before.
> > Does anyone know what is going on here ?
> >
> > Outputting a character into circular buffer works fine like this :
> > =======================================
> > UINT temp;
> >
> > /* Handle TX buffer wrap */
>temp = (tx_head+1)%RS232_SIZE;
> >
> > /* Disable TX interrupts */
> > U0IER = 0x01;
> >
> > /* Circ buffer not empty ? */
> > if (comm_tx_running)
> > {
> > comm_tx[tx_head] = (UCHAR)ch;
> > tx_head = temp;
> > }
> > else
> > {
> > comm_tx_running = 1;
> > U0THR = ch;
> > }
> >
> > /* Reenable TX interrupts */
> > U0IER = 0x03;
> > return (ch);
> >
> >
> > But doing a write to the TX buffer with post increment also causes Default
> > IRQs :
> > ====================================================
> >
> > /* Disable TX interrupts */
> > U0IER = 0x01;
> >
> > /* Circ buffer not empty ? */
> > if (comm_tx_running)
> > {
> > comm_tx[tx_head++] = (UCHAR)ch;
> > tx_head %= RS232_SIZE;
> > }
> > else
> > {
> > comm_tx_running = 1;
> > U0THR = ch;
> > }
> >
> > /* Reenable TX interrupts */
> > U0IER = 0x03;
> > return (ch);
> >
> > -------------------
> >
> > Could this have to do with optimisation ?
> > I have to honestly volunteer I'm stumped with that one.
> >
> > -- Kris
>
>I'd second this. I see nothing in here to stop a compiler from
>moving stuff around to it's hearts content. You might want
>to look at adding some barrier() statements to the above code
>to prevent that kind of code rearrangement.

GCC is more prone to this than other compilers I've worked with. Most of
them seem to take the approach of "no re-ordering unless provably
beneficial" whereas GNU's appears to be "reorder as long as it's not
wrong". A different emphasis that makes stepping through optimized code a
little more interesting.

Judicious use of volatile helps. Re-doing critical sections in assembly
certainly works.

If it's an ordering problem I suspect that if both U0IER and
comm_tx_running are volatile you shouldn't have a problem since the
compiler is not allowed to optimize the order of reference to volatile
variables..

A quick google search ( barrier GNU reordering) reveals this item

http://compilers.iecc.com/comparch/article/03-03-158

I think I'd go for full assembly for control rather than inline asm.

Hmm, actually a point. If you want to check that optimization is messing
with your execution order just dump it and read it. Always an illuminating
process :)

Robert

" 'Freedom' has no meaning of itself. There are always restrictions,
be they legal, genetic, or physical. If you don't believe me, try to
chew a radio signal. "

Kelvin Throop, III


At 10:13 AM 2/22/04 -0600, you wrote:
>As a side note. The disabling and restoring of the global interrupt
>flag in the cpsr can has a similar problem.

<snip>

>For more info on this, Atmel has
>an APP note discribing it on their web site in the AT91 ARM7 section.

Thanks for the pointer.

Robert

" 'Freedom' has no meaning of itself. There are always restrictions,
be they legal, genetic, or physical. If you don't believe me, try to
chew a radio signal. "

Kelvin Throop, III