Reply by Thiadmer Riemersma ITB CompuPhase March 25, 20062006-03-25
Hello Robert,

> Something to consider is cracked capacitors.

Thanks for the tip.

Another hypothesis that I want to investigate is whether we may be
operating the LPC2138 under "off spec" conditions. Most chips will
still run (slightly) off spec, but some won't.

This would explain why resoldering all of the pins of the LPC2138 did
not help, but replacing it by another made the spurious interrupts go
away.

It is just a hypothesis.

Kind regards,
Thiadmer
	

An Engineer's Guide to the LPC2100 Series

Reply by Robert Adsett March 23, 20062006-03-23
At 08:53 AM 3/23/06 +0000, Thiadmer Riemersma (ITB CompuPhase) wrote:
>A floating pin (whether "by design" or by
a bad soldering) was our
>first thought too. Note that so far we have tested more than 30 boards
>and so far only 2 cause these spurious interrupts.

Something to consider is cracked capacitors.  Stress on the board (perhaps 
from bad panel separation procedures) can crack ceramic surface mount 
devices particularly if they are close to the stress origin.  I ran into a 
bunch of these once and the key to finding them initially was to breathe on 
them.  The extra humidity was all the was needed to trigger the observed 
behaviour on most.  The cracks tended not to be visible to the naked eye 
but when the different layers of the caps shorted bad things 
happened.  That's a subtle enough effect to be overlooked unless you 
specifically hunt for it.

Robert

" 'Freedom' has no meaning of itself.  There are always restrictions,   be 
they legal, genetic, or physical.  If you don't believe me, try to chew a 
radio signal. "  -- Kelvin Throop, III
http://www.aeolusdevelopment.com/
	
Reply by Thiadmer Riemersma ITB CompuPhase March 23, 20062006-03-23
Hello Richard,

A floating pin (whether "by design" or by a bad soldering) was our
first thought too. Note that so far we have tested more than 30 boards
and so far only 2 cause these spurious interrupts.

Kind regards,
Thiadmer Riemersma
	
Reply by Thiadmer Riemersma ITB CompuPhase March 22, 20062006-03-22
Hello Tom,

On both boards, we resoldered all pins of the existing LPC2138 before
we took the step of replacing an LPC2138 on one of the boards.

The curious (or suspicious) thing is:
- both boards have precisely the same problem
- so far, no board failed in some other way
- the problem is almost identical to the one in message 11212

But, yes, you are right. I have been jumping to conclusions.

Kind regards,
Thiadmer Riemersma
	
Reply by rtstofer March 22, 20062006-03-22
--- In lpc2000@lpc2..., "Thiadmer Riemersma (ITB CompuPhase)"
<go@...> wrote:
>
> Hello Leon,
> 
> We designed the board and did a few prototypes in-house, but series
> manufacturing was done by a specialized company. So I do not really
> know whether precautions were observed.
> 
> > High-reliability systems are often 'burned-in' - run for some time at
> > a high temperature - which shows up faults like that.
> 
> Funnily, this particular problem went away at high temperature.
> 
> Kind regards,
> Thiadmer
>

Whenever I have run across a situation where touching the circuit
could make a change, it was because I had left pins floating.  I would
start with the schematic and look for pins that float.

Richard
	
Reply by Thiadmer Riemersma ITB CompuPhase March 22, 20062006-03-22
Hello Leon,

We designed the board and did a few prototypes in-house, but series
manufacturing was done by a specialized company. So I do not really
know whether precautions were observed.

> High-reliability systems are often 'burned-in' -
run for some time at
> a high temperature - which shows up faults like that.

Funnily, this particular problem went away at high temperature.

Kind regards,
Thiadmer
	
Reply by Thiadmer Riemersma ITB CompuPhase March 22, 20062006-03-22
Hello Brandon,

Thank you for your message. Indeed there is a lot information still
lacking in what I described.

Before replacing the LPC2138, we resoldered the leads of the old
processors (all of them). The pin connections were also visually
inspected (with a 10x magnifier).

I do not recall whether the spurious interrupt was "deterministic" on
the second board (the one that is now repaired). However, on our first
board where this occurred, the issue "appeared" deterministic. In
fact, on the first board we discovered the problem because we
experienced task starvation. This, in turn, was caused by an interrupt
being generated immediately after switching a GPIO pin 0.16 to EINT0.
We are switching the pin between GPIO and EINT for the same reason
(basically) as the one described in your original message.

I said that it "appeared" deterministic, because adding pressure on
this board made the spurious interrupts less frequent or made them
disappear alltogether. However, without external pressure, the low
priority task could remain starved for a very, very long time.

Since then, I have adapted my firmware to detect this particular
spurious interrupt issue, and cause a reset with a diagnostic message
if it appears (even only once). For the second board, I do therefore
not know if the spurious interrupt was deterministically appearing
after each switch from GPIO to EINT, or just (say) 5 out of 10 times.

Kind regards,
Thiadmer
	
Reply by Tom Walsh March 22, 20062006-03-22
Thiadmer Riemersma (ITB CompuPhase) wrote:

>Hello everyone,
>
>On 8 december 2005, Brendan Murphy posted a report where switching a
>pin from GPIO to EINT (external interrupt) caused an edge-triggered
>interrupt (message http://groups.yahoo.com/group/lpc2000/message/11212).
>
>I replied to that message saying that we encountered precisely that
>error on a single PCB out of 20 that we tested. Our hypothesis was
>that the particular PCB had some kind of defect. This hypothesis was
>based on the fact that the spurious interrupts _disappeared_ when
>slightly bending the PCB.
>
>We recently encountered a second board with exactly the same problem.
>Here, too, pressure on the board influenced the occurrence of spurious
>interrupts. Also similar was that the "pressure point" was on the
>processor. In a related test, we heated the processor. This also
>influenced the occurrence of the spurious interrupts.
>
>We replaced the processor on the board (an LPC2138) by another one. No
>more spurious interrupts occurred.
>
>We are wondering whether this issue may perhaps be caused by the
>processors not being handled correctly during manufacturing (perhaps
>there was no baking before reflow soldering).
>
>  
>
Well, it is tough to assume that everything else was okay.  In another 
posting I made this morning was about how clean the finished board was 
and your comment that replacing the chip brings it to mind.  I had a 
manufacturer call me in to try to find a problem with a group of boards 
that they could not get working.  These had been assembled in Asia and 
the majority of the lot worked, all but these few boards.

Looking at these boards under a 20X microscope, I discovered solder 
debris which had not been properly cleaned from the board.  There were 
tiny groups of solder balls + unmelted paste around many of the pins.  
The cleaning process had removed the majority of the debris around the 
pins but failed to adequately remove it from "underneath".  The debris

was tucked up on the "inside" of the pins where it was difficult to
see.

Several of the boards worked when I just reheated the pins with a 
soldering iron and reflowed the debris.

TomW

-- 
Tom Walsh - WN3L - Embedded Systems Consultant
http://openhardware.net, http://cyberiansoftware.com
"Windows? No thanks, I have work to do..."
----------------
	
Reply by Leon Heller March 22, 20062006-03-22
----- Original Message ----- 
From: "Thiadmer Riemersma (ITB CompuPhase)" <go@go@....>
To: <lpc2000@lpc2...>
Sent: Wednesday, March 22, 2006 9:45 AM
Subject: [lpc2000] CONFIRMED: WARNING: problem reading state of external 
interrupt lines.
	> Hello everyone,
>
> On 8 december 2005, Brendan Murphy posted a report where switching a
> pin from GPIO to EINT (external interrupt) caused an edge-triggered
> interrupt (message http://groups.yahoo.com/group/lpc2000/message/11212).
>
> I replied to that message saying that we encountered precisely that
> error on a single PCB out of 20 that we tested. Our hypothesis was
> that the particular PCB had some kind of defect. This hypothesis was
> based on the fact that the spurious interrupts _disappeared_ when
> slightly bending the PCB.
>
> We recently encountered a second board with exactly the same problem.
> Here, too, pressure on the board influenced the occurrence of spurious
> interrupts. Also similar was that the "pressure point" was on the
> processor. In a related test, we heated the processor. This also
> influenced the occurrence of the spurious interrupts.
>
> We replaced the processor on the board (an LPC2138) by another one. No
> more spurious interrupts occurred.
>
> We are wondering whether this issue may perhaps be caused by the
> processors not being handled correctly during manufacturing (perhaps
> there was no baking before reflow soldering).
>
> We have also tried to implement simple work-arounds, such as disabling
> the external interrupt in the VIC before switching the pin from GPIO
> to EINT, or clearing the interrupt in the VIC immediately after the
> switch. Both these attempts failed.
>
> I am reporting this to retract my earlier statement that the probable
> cause is in the PCB or in the PCB design. In addition, I am curious
> whether anyone can think of a work-around or fix for this issue.

Moisture getting into the package could have caused the problem, if the 
chips were removed from the protective packaging and left for some time 
before assembly. Baking before reflow soldering is only required if the 
precautions have not been observed.

High-reliability systems are often 'burned-in' - run for some time at a high 
temperature - which shows up faults like that.

Leon
	
Reply by brendanmurphy37 March 22, 20062006-03-22
Thiadmer,

It's a bit difficult to offer any kind of specific advice, based on 
what you say.

The problem I described is quite deterministic: that is, if you 
configure the device in a certain way, it always responds in exactly 
the same way to the same sequence of actions. 

From what you say, your problem seems to be intermittent, which is of 
course the worst kind of problem to deal with.

It certainly sounds suspicious that it may be hardware related by the 
symptoms you describe. However, it doesn't prove it. As a general 
strategy, I'd advise trying to get the system to behave consistently, 
and then look at differences from a system that works (consistently) 
and one that doesn't. You could try for example reducing the software 
as much as possible (e.g. a total of 20 or 30 lines of code), and see 
can you get it to work consistently, regardless of the board it's 
running on). It was this approach that enabled us to identify the 
issue I reported.

By the way, I assume you've discounted simple issues such as 
intermittant contacts at the processor pins? Applying pressure or 
heat is a good way of turning an intermittant contact (that may well 
be generating interrupts as the pin's state changes) into a good one, 
where the pin is held to its correct state by whatever it's connected 
to. A good way of testing this is to run the relevant software on a 
known good board (e.g. a bought-in, commercially available board).

I know this is of limited help, and probably nothing more than you've 
already done: it would be interesting to get feedback on how you 
progress.

Regards
Brendan

--- In lpc2000@lpc2..., "Thiadmer Riemersma (ITB CompuPhase)" 
<go@...> wrote:
>
> Hello everyone,
> 
> On 8 december 2005, Brendan Murphy posted a report where switching a
> pin from GPIO to EINT (external interrupt) caused an edge-triggered
> interrupt (message 
http://groups.yahoo.com/group/lpc2000/message/11212).
> 
> I replied to that message saying that we encountered precisely that
> error on a single PCB out of 20 that we tested. Our hypothesis was
> that the particular PCB had some kind of defect. This hypothesis was
> based on the fact that the spurious interrupts _disappeared_ when
> slightly bending the PCB.
> 
> We recently encountered a second board with exactly the same 
problem.
> Here, too, pressure on the board influenced the
occurrence of 
spurious
> interrupts. Also similar was that the
"pressure point" was on the
> processor. In a related test, we heated the processor. This also
> influenced the occurrence of the spurious interrupts.
> 
> We replaced the processor on the board (an LPC2138) by another one. 
No
> more spurious interrupts occurred.
> 
> We are wondering whether this issue may perhaps be caused by the
> processors not being handled correctly during manufacturing (perhaps
> there was no baking before reflow soldering).
> 
> We have also tried to implement simple work-arounds, such as 
disabling
> the external interrupt in the VIC before switching
the pin from GPIO
> to EINT, or clearing the interrupt in the VIC immediately after the
> switch. Both these attempts failed.
> 
> I am reporting this to retract my earlier statement that the 
probable
> cause is in the PCB or in the PCB design. In
addition, I am curious
> whether anyone can think of a work-around or fix for this issue.
> 
> Kind regards,
> Thiadmer Riemersma
>