EmbeddedRelated.com
Forums
Memfault Beyond the Launch

Cyclic Ethernet frame loss with LPC2468

Started by rfinnovations August 22, 2012
Hi,

Our design uses an LPC2468 with DP83848 PHY. We lose some
Ethernet frames transmitted by our firmware. When I say that
we lose the frames, they have been transmitted as far as we
can tell from the MAC registers in the LPC, but they are not
detected by the PC at the other end of the wire (running a
Wireshark capture).

The test setup was an isolated point-to-point link with a
short crossover cable, but we have seen the issue with a
number of different network hardware configurations.

Probing the lines between the micro and PHY and between the
PHY and transceiver shows no obvious issues.

Running a ping test (1 per sec) on an isolated point-to-
point link over a number of days shows that the frame loss
follows a 24 hour cycle. At the low point, almost no frames
are lost. At the worst point almost all are lost. A plot of
frame loss percentage by hour looks like a sine or cyclic
bell curve.

Locking down the link speed from 100 Mbps to 10 Mbps
prevents the frame loss.

The kicker is that we've used this exact same design before
in another product without trouble. Same circuit, component
values, even board layout.

The main variable we identified was the revision of the
micro. We shipped our first product with Rev B LPC2468s and
the new one with Rev D. Replacing the Rev D micro on one of
our new boards with a Rev B made the problem go away.

Our sample size is only one at the moment, but does anyone
else have this issue, or have any other suggestions? Any
help is greatly appreciated.

An Engineer's Guide to the LPC2100 Series

--- In l..., "rfinnovations" wrote:
>
> The main variable we identified was the revision of the
> micro. We shipped our first product with Rev B LPC2468s and
> the new one with Rev D. Replacing the Rev D micro on one of
> our new boards with a Rev B made the problem go away.
>
> Our sample size is only one at the moment, but does anyone
> else have this issue, or have any other suggestions? Any
> help is greatly appreciated.
>

Have you checked the Ethernet-related items in the Errata Sheet?

http://www.nxp.com/documents/errata_sheet/ES_LPC2468.pdf

I couldn't see anything that matched your description or that would explain why it worked in Rev B and not Rev D, but it may lead you to some other lines of investigation ...

--
Chris Burrows
CFB Software
http://www.astrobe.com

Hello,

I have had the same problem, and the solution was simple.

You probably used the clock output of the phy chip as a reference for the network interface on the lpc. My advise, dont. have the oscillator of the phy also connected to the rx_clk of the lpc. That fixed the problem for us. And we discovered it by carefully reading the datasheet.

Hope this helps.
Kevin.

--- In l..., "rfinnovations" wrote:
>
> Hi,
>
> Our design uses an LPC2468 with DP83848 PHY. We lose some
> Ethernet frames transmitted by our firmware. When I say that
> we lose the frames, they have been transmitted as far as we
> can tell from the MAC registers in the LPC, but they are not
> detected by the PC at the other end of the wire (running a
> Wireshark capture).
>
> The test setup was an isolated point-to-point link with a
> short crossover cable, but we have seen the issue with a
> number of different network hardware configurations.
>
> Probing the lines between the micro and PHY and between the
> PHY and transceiver shows no obvious issues.
>
> Running a ping test (1 per sec) on an isolated point-to-
> point link over a number of days shows that the frame loss
> follows a 24 hour cycle. At the low point, almost no frames
> are lost. At the worst point almost all are lost. A plot of
> frame loss percentage by hour looks like a sine or cyclic
> bell curve.
>
> Locking down the link speed from 100 Mbps to 10 Mbps
> prevents the frame loss.
>
> The kicker is that we've used this exact same design before
> in another product without trouble. Same circuit, component
> values, even board layout.
>
> The main variable we identified was the revision of the
> micro. We shipped our first product with Rev B LPC2468s and
> the new one with Rev D. Replacing the Rev D micro on one of
> our new boards with a Rev B made the problem go away.
>
> Our sample size is only one at the moment, but does anyone
> else have this issue, or have any other suggestions? Any
> help is greatly appreciated.
>

Hi Kevin,

Thank you very much for sharing this with us. You're right, our circuit is designed that way. I've passed on the information to the engineer that designed the circuit and we're going to try it out right away.

Thanks again.
--- In l..., "Kevin" wrote:
>
> Hello,
>
> I have had the same problem, and the solution was simple.
>
> You probably used the clock output of the phy chip as a reference for the network interface on the lpc. My advise, dont. have the oscillator of the phy also connected to the rx_clk of the lpc. That fixed the problem for us. And we discovered it by carefully reading the datasheet.
>
> Hope this helps.
> Kevin.
>
> --- In l..., "rfinnovations" wrote:
> >
> > Hi,
> >
> > Our design uses an LPC2468 with DP83848 PHY. We lose some
> > Ethernet frames transmitted by our firmware. When I say that
> > we lose the frames, they have been transmitted as far as we
> > can tell from the MAC registers in the LPC, but they are not
> > detected by the PC at the other end of the wire (running a
> > Wireshark capture).
> >
> > The test setup was an isolated point-to-point link with a
> > short crossover cable, but we have seen the issue with a
> > number of different network hardware configurations.
> >
> > Probing the lines between the micro and PHY and between the
> > PHY and transceiver shows no obvious issues.
> >
> > Running a ping test (1 per sec) on an isolated point-to-
> > point link over a number of days shows that the frame loss
> > follows a 24 hour cycle. At the low point, almost no frames
> > are lost. At the worst point almost all are lost. A plot of
> > frame loss percentage by hour looks like a sine or cyclic
> > bell curve.
> >
> > Locking down the link speed from 100 Mbps to 10 Mbps
> > prevents the frame loss.
> >
> > The kicker is that we've used this exact same design before
> > in another product without trouble. Same circuit, component
> > values, even board layout.
> >
> > The main variable we identified was the revision of the
> > micro. We shipped our first product with Rev B LPC2468s and
> > the new one with Rev D. Replacing the Rev D micro on one of
> > our new boards with a Rev B made the problem go away.
> >
> > Our sample size is only one at the moment, but does anyone
> > else have this issue, or have any other suggestions? Any
> > help is greatly appreciated.
>

Hi Chris,

Yes, we studied the errata pretty carefully but it didn't really lead us anywhere, thanks for the suggestion. Seems like Kevin may have the answer in his reply.
--- In l..., "cfbsoftware1" wrote:
> --- In l..., "rfinnovations" wrote:
> >
> > The main variable we identified was the revision of the
> > micro. We shipped our first product with Rev B LPC2468s and
> > the new one with Rev D. Replacing the Rev D micro on one of
> > our new boards with a Rev B made the problem go away.
> >
> > Our sample size is only one at the moment, but does anyone
> > else have this issue, or have any other suggestions? Any
> > help is greatly appreciated.
> > Have you checked the Ethernet-related items in the Errata Sheet?
>
> http://www.nxp.com/documents/errata_sheet/ES_LPC2468.pdf
>
> I couldn't see anything that matched your description or that would explain why it worked in Rev B and not Rev D, but it may lead you to some other lines of investigation ...
>
> --
> Chris Burrows
> CFB Software
> http://www.astrobe.com
>

Hi,
You have probably fixed the problem now.  We had a similar problem in a design last year.  This was due to a mistake in a reference design we used.
 
This is how NXP Technical Support responded ...

We have seen this with many more customers.
The ENET_REF_CLK (P1.15) should be connected directly from the 50MHz clk, and not from the PHY. This is a known problem (explained in a National App.Note) which causes sync. problems between EMAC and PHY, especially at high-temperatures. Under this scheme, everything could work but at some point, the failure appears.
 
Some background:
In RMII mode, the 50MHz clock is used by both, the EMAC and the PHY to synchronize the data. Connecting to the EMAC a delayed clock (coming from the PHY instead of using a direct connection with the clock source), the data could eventually not being synchronized correctly.
You associate the failure with rev D, because rev B was working fine for you, but we saw this problem occurring for rev B also at different customers, especially in high temperatures. For this reason, we always recommended to connect the P1.15 directly from the clock source.

 
Conclusion:
So, rev B would not solve the problem. The fact it works now is probably more luck than wisdom . . . Any new rev B production batch could also show up with the same problem.
To find the comment advising not to use pin 25 as a clock source is on page 2, of:
http://www.national.com/an/AN/AN-1405.pdf
 
It’s not an NXP problem. From rev B to rev D nothing functionally has changed (just errata fixes, shrinks, etc. It’s definitely a new die, so timing differences will occur, but everything still within spec). We have measured a phase shift of about 40% between the Oscillator and the 50MHz clock out pin of the PHY. That’s the problem!
 

 

________________________________
From: Kevin
To: l...
Sent: Thursday, 23 August 2012, 10:21
Subject: [lpc2000] Re: Cyclic Ethernet frame loss with LPC2468



 

Hello,

I have had the same problem, and the solution was simple.

You probably used the clock output of the phy chip as a reference for the network interface on the lpc. My advise, dont. have the oscillator of the phy also connected to the rx_clk of the lpc. That fixed the problem for us. And we discovered it by carefully reading the datasheet.

Hope this helps.
Kevin.

--- In mailto:lpc2000%40yahoogroups.com, "rfinnovations" wrote:
>
> Hi,
>
> Our design uses an LPC2468 with DP83848 PHY. We lose some
> Ethernet frames transmitted by our firmware. When I say that
> we lose the frames, they have been transmitted as far as we
> can tell from the MAC registers in the LPC, but they are not
> detected by the PC at the other end of the wire (running a
> Wireshark capture).
>
> The test setup was an isolated point-to-point link with a
> short crossover cable, but we have seen the issue with a
> number of different network hardware configurations.
>
> Probing the lines between the micro and PHY and between the
> PHY and transceiver shows no obvious issues.
>
> Running a ping test (1 per sec) on an isolated point-to-
> point link over a number of days shows that the frame loss
> follows a 24 hour cycle. At the low point, almost no frames
> are lost. At the worst point almost all are lost. A plot of
> frame loss percentage by hour looks like a sine or cyclic
> bell curve.
>
> Locking down the link speed from 100 Mbps to 10 Mbps
> prevents the frame loss.
>
> The kicker is that we've used this exact same design before
> in another product without trouble. Same circuit, component
> values, even board layout.
>
> The main variable we identified was the revision of the
> micro. We shipped our first product with Rev B LPC2468s and
> the new one with Rev D. Replacing the Rev D micro on one of
> our new boards with a Rev B made the problem go away.
>
> Our sample size is only one at the moment, but does anyone
> else have this issue, or have any other suggestions? Any
> help is greatly appreciated.
>






Hi All,

In addition to this :
it might also be of interest that 50 MHz might be a pretty high frequency.
For this reason we have to take into account the impedance (with reference
to the ground plane), length matching and possible damping resistors (with
low value e.g. 22R or equivalent).
Especially the length matching of the 50 MHz reference and the MII data
(both RxD and TxD) might need some special attention so that clocking
should not give any problems.
This is certainly true for gigabit transfer speeds (which we currently
have
to deal with the RGMII).

Br,

Armand ten Doesschate

On Wed, 29 Aug 2012 09:41:26 +0100 (BST), Bruce Marshall
wrote:
> Hi,
> You have probably fixed the problem now.  We had a similar problem in a
> design last year.  This was due to a mistake in a reference design we
used.
>  
> This is how NXP Technical Support responded ...
>
> We have seen this with many more customers.
> The ENET_REF_CLK (P1.15) should be connected directly from the 50MHz
clk,
> and not from the PHY. This is a known problem (explained in a National
> App.Note) which causes sync. problems between EMAC and PHY, especially
at
> high-temperatures. Under this scheme, everything could work but at some
> point, the failure appears.
>  
> Some background:
> In RMII mode, the 50MHz clock is used by both, the EMAC and the PHY to
> synchronize the data. Connecting to the EMAC a delayed clock (coming
from
> the PHY instead of using a direct connection with the clock source), the
> data could eventually not being synchronized correctly.
> You associate the failure with rev D, because rev B was working fine for
> you, but we saw this problem occurring for rev B also at different
> customers, especially in high temperatures. For this reason, we always
> recommended to connect the P1.15 directly from the clock source.
>
>  
> Conclusion:
> So, rev B would not solve the problem. The fact it works now is probably
> more luck than wisdom . . . Any new rev B production batch could also
show
> up with the same problem.
> To find the comment advising not to use pin 25 as a clock source is on
> page 2, of:
> http://www.national.com/an/AN/AN-1405.pdf
>  
> It’s not an NXP problem. From rev B to rev D nothing functionally has
> changed (just errata fixes, shrinks, etc. It’s definitely a new die, so
> timing differences will occur, but everything still within spec). We
have
> measured a phase shift of about 40% between the Oscillator and the 50MHz
> clock out pin of the PHY. That’s the problem!
>  
>
>  
>
> ________________________________
> From: Kevin
> To: l...
> Sent: Thursday, 23 August 2012, 10:21
> Subject: [lpc2000] Re: Cyclic Ethernet frame loss with LPC2468
>
>

Having used the NS Phy with the LPC4350 / 4330 we found that the digital interconnect including the clock between the PHY and LPC device was not as critical, I have one board where these are connected using wire mods with considerable difference in length / routing and it works fine.



The biggest problem we experienced with the PHY was the actual PHY layout, particularly with respect to the capacitors connected to the PHY. There is a specific layout guidline in the PHY app note regarding the order in which these are connected, and not following that order seemed to cause a lot of problems.



I haven’t found it necessary to use termination ( damping ) resistors, or to length match the traces between PHY and LPC, the timing is quite relaxed provided you use the correct clock source, i.e. NOT the clock output from the PHY. I have traces varying between 10 and 20cm due to wire mods and see no problems.



Regards



Phil.



From: l... [mailto:l...] On Behalf Of a...@mini-amd.org
Sent: 29 August 2012 10:18
To: l...
Subject: Re: [lpc2000] Re: Cyclic Ethernet frame loss with LPC2468





Hi All,

In addition to this :
it might also be of interest that 50 MHz might be a pretty high frequency.
For this reason we have to take into account the impedance (with reference
to the ground plane), length matching and possible damping resistors (with
low value e.g. 22R or equivalent).
Especially the length matching of the 50 MHz reference and the MII data
(both RxD and TxD) might need some special attention so that clocking
should not give any problems.
This is certainly true for gigabit transfer speeds (which we currently
have
to deal with the RGMII).

Br,

Armand ten Doesschate

On Wed, 29 Aug 2012 09:41:26 +0100 (BST), Bruce Marshall
> wrote:
> Hi,
> You have probably fixed the problem now. We had a similar problem in a
> design last year. This was due to a mistake in a reference design we
used.
>
> This is how NXP Technical Support responded ...
>
> We have seen this with many more customers.
> The ENET_REF_CLK (P1.15) should be connected directly from the 50MHz
clk,
> and not from the PHY. This is a known problem (explained in a National
> App.Note) which causes sync. problems between EMAC and PHY, especially
at
> high-temperatures. Under this scheme, everything could work but at some
> point, the failure appears.
>
> Some background:
> In RMII mode, the 50MHz clock is used by both, the EMAC and the PHY to
> synchronize the data. Connecting to the EMAC a delayed clock (coming
from
> the PHY instead of using a direct connection with the clock source), the
> data could eventually not being synchronized correctly.
> You associate the failure with rev D, because rev B was working fine for
> you, but we saw this problem occurring for rev B also at different
> customers, especially in high temperatures. For this reason, we always
> recommended to connect the P1.15 directly from the clock source.
>
>
> Conclusion:
> So, rev B would not solve the problem. The fact it works now is probably
> more luck than wisdom . . . Any new rev B production batch could also
show
> up with the same problem.
> To find the comment advising not to use pin 25 as a clock source is on
> page 2, of:
> http://www.national.com/an/AN/AN-1405.pdf
>
> It’s not an NXP problem. From rev B to rev D nothing functionally has
> changed (just errata fixes, shrinks, etc. It’s definitely a new die, so
> timing differences will occur, but everything still within spec). We
have
> measured a phase shift of about 40% between the Oscillator and the 50MHz
> clock out pin of the PHY. That’s the problem!
>
>
>
>
> ________________________________
> From: Kevin >
> To: l...
> Sent: Thursday, 23 August 2012, 10:21
> Subject: [lpc2000] Re: Cyclic Ethernet frame loss with LPC2468
>
>



Hi Bruce,

Thanks for the extra detail. The clock fix does seem to have solved the problem for us. We'll also go back and modify our previous design.

--- In l..., Bruce Marshall wrote:
>
> Hi,
> You have probably fixed the problem now.  We had a similar problem in a design last year.  This was due to a mistake in a reference design we used.
>  
> This is how NXP Technical Support responded ...
>
> We have seen this with many more customers.
> The ENET_REF_CLK (P1.15) should be connected directly from the 50MHz clk, and not from the PHY. This is a known problem (explained in a National App.Note) which causes sync. problems between EMAC and PHY, especially at high-temperatures. Under this scheme, everything could work but at some point, the failure appears.
>  
> Some background:
> In RMII mode, the 50MHz clock is used by both, the EMAC and the PHY to synchronize the data. Connecting to the EMAC a delayed clock (coming from the PHY instead of using a direct connection with the clock source), the data could eventually not being synchronized correctly.
> You associate the failure with rev D, because rev B was working fine for you, but we saw this problem occurring for rev B also at different customers, especially in high temperatures. For this reason, we always recommended to connect the P1.15 directly from the clock source.
>
>  
> Conclusion:
> So, rev B would not solve the problem. The fact it works now is probably more luck than wisdom . . . Any new rev B production batch could also show up with the same problem.
> To find the comment advising not to use pin 25 as a clock source is on page 2, of:
> http://www.national.com/an/AN/AN-1405.pdf
>  
> It’s not an NXP problem. From rev B to rev D nothing functionally has changed (just errata fixes, shrinks, etc. It’s definitely a new die, so timing differences will occur, but everything still within spec). We have measured a phase shift of about 40% between the Oscillator and the 50MHz clock out pin of the PHY. That’s the problem!
>  
>
>  
>
> ________________________________
> From: Kevin
> To: l...
> Sent: Thursday, 23 August 2012, 10:21
> Subject: [lpc2000] Re: Cyclic Ethernet frame loss with LPC2468
>
>
>
>  
>
> Hello,
>
> I have had the same problem, and the solution was simple.
>
> You probably used the clock output of the phy chip as a reference for the network interface on the lpc. My advise, dont. have the oscillator of the phy also connected to the rx_clk of the lpc. That fixed the problem for us. And we discovered it by carefully reading the datasheet.
>
> Hope this helps.
> Kevin.
>
> --- In mailto:lpc2000%40yahoogroups.com, "rfinnovations" wrote:
> >
> > Hi,
> >
> > Our design uses an LPC2468 with DP83848 PHY. We lose some
> > Ethernet frames transmitted by our firmware. When I say that
> > we lose the frames, they have been transmitted as far as we
> > can tell from the MAC registers in the LPC, but they are not
> > detected by the PC at the other end of the wire (running a
> > Wireshark capture).
> >
> > The test setup was an isolated point-to-point link with a
> > short crossover cable, but we have seen the issue with a
> > number of different network hardware configurations.
> >
> > Probing the lines between the micro and PHY and between the
> > PHY and transceiver shows no obvious issues.
> >
> > Running a ping test (1 per sec) on an isolated point-to-
> > point link over a number of days shows that the frame loss
> > follows a 24 hour cycle. At the low point, almost no frames
> > are lost. At the worst point almost all are lost. A plot of
> > frame loss percentage by hour looks like a sine or cyclic
> > bell curve.
> >
> > Locking down the link speed from 100 Mbps to 10 Mbps
> > prevents the frame loss.
> >
> > The kicker is that we've used this exact same design before
> > in another product without trouble. Same circuit, component
> > values, even board layout.
> >
> > The main variable we identified was the revision of the
> > micro. We shipped our first product with Rev B LPC2468s and
> > the new one with Rev D. Replacing the Rev D micro on one of
> > our new boards with a Rev B made the problem go away.
> >
> > Our sample size is only one at the moment, but does anyone
> > else have this issue, or have any other suggestions? Any
> > help is greatly appreciated.
> >
>
>
>
>
>
>


Memfault Beyond the Launch