Reply by CBFalconer June 8, 20072007-06-08
Tom Lucas wrote:
>
... snip ...
> > The day we switched to metal enclosures was the day when the > majority of our EMI glitches disappeared. And they are strong too > - a fork lift truck once drove over one of our control boxes! > Unfortunately it is not possible for everyone's applications.
Back in the good old days everything went into metal boxes. None of these oil wasting plastic thingies. Of course, the assortment of tubes and ten watt resistors made the boxes rather large and heavy. :-) Not to mention 4 inch tall 2 inch radius electrolytics to get 10 uF (at 450 V). -- <http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt> <http://www.securityfocus.com/columnists/423> <http://www.aaxnet.com/editor/edit043.html> <http://kadaitcha.cx/vista/dogsbreakfast/index.html> cbfalconer at maineline dot net -- Posted via a free Usenet account from http://www.teranews.com
Reply by Vladimir Vassilevsky June 8, 20072007-06-08

Rob Horton wrote:

> I have a number of boards with processors that communicate to about 10 > different SPI connected devices, all on the same bus. 1 of these happens > to be an Atmel DataFlash. The DataFlash appears to be experiencing a > corrupted memory problem.
I am 95% sure that the memory corruption has to do with the software bugs and/or CPU behavior at the startup or powerdown. The SPI would be the least of my worries unless the layout is really really bad and there is something VERY noisy (like a powerful switch). Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
Reply by Tom Lucas June 8, 20072007-06-08
"msg" <msg@_cybertheque.org_> wrote in message 
news:136io9l2i8gigf5@corp.supernews.com...
> larwe wrote: > > <snip> >>>When deliberately trying to introduce noise on a unit at work by >>>repeatedly opening and closing some 240V contactors in close >> >> >> Oh. This is a *very* different issue from what I had imagined from >> your initial post. If you are having trouble with transients like >> this, the mitigation techniques are different from what you would do >> for internally (on-board) generated noise. Repeat your contactor >> experiment and look at what's happening to the power rails inside >> your >> product. >> > > Also be concerned about radiated EMI from the contactors; I am busy > quenching similar transient induced glitches and have resorted to > tin-box faraday cages enclosing boards and other directly connected > subassemblies, with a lot of ground straps to well-grounded sinks.
The day we switched to metal enclosures was the day when the majority of our EMI glitches disappeared. And they are strong too - a fork lift truck once drove over one of our control boxes! Unfortunately it is not possible for everyone's applications. The easiest way to solve the problem is to move the contactors somewhere else...
Reply by msg June 8, 20072007-06-08
larwe wrote:

<snip>
>>When deliberately trying to introduce noise on a unit at work by >>repeatedly opening and closing some 240V contactors in close > > > Oh. This is a *very* different issue from what I had imagined from > your initial post. If you are having trouble with transients like > this, the mitigation techniques are different from what you would do > for internally (on-board) generated noise. Repeat your contactor > experiment and look at what's happening to the power rails inside your > product. >
Also be concerned about radiated EMI from the contactors; I am busy quenching similar transient induced glitches and have resorted to tin-box faraday cages enclosing boards and other directly connected subassemblies, with a lot of ground straps to well-grounded sinks. Regards, Michael
Reply by larwe June 8, 20072007-06-08
On Jun 8, 9:23 am, Rob Horton <yahoo@mr_horton.com> wrote:
> The RAM buffer corruption could explain some effects that have been > observed with some units on site where a few bytes at a time seem to be
This is more or less what I observed while debugging my application. In my case it was not noise, it was failure to handle brownout gracefully; it was possible for the micro to start a write op during a brownout condition.
> corrupted in the stored data with a gradualy worsening situation. Small > amounts of data are written at a time so there will probably be a number > of repetative reads and writes to the same page.
It should be very easy to do a worst-case intensive endurance test on this to determine if that's your problem. But it doesn't sound like it.
> When deliberately trying to introduce noise on a unit at work by > repeatedly opening and closing some 240V contactors in close
Oh. This is a *very* different issue from what I had imagined from your initial post. If you are having trouble with transients like this, the mitigation techniques are different from what you would do for internally (on-board) generated noise. Repeat your contactor experiment and look at what's happening to the power rails inside your product.
Reply by Rob Horton June 8, 20072007-06-08
larwe wrote:
> On Jun 8, 8:00 am, Rob Horton <yahoo@mr_horton.com> wrote: > > >> When writing to the flash a command is sent to the flash to copy a page > > I know how DataFlash works, I use it in a current design :) > >> sent to the flash so that it writes the entire contents of the RAM >> buffer back into the flash memory. There is no error detection in this > > Statistically, your noise problem is much more likely to result in the > contents of the RAM buffer being corrupted than the target address > being wrong. What type of corruption are you actually observing? >
The RAM buffer corruption could explain some effects that have been observed with some units on site where a few bytes at a time seem to be corrupted in the stored data with a gradualy worsening situation. Small amounts of data are written at a time so there will probably be a number of repetative reads and writes to the same page. The system uses a rabbit processor and Dynamic C which supplies all of the low level code to access the flash. To me, the user, is presented easy to use fLashwrite() and flashread() functions. I have studied the supplied low level code incase of a bug and it looks fine to me. My feeling is that we are experiencing a noise problem on site altough I am keeping an open mind. Quite often a "problem" turns out to be a combination of a number of simpler issues. When deliberately trying to introduce noise on a unit at work by repeatedly opening and closing some 240V contactors in close proximity to the unit, I had the situation where the flash page index had been corrupted. This resulted in an entirely different page being copied into the internal RAM buffer. This was then written back to the non corrupted page address, resulting in 1024 bytes of invalid data. I haven't seen this happen on site yet, but it has opened my eyes as to the effects that noise can have. I am hoping to make some changes that will minimise or prevent this noise problem.
Reply by larwe June 8, 20072007-06-08
On Jun 8, 8:00 am, Rob Horton <yahoo@mr_horton.com> wrote:


> When writing to the flash a command is sent to the flash to copy a page
I know how DataFlash works, I use it in a current design :)
> sent to the flash so that it writes the entire contents of the RAM > buffer back into the flash memory. There is no error detection in this
Statistically, your noise problem is much more likely to result in the contents of the RAM buffer being corrupted than the target address being wrong. What type of corruption are you actually observing?
Reply by Rob Horton June 8, 20072007-06-08
larwe wrote:
> On Jun 8, 6:21 am, Rob Horton <yahoo@mr_horton.com> wrote: > >> to be an Atmel DataFlash. The DataFlash appears to be experiencing a > > Are you using the write protect pin to ensure that spurious garbage > doesn't go to the chip during powerup/powerdown? What clock speed are > you running at? What is Vio? > > The good news is that SPI is very easy to buffer. Consider putting a > buffer on SCK [at least, MOSI/MISO as well if necessary] and running > the buffered signals separately to the DataFlash [i.e. put it on an > isolated segment that doesn't run all over the board touching the > other 9 SPI devices]. >
When writing to the flash a command is sent to the flash to copy a page of data (1K) to it's internal RAM buffer. The bytes that need to be written are then written to this internal RAM buffer. A command is then sent to the flash so that it writes the entire contents of the RAM buffer back into the flash memory. There is no error detection in this command sequence. If there is a bit of noise on the line then the wrong area of flash is copied into the buffer or the buffer is written back into the wrong area of flash. There is no way of telling. OK, I could read back what I have just written to verify it but I have no way of knowing wether that is corrupted or where the orignal write was incorrectly written. The buffering sounds like a good idea.
Reply by larwe June 8, 20072007-06-08
On Jun 8, 6:21 am, Rob Horton <yahoo@mr_horton.com> wrote:

> to be an Atmel DataFlash. The DataFlash appears to be experiencing a
Are you using the write protect pin to ensure that spurious garbage doesn't go to the chip during powerup/powerdown? What clock speed are you running at? What is Vio? The good news is that SPI is very easy to buffer. Consider putting a buffer on SCK [at least, MOSI/MISO as well if necessary] and running the buffered signals separately to the DataFlash [i.e. put it on an isolated segment that doesn't run all over the board touching the other 9 SPI devices].
Reply by June 8, 20072007-06-08
Hi Rob,

If termination is your problem, keep in mind that for slow bandwidth
transfers you really only need to correctly terminate the clock
signal.

Termination is very simple and straight forward when there's only one
slave.  If you can, use 10 individual clocks instead of one common
clock to take advantage of the simple termination methods.  The data
pins can stay shared for reasonable clock speeds.

Regards,
Marc