Forums

CompactFlash + microcontroller weirdness

Started by H. Peter Anvin September 17, 2004
Hello,

I wanted to inquire if someone else has seen this kind of phenomenon
before...

I have a design (currently implemented in an FPGA) which drives a
CompactFlash from a microcontroller, specifically a T80 (Z80 clone)
from opencores.org, with my own firmware.  The clock frequency is
25 MHz.

The weird part is that it reads all cards fine, but when it comes to
writing I get some extremely odd behaviour.  I have two cards -- a 64
MB card from IOData and a 128 MB card from SanDisk, both used but
which write fine in a PC -- very quickly giving me an unrecoverable
error on write, with the error register set to either 81h or 40h.  The
failing sector is filled with garbage which doesn't look random; it
has a lot of 55h bytes in particular.  Furthermore, the sectors
*after* the failing sector, up to the next 4K boundary, is filled with
a 16-bit pattern, usually, but not always, 0Fh 80h.

Occationally, too, I see entire 512-byte sectors filled with zero
without an error being reported.  After those, normal operation
resumes fine with the next sector.

However, a brand new card, 256 MB from POI, works like a charm.

It just seems very odd to me.  I'm mostly a software guy, so the
hardware aspects of this project are largely new to me.  However, I
have tried to eliminate glitches or asynchronicities.  The board is
the Altera NIOS development kit, so I have no reason to believe the
electricals are marginal as I might have if it had been a custom
board.  However, part of the reason I'd like to understand the
phenomenon is that I might want to use this design as a prototype for
a "real" hardware project in the future.

Detailed info (read only if bored):

It uses the 8-bit common memory mode of CompactFlash, reading/writing
single sectors at a time, in LBA mode.  The timing of the signals is
as follows (1 cycle = 40 ns):

   - A/Dout latched at the same time CE1# is asserted, so they should
     be stable for the duration of the transfer
   1 cycle     - CE1# asserted, WE/OE# deasserted
   3 cycles    - CE1# asserted, WE/OE# asserted
	       -> WAIT# sampled during this time; if WAIT# is sampled
     low within 2 cycles of WE/OE# assertion, the access will be held
     until at least 1 cycle after WAIT# deassertion
	       - Din latched at the same time OE# is deasserted
   2 cycles    - CE1# asserted, WE/OE# deasserted

After sending a command, the firmware will wait for BSY# assert and
deassert; the pulse is latched in hardware and then polled by
firmware.

I have not actually seen WAIT# being asserted by any card that I have
tried.

The source code (Verilog) is part of the project at:

    ftp://ftp.zytor.com/pub/fpga/abc80/abc80-10.zip

Thanks!

	-hpa
> The weird part is that it reads all cards fine, but when it comes to > writing I get some extremely odd behaviour. I have two cards -- a 64 > MB card from IOData and a 128 MB card from SanDisk, both used but
Is there any power-cycling going on during this test? I have encountered a similar, but not identical problem, with certain specific cards. But I tied our problem down to power-cycling after a write operation.
Followup to:  <608b6569.0409171822.231b6ed6@posting.google.com>
By author:    larwe@larwe.com (Lewin A.R.W. Edwards)
In newsgroup: comp.arch.embedded
> > > The weird part is that it reads all cards fine, but when it comes to > > writing I get some extremely odd behaviour. I have two cards -- a 64 > > MB card from IOData and a 128 MB card from SanDisk, both used but > > Is there any power-cycling going on during this test? I have > encountered a similar, but not identical problem, with certain > specific cards. But I tied our problem down to power-cycling after a > write operation. >
Generally, no, although on several occations the system hung and I had to remove the CF card from its slot. However, at that point the system was already hung. Did you find a workaround for power cycle after write? Intervening read, or go into sleep mode? -hpa
[snip]
>The weird part is that it reads all cards fine, but when it comes to >writing I get some extremely odd behaviour.
[snip] Whenever I had weird problems like this they boild down to haveing too long wires from the microcontroller to the CF. HTH Markus
> > specific cards. But I tied our problem down to power-cycling after a > > write operation. > > > Did you find a workaround for power cycle after write? Intervening > read, or go into sleep mode?
No, the problem was much more evil than that. It turned out that the problem (zeroing of sectors) happened randomly, but ONLY if a write operation had occurred followed - at any time - by a power-cycle. Moreover, the problem was happening on the power-up half of the cycle. In other words, the problem appeared if we did this: 1 write something 2 maybe read something 3 switch off 4 switch on If we pulled the card out at step 3, and put it in a card-reader, we observed that the data was ALWAYS OK. Furthermore, if we then put the card back in our device, it would never have a problem. We only saw the problem if we left the card in the device. I'm going to guess that if we left it for several hours between steps 3-4, it probably wouldn't have problems either, but that would have been too time-consuming to test. I "solved" the problem by changing the PCB layout a little, adding a large tantalum cap on the CF power rails, and adding weak pull[up,down as appropriate] resistors on the control lines. I don't know which of those were important; we were under time pressure and had no time for more than one additional board spin.
> ... > I "solved" the problem by changing the PCB layout a little, adding a > large tantalum cap on the CF power rails, and adding weak pull[up,down > as appropriate] resistors on the control lines. I don't know which of > those were important; we were under time pressure and had no time for > more than one additional board spin.
Both are necessary, as well as properly shielded PCB. The CF signal lines don't have enough grounding pathes. We were having noise problems unless there are power & ground planes on the PCB. However, that means 4 layers and double the cost on the PCB. After much experients, we end up with Alumimum shielding covers on both sides of the CF & PCB. That way, we can stay with 2 layers.
Followup to:  <608b6569.0409181336.702f00a1@posting.google.com>
By author:    larwe@larwe.com (Lewin A.R.W. Edwards)
In newsgroup: comp.arch.embedded
> > No, the problem was much more evil than that. It turned out that the > problem (zeroing of sectors) happened randomly, but ONLY if a write > operation had occurred followed - at any time - by a power-cycle. > Moreover, the problem was happening on the power-up half of the cycle. > In other words, the problem appeared if we did this: > > 1 write something > 2 maybe read something > 3 switch off > 4 switch on > > If we pulled the card out at step 3, and put it in a card-reader, we > observed that the data was ALWAYS OK. Furthermore, if we then put the > card back in our device, it would never have a problem. We only saw > the problem if we left the card in the device. I'm going to guess that > if we left it for several hours between steps 3-4, it probably > wouldn't have problems either, but that would have been too > time-consuming to test. > > I "solved" the problem by changing the PCB layout a little, adding a > large tantalum cap on the CF power rails, and adding weak pull[up,down > as appropriate] resistors on the control lines. I don't know which of > those were important; we were under time pressure and had no time for > more than one additional board spin. >
Yipes. OK, I don't have control over the board layout, this being an FPGA development board, but I tried turning on the FPGA "slow slew rate" I/O option in order to reduce noise. It doesn't seem to have had any immediate effect, but I will continue to investigate. One more data item that I forgot to mention: at least one of the CF cards did once work in the design, but unfortunately on a version of the project for which the sources never had made it into CVS, and were lost due to a file server disk crash... oops. Since this was quite a while ago I was starting to wonder if there was an issue with the CF card not performing releveling properly. I'll definitely hunt for power sequencing and other such issues, though. If you don't mind telling, on your board, do you assert CEx# for each transfer, or do you leave it (one or both) tied to ground? -hpa
After the rather unanimous comments in this group that the kind of
problems I'd been seeing are probably noise-related, I went back and
looked at the schematic for the FPGA development board (Altera Nios
Cyclone edition) I'm using.

It turns out that the power supply for the CF card is gated, but the
pin list I'd used didn't include the cf_power control pin - thus it
got left tristated and floating.  Apparently the unreliable VCC was
good enough for reading, but not writing.

Explicitly connecting cf_power to VCC solved the problem.

Many thanks!

	-hpa
"H. Peter Anvin" <hpa@terminus.zytor.com> wrote in message
news:ciohs3$qhu$1@terminus.zytor.com...
> After the rather unanimous comments in this group that the kind of > problems I'd been seeing are probably noise-related, I went back and > looked at the schematic for the FPGA development board (Altera Nios > Cyclone edition) I'm using. > > It turns out that the power supply for the CF card is gated, but the > pin list I'd used didn't include the cf_power control pin - thus it > got left tristated and floating. Apparently the unreliable VCC was > good enough for reading, but not writing.
I guess the card could draw enough power via the protection diodes from the lines that are high most of the time. I have that board too and used the CF card in one project. Some cards didn't work, it turned out that one signal called PACK or something like that is connected to the LCD port on J12. I had to cut the wire on the CF card connector to make it work properly. You know you can also attach a harddisk to it at J11? The 40 pin connectors are IDE compatible and from a software point of view, it doesn't make a big difference. Jeroen
Followup to:  <414fde72$0$21106$e4fe514c@news.xs4all.nl>
By author:    "Jeroen" <jayjay.1974@xs4all.nl>
In newsgroup: comp.arch.embedded
> > > It turns out that the power supply for the CF card is gated, but the > > pin list I'd used didn't include the cf_power control pin - thus it > > got left tristated and floating. Apparently the unreliable VCC was > > good enough for reading, but not writing. > > I guess the card could draw enough power via the protection diodes from the > lines that are high most of the time. >
Either that or the floating input provided some power, enough that the caps could keep Vcc up for reading.
> I have that board too and used the CF card in one project. Some cards didn't > work, it turned out that one signal called PACK or something like that is > connected to the LCD port on J12. I had to cut the wire on the CF card > connector to make it work properly.
Yeah; I'm not using the LCD port for this project, for this very reason.
> You know you can also attach a harddisk to it at J11? The 40 pin connectors > are IDE compatible and from a software point of view, it doesn't make a big > difference.
Interesting; I guess that explains the pin sharing, and running the CF card at 5 V, which I otherwise found hard to understand. In my application it wouldn't work, though, since I'm using the common memory mode as opposed to the IDE mode of the CF interface, and 8 bits to boot. Not a big deal; my application is recreating a 1970's computer; the OS can't even access more than 64 MB. -hpa