This list is for discussion of the design and implementation of field-programmable gate array based processors and integrated systems. It is also for discussion and community support of the XSOC Project (see http://www.fpgacpu.org/xsoc).
|
That sounds impressive, 1,000,000,0000 transistors. ( 10 Billion marketing gates :) ). Did you know the PDP-8/S had 1001 transistors in marketed in 1968? Now with that many transistors how is failing/defective transistors/CLB's handled? Need one design error detecting logic in the new cpu ISA's? While I know the decimal machines of the 1950's often had error detecting codes like 2 bits out 5 that not only detected storage problems it detected alu problems too. Is there anything simple for today's binary machines in re-coding information for storage and arithmetic to detect possible problems? -- Ben Franchuk - Dawn * 12/24 bit cpu * www.jetnet.ab.ca/users/bfranchuk/index.html |
|
|
|
> Now with that many transistors how is failing/defective > transistors/CLB's handled? Need one design error detecting > logic in the new cpu ISA's? While I know the decimal machines > of the 1950's often had error detecting codes like 2 bits out > 5 that not only detected storage problems it detected alu > problems too. Is there anything simple for today's binary > machines in re-coding information for storage and arithmetic > to detect possible problems? Each and every one of those transistors test out "perfectly" at the factory. I understand that the tester downloads a number of configuration bitstreams that fully exercise and cover the configuration memory, the CLBs, interconnect, etc. (((Wacky idea: I understand that testers step over each die on the unsawn wafer, pressing probe wires to the die's pads, powering it up, and running some test circuits. I wonder, is it practical to add power, ground, and JTAG-like test paths, between dice, to interconnect the dice on the unsawn wafer and thereby test entire wafers in parallel? You would still need to step the tester over each die to check out I/O defects, but since most internal logic defects would already have been diagnosed, the tester would not need to spend much time on known bad dice. Then you collect the self-test and tester-based test results and saw and keep the good dice, the EasyPath dice, etc.))) Altera APEX: BTW in APEX parts, Altera reportedly uses redundancy to cope with defects -- see http://www.altera.com/products/devices/apex/apx-redundancy.html EasyPath: Since only a fraction of FPGA transistors matter for a given configuration, the keen idea of the EasyPath product, as I understand it, is to qualify partially defective dice against a fixed configuration (or at any rate, a set of test configurations) that covers the resources required by the fixed configuration. That said, factory perfect FPGAs may still have failures in the field. Coping with those failures is a rich subject. Here are just a few comments. 1. You can use readback to read the configuration bitstream. You can even read it back to an internal circuit within the FPGA. There you can compute a signature on the bitstream and so detect if it has changed through some kind of SRAM upset. You can even continuously readback the configuration and test it is pristine every second (or more often than that). 2. In one FPGA you can build two or more processors, and run them in lock step, comparing the write-back results of each processor each cycle. This can detect when one diverges from the other. I really think this *is* the easist thing to do, at least to detect faults. You can also build a TMR system. (And I think I would have more confidence in a system done across three FPGAs than all on one.) And as in big systems you can always put EDAC (ECC) on the buses and/or RAMs in your system. Designers of aerospace systems have to worry about this all the time. See for example the MAPLD Conference sessions http://klabs.org/richcontent/MAPLDCon02/MAPLDCon02.html. Jan Gray, Gray Research LLC |
|
|
|
"Jan Gray" <> wrote: > Each and every one of those transistors test out "perfectly" at the > factory. I understand that the tester downloads a number of > configuration bitstreams that fully exercise and cover the configuration ^^^^^ > memory, the CLBs, interconnect, etc. Dream on... :-) http://groups.google.com/groups?hl=en&th=4e7ce1c83a7baa68&seekm=20010731.103308.1239036029.24248%40polybus.com > (((Wacky idea: I understand that testers step over each die on the > unsawn wafer, pressing probe wires to the die's pads, powering it up, > and running some test circuits. I wonder, is it practical to add power, > ground, and JTAG-like test paths, between dice, to interconnect the dice > on the unsawn wafer and thereby test entire wafers in parallel? But there may be flaws anywhere on the wafer; actually some parts of the wafer are quite likely to be non-functional. Anything in a JTAG-like chain behind some non-functional part cannot be observed... So you have to be able to route around flaws with unpredictable locations and with uncertain distribution. That's quite a bit more involved than a JTAG chain. Worse, you'll have to get functional interconnect across the entire wafer somehow. This is tricky with a wafer stepper (i.e. most modern fabs expose wafers only one die at a time, and aren't set up to produce working circuits between those dies). I know it has been done with wafer steppers, but requires alignment between the steps and support for funny design rules; expect it to come at a cost... - Reinoud |
|
|
|
> Dream on... :-) > http://groups.google.com/groups?hl=en&th=4e7ce1c83a7baa68&seekm=20010731 .103308.1239036029.24248%40polybus.com Thanks, great thread, which I had read, but did not come to mind. Xilinx claims "all devices are 100% functionally tested" in their data sheets (e.g. for Virtex-II). Glaringly absent from this thread were official explanations or clarifications from Xilinx about what their 100% functionally tested assertion entails. > > (((Wacky idea))) > But ... Yes, I had thought of those two concerns. I thought redundant paths might help with the first problem, and wide (10 micron) easily registered test interconnect might help with the second -- it need only be top layer metal, not active layer stuff. Of greater concern -- I assume defects can short out a die and thereby short out the whole wafer were it were all "wired together". Bear with me -- once every 10 "wacky ideas" I might stumble on to something good. Thanks for the push-back, Jan Gray, Gray Research LLC |
|
|
|
"Jan Gray" <> wrote: > Xilinx claims "all devices are 100% functionally tested" in their data > sheets (e.g. for Virtex-II). Yeah, and Bill Gates himself claims MS products have no bugs ;-). Seriously, it seems impossible to fully test state-of-the-art chips (at reasonable cost). FPGAs are no exception, even with their simple architecture. Just look at all the possible time- and data-dependent failure modes. I wasn't surprised in the least by that c.a.fpga thread (although I'm afraid the Xilinx people actually believe their claims). > > > (((Wacky idea))) > > But ... > > Yes, I had thought of those two concerns. I thought redundant paths > might help with the first problem, and wide (10 micron) easily > registered test interconnect might help with the second -- it need only > be top layer metal, not active layer stuff. Agreed (although the 10 micron figure may be rather optimistic). > Of greater concern -- I > assume defects can short out a die and thereby short out the whole wafer > were it were all "wired together". Yeah, but those shorts will be small and disappear in a little puff of smoke in the first usec after connecting power :-). I don't think that'll be a problem. > Bear with me -- once every 10 "wacky ideas" I might stumble on to > something good. Oh, but I think it's actually a good idea... I have a terrible tendency to immediately argue opposing views, no matter my own preference (or good manners;). Sorry! And _please_ keep the 'wacky' ideas coming; life, computer architecture and everything would be so boring without them... And, while I'm at it - thanks for your mailing list, web site, postings, papers and designs. They're great. - Reinoud PS. Ever considered a more 'open source' license for your designs? |
|
Jan >(((Wacky idea: I understand that testers step over each die on the I don't know if there is anything left on the web, but in Cambridge UK, there was a company called Anamartic, since run out of venture capital. Anyway, they were making solid state disks by taking an entire wafer, doing some magic in electronics and firmware to figure the bad blocks and setting up routing through the entire wafer. In the process they produced all sorts of IP for routing etc on wafer which might be around if you want to have a play. Veronica |