EmbeddedRelated.com
Forums
The 2024 Embedded Online Conference

Pipelined 6502/z80 with cache and 16x clock multiplier

Started by Brett Davis December 19, 2010
In article <4d1f64ec$0$3034$afc38c87@news.optusnet.com.au>,
 <kym@kymhorsell.com> wrote:
>In comp.arch MitchAlsup <MitchAlsup@aol.com> wrote: >... >> There was a rev of TOP-10 that would timeout when accessing a >> particular memory (OS) structure on the KIs. Either DEC added another >> level of indirection, or rearranged the memory footprint so that the >> timer timeout was exposed. I was going to mention this, but though >> "just let it go". > >I'm not sure what you mean. "A particular memory (OS) structure" -- does >that mean some specific O/S table had a timeout on it, or does it >mean that if an indirect was stuck in self ref it would get timed out? >No matter. > >With this nasty @x feature you could easily hook 1000s of >locations together and have a long loop from one to the next and back to >the first again. I don't think anyone ever did *that*. But you >never know who might use such a trick to (e.g.) implement a free list for >a LISP interpreter.
There is another feature at play here, AFAIR. When the indirect chain is interrupted the original instruction is stopped, and PC saved; and a context swap is done. When the interrupt is done, and the machine gets back to scheduling the instruction again, the whole thing has to start over, and evaluate it all over again. With a sufficiently long chain on a sufficiently memory-starved machine this set of events may never terminate.
>> robustness
I certainly don't miss the quality issues with hardware from the era from before risk processors, raids and real networks. -- mrr
In article <8e86v7-266.ln1@laptop.reistad.name>,
Morten Reistad  <first@last.name> wrote:
> >>> robustness > >I certainly don't miss the quality issues with hardware from the era >from before risk processors, raids and real networks.
So you much prefer the current failure modes? Yes, they are much rarer, but typically FAR more evil when they occur - just as with modern versus older automobiles. If you do a proper cost-benefit analysis (i.e. using game theory, not benchmarketing), modern systems aren't as much better as most people think. Some of that could be improved by proper documentation and not just some recipes to follow when all goes well, and more could be improved by putting more resources into better and more pervasive diagnostics, but some of the degradation is fundamental. Where timing problems were rare and obscure, now they are common and ubiquitous. Even 40 years ago, it was EXTREMELY rare to have to cancel a whole project because of a failure mode IN PRODUCTION EQUIPMENT which couldn't be located or even reduced to a tolerable level, but nowadays it is merely unusual. In a few decades, it may even become common. Regards, Nick Maclaren.
In comp.arch nmm1@cam.ac.uk wrote:
...
> diagnostics, but some of the degradation is fundamental. Where > timing problems were rare and obscure, now they are common and > ubiquitous. > > Even 40 years ago, it was EXTREMELY rare to have to cancel a whole > project because of a failure mode IN PRODUCTION EQUIPMENT which > couldn't be located or even reduced to a tolerable level, but > nowadays it is merely unusual. In a few decades, it may even > become common.
... There is something to this. :) A couple decades back embedded work was fairly straightforward. Components may have been trivial and slow but because of that hooking them together was generally straightforward and the "mental model" needed to get things to work as expected were simple, too. You didn't need (as now) to rely on masses of very buggy documentation to make progress. I remember a few projects in the "early days" making microprocessors (Z80, 6809, 68k, and even the odd 8080 in the *very* early days) do things they were never "designed" for, and generally ending up with something that did a job reliably. There were still quite a few "undocumented features" you'd run across, but they maybe tended to provide shortcuts rather than roadblocks. Just a couple years back I worked on an embedded system to provide simul data, SMS and multi-channel voice over a 3g network. Not only was the wireless module quirky (I am being charitable) with at least 50% of its executive summary functionality undocumented and maybe not-entirely-thought-out, but the large multinational responsible seemed uncooperative in getting our product past a prototype stage. If it weren't for some arm twisting from our arm-twising dept vis a vis some regional company rep the project would have foundered. Timing issues abounded, and the basic design of the module seemed designed to make the operation unreliable at best. After various people assured us the provided documentation was completely up-to-date the regional rep managed to send us tantilising photocopies of clearly more recent documentation that described features we needed to co-ordinate operations. Not that it entirely worked as described. :) We ended up just having to wear the concurrency issues and put in a few "grand mal" resets, numerous sleeps and timeouts with empiricly-determined max parameters etc etc at judicious points to try and discourage and then recover from various races, deadlocks and starvations. The development of consumer-level products is largely a matter of stage magic. Provided the end user (or even your supervisor :) doesn't know exactly what your gadget is doing, it can *appear* to work fine. As in the music hall, a bit of misdirection in the form of a "simplified explanation" or 2, a few flashing leds, and a couple of potted "information messages" can convince observers the product is not only doing its job but miraculously exceeding design specs. Just -- *please* -- don't look behind the curtain. -- Generally, an empty answer. Try again. -- John Stafford <nhoj@droffats.net>, 08 Dec 2010 10:16:59 -0600
In article <ifpugs$v1f$1@gosset.csi.cam.ac.uk>,  <nmm1@cam.ac.uk> wrote:
>In article <8e86v7-266.ln1@laptop.reistad.name>, >Morten Reistad <first@last.name> wrote: >> >>>> robustness >> >>I certainly don't miss the quality issues with hardware from the era >>from before risk processors, raids and real networks. > >So you much prefer the current failure modes? Yes, they are much >rarer, but typically FAR more evil when they occur - just as with >modern versus older automobiles. If you do a proper cost-benefit >analysis (i.e. using game theory, not benchmarketing), modern >systems aren't as much better as most people think.
But if you do a proper systems analysis, they are. Because they are cheap, you can have multiple systems. With different components. And we have tools to handle faults. We can use raid for disks. And multiple power sources. Done right, we can afford to throw one out.
>Some of that could be improved by proper documentation and not >just some recipes to follow when all goes well, and more could be >improved by putting more resources into better and more pervasive >diagnostics, but some of the degradation is fundamental. Where >timing problems were rare and obscure, now they are common and >ubiquitous. > >Even 40 years ago, it was EXTREMELY rare to have to cancel a whole >project because of a failure mode IN PRODUCTION EQUIPMENT which >couldn't be located or even reduced to a tolerable level, but >nowadays it is merely unusual. In a few decades, it may even >become common.
One PPOE had a principle of _always_ having separate implementations of all critical systems, running as live as possible. I learnt a lot from that. We even found a floating point bug in hardware. But the point you are making is important. The open hardware movements are important, because we need the transparency. It is not just that the driver works with Linux. It is that you can actually see what it is doing. And yes, we have to be a lot more proactive on this front. -- mrr
In article <4d209861$0$3428$afc38c87@news.optusnet.com.au>,
 <kym@kymhorsell.com> wrote:
>In comp.arch nmm1@cam.ac.uk wrote: >... >> diagnostics, but some of the degradation is fundamental. Where >> timing problems were rare and obscure, now they are common and >> ubiquitous. >> >> Even 40 years ago, it was EXTREMELY rare to have to cancel a whole >> project because of a failure mode IN PRODUCTION EQUIPMENT which >> couldn't be located or even reduced to a tolerable level, but >> nowadays it is merely unusual. In a few decades, it may even >> become common. >... > >There is something to this. :) > >A couple decades back embedded work was fairly straightforward. >Components may have been trivial and slow >but because of that hooking them together was generally straightforward >and the "mental model" needed to get things to work as expected >were simple, too. You didn't need (as now) to rely on masses of >very buggy documentation to make progress.
It is evident that this poster never handled SMD or ESMD disks, large x.25 network devices, MAU-based token ring, or pre-internet multiplexing equipment.
>I remember a few projects in the "early days" making microprocessors (Z80, >6809, 68k, and even the odd 8080 in the *very* early days) >do things they were never "designed" for, and generally ending up with >something that did a job reliably. There were still quite >a few "undocumented features" you'd run across, but they maybe tended to >provide shortcuts rather than roadblocks.
the 6502 and the other 650x processors had a lot of surprises, and they were not exactly a showcase in terms of documentation.
>Just a couple years back I worked on an embedded system to provide simul data, >SMS and multi-channel voice over a 3g network. Not only was the wireless >module quirky (I am being charitable) with at least 50% of its >executive summary functionality undocumented and maybe >not-entirely-thought-out, but the large multinational responsible seemed >uncooperative in getting our product past a prototype stage. If it weren't >for some arm twisting from our arm-twising dept vis a vis some regional >company rep the project would have foundered. >Timing issues abounded, and the basic design of the module seemed designed >to make the operation unreliable at best.
Bad designs exist everywhere. But get the contract right, and they have to deliver, or perish. The extreme top-down Telco model for implementation never worked. Not then, not now. Read rfc875 for an idelological handle on it.
>After various people assured us the provided documentation was >completely up-to-date the regional rep managed to send us >tantilising photocopies of clearly more recent documentation that >described features we needed to co-ordinate operations. Not that >it entirely worked as described. :) > >We ended up just having to wear the concurrency issues and put >in a few "grand mal" resets, numerous sleeps and timeouts with >empiricly-determined max parameters etc etc at judicious points to try >and discourage and then recover from various races, deadlocks and starvations. > >The development of consumer-level products is largely a matter >of stage magic. Provided the end user (or even your supervisor :) >doesn't know exactly what your gadget is doing, it can *appear* to work fine. >As in the music hall, a bit of misdirection in the form >of a "simplified explanation" or 2, a few flashing leds, >and a couple of potted "information messages" can convince observers >the product is not only doing its job but miraculously exceeding design specs. > >Just -- *please* -- don't look behind the curtain.
Perhaps you are ready for the internet model of consensus and working systems now? -- mrr
Morten Reistad wrote:
> In article<e073c9bf-50f4-45e1-97e2-11a5354c980b@g25g2000yqn.googlegroups.com>, > MitchAlsup<MitchAlsup@aol.com> wrote: >> On Dec 30, 7:56 am, Terje Mathisen<"terje.mathisen at tmsw.no"> >> wrote: >>> Morten Reistad wrote: >>>> I keep thinking what could be done if a classic machine >>>> like a PDP11 or a PDP10 was made with a modern process, no >>>> microcode in core instructions, and we substiute L2 cache >>>> for main memory, and ram for disk. And hyperchannel for >>>> I/O. >>> >>> Afair, both of them had memory indirect addressing? >> >> The PDP-10 had infinite indirect memory addressing--the addressed word >>from memory contained a bit to indicate if another level of >> indirection was to be performed. > > For the ones who do not know the PDP10 instruction set: The address > calculation and the instruction execution are totally separate on this > machine. > > You could do stuff like MOVEI A,10(B), which would add 10 to > the the value in register B, placing the result in register A.
That seems _very_ similar to LEA EAX,[EBX+10] on an x86, which also has separate address calc and integer execution paths. Terje -- - <Terje.Mathisen at tmsw.no> "almost all programming can be viewed as an exercise in caching"
Morten Reistad wrote:
> In article<ifpugs$v1f$1@gosset.csi.cam.ac.uk>,<nmm1@cam.ac.uk> wrote: >> So you much prefer the current failure modes? Yes, they are much >> rarer, but typically FAR more evil when they occur - just as with >> modern versus older automobiles. If you do a proper cost-benefit >> analysis (i.e. using game theory, not benchmarketing), modern >> systems aren't as much better as most people think. > > But if you do a proper systems analysis, they are. Because they > are cheap, you can have multiple systems. With different components.
The last sentence is the key: Yes, you _CAN_ have redundant systems with different components, but I have yet to see a single vendor who will ceritfy and/or recommend this! Instead they want you to make sure that the harware and software is as identical as possible on each node, significantly increasing the risk of a common mode hardware problem hitting all nodes at the same time. I.e. NetWare's System Fault Tolerant setup mirrored the state between two servers, so that the slave could take over more or less immediately (i.e. well within the software timeout limits). I always wanted those two servers to use totally separate motherboards, cpus, disk and network controllers, etc., but was told that the HW had to be identical. :-(
> > And we have tools to handle faults. > > We can use raid for disks. And multiple power sources. > > Done right, we can afford to throw one out.
[snip]
> One PPOE had a principle of _always_ having separate implementations > of all critical systems, running as live as possible. I learnt a lot > from that. We even found a floating point bug in hardware.
That's very interesting, I'll have to get the full story from you at some point in time. :-) Terje -- - <Terje.Mathisen at tmsw.no> "almost all programming can be viewed as an exercise in caching"
In article <gaj7v7-af2.ln1@laptop.reistad.name>,
Morten Reistad  <first@last.name> wrote:
>>> >>>>> robustness >>> >>>I certainly don't miss the quality issues with hardware from the era >>>from before risk processors, raids and real networks. >> >>So you much prefer the current failure modes? Yes, they are much >>rarer, but typically FAR more evil when they occur - just as with >>modern versus older automobiles. If you do a proper cost-benefit >>analysis (i.e. using game theory, not benchmarketing), modern >>systems aren't as much better as most people think. > >But if you do a proper systems analysis, they are. Because they >are cheap, you can have multiple systems. With different components. > >And we have tools to handle faults. > >We can use raid for disks. And multiple power sources. > >Done right, we can afford to throw one out.
I am afraid that you have completely missed the point. To a very good first approximation, any problem that is localised within a single component is trivial; the hard ones are all associated with the global infrastructure or the interfaces between components. And remember the 80:20 rule - eliminating the 80% of the problems that account for only 20% of the cost isn't a great help. Even worse, almost all of the tools to handle faults are intended to make it possible for a trained chimpanzee to deal with the 80% of trivial faults, and completely ignore the 20% of nasty ones. In a bad case, the ONLY diagnostic information is through the tool, and it says that there is no problem, that the problem is somewhere it demonstrably isn't, or is similarly useless. Let me give you just one example. A VERY clued-up colleague had a RAID controller that went sour, so he replaced it. Unfortunately, the dying controller had left the system slightly inconsistent, so the new controller refused to take over and wanted to reinitialise all of the disks. Yes, he had a backup, but it would have taken a week to do a complete reload (which was why he was using a fancy RAID system in the first place). He solved the problem by mounting each disk, cleaning it up using an unrelated 'fsck', manually fiddling a few key files, and then restarting the controller. Damn few people CAN do that, because none of the relevant structure was documented, and the whole process was unsupported. I have several times had a problem where even the vendor admitted defeat, and where a failure to at least bypass the problem would mean that a complete system would have had to be written off before going into production. Each took me over a hundred hours of hair-tearing.
>One PPOE had a principle of _always_ having separate implementations >of all critical systems, running as live as possible. I learnt a lot >from that. We even found a floating point bug in hardware.
Well, yes. I regret not having access to a range of systems any longer - inter alia, it makes it hard to check code for portability.
>But the point you are making is important. The open hardware >movements are important, because we need the transparency. > >It is not just that the driver works with Linux. It is that you >can actually see what it is doing. > >And yes, we have to be a lot more proactive on this front.
I fully agree with that. Regards, Nick Maclaren.
In article <ifscp5$5l3$1@gosset.csi.cam.ac.uk>,  <nmm1@cam.ac.uk> wrote:
>In article <gaj7v7-af2.ln1@laptop.reistad.name>, >Morten Reistad <first@last.name> wrote: >>>> >>>>I certainly don't miss the quality issues with hardware from the era >>>>from before risk processors, raids and real networks. >>> >>>So you much prefer the current failure modes? Yes, they are much >>>rarer, but typically FAR more evil when they occur - just as with >>>modern versus older automobiles. If you do a proper cost-benefit >>>analysis (i.e. using game theory, not benchmarketing), modern >>>systems aren't as much better as most people think. >> >>But if you do a proper systems analysis, they are. Because they >>are cheap, you can have multiple systems. With different components. >> >>And we have tools to handle faults. >> >>We can use raid for disks. And multiple power sources. >> >>Done right, we can afford to throw one out. > >I am afraid that you have completely missed the point. To a very >good first approximation, any problem that is localised within a >single component is trivial; the hard ones are all associated with >the global infrastructure or the interfaces between components. >And remember the 80:20 rule - eliminating the 80% of the problems >that account for only 20% of the cost isn't a great help.
If you want real redundancy you need sufficient separation between systems, and transparancy in failover methods. Today this eliminates tightly coupled systems, like raid controllers, "intelligent" switches and fancy hardware failovers. Raid controllers, etherchannel, multiple power supplies and separate processors are used, but the performance issues are as important as redundancy. The redundancy, or really, extra uptime these bring are "nice to have", but not to depend on. For real redundancy you need separate power, like in different main station feed, at least, separate network, physical separation.
>Even worse, almost all of the tools to handle faults are intended >to make it possible for a trained chimpanzee to deal with the 80% >of trivial faults, and completely ignore the 20% of nasty ones. >In a bad case, the ONLY diagnostic information is through the tool, >and it says that there is no problem, that the problem is somewhere >it demonstrably isn't, or is similarly useless. > >Let me give you just one example. A VERY clued-up colleague had >a RAID controller that went sour, so he replaced it. Unfortunately, >the dying controller had left the system slightly inconsistent, so >the new controller refused to take over and wanted to reinitialise >all of the disks. Yes, he had a backup, but it would have taken a >week to do a complete reload (which was why he was using a fancy >RAID system in the first place). > >He solved the problem by mounting each disk, cleaning it up using >an unrelated 'fsck', manually fiddling a few key files, and then >restarting the controller. Damn few people CAN do that, because >none of the relevant structure was documented, and the whole process >was unsupported.
A "trust me" tool, tightly coupled to the system, without transparancy, from a single vendor.
>I have several times had a problem where even the vendor admitted >defeat, and where a failure to at least bypass the problem would >mean that a complete system would have had to be written off >before going into production. Each took me over a hundred hours >of hair-tearing.
At what point in the deployment did this show up?
>>One PPOE had a principle of _always_ having separate implementations >>of all critical systems, running as live as possible. I learnt a lot >>from that. We even found a floating point bug in hardware. > >Well, yes. I regret not having access to a range of systems any >longer - inter alia, it makes it hard to check code for portability.
I keep Linux, Freebsd and Openbsd around. And in the cases where we only have Linux support I deliberatly install 64-bit systems in location A and 32-bit in location B.
>>But the point you are making is important. The open hardware >>movements are important, because we need the transparency. >> >>It is not just that the driver works with Linux. It is that you >>can actually see what it is doing. >> >>And yes, we have to be a lot more proactive on this front. > >I fully agree with that.
-- mrr
On Mon, 03 Jan 2011 11:01:44 +0100, Terje Mathisen <"terje.mathisen at
tmsw.no"> wrote:

>Morten Reistad wrote: >> In article<ifpugs$v1f$1@gosset.csi.cam.ac.uk>,<nmm1@cam.ac.uk> wrote: >>> So you much prefer the current failure modes? Yes, they are much >>> rarer, but typically FAR more evil when they occur - just as with >>> modern versus older automobiles. If you do a proper cost-benefit >>> analysis (i.e. using game theory, not benchmarketing), modern >>> systems aren't as much better as most people think. >> >> But if you do a proper systems analysis, they are. Because they >> are cheap, you can have multiple systems. With different components. > >The last sentence is the key: > >Yes, you _CAN_ have redundant systems with different components, but I >have yet to see a single vendor who will ceritfy and/or recommend this!
One of my customers is using their systems capable of running on various HW platforms (both on Big as well as Little Endian) on different base operating systems and there should not be much problems in implementing the same functionality on different HW. Using platform diversity did not create much interest, since after all, the same application level software would be used. The only publicly known truly redundant software that I have heard of is the US space shuttle with more os less triple (voting) flight control computers and with a 4th independent computer programmed by a different team capable of (only) landing the space shuttle.

The 2024 Embedded Online Conference