This list is for discussion of the design and implementation of field-programmable gate array based processors and integrated systems. It is also for discussion and community support of the XSOC Project (see http://www.fpgacpu.org/xsoc).
|
It shouldn't be a surprise that with a stack machine for every memory fetch there is a consequent stack manipulation. So, we're looking at 2 memory accesses for every instruction, minimum. This type of machine screams for a Harvard architecture which, heretofore, I had considered an oddity. Silly me, I get nearly a 100% speedup with the increased bandwidth. So, the issue is: which prototype board will allow me to have 2 completely separate memory systems 32 bits wide with a 17 or 18 bit address. One board that I am considering is http://www.easyfpga.com/ez3susb_features.htm This board is quite compact - it doesn't have anything except a 400k gate Spartan 3 and a USB interface. That would mean that all I/O would have to come from somewhere else over a somewhat slow USB link. But that might be very interesting - a Visual Basic program to handle console functions, perhaps some debugging information, file transfer and the like. Interesting... Another board I am considering is https://digilent.us/Sales/Product.cfm?Prod=D2FT-DIO5 where there is a possiblity of connecting 2 memory systems. The combination kit has some advantages but once I fill up the edge connectors with memory I won't have a way to get off board for I/O such as the IDE. I would eventually have to lose the peripheral card but it would be very useful for debugging. Any other suggestions for a board with a bunch of edge connections? I'm not hung up on Xilinx, Altera is certainly in the running if I can find a board with the required configuration. |
|
|
|
rtstofer wrote: > > It shouldn't be a surprise that with a stack machine for every > memory fetch there is a consequent stack manipulation. So, we're > looking at 2 memory accesses for every instruction, minimum. This > type of machine screams for a Harvard architecture which, > heretofore, I had considered an oddity. Silly me, I get nearly a > 100% speedup with the increased bandwidth. > > So, the issue is: which prototype board will allow me to have 2 > completely separate memory systems 32 bits wide with a 17 or 18 bit > address. Most stack machines are often FORTH and you only need a small amount of stack space say 256 words if you go with that langauge. How ever this stack could be internal in the FPGA. The same goes for the Forth return stack as well. Ben. |
|
|
|
At 12:17 AM 10/23/2004, you wrote:
>It shouldn't be a surprise that with a stack machine for every >memory fetch there is a consequent stack manipulation. So, we're >looking at 2 memory accesses for every instruction, minimum. This >type of machine screams for a Harvard architecture which, >heretofore, I had considered an oddity. Silly me, I get nearly a >100% speedup with the increased bandwidth. I'm not clear how you are architecting this. Most stack machines use a separate on chip memory for the stack, often two separate memories for data and return stacks. Likewise, most modern FPGAs have enough memory that you can put the program and data memory on chip as well. This greatly improves the speed and allow each memory to be separate. I found with my stack machine that trying to combine memories and stacks resulted in a lot of muxes on the address busses. This uses a lot of resources and slows down the cycle. Also, on chip memories can be dual port which may be of value. Rick Collins Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAX |
|
|
|
All data, especially data local to a procedure, is dynamically allocated on a stack when the procedure is invoked and the stack is cut back when the procedure terminates. If the procedure was actually a function, the return value remains on the top of the stack (TOS) for the calling procedure to use. There are a bunch of exceptions but let's just assume a function returning an integer. Of course, program calling locations (return addresses) are also stored on this stack in a structure known as a mark stack frame. This 5 word data structure contains the return address as well as the contents of the stack pointer, mark stack pointer, a base pointer and space for the return value. This implies that when a procedure returns it is possible to cut the stack back to exactly the point where it was before the call. The memory used by the terminated procedure is returned to the system for reuse. Recursive functions are a good example of how this data stack grows as the function recurses and gets cut back as the function unwinds. Imagine how it would grow if the procedure had, say, 100 words of local variables. There is another data area known as the heap which is just another stack where data is allocated dynamically. A linked list, for example, will grow in this area as new nodes are created. Although it won't be implemented, this is an area where garbage collection strategies can be employed to recover space as structures are released. One stack grows upward in data memory, one stack grows downward; if they meet in the middle memory overflow results and the CPU stops. This is not a small return address stack, everything is allocated here and recursion is encouraged. The entire Pascal compiler uses recursive descent as it translates the source. In fact, one of the best features of Pascal is using recursion while traversing structures. Sure, it can be done in an indexed loop but it isn't as pretty or as intuitive. Code, once loaded into memory, is static. In concept, if there was a small OS, then there would be a small amount of resident code and other program code would be allocated space before execution and cut back when terminated. So the compiler could be loaded using 32K words, or whatever, and when it terminated with results stored on disk the code space would be available for executing (interpreting) the results. Mentally, this can be viewed as a stack where chunks of code are added and removed as different programs are run. Let's take a simple integer addition of the top two elements on the stack that pops the two operands and returns the result as the new TOS (Top Of Stack). To complicate things, let's assume that the TOS element and NOS (Next On Stack) are not really in SRAM but are held in registers connected to the ALU. Pretty simple, the ALU is already connected to TOS and NOS so the result is clocked into TOS. Now, the NOS register is trash and it needs to be reloaded with the element below in SRAM pointed to by SP. So, SP is decremented and a memory fetch refills NOS. Meanwhile, while the stack fetch is going on it is also time to fetch the next instruction and we might as well use the incoming instruction to cause a branch in the FSM to the proper point to execute the instruction - instruction decode is just a branch at the end of the fetch when the Instruction Register is loaded. Oh, and SP might as well be maintained as, mentally, SP- 1 since we will almost always be popping something off the stack. In fact, there should be an adder creating SP-1 and SP+1 at all times. In theory, the addition, the stack fetch, updating SP and instruction fetch could all occur in parallel in a single cycle. If this was interpreted in code it would likely take 4 CISC instructions: pop op1, pop op2, add, push result. Or some similar scheme using index registers. But, the point is, multiple instructions and perhaps multiple data fetches have to occur to complete the operation. Really ugly unless the architecture is designed to support stack operations. But I can't overlap instruction fetch with stack cleanup unless I have two completely separate memory spaces. It's not fatal to the design, I could still execute the integer add in 2 cycles but that has cut machine speed in half. I'm really just thinking out loud. I started playing with the instruction set, looking at how the FSM might execute instructions and one of the first things to jump out was the issue of instruction fetch colliding with stack cleanup. Now, if I really wanted to speed things up, I would interleave the data stack so that I could fetch two operands in a single cycle. I think I'll let that go... I would need 3 separate memory spaces or about 150 pins... Sure, IBM used to fetch 512 bytes at a time but they had more money to spend. There are some other interesting instructions because Pascal is truly a block structured language and nested blocks are encouraged to enforce scope rules. So, if a procedure wants to fetch an operand from its calling procedure's calling procedure (grand parent) then two levels of indirection are required. This n-level indirection occurs whenever a non-local variable is used and 'n' is related to the level of nesting. The instruction format includes the number of levels to indirect plus the offset within the final level to the specific variable. Pascal is a beautiful language but the underlying hardware can be a little awkward. --- In , Arius - Rick Collins <dsprelated@a...> wrote: > At 12:17 AM 10/23/2004, you wrote: > >It shouldn't be a surprise that with a stack machine for every > >memory fetch there is a consequent stack manipulation. So, we're > >looking at 2 memory accesses for every instruction, minimum. This > >type of machine screams for a Harvard architecture which, > >heretofore, I had considered an oddity. Silly me, I get nearly a > >100% speedup with the increased bandwidth. > > I'm not clear how you are architecting this. Most stack machines use a > separate on chip memory for the stack, often two separate memories for data > and return stacks. Likewise, most modern FPGAs have enough memory that you > can put the program and data memory on chip as well. This greatly improves > the speed and allow each memory to be separate. > > I found with my stack machine that trying to combine memories and stacks > resulted in a lot of muxes on the address busses. This uses a lot of > resources and slows down the cycle. Also, on chip memories can be dual > port which may be of value. > > Rick Collins > > rick.collins@a... > > Arius - A Signal Processing Solutions Company > Specializing in DSP and FPGA design http://www.arius.com > 4 King Ave 301-682-7772 Voice > Frederick, MD 21701-3110 301-682-7666 FAX |
|
|
|
rtstofer wrote: ... { stack stuff removed } > There are some other interesting instructions because Pascal is > truly a block structured language and nested blocks are encouraged > to enforce scope rules. So, if a procedure wants to fetch an > operand from its calling procedure's calling procedure (grand > parent) then two levels of indirection are required. This n-level > indirection occurs whenever a non-local variable is used and 'n' is > related to the level of nesting. The instruction format includes > the number of levels to indirect plus the offset within the final > level to the specific variable. Have you looked at the very old Burrough's computers like the B5000? They look to be interesting stack machines, but documents are few on the internet. |
|
|
|
--- ben franchuk <> wrote: > > rtstofer wrote: > > > > It shouldn't be a surprise that with a stack machine > for every > > memory fetch there is a consequent stack manipulation. > So, we're > > looking at 2 memory accesses for every instruction, > minimum. This > > type of machine screams for a Harvard architecture > which, > > heretofore, I had considered an oddity. Silly me, I > get nearly a > > 100% speedup with the increased bandwidth. > > > > So, the issue is: which prototype board will allow me > to have 2 > > completely separate memory systems 32 bits wide with a > 17 or 18 bit > > address. > Most stack machines are often FORTH and you only need a > small amount > of stack space say 256 words if you go with that > langauge. How ever this > stack could be internal in the FPGA. The same goes for > the Forth > return stack as well. > Ben. > Well, I do beleve, that 256 words may be not enough for more complex programs. However knowning the way stack is used (push, pop, peek and peek few elements below top of stack) you can relatively easy change internal fixed size stack to ring like based stack cache. The data are pushed on stack and if stack overflow appears memory write is generated to flush the oldest data and make place for new data. Same for pop - if there is not enough data for deepest possible peek the memory read cycle is generated. This can spare a lot of memory bandwidth and gives high performance gain compared to pure stack-in-shared-memory solution. This also gives you much more flexibility than on-chip stack only. I do beleve that 8 or 16 stack levels ring cache will be enough - most program grows stack by significant number of words then creates a lot of tiny jitter and shrinks it back in one move. The whole 'jitter' stuff will be cached and won't absorb memory. regards, Tomasz Sztejka. ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com |
|
|
|
I am aware that the old Burrough's machines were stack oriented but I haven't really researched them very much. IIRC they were developed to run Algol which is a very close ancestor of Pascal. There is a reason that stack computers aren't popular - I don't think they make efficient use of memory bandwidth or the addition of registers. Then too, most machines aren't really designed for block structured languages. They can be compiled and run, of course, but it is always less than optimal. C has a lot of advantages here in that blocks aren't nested and there are really only two data areas: global or local. In the case of Pascal with perhaps 10 levels of lexical nesting the situation on operand fetch is much more complex. Particularly if a procedure has recursed down a tree. I still don't have many of the details straight in my mind. I have a lot more thinking to do. > Have you looked at the very old Burrough's computers like the B5000? > They look to be interesting stack machines, but documents are few on the > internet. |
|
|
|
The memory system you suggest would be elegant. I'm not smart enough to build it so I'll let it pass. But I do see where it could solve the bandwidth issue. The bookkeeping would be very complex, particularly when it comes to referencing the data segments of enclosing procedures. Maybe complex is the wrong word but it could take several cycles to calculate where the stack frame for the enclosing procedure is located. It doesn't usually have to be loaded, there are a couple of pointers that are required to perform the indirect addressing. What would be ugly is the situation where the local data of a procedure was larger than the internal stack could hold. Some operands could be fetched from BlockRAM and others would require stack fetches. Now, the compiler could be adapted to know this and emit the proper code; but not by me! Still, the idea of maximizing the use of BlockRAM has great merit. I was thinking that this could contain the code space for certain standard procedures such as boot loading, device IO, maybe some functions of the file system. All I have to do is mirror the BlockRAM into the code memory space and I am off and running. If I go with the board having the USB interface then driver code will need to be included here. Oh, and the interpreter FSM better be able to handle certain IO extensions that are not part of Pascal. I have the Televideo 950 emulation in mirrored BlockRAM on my CP/M machine. I didn't want to give up program execution space by having the code in the BIOS so I mirror it in and out as needed. Yes, it slows down the terminal routines but that is a good thing - the screen scrolls by so fast it can't be read anyway. --- In , Tomasz Sztejka <sztejkat@y...> wrote: > --- ben franchuk <bfranchuk@j...> wrote: > > > > rtstofer wrote: > > > > > > It shouldn't be a surprise that with a stack machine > > for every > > > memory fetch there is a consequent stack manipulation. > > So, we're > > > looking at 2 memory accesses for every instruction, > > minimum. This > > > type of machine screams for a Harvard architecture > > which, > > > heretofore, I had considered an oddity. Silly me, I > > get nearly a > > > 100% speedup with the increased bandwidth. > > > > > > So, the issue is: which prototype board will allow me > > to have 2 > > > completely separate memory systems 32 bits wide with a > > 17 or 18 bit > > > address. > > > > > > Most stack machines are often FORTH and you only need a > > small amount > > of stack space say 256 words if you go with that > > langauge. How ever this > > stack could be internal in the FPGA. The same goes for > > the Forth > > return stack as well. > > Ben. > > > Well, I do beleve, that 256 words may be not enough for > more complex programs. However knowning the way stack is > used (push, pop, peek and peek few elements below top of > stack) you can relatively easy change internal fixed size > stack to ring like based stack cache. The data are pushed > on stack and if stack overflow appears memory write is > generated to flush the oldest data and make place for new > data. Same for pop - if there is not enough data for > deepest possible peek the memory read cycle is generated. > This can spare a lot of memory bandwidth and gives high > performance gain compared to pure stack-in-shared-memory > solution. This also gives you much more flexibility than > on-chip stack only. > I do beleve that 8 or 16 stack levels ring cache will be > enough - most program grows stack by significant number of > words then creates a lot of tiny jitter and shrinks it back > in one move. The whole 'jitter' stuff will be cached and > won't absorb memory. > regards, > Tomasz Sztejka. > ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com |
|
Whew! I forgot that this will not be programmed in Forth... :)
At 04:29 AM 10/23/2004, you wrote:
>All data, especially data local to a procedure, is dynamically >allocated on a stack when the procedure is invoked and the stack is >cut back when the procedure terminates. If the procedure was >actually a function, the return value remains on the top of the >stack (TOS) for the calling procedure to use. There are a bunch of >exceptions but let's just assume a function returning an integer. ...snip... Rick Collins Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAX |
|
At 05:35 PM 10/23/2004, you wrote:
>I am aware that the old Burrough's machines were stack oriented but I >haven't really researched them very much. IIRC they were developed to >run Algol which is a very close ancestor of Pascal. HP also made stack oriented computers. >There is a reason that stack computers aren't popular - I don't think >they make efficient use of memory bandwidth or the addition of >registers. Then too, most machines aren't really designed for block >structured languages. They can be compiled and run, of course, but it >is always less than optimal. That depends on the hardware. If your stack is separate from memory, rather part of the CPU, then it is a lot like registers, providing fast access to variables and uses *NO* memory bandwidth. I don't think this is good for huge computers with many MB of memory, but for smaller machines it can be very efficient. >C has a lot of advantages here in that blocks aren't nested and there >are really only two data areas: global or local. In the case of >Pascal with perhaps 10 levels of lexical nesting the situation on >operand fetch is much more complex. Particularly if a procedure has >recursed down a tree. I am not aware of a significant difference between C and Pascal in the usage of stack frames. Don't they both use that for local storage? Rick Collins Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAX |
|
|
|
Arius - Rick Collins wrote: > Whew! I forgot that this will not be programmed in Forth... :) Ya! A non FORTH stack machine for a change. PS. Another virtual stack architecture here - BCPL. http://www.cl.cam.ac.uk/users/mr/index.html |
|
|
|
<snipped all over the place!> > HP also made stack oriented computers. I didn't know that. More research to do... > >There is a reason that stack computers aren't popular - I don't think > >they make efficient use of memory bandwidth or the addition of > >registers. Then too, most machines aren't really designed for block > >structured languages. They can be compiled and run, of course, but it > >is always less than optimal. > > That depends on the hardware. If your stack is separate from memory, > rather part of the CPU, then it is a lot like registers, providing fast > access to variables and uses *NO* memory bandwidth. I don't think this is > good for huge computers with many MB of memory, but for smaller machines it > can be very efficient. The stack on a Pascal machine is HUGE. Perhaps the active stack reaches 30K words running the compiler and the heap (another stack) is probably similar. In fact, all memory in use is either on the stack or in the heap. There is no static memory for all practical purposes. > >C has a lot of advantages here in that blocks aren't nested and there > >are really only two data areas: global or local. In the case of > >Pascal with perhaps 10 levels of lexical nesting the situation on > >operand fetch is much more complex. Particularly if a procedure has > >recursed down a tree. > > I am not aware of a significant difference between C and Pascal in the > usage of stack frames. Don't they both use that for local storage? I could be all wet here but lexically there are only two levels in C: those variables that are local and usually placed on the stack and those that are global and permanently allocated. Blocks are not nested in C. Consider the nesting of a portion of the Pascal compiler: Program Pascal ..Procedure Block ....Procedure Body ......Procedure Statement ........Procedure Expression ..........Procedure SimpleExpression ............Procedure Term To get to a variable defined in Procedure Block from Procedure Term the run time interpreter (or hardware) has to do indirect addressing through the stack frames until it gets back to the one for Procedure Block. Then it can calculate an actual address by adding in the offset from the base of the stack frame to the actual variable. That's a lot of indirection. And you can't really allocate statically because of recursion. The value in a variable in the most recent invocation of a procedure following recursion does not have to be the same as an encompassing invocation. Next week I am going to sit down and try to figure out all of the op codes and how I plan to implement them. Once I have that laid out in a spreadsheet it may be easier to see what the architecture will look like. This is going to be quite a project! Richard > Rick Collins > > rick.collins@a... > > Arius - A Signal Processing Solutions Company > Specializing in DSP and FPGA design http://www.arius.com > 4 King Ave 301-682-7772 Voice > Frederick, MD 21701-3110 301-682-7666 FAX |
|
<snip> > Well, I do beleve, that 256 words may be not enough for > more complex programs. However knowning the way stack is > used (push, pop, peek and peek few elements below top of > stack) you can relatively easy change internal fixed size > stack to ring like based stack cache. The data are pushed > on stack and if stack overflow appears memory write is > generated to flush the oldest data and make place for new > data. Same for pop - if there is not enough data for > deepest possible peek the memory read cycle is generated. > This can spare a lot of memory bandwidth and gives high > performance gain compared to pure stack-in-shared-memory > solution. This also gives you much more flexibility than > on-chip stack only. > I do beleve that 8 or 16 stack levels ring cache will be > enough - most program grows stack by significant number of > words then creates a lot of tiny jitter and shrinks it back > in one move. The whole 'jitter' stuff will be cached and > won't absorb memory. > regards, > Tomasz Sztejka. Tomasz, Intellectually, I might find it challenging to keep track of what is in the ring cache and what is in RAM. I think the approach has great merit - I don't know that, at this stage in my experience, I can implement it. I know, in concept, how this would work but writing VHDL to make it happen may be above me. But, as I look further at the issues, I am beginning to think that I could implement a 2 element cache, complete with 'dirty' bits for the top two elements of the stack. This means that the stack pointer really does point to the real location in memory and I don't have to keep track of what is in registers and where the real stack is located. All I have to do is a write-through and maintain the 'dirty' bits. I think I will still implement the Harvard architecture for a separate program space later on. It means I have to change boards at some point, but not today. For now I will do a separate instruction fetch at the end of each instruction execution. I will be able to pull that state out quite easily and implement it as a parallel process. An interesting idea about the ring cache... Richard |
|
|
|
rtstofer wrote: > But, as I look further at the issues, I am beginning to think that I > could implement a 2 element cache, complete with 'dirty' bits for > the top two elements of the stack. This means that the stack > pointer really does point to the real location in memory and I don't > have to keep track of what is in registers and where the real stack > is located. All I have to do is a write-through and maintain > the 'dirty' bits. But often the compilers for stack machines are very simple you have alot of stack thrashing. What may be needed is a architecture that can access to stack machines registers as a regular machine for special cases of operations. Ben. |
|
|
|
--- rtstofer <> wrote:
> <snip> > > Well, I do beleve, that 256 words may be not enough for > > more complex programs. However knowning the way stack (...) > Intellectually, I might find it challenging to keep track > of what is > in the ring cache and what is in RAM. I think the > approach has > great merit - I don't know that, at this stage in my > experience, I > can implement it. I know, in concept, how this would > work but > writing VHDL to make it happen may be above me. This is very simple. All you have to do is to stop thinking about HDL/Verilog and start thinking in terms of gates, flip-flops and cycles. The HDL is only a langugae - if you can draw a schematic by hand then coding it in any HDL should be no problem (that's why I do prefer Verilog - it has _wires_! ) Let's say we have: SP - stack pointer, which points to first free element of stack, where we would push data N - which is the deepest peek range (or DUP_N in Forth terms) (a power of two). 0 means only peek is possible, 1 that you can look at element one below element on top of stack and so on; M - is a capacity of ring (a power of two) RP - this is a ring pointer, size necessary to carry M possitions. Points to free place in ring. K - actuall used space in ring. (below pseudocode certailny contains mistakes) on init: K = N+1 SP = xxx RP = 0 - stack contains N+1 trash words, but it does not matter - you won't ever pop them; push: ring[RP] = data RP++ SP++ K++ if (K==M) { generate memory write cycle: memory[SP-M]=ring[RP] K=M-1 } pop: --RP --SP --K return value = ring[RP] if (K==N) { generate memory read cycle: ring[RP-N-1]=memory[SP-N-1] } stack switch: push times M set K = N+1 set SP = target SP + N +1 pop N times the add operation: ring[RP-1]=ring[RP-1]+ring[RP-2] proceed like during pop operation. As you can see, as long as M-N stack moves are generated no memory cycle will be inserted. Having ring to be a dual-port memory (or extra register for ring[RP-1] element) allows all two arg operations to be executed within more or less single cycle. The second subject of this discussion appears to be how compilers works and how they do generate code. In case of stack machines it often looks diferrent that others exposed. You have following data structures: - a work stack - this is like registers set in risc/cisc machines. Local variables are not allocated there. - a return stack - this stores jump targets, at other branch bookeeping informations. It can be joined with work stack but having it separate allows speed boost and more safety; - a frame stack - this is a pure software stack - you have a frame pointer and indirect addressing opcodes. Frame stack is managed by software and is used to store local variables. Note, blocks in code ( the C or java { } ) does not move frame pointer - this is up to compiler to find right place for local variables within subroutine frame. No multiple indirections there (there are also so called 'compiled stacks' which does not use indirection at all but prevents recursion). At the entry to subroutine frame pointer is incremented by known constant and at return is decremented to restore previus value; - a rest of memory, for dynamic variables heap, static variables and so on. The more variables compiler can allocate as static or can find a fixed space in local variables frame the better it works. I recommend to read JVM specs (brief scan through it will give you a lot of informations about dos and dont's in stack machines). ===== Tomasz Sztejka POLON ALFA (work) http://www.polon-alfa.com.pl/ (private) http://www.sztejkat.prv.pl/ ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com |
|
----- Original Message ----- > The more variables compiler can allocate as static or can > find a fixed space in local variables frame the better it > works. I recommend to read JVM specs (brief scan through it > will give you a lot of informations about dos and dont's in > stack machines). I would like to jump in here. The JVM is a very interesting stack machine. It's a little bit different from FORTH. If you are interested how a JVM can be implemented in an FPGA there is a design available at: http://www.jopdesign.com/ I've also a 'not jet published' paper especially on designing a stack architecture for the JVM inside an FPGA. If you are interested I can send you a copy of it. It will get available on the website after being published. Martin |
|
|
|
Tomasz Sztejka wrote: > This is very simple. All you have to do is to stop > thinking about HDL/Verilog and start thinking in terms of > gates, flip-flops and cycles. The HDL is only a langugae - > if you can draw a schematic by hand then coding it in any > HDL should be no problem (that's why I do prefer Verilog - > it has _wires_! ) what about a real ring ... the data moves :) Q-1[ 0 mulx ] Q+1[1 ]--[D Q]-- Typical stack element for small rings. While a poor idea for a FPGA this may be a better idea if you ever want to port to real logic in a custom chip. Any how I am sticking to TTL logic rather than using FPGA's since all my designs have a front panel and I have large amount of bulbs to light up, and a large TTL 4 bit slice PCB has power to drive the panel. A CPLD device needs about 128? clb's and about 75 real i/o pins to replace the ttl pcb. I suspect you will run into a similar problem when sombody builds a machine like the IBM1130. Ben. |
|
|
|
Arius - Rick Collins wrote:
> If you try to build the 1130 out of TTL, you will likely end up with a not > so portable space heater! Minicomputers of the 70's were all built with > TTL and they were big and hot! An FPGA is a *MUCH* better way to go since > you can correct all your mistakes as if they were software, by just > recompiling and downloading. No soldering iron required! I use schematic entry for any FPGA work I have done in the past. Moving to TTL (LS) is a advantage for me since the Logic is visable. Also if I have a idea 3 am I don't need to turn on the computer and fire up the software. With a 200 MHZ computer the recompiling took often several hours before I could download the firmware, for the FPGA board I had then. The main reason I stopped development work was I could get a serial rom programmed, and the FPGA I was useing was TOO small for the CPU design I was doing at the time. > You can power plenty of light bulbs with the PSU to drive an FPGA. By > light bulbs, I assume you really mean LEDs, right? I know you are going > the nostalgia route, but light bulbs can be a PITA with the heat and short > life. Even so, if you must have bulbs, use a transistor to drive the light > bulbs from the FPGA output. Typical TTL does not have enough umph to drive > a 60 mA light bulb. I plan to use a 74ls06 driver -- ample current ( 30 ma ) for a led or a bulb. I might use CPLD's but right now I can allways port TTL to FPGA rather than the other way around. Ben. PS. I think a IBM1130 1/4 size would be neat -- every thing scaled down, but the console selectric and the disk drives would be a real pain to have functional since nobody has 'mechanical' resources nowdays to machine parts. |
|
|
|
rtstofer wrote: > > --- In , ben franchuk <bfranchuk@j...> wrote: > Well, if I build the 1130, it won't have any of the 'real' IO > devices anyway. The whole point of the IBM1130 Computing envorment is PUNCHED CARDS - NOISEY LINE PRINTER - MASIVE DISK IO. Did I also say Blinking lights and other stuff. > What would be really neat is to have an FPGA with a > network interface. Now my printer output can go to a networked > printer. How ever the printer and screen fonts need to be created for the correct printing of the characters used in FORTRAN and say APL. > I guess Disk IO could come from a networked file server and > the console functions could go to a networked PC running a Visual > Basic application. But if you do want to play with a IBM1130 you can get a software emuator today -- simh http://simh.trailing-edge.com/ Software may be hard to find nowdays. Good luck with your FPGA projects. Ben. |
|
Hi all, B u r c h E D is having a discount sale. Hopefully many of the items are of interest to FPGA-CPUers. If you are already the user of a B u r c h E D system, it's a great time to consider expanding your system with some extra plug-on modules, or to upgrade with one of our "hardware upgrade deals". The ad for the sale is below. Best regards, Tony Burch ----------------- FPGA Boards Massive Sale! 10% to 25% off the normal price of all items! Sale ends 30 November 2004. Download sale flyer at http://www.BurchED.biz/BurchEDSale.pdf Website at http://www.BurchED.biz Great savings for a limited time! * 25% off B5-X300 FPGA Boards, including download cable. Great FPGA boards for building real-world projects! Largest number of accessible I/O pins. * 20% off Super-Value-Packs. B5-X300 and plug-on modules. Super-Super value! * 20% off all plug-on modules. Widest selection, including Switches, LEDS, SRAM, CF, ... * 15% off all hardware upgrade deals, for current B u r c h E D customers. * 10% off Sydney-X1 and Sydney-X1E development systems. * 10% off all accessories, including the new B5-Easy-Clips! * 10% off all bulk-saver-packs! Great value multiplied! Set up your university lab! If you already have some other FPGA board, which was good for evaluation or demo, then maybe now you're "past it" and you need a board for building real-world designs, or advanced student projects... B u r c h E D boards have: * The largest number of accessible I/O pins * The widest selection of plug-on modules * Are ideal for prototyping real-world designs * Are economical for advanced student projects At these sale prices, it is a great time to get a B u r c h E D system, or some sets of systems. Order online now, or ask us for a quote online now - the sale ends on 30 November 2004! B u r c h E D Making FPGA Prototyping Easy http://www.BurchED.biz |