This list is for discussion of the design and implementation of field-programmable gate array based processors and integrated systems. It is also for discussion and community support of the XSOC Project (see http://www.fpgacpu.org/xsoc).
|
All, please read http://www.fpgacpu.org/log/sep02.html#IP-redux. Agree? Disagree? Discuss :-) Jan. |
|
|
|
--- In , "Jan Gray" <jsgray@a...> wrote: > All, please read http://www.fpgacpu.org/log/sep02.html#IP-redux. Agree? > Disagree? Discuss :-) > > Jan. Jan, I am not a developer so I don't see things from that perspective. I do remember the early days of hobby computing (before personal computing) and the myriad suppliers of components and software. I won't enumerate them but a few examples like Borland, Lotus, Word Perfect come to mind. Eventually they all rolled up into Microsoft. In the beginning of FPGAs (from my limited perspective) there simply wasn't enough talent available, even at the manufacturing companies, to develop all the things that needed to be created. So, for a few years, there is a cottage industry for IP developers until a certain critical mass is achieved. Now, in an effort to compete for chip sales, the manufacturers are throwing in IP cores. Altera and Xilinx are in the business of selling chips, not IPs. From a personal perspective I like the 'free' cores and the 'free' tools. There is no way in the world I could afford to play with FPGAs if I had to buy the tools. They would remain unexplored to my loss. Even in the highly evolved PC market their are still things that don't come from Microsoft. Specialized software like Autocad and PSpice and there are rumors of operating systems other than Windows (yes, it's true! I have one). Even in the world of IP cores it will be possible for a few developers to provide 'best of class' products that integrate perfectly into the chip maker's platform. But only the very best will survive! In summary: chip manufacturers are in the business of selling chips. Period. If it is necessary to give away development tools and reference cores, so what? Sell more chips to pay for the overhead. Best of class components will always demand a premium but only the very best. Critical mass has been reached; everyone better get on the same bus (pun intended?). But what do I know, I'm retired. I will leave the battles to the younger engineers. |
|
Matlab and Simulink would probably be the best examples of a GUI/integrated IDE with libraries. Protel and Labview are also getting into the game with parameterizable VHDL/Verilog Libraries. I'm not sure if they allow users to integrate their own modules, but for small freelance designers the cost of these packages are fairly prohibitive. I would quite like to see some work around SciLab, which is free from Inria in France, to integrate an open source framework for designing VHDL or Verilog components. There used to be a free image procesing package called Khoros that had a suite of applications for designing and integrating image processing algorithms into glyphs. It was designed to run under Unix and Linux. Components communicated using intermediate files and it lacked the Scientific / Mathematical simulation capability. I'm not sure if there are any other GNU packages, such as schematic capture packages, that can form the basis for a System Builder or if they could be integrated into a mixed signal simulation package that allowed you to simulate the interaction between a microprocessor program and analog circuits. Some years ago, a friend from the internet pointed me to a mixed signal simulation package that allowed you to simulate Microprocessor, LCD displays and other components, including analog designs, and you could download and simulate microcomputer programs. Its all out there, but whether it is affordable and an open archictecture is another matter. John. Jan Gray wrote: >All, please read http://www.fpgacpu.org/log/sep02.html#IP-redux. Agree? >Disagree? Discuss :-) > >Jan. > -- http://members.optushome.com.au/jekent |
|
|
|
> In summary: chip manufacturers are in the business of selling > chips. Period. If it is necessary to give away development tools > and reference cores, so what? Sell more chips to pay for the > overhead. Best of class components will always demand a premium but > only the very best. Critical mass has been reached; everyone better > get on the same bus (pun intended?). Like Chris, I'm another listener -- a DSP person not ASIC. Some of Richard's posts are almost crazy -- like trying to resurrect a PDP11 on an FPGA :-) I keep waiting for his subject line "It's alive!". But they're always fun, and his point above is right on, exactly right on. However, there are differences in how chip manufacturers attempt to succeed. Although Texas Instruments beat Intel to the "first computer on a chip" in the late '70s (remember calculators), Intel was the tortoise and stuck with the hard, slow work of creating development tools and business models that helped developers. The history of PCs is the result. Texas Instruments execs -- some of whom are still in charge -- never forgot that lesson and they have since pummeled ADI and Mot (and eliminated AT&T/Lucent completely) in DSPs, in part by focusing on development tools and educational programs. I would point out to Jan that TI has no qualms about stepping on its third-party providers and stealing their products to build in qty and give away for free. Sometimes, TI is nice about it and tries to buy the 3p :-) The successful chip manufacturers tend to give away more things all the time. Jan's theory that: "If FPGA vendors give away enough free cores, the end effect could be to discourage pure IP vendors from contributing to that device vendor's value chain, reducing the supply of device optimized cores, hence design wins, hence device sales." will stick to the chip vendors like water to a duck. As Rich said, the chip vendors are in the business of selling chips. Whatever they need to do to accomplish that, they will. -Jeff |
|
|
|
I am working on a custom CPU to implement in an FPGA. I am optimizing nearly every aspect of the machine to keep the size small and the speed high while minimizing the code size (very important since there is little memory inside the FPGA). After going through several iterations of designing the instruction set, I ended up with both a relative call and an absolute call. The relative call was essentially free in terms of the hardware since the jumps are relative. It seems to me that the absolute call is the more useful of the two and I could live a rich, full life without the relative call. However, I have currently completed a first pass at the design with both call types in the instruction set and a fair amount of the design is very well optimized. My question is, will the relative call be pretty useless compared to the absolute call? Or will both be useful? What situation would make the relative call more useful? Or do I have it backwards and I should give up the absolute call? At this point there are some read/write internal register commands I could use in place of either one of these call types, but they are currently memory mapped and working fine. There is no strong need to make any changes. I am reaching a point where changes to the instruction set will require excessive amounts of redoing optimizing and debugging. So I would like to make a final decision. Any and all comments are appreciated! Did I mention that my software target is to run forth on this chip? I think that may make a difference. Rick Collins Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAX |
|
|
|
Both relative call and absolute call take you from a fixed address to another fixed address. So you don't need both. Relative call might be smaller. Relative call might facilitate position independent code. If your tools are always going to emit one or the other, there is no point having both, documenting both, testing both, except that orthogonality might make it less elegant to use the relative call opcodes for something else. Consider the pdp-11 had some addressing modes combinations that were little or never used, but it was simpler and more elegant to keep 'em than to forbid or reuse them. You will probably also need an indirect call, I would think... In the xr16, although you could always form any call to any address 0xABCD via the pair of 16-bit insns IMM 0xABC ;; JAL rd,0xD(r0) // rd := pc, pc := 0xABCD it was sufficiently frequent that I added a special opcode CALL func that encoded all of the above in one 16-bit instruction (assuming the call target 0xABC0 is 16-byte aligned, and assuming rd==r15, the return address linkage register). You have my permission to leave things well enough alone for now. :-) And just as you tune the ISA to the FPGA, so you should do an iteration of tuning the ISA to the software tools that generate code for it. Once you have a working tool flow, you can build and run some programs and that will give you more refined ideas about what to change next -- in particular, what to pitch that you thought you would need, but now realize you don't. Wishing you much fun, Jan. |
|
|
|
In my opinion (and yes, more often than not, it's crazy) I would keep the relative and decide on the absolute. The reason: when you get to building an operating system and start loading code into the machine you don't have to translate relative addresses. It's clear you need some absolute addressing if you are going to 'call' system routines that are fixed in memory. In fact, absolute plus an index would be great! Or indirect through a transfer vector, indexed of course. And it you ever want to relocate a code block after it has been loaded , relative addressing is the thing to have. All of the absolute addresses were modified when the code was loaded so nothing changes regardless of where the code is moved. Remember guys, I am old, I am retired and I do this stuff for fun. Oh, and the P4 machine is starting to work quite well. The hard part, 'call' and 'return' with and without parameters and return values works as do some of the arithmetic functions. Logic functions come next so I can start debugging 'if' 'then' 'else'. My naive design is a gigantic synchronous state machine and I think I will run out of space before I finish. I really want to take another look at Jan's implementation. It is far more elegant! In fact, I downloaded it again planning to steal from the design. As near as I can tell the hardware platform is obsolete and the development tool chain isn't free. So, I can start over with XSOC or keep going as I am. Until I hit the wall, it is full steam ahead! Richard --- In , "Jan Gray" <jsgray@a...> wrote: > Both relative call and absolute call take you from a fixed address to > another fixed address. So you don't need both. > Relative call might be smaller. > Relative call might facilitate position independent code. > If your tools are always going to emit one or the other, there is no point > having both, documenting both, testing both, except that orthogonality might > make it less elegant to use the relative call opcodes for something else. > Consider the pdp-11 had some addressing modes combinations that were little > or never used, but it was simpler and more elegant to keep 'em than to > forbid or reuse them. > You will probably also need an indirect call, I would think... > > In the xr16, although you could always form any call to any address 0xABCD > via the pair of 16-bit insns > IMM 0xABC ;; JAL rd,0xD(r0) // rd := pc, pc := 0xABCD > it was sufficiently frequent that I added a special opcode > CALL func > that encoded all of the above in one 16-bit instruction (assuming the call > target 0xABC0 is 16-byte aligned, and assuming rd==r15, the return address > linkage register). > > You have my permission to leave things well enough alone for now. :-) And > just as you tune the ISA to the FPGA, so you should do an iteration of > tuning the ISA to the software tools that generate code for it. Once you > have a working tool flow, you can build and run some programs and that will > give you more refined ideas about what to change next -- in particular, what > to pitch that you thought you would need, but now realize you don't. > > Wishing you much fun, > Jan. |
|
|
|
Thanks to both you and Jan for your posts. This reply is to both of you. At 09:25 PM 11/25/2004, you wrote: >In my opinion (and yes, more often than not, it's crazy) I would keep >the relative and decide on the absolute. The reason: when you get to >building an operating system and start loading code into the machine >you don't have to translate relative addresses. It's clear you need >some absolute addressing if you are going to 'call' system routines >that are fixed in memory. In fact, absolute plus an index would be >great! Or indirect through a transfer vector, indexed of course. I understand what you are saying. But one of the reasons that I decided to go with a stack architecture is to avoid the complication of these multiple addressing modes. I think I did not say it until the end of my message, but I am designing this CPU to optimize forth and will not be using any other languages (other than the assembly language which is a lot like the forth primitives. Forth is "threaded" and I expect to use subroutine threading. This will require a call instruction with a fixed destination. There is not much need that I know of for relocatable code, especially since compiling is very fast. Instead of linking precompiled modules, you can just recompile the code. So I don't think I need a relative call and I am certain I don't need indexed or calculated calls. However, if I do need that, it can be done with a pointer to a table of addresses and a call to a small routine that calculates the address from the table puts it on the return stack and does a return which then behaves like a jump. Since the return address from the original call is still on the stack, the routine that is jumped to will return to the original piece of code when done. >And it you ever want to relocate a code block after it has been loaded >, relative addressing is the thing to have. All of the absolute >addresses were modified when the code was loaded so nothing changes >regardless of where the code is moved. I think my question is really more of a Forth question. I have asked the question there as well, I was just trying to cover my bases by discussing it here. I think that other languages like to see a very flexible instruction set that is not required to implement forth. In fact, the ISA of this machine basically *is* forth. I just don't know a lot about how best to implement forth either in hardware or software. >Remember guys, I am old, I am retired and I do this stuff for fun. >Oh, and the P4 machine is starting to work quite well. The hard part, >'call' and 'return' with and without parameters and return values >works as do some of the arithmetic functions. Logic functions come >next so I can start debugging 'if' 'then' 'else'. > >My naive design is a gigantic synchronous state machine and I think I >will run out of space before I finish. I really want to take another >look at Jan's implementation. It is far more elegant! In fact, I >downloaded it again planning to steal from the design. As near as I >can tell the hardware platform is obsolete and the development tool >chain isn't free. So, I can start over with XSOC or keep going as I >am. Until I hit the wall, it is full steam ahead! I remember some of your posts here. I believe you are recreating an older machine, right? >--- In , "Jan Gray" <jsgray@a...> wrote: > > Both relative call and absolute call take you from a fixed address to > > another fixed address. So you don't need both. > > Relative call might be smaller. > > Relative call might facilitate position independent code. > > If your tools are always going to emit one or the other, there is no >point > > having both, documenting both, testing both, except that >orthogonality might > > make it less elegant to use the relative call opcodes for something >else. > > Consider the pdp-11 had some addressing modes combinations that were >little > > or never used, but it was simpler and more elegant to keep 'em than to > > forbid or reuse them. > > You will probably also need an indirect call, I would think... I agree that having both rel and abs calls is not all that useful. The one advantage of absolute addresses is that they are a lot more user friendly while I am hand assembling code. So that is the way I am leaning. If I had more opcode space, I would just leave it alone. But decoding a single register takes a bit of work and can end up in a critical timing path if I am not careful. With the 16 extra opcodes, I can add read and write of 8 registers as dedicated instructions or even use some of them for other functions since I only have 3 registers at the moment. > > In the xr16, although you could always form any call to any address >0xABCD > > via the pair of 16-bit insns > > IMM 0xABC ;; JAL rd,0xD(r0) // rd := pc, pc := 0xABCD > > it was sufficiently frequent that I added a special opcode > > CALL func > > that encoded all of the above in one 16-bit instruction (assuming >the call > > target 0xABC0 is 16-byte aligned, and assuming rd==r15, the return >address > > linkage register). I have looked at a lot of FPGA CPUs on the web, but I think I remember your machine. I recall that it is *very* streamlined and is actually smaller than mine. That is pretty good considering that it has a set of 16 registers and mine just has the two stacks. I expect the efficiency comes from having the registers in LUT ram which can be very fast and includes the output multiplexor while my machine has to have explicit multiplexors for all the inputs to the stacks. I believe my machine will have an advantage when used for implementing Forth because it executes all the essential primitives in one clock cycle while RISC machines will typically require multiple clock cycles. It also has an 8 bit opcode which is a significant issue considering the small amount of ram in the FPGA I am using, 10 blocks of 512 bytes. > > You have my permission to leave things well enough alone for now. >:-) Thank you, I appreciate that... :) >And > > just as you tune the ISA to the FPGA, so you should do an iteration of > > tuning the ISA to the software tools that generate code for it. >Once you > > have a working tool flow, you can build and run some programs and >that will > > give you more refined ideas about what to change next -- in >particular, what > > to pitch that you thought you would need, but now realize you don't. That is what I don't want to do. I want to wrap up this design and move on to other things. I probably should not even try to optimize it further for now, but I may not be coming back to it again for changes. I'm sure you know that tune! Rick Collins Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAX |
|
|
|
Arius - Rick Collins wrote: > I am working on a custom CPU to implement in an FPGA. I am optimizing > nearly every aspect of the machine to keep the size small and the speed > high while minimizing the code size (very important since there is > little memory inside the FPGA). After going through several iterations > of designing the instruction set, I ended up with both a relative call > and an absolute call. The relative call was essentially free in terms > of the hardware since the jumps are relative. It seems to me that the > absolute call is the more useful of the two and I could live a rich, > full life without the relative call. However, I have currently > completed a first pass at the design with both call types in the > instruction set and a fair amount of the design is very well optimized. > > My question is, will the relative call be pretty useless compared to the > absolute call? Or will both be useful? What situation would make the > relative call more useful? Or do I have it backwards and I should give > up the absolute call? > > At this point there are some read/write internal register commands I > could use in place of either one of these call types, but they are > currently memory mapped and working fine. There is no strong need to make > any changes. I am reaching a point where changes to the instruction set > will require excessive amounts of redoing optimizing and debugging. So I > would like to make a final decision. > > Any and all comments are appreciated! > > Did I mention that my software target is to run forth on this chip? I > think that may make a difference. Some random thoughts (I'm assuming you're fairly new to processors, correct me if I'm wrong :)) - Why is the relative call 'essentially free'? It needs an adder, whereas an absolute call doesn't. Do you need to worry about address wrap-around on your relative calls? What are the sizes of your relative and absolute calls? Is the relative call more compact, or could it be made so? Or are all instructions fixed-length? Out of interest, did you look at the Xilinx cores? I think there are free ones(?) - are they any good? It sounds like you're optimising the hardware before you've had a chance to run any code and verify your design - is this right? Everything else being equal (which it normally isn't), the advantage of a relative call is that it's position-independent; it doesn't require tools to carry out fixups after relocation in the way that absolute calls do. So, the $M question is, how important is relocatability to you? will you have an OS; do you intend to run only one process, or more than one? If more than one, how will you locate them in memory? Can you write the tools to do address fixup if necessary, or will it never be necessary, because you'll only ever have one program, located at address 0? What about compilers/assemblers/etc.? Do you have a Forth compiler, and do you know what it produces? Can you port it? Cheers Paul |
|
|
|
Rick, I got to thinking about your original post 'after' I promoted the idea of relative addressing. You're probably right, for a strictly Forth type of machine it is probably unnecessary although I don't know much about the implementation. That's the nice thing about FPGAs - if you find out you just absolutely have to have it - add it! Talk about the ultimate Erector Set! My first project was to use the T80 core (Z80 emulation) and get CP/M running with 16 ea 8 MB disk drives (2 CF modules), embedded graphics (thanks to John Kent) and a PS/2 keyboard. A definite retro project. Running at 12.5 MHz it is a pretty quick machine compared to the originals - heck, it is 6 times faster than my Altair 8800A and that's not counting the difference between CF and 8" floppies! My current project is to implement the original Pascal P4 interpreter (stack machine) in hardware as opposed to software. My goal is to have instruction execution speed faster than the CDC 6600 on which it was originally implemented. Faster than a speeding mainframe! Richard --- In , Arius - Rick Collins <dsprelated@a...> wrote: > Thanks to both you and Jan for your posts. This reply is to both of you. > > At 09:25 PM 11/25/2004, you wrote: > > >In my opinion (and yes, more often than not, it's crazy) I would keep > >the relative and decide on the absolute. The reason: when you get to > >building an operating system and start loading code into the machine > >you don't have to translate relative addresses. It's clear you need > >some absolute addressing if you are going to 'call' system routines > >that are fixed in memory. In fact, absolute plus an index would be > >great! Or indirect through a transfer vector, indexed of course. > > I understand what you are saying. But one of the reasons that I decided to > go with a stack architecture is to avoid the complication of these multiple > addressing modes. I think I did not say it until the end of my message, > but I am designing this CPU to optimize forth and will not be using any > other languages (other than the assembly language which is a lot like the > forth primitives. > > Forth is "threaded" and I expect to use subroutine threading. This will > require a call instruction with a fixed destination. There is not much > need that I know of for relocatable code, especially since compiling is > very fast. Instead of linking precompiled modules, you can just recompile > the code. > > So I don't think I need a relative call and I am certain I don't need > indexed or calculated calls. However, if I do need that, it can be done > with a pointer to a table of addresses and a call to a small routine that > calculates the address from the table puts it on the return stack and does > a return which then behaves like a jump. Since the return address from the > original call is still on the stack, the routine that is jumped to will > return to the original piece of code when done. > >And it you ever want to relocate a code block after it has been loaded > >, relative addressing is the thing to have. All of the absolute > >addresses were modified when the code was loaded so nothing changes > >regardless of where the code is moved. > > I think my question is really more of a Forth question. I have asked the > question there as well, I was just trying to cover my bases by discussing > it here. I think that other languages like to see a very flexible > instruction set that is not required to implement forth. In fact, the ISA > of this machine basically *is* forth. I just don't know a lot about how > best to implement forth either in hardware or software. > >Remember guys, I am old, I am retired and I do this stuff for fun. > >Oh, and the P4 machine is starting to work quite well. The hard part, > >'call' and 'return' with and without parameters and return values > >works as do some of the arithmetic functions. Logic functions come > >next so I can start debugging 'if' 'then' 'else'. > > > >My naive design is a gigantic synchronous state machine and I think I > >will run out of space before I finish. I really want to take another > >look at Jan's implementation. It is far more elegant! In fact, I > >downloaded it again planning to steal from the design. As near as I > >can tell the hardware platform is obsolete and the development tool > >chain isn't free. So, I can start over with XSOC or keep going as I > >am. Until I hit the wall, it is full steam ahead! > > I remember some of your posts here. I believe you are recreating an older > machine, right? > >--- In , "Jan Gray" <jsgray@a...> wrote: > > > Both relative call and absolute call take you from a fixed address to > > > another fixed address. So you don't need both. > > > Relative call might be smaller. > > > Relative call might facilitate position independent code. > > > If your tools are always going to emit one or the other, there is no > >point > > > having both, documenting both, testing both, except that > >orthogonality might > > > make it less elegant to use the relative call opcodes for something > >else. > > > Consider the pdp-11 had some addressing modes combinations that were > >little > > > or never used, but it was simpler and more elegant to keep 'em than to > > > forbid or reuse them. > > > You will probably also need an indirect call, I would think... > > I agree that having both rel and abs calls is not all that useful. The one > advantage of absolute addresses is that they are a lot more user friendly > while I am hand assembling code. So that is the way I am leaning. If I > had more opcode space, I would just leave it alone. But decoding a single > register takes a bit of work and can end up in a critical timing path if I > am not careful. With the 16 extra opcodes, I can add read and write of 8 > registers as dedicated instructions or even use some of them for other > functions since I only have 3 registers at the moment. > > > In the xr16, although you could always form any call to any address > >0xABCD > > > via the pair of 16-bit insns > > > IMM 0xABC ;; JAL rd,0xD(r0) // rd := pc, pc := 0xABCD > > > it was sufficiently frequent that I added a special opcode > > > CALL func > > > that encoded all of the above in one 16-bit instruction (assuming > >the call > > > target 0xABC0 is 16-byte aligned, and assuming rd==r15, the return > >address > > > linkage register). > > I have looked at a lot of FPGA CPUs on the web, but I think I remember your > machine. I recall that it is *very* streamlined and is actually smaller > than mine. That is pretty good considering that it has a set of 16 > registers and mine just has the two stacks. I expect the efficiency comes > from having the registers in LUT ram which can be very fast and includes > the output multiplexor while my machine has to have explicit multiplexors > for all the inputs to the stacks. > > I believe my machine will have an advantage when used for implementing > Forth because it executes all the essential primitives in one clock cycle > while RISC machines will typically require multiple clock cycles. It also > has an 8 bit opcode which is a significant issue considering the small > amount of ram in the FPGA I am using, 10 blocks of 512 bytes. > > > You have my permission to leave things well enough alone for now. > >:-) > > Thank you, I appreciate that... :) > > >And > > > just as you tune the ISA to the FPGA, so you should do an iteration of > > > tuning the ISA to the software tools that generate code for it. > >Once you > > > have a working tool flow, you can build and run some programs and > >that will > > > give you more refined ideas about what to change next -- in > >particular, what > > > to pitch that you thought you would need, but now realize you don't. > > That is what I don't want to do. I want to wrap up this design and move on > to other things. I probably should not even try to optimize it further for > now, but I may not be coming back to it again for changes. > > I'm sure you know that tune! > > Rick Collins > > rick.collins@a... > > Arius - A Signal Processing Solutions Company > Specializing in DSP and FPGA design http://www.arius.com > 4 King Ave 301-682-7772 Voice > Frederick, MD 21701-3110 301-682-7666 FAX |
|
Paul Davis wrote: <snipped> Sorry - haven't got used to the latency here - most of this stuff has already been answered.. Paul |
|
I have been discussing this in several other forums and I can't remember what I have posted where, so I don't mind responding to your post even if it is a bit redundant. I'll try to keep my replies short. At 04:09 AM 11/26/2004, you wrote: >Some random thoughts (I'm assuming you're fairly new to processors, >correct me if I'm wrong :)) - Yes, you are wrong. I have been designing processors of one sort or another for some 20 years starting with a microprogrammed IO board using a Signetics sequencer chip. >Why is the relative call 'essentially free'? It needs an adder, whereas >an absolute call doesn't. Do you need to worry about address wrap-around >on your relative calls? There are two reasons that make the adder for the relative call virtually free. One is the fact that an adder is required to implement the PC <= PC + 1 used by most instructions. To calculate a relative jump or call the offset is added by the same hardware. The other is the fact that the other address calculations require a 4 input mux which is implemented as a pair of 2 input muxes combined with a third 2 input mux. In my design I use the adder as the final 2 input mux by adding enables to the first two muxes which can zero their outputs when needed. The adder uses the same number of LUTs and is required by the PC inc function anyway, so the relative jump is free while the absolute call requires one of the inputs to the mux (two actually, but they are the same two required by the relative jump/call so that is free as well). >What are the sizes of your relative and absolute calls? Is the relative >call more compact, or could it be made so? Or are all instructions >fixed-length? I have an 8 bit instruction with a 7 bit literal field which is appended to any previous literal values loaded to the return stack immediately before. This allows literals to be built up so that fewer bytes can be used to represent smaller values. The JMPx and CALx instructions contain a 4 bit literal value which is appended in the same way. So a jump -8 to +7 is done with a single byte. A jump -1 kB or +1 kB uses two bytes. The return stack and data stack are 16 bits providing a 64 kB address space. However the program and data memories are only 1 kB in the current implementation due to the limited ram in the FPGA. >Out of interest, did you look at the Xilinx cores? I think there are >free ones(?) - are they any good? Yes, I did. I was intrigued by the very small pico-blaze. But when I examined it closely, I found that it was a very limited processor. I don't recall the details, but they specifically designed the processor around what you could do with the very least amount of logic. So there are various limitations to allow the special features of the Xilinx CLBs to be put to maximum use. At least that is what I seem to recall. The micro-blaze also has a small version, but it is not free. However someone is working on an open source duplicate, but not the smaller version, only the middle or larger version, I don't recall which. The nios-II processor from Altera also looks very good, but they don't have a version for the ACEX chips I am using and it certainly won't work in the Spartan-3 I am also using. >It sounds like you're optimising the hardware before you've had a chance >to run any code and verify your design - is this right? Yes. I don't currently have any code to run. This CPU will be used to control the operation of an FPGA that is emulating a UART interface to the PC and performing DMA to the DSP memory on the board. I think this problem is too complex for a state machine and a fancier processor would be overkill (like the ARM chip I was planning to use). My only concern is the program space available. I don't know for sure that 1 kB is large enough. I do have room for growth, but not lots, 3 kB max if I use none for hardware buffers. >Everything else being equal (which it normally isn't), the advantage of >a relative call is that it's position-independent; it doesn't require >tools to carry out fixups after relocation in the way that absolute >calls do. So, the $M question is, how important is relocatability to >you? will you have an OS; do you intend to run only one process, or more >than one? If more than one, how will you locate them in memory? Can you >write the tools to do address fixup if necessary, or will it never be >necessary, because you'll only ever have one program, located at address 0? Yes, that is the question I don't have an answer to, how useful relocatable code is. The OS will be forth which does not make use of relocatable code AFAIK. I am expecting the code to be fully recompiled anytime a change is made. >What about compilers/assemblers/etc.? Do you have a Forth compiler, and >do you know what it produces? Can you port it? I don't have a forth tool yet. Right now my best choice seems to be the MPE forth which I will have to port to my CPU. They have a $400 version (which may be more if the USD keeps dropping) which is targeted to the Z8. I am just unsure of how much support I will need to port it. I am also posting this to the comp.lang.forth newsgroup and the hForth yahoo group. So I have been getting some good feedback there as well. Thanks for your reply... Rick Collins Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAX |
|
|
|
--- Arius - Rick Collins <> wrote: (....) > Yes. I don't currently have any code to run. This CPU > will be used to > control the operation of an FPGA that is emulating a UART > interface to the > PC and performing DMA to the DSP memory on the board. I > think this problem > is too complex for a state machine and a fancier > processor would be > overkill (like the ARM chip I was planning to use). Exactly. Ofcourse there are other processors you could use (ex. AVR - 8 bit risc at 16Mips max, less than 10$ + free tools) but I assume, you already have a good reason to use FPGA in your project. I think you may safely throw away relative call leaving absolute call and relative branch (you can save some decoder's space). Those two can emulate relative call very easy: call _fixed_bridge _fixed_bridge: branch_relative _offset If you like to be minimal, please take a look at : http://www.sztejkat.prv.pl/downloads/missm/index.html This is a project of stack processor I made recently. I did not implemented it in hardware - the behavioral simulation was done however. You will also find retargetable assembler what you may find usefull with two example targets and set of GUI libraries to make simulator frontend linked with verilog behavioral simulator. ===== Tomasz Sztejka POLON ALFA (work) http://www.polon-alfa.com.pl/ (private) http://www.sztejkat.prv.pl/ ___________________________________________________________ Moving house? Beach bar in Thailand? New Wardrobe? Win £10k with Yahoo! Mail to make your dream a reality. Get Yahoo! Mail www.yahoo.co.uk/10k |
|
|
|
At 01:04 PM 12/2/2004, you wrote: > Exactly. Ofcourse there are other processors you could use >(ex. AVR - 8 bit risc at 16Mips max, less than 10$ + free >tools) but I assume, you already have a good reason to use >FPGA in your project. Yes, the FPGA is already there. I also have a small MCU on the board as well, but I am trying to minimize the parts cost and I can use a $2 very low power MCU if I put the fast one in the FPGA. The FPGA is unpowered (along with most of the rest of the board) for a low power standby mode while the small MCU continues to run. > I think you may safely throw away relative call leaving >absolute call and relative branch (you can save some >decoder's space). Those two can emulate relative call very >easy: > > call _fixed_bridge >_fixed_bridge: branch_relative _offset Currently I am not using the relative call and am leaving the opcode space for one of two possible extensions; adding IO mapped IO fetch and store vs. combining a return with about half of the current instructions. In an instruction frequency analysis of Forth by Koopman, http://www.ece.cmu.edu/~koopman/stack_computers/sec6_3.html, he found that the CALL and RETURN instructions are used very frequently both by a dynamic measure (how often they are executed) and a static measure (how often they appear in the code). So by adding a return operation in a large number of instructions, it saves both the code space used and the execution time for the return. But I need to write some of my own code to find out which instructions will be optimal to combine with the return (since I can't combine all of them) and see if this is practical to implement in the instruction space. > If you like to be minimal, please take a look at : > >http://www.sztejkat.prv.pl/downloads/missm/index.html > > This is a project of stack processor I made recently. I >did not implemented it in hardware - the behavioral >simulation was done however. You will also find >retargetable assembler what you may find usefull with two >example targets and set of GUI libraries to make simulator >frontend linked with verilog behavioral simulator. Yes, this is a very minimal machine. But I don't think it would be optimal for my target. My instruction set is so simple, that I can implement the assembler as part of my VHDL code. Once the CPU is running, I expect to use Forth as the programming language. Thanks for your inputs. Rick Collins Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAX |
|
John Kent wrote: >Matlab and Simulink would probably be the best examples of a >GUI/integrated IDE with libraries. >Protel and Labview are also getting into the game with parameterizable >VHDL/Verilog Libraries. >I'm not sure if they allow users to integrate their own modules, but for >small freelance designers >the cost of these packages are fairly prohibitive. Only thing is with dxp2004 / protel designed to be used with their software. Not portable. Be interesting to see if eagle follows along. (the only commerical cad program that is available for windows, linux and mac osx) >I would quite like to see some work around SciLab, which is free from >Inria in France, to integrate an >open source framework for designing VHDL or Verilog components. Surely it be easier to make it as a plugin for eclipse (like xiinx has done), only that means using java. http://www.eclipse.org/projects/index.html Probably could get the eclipse tools guys to help with the addin. A combination of these, would surely make a soc builder app (well at least the front end) http://www.eclipse.org/tools/index.html http://www.eclipse.org/technology/index.html http://www.eclipse.org/gef/ whats the picture on the front page!(logic design app) >There used to be a free image procesing package called Khoros that had a >suite of applications for designing >and integrating image processing algorithms into glyphs. It was designed >to run under Unix and Linux. >Components communicated using intermediate files and it lacked the >Scientific / Mathematical simulation capability. Not free any more. Was very buggy software that crashed very regularly in my experiance. (windows or solaris) news:comp.soft-sys.khoros Recently became visiquest. Should be able to find a copy of the free student of khoras verson floating around somewhere, or old linux / unix version(lots from google) or 15 day eval version of visiquest http://www.accusoft.com/support/evalcenter/ >I'm not sure if there are any other GNU packages, such as schematic >capture packages, that can form >the basis for a System Builder or if they could be integrated into a >mixed signal simulation package >that allowed you to simulate the interaction between a microprocessor >program and analog circuits. >Some years ago, a friend from the internet pointed me to a mixed signal >simulation package that >allowed you to simulate Microprocessor, LCD displays and other >components, including analog designs, >and you could download and simulate microcomputer programs. There are a few but not free or opensource. Wasn't a compiler / ide like sourcebuilder /pic c compiler www.picant.com >Its all out there, but whether it is affordable and an open >archictecture is another matter. > >John. Alex -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.298 / Virus Database: 265.6.5 - Release Date: 26/12/2004 |
|
|
|
Hi Alex,
Alex Gibson wrote: >Surely it be easier to make it as a plugin for eclipse (like xiinx has >done), >only that means using java. http://www.eclipse.org/projects/index.html > >Probably could get the eclipse tools guys to help with the addin. > >A combination of these, would surely make a soc builder app >(well at least the front end) >http://www.eclipse.org/tools/index.html >http://www.eclipse.org/technology/index.html >http://www.eclipse.org/gef/ whats the >picture on the front page!(logic design app) > I took a look at the eclipse web site and am downloading the code to see what its all about. The thing about SCILAB is that it provides a Maths and scientific package on which to model and simulate the algorithm. >>There used to be a free image procesing package called Khoros that had a >> >> >> snip >Not free any more. >Was very buggy software that crashed very regularly in my experiance. >(windows or solaris) >news:comp.soft-sys.khoros > >Recently became visiquest. > >Should be able to find a copy of the free student of khoras verson >floating around somewhere, >or old linux / unix version(lots from google) >or 15 day eval version of visiquest >http://www.accusoft.com/support/evalcenter/ > From what I could tell, when I looked up khoros on google a few years ago, the business had taken a different direction, whether it was more along the lines of eclipse I can't remember. I have an old copy of Khoros but I think it is reliant on some of the old linux libraries, that might be hard to find now. I remember in the early days of the internet downloading a 10Mbyte Mathematical Morphology library from Brazil, and wondering how much the download was going to cost. Its nothing now. You are probably right in that it was pretty buggy. >>I'm not sure if there are any other GNU packages, such as schematic >>capture packages, that can form >>the basis for a System Builder or if they could be integrated into a >>mixed signal simulation package >>that allowed you to simulate the interaction between a microprocessor >>program and analog circuits. >> >> >> snip > There are a few but not free or opensource. > Wasn't a compiler / ide like sourcebuilder /pic c compiler > www.picant.com I think even the FPGA vendors are having trouble supplying an integrated environment. I have a amateur radio friend trying to use some old 3000 series FPGAs. The development software was sourced from a number of companies. The license had expired and Xilinx had fallen out with the original vendors so there was no way of getting the software relicensed. You could ask what he was doing still using the 3000 series. The answer is that the product had a long lifetime and the 3000 package was particularly small so fitted in the space provided. John. -- http://members.optushome.com.au/jekent [Non-text portions of this message have been removed] |