Reply by Theo Markettos September 28, 20162016-09-28
Paul Urbanus <urb@urbonix.com> wrote:
> > on Microzed the Vcco is in the connector so you can set it to 1.5V if > > you need > > > > -Lasse > > > picoZed also has Vcco for two banks on the connector also. For the Z7030 > picoZed these are banks 34 and 35.
Thanks. I've now found a few Altera SOMs that also expose the Vccio pins, which makes life a lot easier. Can't remember which offhand, but Altera provide a handy list of all of them: https://www.altera.com/products/soc/ecosystem/system-on-modules.tablet.html#cyclone-v-soc Thanks for the other discussion in this thread. It helped me refine that what I wanted was an SOM - which may expose Vccio - rather than a dev board which typically don't. As it turns out the SOM form factor is also more amenable to my application. Being able to stick to the SoC FPGAs solves a lot of problems and means not having to pretend to be a random memory interface just to get data in and out. Theo
Reply by Paul Urbanus September 27, 20162016-09-27
On 9/10/2016 10:07 AM, lasselangwadtchristensen@gmail.com wrote:
> Den fredag den 9. september 2016 kl. 18.37.54 UTC+2 skrev Theo Markettos: >> rickman <gnuarm@gmail.com> wrote: >>> I think you are not going to find anything other than the memory >>> interface. What's wrong with that? I assume you are referring to a >>> processor that runs in the GHz range, but even then it would be hard for >>> it to push data out of parallel I/O at "hundreds of MHz". Since this >>> would be a very atypical use of parallel I/Os, I can't imagine a chip >>> maker who would put the I/O pins on the fast local bus. Rather they are >>> typically connected through a slower bus for the peripherals. >>> >>> Can I ask why you don't want to use high speed serial I/Os (which are >>> intended for this) or a combined FPGA/CPU chip? Is using the memory >>> interface too obvious or is there a reason to not use that? >> >> I don't have a problem with using a memory interface, just would rather >> avoid something like PCIe - I don't have a transceiver interface to receive >> it. I just wasn't aware of anything that exported a 'simple' high speed >> memory interface. >> >> The reason for not using a combined FPGA/CPU chip is that I'll be using a >> dev board rather than making my own board (buying serious FPGAs in small >> quantities isn't fun and I'd rather not have to do the DDR3/etc layout). On >> the other side of my bridge CPLD/FPGA is a 1.5V parallel interface: most ARM >> FPGA dev boards hardwire their pins to something other than 1.5v, so I can't >> simply use the FPGA on the combined chip. I'm also physically constrained >> which rules out a lot of dev boards. > > on Microzed the Vcco is in the connector so you can set it to 1.5V if you need > > -Lasse >
picoZed also has Vcco for two banks on the connector also. For the Z7030 picoZed these are banks 34 and 35.
Reply by Dimiter_Popoff September 10, 20162016-09-10
On 10.9.2016 &#1075;. 21:06, Tim Wescott wrote:
> On Sat, 10 Sep 2016 18:34:45 +0300, Dimiter_Popoff wrote: > >> On 10.9.2016 &#1075;. 17:51, Tim Wescott wrote: >>> On Fri, 09 Sep 2016 15:34:23 +0100, Theo Markettos wrote: >>> >>>> I'm looking for a Cortex-A class processor that has reasonably quick >>>> parallel I/O that might be hooked up to an FPGA. I'm aware of the >>>> existing Zynq and Altera SoC FPGAs, but looking for something >>>> different. >>>> >>>> By 'parallel I/O' I mean ideally a memory interface - either >>>> bidirectional eg 64-bits or separate 32-bit tx and 32-bit rx. GPIO >>>> with data valid signals/strobes is another possibility. By 'quick' I >>>> mean hundreds of MHz, ideally with low latency. >>>> >>>> I know there are things like the TI PRU in eg the OMAP family, but >>>> they seem to have a limited number of pins (16 bit tx/rx). >>>> >>>> Can anyone suggest anything else in this space? >>>> >>>> Thanks Theo >>> >>> I couldn't find where you asked, but here's the processor I found: >>> >>> http://www.ti.com/lit/ds/symlink/am3358.pdf >>> >>> I'm absolutely not saying "use this one" -- it's just the first one I >>> found, and it had a conventional memory bus. I suspect that if you >>> look around you'll find something better. >>> >>> I looked at the general-purpose memory interface and it doesn't look >>> fast enough for you -- it's calling out 100MHz or 50MHz clock on a >>> 16-bit wide bus, and I'm not sure when you can drive it at 100MHz. >>> >>> DDR, OTOH, will go up to a 200MHz clock (with 400MHz data rate) -- >>> that's why I was suggesting it, if things are simple enough on the FPGA >>> side. >>> >>> >> DDR data rate is sort of misleading. The data are clocked in/out at this >> rate allright, but the processors I have seen do one cacheline at a time >> (32 bytes) and add to that quite a number of initiation cycles. >> Perhaps the largest ones do take advantage of open pages etc. but the >> smaller one are unlikely to do it - which in practice yields around half >> the clocked data rate. > > The one that I used, a long time ago, had a hugely programmable SDRAM > interface. I can't remember the chip, but it was a Motorola with a > PowerPC, which should date it. That SDRAM interface could have been > programmed to make life easy on the FPGA, I think.
Could it have been the 8240 or 8241 (same pinout, same part really, newer version). I have used it - it has a 64 bit wide SDRAM interface and it is hugely programmable, but it does burst one cacheline at a time all right, which is the speed killer I was talking about. I am not sure if one could stop the refresh completely, its period and other related stuff are programmable of course but I don't remember (may be never looked for it). Dimiter ------------------------------------------------------ Dimiter Popoff, TGI http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/
> > But -- I can't talk to this TI part; I just skimmed the data sheet. >
Reply by Tim Wescott September 10, 20162016-09-10
On Sat, 10 Sep 2016 18:34:45 +0300, Dimiter_Popoff wrote:

> On 10.9.2016 &#1075;. 17:51, Tim Wescott wrote: >> On Fri, 09 Sep 2016 15:34:23 +0100, Theo Markettos wrote: >> >>> I'm looking for a Cortex-A class processor that has reasonably quick >>> parallel I/O that might be hooked up to an FPGA. I'm aware of the >>> existing Zynq and Altera SoC FPGAs, but looking for something >>> different. >>> >>> By 'parallel I/O' I mean ideally a memory interface - either >>> bidirectional eg 64-bits or separate 32-bit tx and 32-bit rx. GPIO >>> with data valid signals/strobes is another possibility. By 'quick' I >>> mean hundreds of MHz, ideally with low latency. >>> >>> I know there are things like the TI PRU in eg the OMAP family, but >>> they seem to have a limited number of pins (16 bit tx/rx). >>> >>> Can anyone suggest anything else in this space? >>> >>> Thanks Theo >> >> I couldn't find where you asked, but here's the processor I found: >> >> http://www.ti.com/lit/ds/symlink/am3358.pdf >> >> I'm absolutely not saying "use this one" -- it's just the first one I >> found, and it had a conventional memory bus. I suspect that if you >> look around you'll find something better. >> >> I looked at the general-purpose memory interface and it doesn't look >> fast enough for you -- it's calling out 100MHz or 50MHz clock on a >> 16-bit wide bus, and I'm not sure when you can drive it at 100MHz. >> >> DDR, OTOH, will go up to a 200MHz clock (with 400MHz data rate) -- >> that's why I was suggesting it, if things are simple enough on the FPGA >> side. >> >> > DDR data rate is sort of misleading. The data are clocked in/out at this > rate allright, but the processors I have seen do one cacheline at a time > (32 bytes) and add to that quite a number of initiation cycles. > Perhaps the largest ones do take advantage of open pages etc. but the > smaller one are unlikely to do it - which in practice yields around half > the clocked data rate.
The one that I used, a long time ago, had a hugely programmable SDRAM interface. I can't remember the chip, but it was a Motorola with a PowerPC, which should date it. That SDRAM interface could have been programmed to make life easy on the FPGA, I think. But -- I can't talk to this TI part; I just skimmed the data sheet. -- Tim Wescott Control systems, embedded software and circuit design I'm looking for work! See my website if you're interested http://www.wescottdesign.com
Reply by Theo Markettos September 10, 20162016-09-10
lasselangwadtchristensen@gmail.com wrote:
> on Microzed the Vcco is in the connector so you can set it to 1.5V if you need
Thanks, that's /very/ interesting. That makes life an awful lot simpler. I'm less familiar with the Zynq than with the Altera SoC FPGAs, but I'll have a dig. Theo
Reply by Dimiter_Popoff September 10, 20162016-09-10
On 10.9.2016 &#1075;. 18:29, Theo Markettos wrote:
> Tim Wescott <tim@seemywebsite.com> wrote: >> On Sat, 10 Sep 2016 09:51:35 -0500, Tim Wescott wrote: >>> I couldn't find where you asked, but here's the processor I found: >>> >>> http://www.ti.com/lit/ds/symlink/am3358.pdf >>> >>> I'm absolutely not saying "use this one" -- it's just the first one I >>> found, and it had a conventional memory bus. I suspect that if you look >>> around you'll find something better. > > Thanks. I think you struck lucky there - many SoCs don't have such an > interface, or at least it's behind a NAND controller which adds some > 'cleverness' you might not want. > > I noticed that, and I also found the iMX6 which has a 32 bit external memory > interface at 104MHz. The trouble is most SOMs either don't or partially pin > that out (too many parallel pins) which tends to mean 8 or 16 bits at 104MHz > - so maybe 200MB/s is the best I'm likely to get.
Well if you can retreat somewhat on the "hundreds of MB/s" the MPC5200B can be quite useful (and I could probably offer more help about it than the manufacturer meanwhile :). It can do 2 cycles per 32 bit at 66 MHz (132 MB/s) on a generic bus, and it can do 66 MHz PCI (32 bit). However, while 66 MHz PCI is 1 longword per clock, it will burst only a cacheline at a time (32 bytes, 8 beats). Not sure how many burst initiation cycles this was taking. Dimiter ------------------------------------------------------ Dimiter Popoff, TGI http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/
Reply by Dimiter_Popoff September 10, 20162016-09-10
On 10.9.2016 &#1075;. 17:51, Tim Wescott wrote:
> On Fri, 09 Sep 2016 15:34:23 +0100, Theo Markettos wrote: > >> I'm looking for a Cortex-A class processor that has reasonably quick >> parallel I/O that might be hooked up to an FPGA. I'm aware of the >> existing Zynq and Altera SoC FPGAs, but looking for something different. >> >> By 'parallel I/O' I mean ideally a memory interface - either >> bidirectional eg 64-bits or separate 32-bit tx and 32-bit rx. GPIO with >> data valid signals/strobes is another possibility. By 'quick' I mean >> hundreds of MHz, ideally with low latency. >> >> I know there are things like the TI PRU in eg the OMAP family, but they >> seem to have a limited number of pins (16 bit tx/rx). >> >> Can anyone suggest anything else in this space? >> >> Thanks Theo > > I couldn't find where you asked, but here's the processor I found: > > http://www.ti.com/lit/ds/symlink/am3358.pdf > > I'm absolutely not saying "use this one" -- it's just the first one I > found, and it had a conventional memory bus. I suspect that if you look > around you'll find something better. > > I looked at the general-purpose memory interface and it doesn't look fast > enough for you -- it's calling out 100MHz or 50MHz clock on a 16-bit wide > bus, and I'm not sure when you can drive it at 100MHz. > > DDR, OTOH, will go up to a 200MHz clock (with 400MHz data rate) -- that's > why I was suggesting it, if things are simple enough on the FPGA side. >
DDR data rate is sort of misleading. The data are clocked in/out at this rate allright, but the processors I have seen do one cacheline at a time (32 bytes) and add to that quite a number of initiation cycles. Perhaps the largest ones do take advantage of open pages etc. but the smaller one are unlikely to do it - which in practice yields around half the clocked data rate. Dimiter
Reply by Theo Markettos September 10, 20162016-09-10
Tim Wescott <tim@seemywebsite.com> wrote:
> On Sat, 10 Sep 2016 09:51:35 -0500, Tim Wescott wrote: > > I couldn't find where you asked, but here's the processor I found: > > > > http://www.ti.com/lit/ds/symlink/am3358.pdf > > > > I'm absolutely not saying "use this one" -- it's just the first one I > > found, and it had a conventional memory bus. I suspect that if you look > > around you'll find something better.
Thanks. I think you struck lucky there - many SoCs don't have such an interface, or at least it's behind a NAND controller which adds some 'cleverness' you might not want. I noticed that, and I also found the iMX6 which has a 32 bit external memory interface at 104MHz. The trouble is most SOMs either don't or partially pin that out (too many parallel pins) which tends to mean 8 or 16 bits at 104MHz - so maybe 200MB/s is the best I'm likely to get. (Though it seems the bandwidth somebody measured on iMX6 is substantially worse)
> That "faster DDR clock means faster data" assertion assumes that you have > big chunks to send -- it'll probably be slower to send single words, but > you'll gain a lot if you can use burst mode.
Burst is fine, but dealing with the DDR DRAM PHY and unpicking what the DDR memory controller did to you isn't nice. The AM3358 NOR flash burst (being SDR) looks a bit more sane. Theo
Reply by September 10, 20162016-09-10
On Saturday, September 10, 2016 at 9:51:46 AM UTC-5, Tim Wescott wrote:
> On Fri, 09 Sep 2016 15:34:23 +0100, Theo Markettos wrote: > > > I'm looking for a Cortex-A class processor that has reasonably quick > > parallel I/O that might be hooked up to an FPGA. I'm aware of the > > existing Zynq and Altera SoC FPGAs, but looking for something different. > > > > By 'parallel I/O' I mean ideally a memory interface - either > > bidirectional eg 64-bits or separate 32-bit tx and 32-bit rx. GPIO with > > data valid signals/strobes is another possibility. By 'quick' I mean > > hundreds of MHz, ideally with low latency. > > > > I know there are things like the TI PRU in eg the OMAP family, but they > > seem to have a limited number of pins (16 bit tx/rx). > > > > Can anyone suggest anything else in this space? > > > > Thanks Theo > > I couldn't find where you asked, but here's the processor I found: > > http://www.ti.com/lit/ds/symlink/am3358.pdf > > I'm absolutely not saying "use this one" -- it's just the first one I > found, and it had a conventional memory bus. I suspect that if you look > around you'll find something better. > > I looked at the general-purpose memory interface and it doesn't look fast > enough for you -- it's calling out 100MHz or 50MHz clock on a 16-bit wide > bus, and I'm not sure when you can drive it at 100MHz. > > DDR, OTOH, will go up to a 200MHz clock (with 400MHz data rate) -- that's > why I was suggesting it, if things are simple enough on the FPGA side. > > -- > Tim Wescott > Control systems, embedded software and circuit design > I'm looking for work! See my website if you're interested > http://www.wescottdesign.com
]> > I'm looking for a Cortex-A class processor that has reasonably quick ]> > parallel I/O that might be hooked up to an FPGA. I'm aware of the ]> > existing Zynq and Altera SoC FPGAs, but looking for something different. ] http://www.ti.com/lit/ds/symlink/am3358.pdf The TI part has a DRAM and a separate 2nd memory port with 7 chip selects. A lot better than a single memory port: DDR timing much different from FPGA IO port capabilities. With 7 chip selects one can have distinct FPGA read and write pins with tri-states on the read pins. Again, helps with timing in my experience. According to the literature the xilinx tools completely handle the internal interfaces within the Zynq part and Vivado/ISE gives you your timing pass/fail. I'd go with a SOC FPGA in a minute, given the timing troubles we had with a distinct ARM chip.
Reply by September 10, 20162016-09-10
Den fredag den 9. september 2016 kl. 18.37.54 UTC+2 skrev Theo Markettos:
> rickman <gnuarm@gmail.com> wrote: > > I think you are not going to find anything other than the memory > > interface. What's wrong with that? I assume you are referring to a > > processor that runs in the GHz range, but even then it would be hard for > > it to push data out of parallel I/O at "hundreds of MHz". Since this > > would be a very atypical use of parallel I/Os, I can't imagine a chip > > maker who would put the I/O pins on the fast local bus. Rather they are > > typically connected through a slower bus for the peripherals. > > > > Can I ask why you don't want to use high speed serial I/Os (which are > > intended for this) or a combined FPGA/CPU chip? Is using the memory > > interface too obvious or is there a reason to not use that? > > I don't have a problem with using a memory interface, just would rather > avoid something like PCIe - I don't have a transceiver interface to receive > it. I just wasn't aware of anything that exported a 'simple' high speed > memory interface. > > The reason for not using a combined FPGA/CPU chip is that I'll be using a > dev board rather than making my own board (buying serious FPGAs in small > quantities isn't fun and I'd rather not have to do the DDR3/etc layout). On > the other side of my bridge CPLD/FPGA is a 1.5V parallel interface: most ARM > FPGA dev boards hardwire their pins to something other than 1.5v, so I can't > simply use the FPGA on the combined chip. I'm also physically constrained > which rules out a lot of dev boards.
on Microzed the Vcco is in the connector so you can set it to 1.5V if you need -Lasse