EmbeddedRelated.com
Forums

Custom CPU Designs

Started by Rick C April 16, 2020
Grant Edwards <invalid@invalid.invalid> wrote:
> Once I got a UART working so I count print messages, I just gave up on > the JTAG BS. Another interesting quirk was that the Altera USB JTAG > interface only worked right with a few specific models of powered USB > hubs.
I've spent months working around such problems :( We have an application that pushes gigabytes through JTAG UARTs and have learnt all about it... There's a pile of specific issues: - the USB 1.1 JTAG is an FT245 chip which basically bitbangs JTAG; it sends a byte containing 4 bits for the 4 JTAG wires. The software is literally saying "clock high, clock low, clock high, clock low" etc. Timing of that is not reliable. Newer development boards have a USB 2.0 programmer where things are a bit better here, but it's still bitbanging. - being USB 1.1, if you have a cheap USB 2.0 hub it may only support USB SST which means all USB 1.1 peripherals share 12Mbps of bandwidth. In our case we have 16 FPGAs all trying to chat using that shared 12Mbps bandwidth. Starvation occurs and nobody makes any progress. A better hub with MST will allow multiple 12Mbps streams to share the 480Mbps USB 2.0 bandwidth. Unfortunately when you buy a hub this is never advertised or explained. - The software daemon that generates the bitbanging data is called jtagd and it's single threaded. It can max out a CPU core bitbanging, and that can lead to unreliability. I had an Atom where it was unusable. I now install i7s in servers with FPGAs, purely to push bits down the JTAG wire. - To parallelise downloads to multiple FPGAs, I've written some horrible containerisation scripts that lie to each jtagd there's only one FPGA in tte system. Then I can launch 16 jtagds and use all 16 cores in my system to push traffic through the JTAG UARTs - Did I mention that programming an FPGA takes about 700MB? So I need to fit at least 8GB of RAM to avoid memory starvation when doing parallel programming (if the system swaps the bitbanging stalls and the FPGA programming fails) - there's some troubles with jtagd and libudev.so.0 - if you don't have it things seem to work but get unreliable. I just symlink libudev.so.1 on Ubuntu and it seems to fix it. - the register-level interface of the JTAG UART isn't able to read the state of the input FIFO without also dequeuing the data on it. Writing reliable device drivers is almost impossible. I have a version that wraps the UART in a 16550 register interface to avoid this problem. - if the FPGA is failing timing, the producer/consumer of the UART can break in interesting ways, which look a lot like there's some problem with the USB hub or similar. It's a very precarious pile of hardware and software that falls over in numerous ways if pushed at all hard :( Theo [adding comp.arch.fpga since this is relevant to those folks]
On 17/04/20 17:15, David Brown wrote:
> On 17/04/2020 16:23, Tom Gardner wrote: >> On 17/04/20 14:44, David Brown wrote: >>> On 17/04/2020 11:49, Tom Gardner wrote: >>>> On 17/04/20 09:02, David Brown wrote: > >>>> As you say, the XMOS /ecosystem/ is far more compelling, >>>> partly because it has excellent /integration/ between the >>>> hardware, the software and the toolchain. The latter two >>>> are usually missing. >>> >>> Agreed.&nbsp; And the XMOS folk have learned and improved.&nbsp; With the first chips, >>> they proudly showed off that you could make a 100 MBit Ethernet controller in >>> software on an XMOS chip.&nbsp; Then it was pointed out to them that - impressive >>> achievement though it was - it was basically useless because you didn't have >>> the resources left to use it for much, and hardware Ethernet controllers were >>> much cheaper.&nbsp; So they brought out new XMOS chips with hardware Ethernet >>> controllers.&nbsp; The same thing happened with USB. >> >> It looks like a USB controller needs ~8 cores, which isn't >> a problem on a 16 core device :) >> > > I've had another look, and I was mistaken - these devices only have the USB and > Ethernet PHYs, not the MACs, and thus require a lot of processor power, pins, > memory and other resources.&nbsp; It doesn't need 8 cores, but the whole thing just > seems so inefficient.&nbsp; No one is going to spend the extra cost for an XMOS with > a USB PHY, so why not put a hardware USB controller on the chip?&nbsp; The silicon > costs would surely be minor, and it would save a lot of development effort and > release resources that are useful for other tasks.&nbsp; The same goes for Ethernet. > Just because you /can/ make these things in software on the XMOS devices, does > not make it a good idea.
Oh I agree! However, being able to do it in software is a good demonstration of the device's unique characteristics, and that "you aren't in Kansas anymore"
> Overall, the thing that bugs me about XMOS is that you can write very simple, > elegant tasks for the cores to do various tasks.&nbsp; But when you do that, you run > out of cores almost immediately.&nbsp; So you have to write your code in a way that > implements your own scheduler, losing a major part of the point of the whole > system.&nbsp; Or you use the XMOS FreeRTOS port on one of the virtual cores - in > which case you could just switch to a Cortex-M microcontroller with hardware > USB, Ethernet, PWM, UART, etc. and a fraction of the price.
I didn't know they had a FreeRTOS port, and it sounds like having a dog and barking :) Sounds like it would combine the disadvantages and negate the advantages! Having said that, they did have a chip where one of the processors was an ARM. Perhaps it was intended that the ARM run FreeRTOS?
> If the XMOS devices and software had a way of neatly multi-tasking /within/ a > single virtual core, while keeping the same kind of inter-task communication and > other benefits, then they would have something I could see being very nice.
There is a half-way house. If you adopt a certain coding style, the IDE will combine several processes to run on a single processor. Basically it is equivalent to appending all the process' "startup" code into a single block, and all the "forever loop" code into a single block. The key bit is combining all the process' select statements into as single select statement. With that understanding, the coding style requirements become obvious, not onerous, and they are checked by the compiler.
>>> There is a lot to like about XMOS devices and tools, but they still strike me >>> as a solution in search of a problem.&nbsp; An elegant solution, perhaps, but >>> still missing a problem.&nbsp; We used them for a project many years ago for a USB >>> Audio Class 2 device.&nbsp; There simply were no realistic alternatives at the >>> time, but I can't say the XMOS solution was a good one.&nbsp; The device has far >>> too little memory to make sensible buffers (this still applies to XMOS >>> devices, last I looked), and the software at the time was painful (this I >>> believe has improved significantly).&nbsp; If we were making a new version of the >>> product, we'd drop the XMOS device in an instant and use an off-the-shelf >>> chip instead. >> >> I certainly wouldn't want to comment on your use case. > > As I said, it was a while ago, when XMOS were relatively new - I assume the > software, libraries and examples are better now than at that time. But for > applications like ours, you can just get a CMedia chip and wire it up - no > matter how good XMOS tools have become, they don't beat that. > > (And then all the development budget can be spent on trying to get drivers to > work on idiotic Windows systems...)
What is this "Windows" of which you speak?
>> To me a large part of the attraction is that you can >> /predict/ the /worst/ case latency and jitter (and hence >> throughput), in a way that is difficult in a standard MCU >> and easy in an FPGA. > > For standard MCU's, you aim to do this by using hardware peripherals (timers, > PWM blocks, communication controllers, etc.) for the most timing-critical > stuff.&nbsp; Then you don't need it in the software.
Yebbut, the toolset won't analyse and predict worst case performance. So you are back to "run it and hope we stumble upon the worst case". Yes, that is sufficient in many cases, but it is /inelegant/, dammit!
>> To that extent it allows FPGA-like performance with "traditional" >> software development tools and methodologies. Plus a little >> bit of "thinking parallel" that everybody will soon /have/ to >> be doing :) > > It's a nice idea, and I'm sure XMOS has some good use-cases.&nbsp; But I can't help > feeling they have something that is /almost/ a good system - with a bit more, > they could be very much more useful.
There's no doubt it is niche. The world will go parallel. A major difficulty will be finding people that can think that way. (Just look at how difficult softies find it when they try to "program" VHDL/Verilog) We need all the tools and concepts we can muster; my fear is that CSP is the best! :)
On Friday, April 17, 2020 at 12:15:37 PM UTC-4, David Brown wrote:
> On 17/04/2020 16:23, Tom Gardner wrote: > > On 17/04/20 14:44, David Brown wrote: > >> On 17/04/2020 11:49, Tom Gardner wrote: > >>> On 17/04/20 09:02, David Brown wrote: > > >>> As you say, the XMOS /ecosystem/ is far more compelling, > >>> partly because it has excellent /integration/ between the > >>> hardware, the software and the toolchain. The latter two > >>> are usually missing. > >> > >> Agreed.&nbsp; And the XMOS folk have learned and improved.&nbsp; With the first > >> chips, they proudly showed off that you could make a 100 MBit Ethernet > >> controller in software on an XMOS chip.&nbsp; Then it was pointed out to > >> them that - impressive achievement though it was - it was basically > >> useless because you didn't have the resources left to use it for much, > >> and hardware Ethernet controllers were much cheaper.&nbsp; So they brought > >> out new XMOS chips with hardware Ethernet controllers.&nbsp; The same thing > >> happened with USB. > > > > It looks like a USB controller needs ~8 cores, which isn't > > a problem on a 16 core device :) > > > > I've had another look, and I was mistaken - these devices only have the > USB and Ethernet PHYs, not the MACs, and thus require a lot of processor > power, pins, memory and other resources. It doesn't need 8 cores, but > the whole thing just seems so inefficient. No one is going to spend the > extra cost for an XMOS with a USB PHY, so why not put a hardware USB > controller on the chip? The silicon costs would surely be minor, and it > would save a lot of development effort and release resources that are > useful for other tasks. The same goes for Ethernet. Just because you > /can/ make these things in software on the XMOS devices, does not make > it a good idea. > > Overall, the thing that bugs me about XMOS is that you can write very > simple, elegant tasks for the cores to do various tasks. But when you > do that, you run out of cores almost immediately. So you have to write > your code in a way that implements your own scheduler, losing a major > part of the point of the whole system. Or you use the XMOS FreeRTOS > port on one of the virtual cores - in which case you could just switch > to a Cortex-M microcontroller with hardware USB, Ethernet, PWM, UART, > etc. and a fraction of the price.
Too bad the XMOS doesn't have more CPUs, like maybe 144 of them?
> If the XMOS devices and software had a way of neatly multi-tasking > /within/ a single virtual core, while keeping the same kind of > inter-task communication and other benefits, then they would have > something I could see being very nice.
Is a "virtual core" one CPU? Multitasking a single CPU is the thing the XMOS is supposed to eliminate, no? Why bring it back? Oh, because there aren't enough CPUs on the XMOS for some applications! So it's back to the fast ARM processors and multitasking. I seem to recall there being asymmetric multicores from various ARM makers with one fast CPU for multitasking and a smaller CPU for handling the lesser real time tasks without interference. That's a good combination, but again, a more specific target market. It seems that is how the CPU market has gone. The volumes are so high there are many niche areas justifying their own type of SoC to address it.
> >> There is a lot to like about XMOS devices and tools, but they still > >> strike me as a solution in search of a problem.&nbsp; An elegant solution, > >> perhaps, but still missing a problem.&nbsp; We used them for a project many > >> years ago for a USB Audio Class 2 device.&nbsp; There simply were no > >> realistic alternatives at the time, but I can't say the XMOS solution > >> was a good one.&nbsp; The device has far too little memory to make sensible > >> buffers (this still applies to XMOS devices, last I looked), and the > >> software at the time was painful (this I believe has improved > >> significantly).&nbsp; If we were making a new version of the product, we'd > >> drop the XMOS device in an instant and use an off-the-shelf chip instead. > > > > I certainly wouldn't want to comment on your use case. > > As I said, it was a while ago, when XMOS were relatively new - I assume > the software, libraries and examples are better now than at that time. > But for applications like ours, you can just get a CMedia chip and wire > it up - no matter how good XMOS tools have become, they don't beat that. > > (And then all the development budget can be spent on trying to get > drivers to work on idiotic Windows systems...) > > > > > To me a large part of the attraction is that you can > > /predict/ the /worst/ case latency and jitter (and hence > > throughput), in a way that is difficult in a standard MCU > > and easy in an FPGA. > > For standard MCU's, you aim to do this by using hardware peripherals > (timers, PWM blocks, communication controllers, etc.) for the most > timing-critical stuff. Then you don't need it in the software.
And all that hardware costs chip space which you may or may not use. That's why they have so many flavors, to give the "perfect" combination of memory, peripherals and analog to minimize cost for each project. FPGAs have a cost overhead which is fading into the background as they become more and more efficient. For many designs an FPGA provides a good trade off between cost and flexibility. In many cases it also provides a functionality that can't be duplicated elsewhere.
> > To that extent it allows FPGA-like performance with "traditional" > > software development tools and methodologies. Plus a little > > bit of "thinking parallel" that everybody will soon /have/ to > > be doing :) > > It's a nice idea, and I'm sure XMOS has some good use-cases. But I > can't help feeling they have something that is /almost/ a good system - > with a bit more, they could be very much more useful.
It's good, it just has it's own niche of applications where it is the best solution. Nothing wrong with that! -- Rick C. -++ Get 1,000 miles of free Supercharging -++ Tesla referral code - https://ts.la/richard11209
Rick C <gnuarm.deletethisbit@gmail.com> writes:
> A single, fast CPU is harder to program than many, fast CPUs. > Programmers have to learn a lot in order to perform multitasking on a > single CPU.
Really it's the other way around. A typical programmer these days might not know how to implement a multitasker or OS on a bare machine, but they do know how to spawn processes and use them on a machine with an OS. Organizing a parallel or distributed program is much harder.
On Friday, April 17, 2020 at 3:17:55 PM UTC-4, Paul Rubin wrote:
> Rick C <gnuarm.deletethisbit@gmail.com> writes: > > A single, fast CPU is harder to program than many, fast CPUs. > > Programmers have to learn a lot in order to perform multitasking on a > > single CPU. > > Really it's the other way around. A typical programmer these days might > not know how to implement a multitasker or OS on a bare machine, but > they do know how to spawn processes and use them on a machine with an > OS. Organizing a parallel or distributed program is much harder.
Really? Multitasking is a lot more complex than just spawning tasks. There are potential conditions that can lock up the computer or the tasks. Managing task priorities can be a very complex issue and learning how to do that correctly is an important part of multitasking. In real time systems it becomes potentially the hardest part of a project. Breaking a design down to assign tasks on various processors is a much simpler matter. It's much like hardware design where you dedicate hardware to perform various actions and simply don't have the many problems of sharing a single CPU among many tasks. Do I have it wrong? Is multitasking actually simple and the various articles I've read about the complexities overstate the matter? -- Rick C. +-- Get 1,000 miles of free Supercharging +-- Tesla referral code - https://ts.la/richard11209
On Fri, 17 Apr 2020 15:44:15 +0200, David Brown
<david.brown@hesbynett.no> wrote:

> >> >> As you say, the XMOS /ecosystem/ is far more compelling, >> partly because it has excellent /integration/ between the >> hardware, the software and the toolchain. The latter two >> are usually missing.
The XCALE architecture is nice, if you are comfortable implementing whole applications mainly with interrupts. Just assign one core for each ISR. When an external HW signal ("interrupt") occurs, restart the program in that core. The program needs to be executed before the next "interrupt" occurs. Thus implementing e.g. audio or video sampled system is nice. The core task just needs to be fast enough to handle one sample.
>Agreed. And the XMOS folk have learned and improved. With the first >chips, they proudly showed off that you could make a 100 MBit Ethernet >controller in software on an XMOS chip. Then it was pointed out to them >that - impressive achievement though it was - it was basically useless >because you didn't have the resources left to use it for much, and >hardware Ethernet controllers were much cheaper. So they brought out >new XMOS chips with hardware Ethernet controllers. The same thing >happened with USB.
On the Ethernet, the minimum MAC frame is 64 bytes, thus new short frames may appear every 6,4 us, thus the "ISR" must execute in less than 6.4 us. If not possible, let one task just split the frame into header and actual payload and use separate cores e.g. to handle the MAC and IP headers, Still 8 cores sounds a quite large number,
Rick C <gnuarm.deletethisbit@gmail.com> writes:
> Do I have it wrong? Is multitasking actually simple and the various > articles I've read about the complexities overstate the matter?
Multitasking isn't exactly simple, but we (programmers) are used to it by now. The stuff you read about lock hazards is mostly from multi-threading in a single process. If you have processes communicating through channels, there are still ways to mess up, but it's usually simpler than dealing with threads and locks.
On Friday, April 17, 2020 at 5:48:50 PM UTC-4, Paul Rubin wrote:
> Rick C <gnuarm.deletethisbit@gmail.com> writes: > > Do I have it wrong? Is multitasking actually simple and the various > > articles I've read about the complexities overstate the matter? > > Multitasking isn't exactly simple, but we (programmers) are used to it > by now. The stuff you read about lock hazards is mostly from > multi-threading in a single process. If you have processes > communicating through channels, there are still ways to mess up, but > it's usually simpler than dealing with threads and locks.
Exactly, the mindset it to use multitasking... but it can still be complex. That's my point... what you are used to is what you use even when it's not the best approach. Splitting a design to run on independent processors is just as easy if not more so because of the lack of sharing issues. The stuff you are thinking of with distributed processing is when your application doesn't suit multitasking and it needs to be distributed over a lot of processors to speed it up. That's not the same issue at all of simply getting the job done. That's the sort of stuff they have problems with on super computers. I think we've been down this road before. -- Rick C. +-+ Get 1,000 miles of free Supercharging +-+ Tesla referral code - https://ts.la/richard11209
On 17/04/2020 18:49, Tom Gardner wrote:
> On 17/04/20 17:15, David Brown wrote: >> On 17/04/2020 16:23, Tom Gardner wrote: >>> On 17/04/20 14:44, David Brown wrote: >>>> On 17/04/2020 11:49, Tom Gardner wrote: >>>>> On 17/04/20 09:02, David Brown wrote: >> >>>>> As you say, the XMOS /ecosystem/ is far more compelling, >>>>> partly because it has excellent /integration/ between the >>>>> hardware, the software and the toolchain. The latter two >>>>> are usually missing. >>>> >>>> Agreed.&nbsp; And the XMOS folk have learned and improved.&nbsp; With the >>>> first chips, they proudly showed off that you could make a 100 MBit >>>> Ethernet controller in software on an XMOS chip.&nbsp; Then it was >>>> pointed out to them that - impressive achievement though it was - it >>>> was basically useless because you didn't have the resources left to >>>> use it for much, and hardware Ethernet controllers were much >>>> cheaper.&nbsp; So they brought out new XMOS chips with hardware Ethernet >>>> controllers.&nbsp; The same thing happened with USB. >>> >>> It looks like a USB controller needs ~8 cores, which isn't >>> a problem on a 16 core device :) >>> >> >> I've had another look, and I was mistaken - these devices only have >> the USB and Ethernet PHYs, not the MACs, and thus require a lot of >> processor power, pins, memory and other resources.&nbsp; It doesn't need 8 >> cores, but the whole thing just seems so inefficient.&nbsp; No one is going >> to spend the extra cost for an XMOS with a USB PHY, so why not put a >> hardware USB controller on the chip?&nbsp; The silicon costs would surely >> be minor, and it would save a lot of development effort and release >> resources that are useful for other tasks.&nbsp; The same goes for >> Ethernet. Just because you /can/ make these things in software on the >> XMOS devices, does not make it a good idea. > > Oh I agree! However, being able to do it in software is a > good demonstration of the device's unique characteristics, > and that "you aren't in Kansas anymore" >
Indeed. The real power comes from when you want to do something that is /not/ standard, or at least not common. Implementing a standard UART in a couple of XMOS cores is a pointless waste of silicon. Implementing a UART that uses Manchester encoding for the UART signals so that you can use it on a balanced line without keeping track of which line is which - /then/ you've got something that can be done just as easily on an XMOS and is a big pain to do on a standard microcontroller. Implementing an Ethernet MAC on an XMOS is pointless. Implementing an EtherCAT slave is not going to be much harder for the XMOS than a normal Ethernet MAC, but is impossible on any microcontroller without specialised peripherals.
> >> Overall, the thing that bugs me about XMOS is that you can write very >> simple, elegant tasks for the cores to do various tasks.&nbsp; But when you >> do that, you run out of cores almost immediately.&nbsp; So you have to >> write your code in a way that implements your own scheduler, losing a >> major part of the point of the whole system.&nbsp; Or you use the XMOS >> FreeRTOS port on one of the virtual cores - in which case you could >> just switch to a Cortex-M microcontroller with hardware USB, Ethernet, >> PWM, UART, etc. and a fraction of the price. > > I didn't know they had a FreeRTOS port, and it sounds > like having a dog and barking :) Sounds like it would > combine the disadvantages and negate the advantages! >
Some real-time stuff needs microsecond or sub-microsecond precision - XMOS lets you do that in software on a core, while normally you'd do it in dedicated peripherals on a microcontroller. But a lot needs millisecond or sub-second precision, and FreeRTOS is absolutely fine for that. (As are other methods, such as software timers.)
> Having said that, they did have a chip where one of the > processors was an ARM. Perhaps it was intended that the > ARM run FreeRTOS?
I haven't seen such a chip. Do you have a link? It could be an interesting device.
> > >> If the XMOS devices and software had a way of neatly multi-tasking >> /within/ a single virtual core, while keeping the same kind of >> inter-task communication and other benefits, then they would have >> something I could see being very nice. > > There is a half-way house. > > If you adopt a certain coding style, the IDE will combine > several processes to run on a single processor. > > Basically it is equivalent to appending all the process' > "startup" code into a single block, and all the "forever > loop" code into a single block. The key bit is combining > all the process' select statements into as single select > statement. > > With that understanding, the coding style requirements > become obvious, not onerous, and they are checked by the > compiler.
I understand the principle, but you'd lose some of the modularity here. How well does it work if you want to have your UART, your CAN, your PWM, etc., defined in different files - and then you want to put them on the same core? I guess it will be possible.
> > >>>> There is a lot to like about XMOS devices and tools, but they still >>>> strike me as a solution in search of a problem.&nbsp; An elegant >>>> solution, perhaps, but still missing a problem.&nbsp; We used them for a >>>> project many years ago for a USB Audio Class 2 device.&nbsp; There simply >>>> were no realistic alternatives at the time, but I can't say the XMOS >>>> solution was a good one.&nbsp; The device has far too little memory to >>>> make sensible buffers (this still applies to XMOS devices, last I >>>> looked), and the software at the time was painful (this I believe >>>> has improved significantly).&nbsp; If we were making a new version of the >>>> product, we'd drop the XMOS device in an instant and use an >>>> off-the-shelf chip instead. >>> >>> I certainly wouldn't want to comment on your use case. >> >> As I said, it was a while ago, when XMOS were relatively new - I >> assume the software, libraries and examples are better now than at >> that time. But for applications like ours, you can just get a CMedia >> chip and wire it up - no matter how good XMOS tools have become, they >> don't beat that. >> >> (And then all the development budget can be spent on trying to get >> drivers to work on idiotic Windows systems...) > > What is this "Windows" of which you speak? >
Something some customers have. It is a system designed to be as inconvenient for developers as humanly possible.
> >>> To me a large part of the attraction is that you can >>> /predict/ the /worst/ case latency and jitter (and hence >>> throughput), in a way that is difficult in a standard MCU >>> and easy in an FPGA. >> >> For standard MCU's, you aim to do this by using hardware peripherals >> (timers, PWM blocks, communication controllers, etc.) for the most >> timing-critical stuff.&nbsp; Then you don't need it in the software. > > Yebbut, the toolset won't analyse and predict worst case > performance. So you are back to "run it and hope we stumble > upon the worst case". >
The peripherals are independent, and specified in documentation. If the PWM timer block can do 16-bit precision at 120 MHz, then you know its limits - and it doesn't matter how many UARTs you use or how fast you want your SPI bus to run. You don't need to analyse the timings - that was done when the chip was designed. You need to check the speeds of your high-level software that uses the modules, but that applies to XMOS code too.
> Yes, that is sufficient in many cases, but it is /inelegant/, > dammit! >
There is always room for improvement and extra tools!
> >>> To that extent it allows FPGA-like performance with "traditional" >>> software development tools and methodologies. Plus a little >>> bit of "thinking parallel" that everybody will soon /have/ to >>> be doing :) >> >> It's a nice idea, and I'm sure XMOS has some good use-cases.&nbsp; But I >> can't help feeling they have something that is /almost/ a good system >> - with a bit more, they could be very much more useful. > > There's no doubt it is niche. > > The world will go parallel. A major difficulty will be finding > people that can think that way. (Just look at how difficult > softies find it when they try to "program" VHDL/Verilog) > > We need all the tools and concepts we can muster; my fear > is that CSP is the best! :)
I'd like a good way to do CSP stuff in C or C++. I've been looking at passing std::variant types in message queues in FreeRTOS, but I'm not happy with the results yet. I think I'll have to make my own alternative to std::variant, to make it more efficient for the task. It's fun. A different kind of fun from playing with the XMOS, but fun. If I ever get time, I will dig out my old XMOS kit and see how the current tools are looking. If I can find a neat way to get your "half-way house" system working, it will increase a good deal in its attractiveness for me.
On 17/04/2020 19:48, Rick C wrote:
> On Friday, April 17, 2020 at 12:15:37 PM UTC-4, David Brown wrote: >> On 17/04/2020 16:23, Tom Gardner wrote: >>> On 17/04/20 14:44, David Brown wrote: >>>> On 17/04/2020 11:49, Tom Gardner wrote: >>>>> On 17/04/20 09:02, David Brown wrote: >> >>>>> As you say, the XMOS /ecosystem/ is far more compelling, >>>>> partly because it has excellent /integration/ between the >>>>> hardware, the software and the toolchain. The latter two are >>>>> usually missing. >>>> >>>> Agreed. And the XMOS folk have learned and improved. With the >>>> first chips, they proudly showed off that you could make a 100 >>>> MBit Ethernet controller in software on an XMOS chip. Then it >>>> was pointed out to them that - impressive achievement though it >>>> was - it was basically useless because you didn't have the >>>> resources left to use it for much, and hardware Ethernet >>>> controllers were much cheaper. So they brought out new XMOS >>>> chips with hardware Ethernet controllers. The same thing >>>> happened with USB. >>> >>> It looks like a USB controller needs ~8 cores, which isn't a >>> problem on a 16 core device :) >>> >> >> I've had another look, and I was mistaken - these devices only have >> the USB and Ethernet PHYs, not the MACs, and thus require a lot of >> processor power, pins, memory and other resources. It doesn't need >> 8 cores, but the whole thing just seems so inefficient. No one is >> going to spend the extra cost for an XMOS with a USB PHY, so why >> not put a hardware USB controller on the chip? The silicon costs >> would surely be minor, and it would save a lot of development >> effort and release resources that are useful for other tasks. The >> same goes for Ethernet. Just because you /can/ make these things >> in software on the XMOS devices, does not make it a good idea. >> >> Overall, the thing that bugs me about XMOS is that you can write >> very simple, elegant tasks for the cores to do various tasks. But >> when you do that, you run out of cores almost immediately. So you >> have to write your code in a way that implements your own >> scheduler, losing a major part of the point of the whole system. >> Or you use the XMOS FreeRTOS port on one of the virtual cores - in >> which case you could just switch to a Cortex-M microcontroller with >> hardware USB, Ethernet, PWM, UART, etc. and a fraction of the >> price. > > Too bad the XMOS doesn't have more CPUs, like maybe 144 of them? >
144 cpus is far more than would be useful in practice - as long as you have cores that can do useful work in a flexible way (like XMOS cores). When you have very limited cores that can barely do anything themselves, you need lots of them. (Arguably that's what you have on an FPGA - lots of tiny bits that do very little on their own. But the key difference is the tools - FPGA's would be a lot less popular if you had to code each LU individually, do placement manually by numbering them, and write a routing file by hand.)
> >> If the XMOS devices and software had a way of neatly multi-tasking >> /within/ a single virtual core, while keeping the same kind of >> inter-task communication and other benefits, then they would have >> something I could see being very nice. > > Is a "virtual core" one CPU? Multitasking a single CPU is the thing > the XMOS is supposed to eliminate, no? Why bring it back? Oh, > because there aren't enough CPUs on the XMOS for some applications! > So it's back to the fast ARM processors and multitasking. >
When you are writing multi-tasking code, you often want a lot of tasks. More than 8. (Sometimes, just to be more awkward, you want them to be created dynamically.) Any specific limit is an inconvenient limit. There are always ways to get round this - like using an RTOS on one virtual core of the XMOS. But you lose some of the symmetry and convenience that way. (I understand why the XMOS is designed the way it is - any system is going to be a compromise between what the hardware designers can do practically, and what the software designers want.)
> I seem to recall there being asymmetric multicores from various ARM > makers with one fast CPU for multitasking and a smaller CPU for > handling the lesser real time tasks without interference. >
Yes - you have a Cortex-A cpu that can handle high throughtput, perhaps with several cores, combined with a Cortex-M device that is more deterministic and can give more real-time control.
> That's a good combination, but again, a more specific target market. > It seems that is how the CPU market has gone. The volumes are so > high there are many niche areas justifying their own type of SoC to > address it. >
These are becoming increasingly common - they are no longer niche. If you have a system that needs the processing power of a bigger cpu (for screens, image handling, embedded Linux, the convenience and low development costs of high-level languages and off-the-shelf libraries, etc.) then having a small cpu for handling ADC's, timers, PWM, UARTs, power management, keys, and that kind of thing is a big win. Even combinations of fast M7 or M4 cores with an M0 core are common, especially if you have a specific task for the small core (like running a Bluetooth stack for an embedded wireless device).
> >>>> There is a lot to like about XMOS devices and tools, but they >>>> still strike me as a solution in search of a problem. An >>>> elegant solution, perhaps, but still missing a problem. We >>>> used them for a project many years ago for a USB Audio Class 2 >>>> device. There simply were no realistic alternatives at the >>>> time, but I can't say the XMOS solution was a good one. The >>>> device has far too little memory to make sensible buffers (this >>>> still applies to XMOS devices, last I looked), and the software >>>> at the time was painful (this I believe has improved >>>> significantly). If we were making a new version of the >>>> product, we'd drop the XMOS device in an instant and use an >>>> off-the-shelf chip instead. >>> >>> I certainly wouldn't want to comment on your use case. >> >> As I said, it was a while ago, when XMOS were relatively new - I >> assume the software, libraries and examples are better now than at >> that time. But for applications like ours, you can just get a >> CMedia chip and wire it up - no matter how good XMOS tools have >> become, they don't beat that. >> >> (And then all the development budget can be spent on trying to get >> drivers to work on idiotic Windows systems...) >> >>> >>> To me a large part of the attraction is that you can /predict/ >>> the /worst/ case latency and jitter (and hence throughput), in a >>> way that is difficult in a standard MCU and easy in an FPGA. >> >> For standard MCU's, you aim to do this by using hardware >> peripherals (timers, PWM blocks, communication controllers, etc.) >> for the most timing-critical stuff. Then you don't need it in the >> software. > > And all that hardware costs chip space which you may or may not use. > That's why they have so many flavors, to give the "perfect" > combination of memory, peripherals and analog to minimize cost for > each project.
The silicon costs of basic peripherals like timers and UARTs is tiny - generally close to irrelevant. Ethernet is a bit more costly, and some kinds of peripherals have additional cost such as royalties (this used to be the case for CAN controllers until the patents ran out). Analogue parts can be most costly in silicon space, especially if they need calibrating in some way. Much of the cost of peripherals is in the IO blocks and drivers, and the multiplexing and signal routing to support them. Memory blocks - while simple - usually take up a much bigger part of the die area.
> > FPGAs have a cost overhead which is fading into the background as > they become more and more efficient. For many designs an FPGA > provides a good trade off between cost and flexibility. In many > cases it also provides a functionality that can't be duplicated > elsewhere.
Sure, FPGAs have their uses - including areas where they are the only sensible solution, and areas of overlap where either microcontrollers or FPGAs could do the job. When looking at the cost of making these choices, there are three main parts. The development costs, the production costs, and the lifetime costs. How you balance these will depend on the type of product, the quantities you make, the use of the product, and its expected lifetime. So no single answer is every going to be "right". But one thing is very clear - for developers and companies that have done a lot of FPGA development, the costs of developing a new FPGA-based device will be far smaller than for a company that has not done such systems before. Don't assume that because FPGA design is cheap for /you/ to do, that it is necessarily cheap for others. The opposite is true as well - there are plenty of boards made where an FPGA (or other programmable logic) would make things simpler and cheaper, but is not seriously considered because programmable logic is often viewed as expensive and difficult. About the only thing you can be sure of in embedded development is that there are many possible answers. And for any serious project, by the time you have a finished product there will be new devices and new answers that could have made the whole thing cheaper!
> > >>> To that extent it allows FPGA-like performance with >>> "traditional" software development tools and methodologies. Plus >>> a little bit of "thinking parallel" that everybody will soon >>> /have/ to be doing :) >> >> It's a nice idea, and I'm sure XMOS has some good use-cases. But >> I can't help feeling they have something that is /almost/ a good >> system - with a bit more, they could be very much more useful. > > It's good, it just has it's own niche of applications where it is the > best solution. Nothing wrong with that! >
Indeed. I think what bugs me most is that those niches haven't yet turned up in the projects my customers are asking for!