MCU mimicking a SPI flash slave| page 4

Reply by Tom Gardner ●June 16, 20172017-06-16

On 16/06/17 04:12, rickman wrote:
> Tom Gardner wrote on 6/15/2017 6:33 PM:
>> On 15/06/17 21:52, rickman wrote:
>>> Tom Gardner wrote on 6/15/2017 1:01 PM:
>>>> On 15/06/17 17:13, rickman wrote:
>>>>> Tom Gardner wrote on 6/15/2017 3:01 AM:
>>>>>>
>>>>>> That sounds easy for the XMOS processors:
>>>>>>  - multicore (up to 32 100MIPS cores)
>>>>>>  - FPGA-like i/o, e.g. SERDES, buffers, programmable
>>>>>>    clocks to 250Mb/s, separate timer for each port
>>>>>>  - i/o timing changes can be specified/measured in
>>>>>>    software with 4ns resolution
>>>>>>  - guaranteed latencies <100ns
>>>>>>  - *excellent* programming model based on CSP
>>>>>>    (summary: of "Occam in C")
>>>>>>  - Eclipse-based dev environment
>>>>>
>>>>> Any yet the XMOS can't run fast enough to return a result in the time
>>>>> available.  What a pity, all dressed up an nowhere to go.
>>>>
>>>> You'll have to explain that, because I don't understand
>>>> what you are referring to.
>>>
>>> Maybe I don't understand the speed of the XMOS.
>>
>> A 30000ft opverview:
>> http://www.xmos.com/published/xcore-architecture-flyer?version=latest
>>
>>> Nothing you write above says to
>>> me it is any faster than the 700 MIPS processor I considered. Nothing written
>>> above would allow it to perform the task any differently than the bit banging
>>> approach I initially considered.
>>
>> What are the *guaranteed* timings and latencies?
>> Include all possible disturbances due to cache/TLB
>> misses, branch predictions and interrupts.
>> Include simultaneous USB comms to a host PC.
>>
>> N.B. "guaranteed" /precludes/ measurements to see
>> what is happening. "Guaranteed" requires accurate
>> /prediction/ *before* the code executes.
>
> I know this is clear to you, but I'm not sure what processor you are talking about.
>
>
>>> So what are you confused about?
>>>
>>> Maybe you don't understand the problem.  A bit serial data port (two bits
>>> wide
>>> actually, but that is not relevant) provides data, address and a two bit
>>> command
>>> word serially.  The last bit of the command work is indicated by a "command"
>>> signal going high during that bit.  Data is clocked in on the rising edge
>>> of the
>>> clock.  When the command word is a read, the read data is provided serially
>>> starting on the subsequent falling edge.  This provides 15 ns from rising
>>> edge
>>> to falling edge.  It is possible to prefetch the read data based on the
>>> shifted
>>> address on each rising edge until the command signal is asserted.  This helps
>>> with timing of the memory fetch, but still, that data has to be presented
>>> to the
>>> serial port output in 15 ns and updated every 30 ns.
>>>
>>> I'd like to see code that will make this work.  Or maybe you weren't
>>> addressing
>>> my previous application?  For the OP, can the XMOS emulate a 30 Mbps SPI
>>> port?
>>
>> Each i/o pin can be clocked at up to 250Mb/s.
>> That clock can come from an internal clock
>> (up to 500MHz) or an external pin. The latter
>> sounds relevant for your case. (Strobed and
>> master/slave interfaces are also directly
>> supported in hardware and software).
>>
>> Each I/O pin has SERDES registers, so the data
>> rate processed by a core can be reduced by a factor
>> of 32.
>>
>> So yes, it does look like a /small/ fraction of an xCORE
>> device could very comfortably support that speed.
>
> You still are not addressing the issue at hand.  You are talking about raw I/O
> speeds and the problem is not about raw I/O speeds.  The issue is
> interactivity.  Stimulus followed by response in very short order.  I have seen
> nothing to indicate the XMOS will work for this problem.  That's why I asked for
> a code snippet.

There are many interesting snippets, plus a description
of how they fit together in the /very/ lucid xC tutuorial.
https://www.xmos.com/support/tools/programming?component=17653

You might care to look for the "pinseq/pinsneq" attribute
of i/o ports, in conjunction with the ways of combining i/o
pins, and port timers.

For more details of some i/o structures, see
https://www.xmos.com/published/xs1-ports-introduction?version=latest



>>>>>> Currently I'm managing to count transitions in
>>>>>> software on two 50Mb/s inputs, plus do
>>>>>> simultaneous USB comms to a host computer.
>>>>>
>>>>> Can you output a result before the next input transition?
>>>>
>>>> Yes and no.
>>>>
>>>> Yes: the output (to a host processor and/or LCD)
>>>> proceeds in parallel with the next capture phase.
>>>
>>> I'm referring to bit I/O using the same 50 MHz clock.
>>>
>>>
>>>> No: in the current incarnation there is a short gap
>>>> between capture phases as the results are passed
>>>> from one core to another and the next capture phase
>>>> is started. While it isn't important in my application,
>>>> I may be able to remove that limitation in future
>>>> incarnations.
>>>
>>> That's a lot more than a "short" gap in the context of a fast clock.
>>
>> Sigh.
>>
>> My application only stops for a few microseconds when
>> it is convenient for my application, i.e. once every
>> 1 or 10seconds at the end of a measurement cycle.
>>
>> At other times it chunters away continuously at full
>> rate without interruption - as guaranteed by design.
>
> Fine, but the fact that it will work for your needs does not mean it will work
> for mine.  Again the requirement is to read the inputs looking for a command
> strobe, on finding that retrieve the appropriate word from memory and outputting
> it.  The clock cycle is 30 ns and the total I/O time from command strobe read on
> the positive edge of the clock to the input of the device monitoring the output
> pin (with a 5 ns setup time) 15 ns.  You haven't even addressed the output
> delays in the I/O pins.
>
>
>>>>>> Processors plus &pound;10 devkit available from Farnell
>>>>>> and DigiKey.
>>>>>
>>>>> Not a very good processor to use in cost conscious applications.  The
>>>>> lowest
>>>>> price at Digikey is $3 at qty 1000.  They can't make a part in the
>>>>> sub-dollar
>>>>> range?
>>>>
>>>> Don't presume everybody has your constraints.
>>>
>>> I don't, but not everyone has *your* constraints.  If a processor can't be
>>> sold
>>> in a low cost product, it cuts off the largest volume parts of the market
>>> including the app I am looking at presently.
>>
>> I don't think there are any surprises there.
>>
>> But, again, your (current) requirement is only one
>> perspective.
>
> I can't argue with that.  But that is the need I have and low cost is not a very
> minor requirement.  Processors are sold at much higher volumes for the low cost
> products.  The higher cost, lower volume products often can be built with a wide
> range of solutions, again the selected device is often chosen as the one that
> meets the requirements at the lowest price.  So shrugging off the cost issue of
> *my* current need is rather disingenuous.

Of course I will shrug them off; they are of no direct
interest to me.

You have a choice...

You could look at the XMOS devices see where they
might complement the existing design options
(principally FPGAs and conventional MCUs), and
blur the boundaries between them.

Or you could find ways in which the devices
couldn't /possibly/ do what existing options
can do (cf the way h/w designers reacted to
early microrocessors).

I can imagine that FPGA designers might see them as
a /bit/ of a threat, and react accordingly.

Reply by rickman ●June 16, 20172017-06-16

Tom Gardner wrote on 6/16/2017 2:05 AM:
> On 16/06/17 04:12, rickman wrote:
>> Tom Gardner wrote on 6/15/2017 6:33 PM:
>>> On 15/06/17 21:52, rickman wrote:
>>>> Tom Gardner wrote on 6/15/2017 1:01 PM:
>>>>> On 15/06/17 17:13, rickman wrote:
>>>>>> Tom Gardner wrote on 6/15/2017 3:01 AM:
>>>>>>>
>>>>>>> That sounds easy for the XMOS processors:
>>>>>>>  - multicore (up to 32 100MIPS cores)
>>>>>>>  - FPGA-like i/o, e.g. SERDES, buffers, programmable
>>>>>>>    clocks to 250Mb/s, separate timer for each port
>>>>>>>  - i/o timing changes can be specified/measured in
>>>>>>>    software with 4ns resolution
>>>>>>>  - guaranteed latencies <100ns
>>>>>>>  - *excellent* programming model based on CSP
>>>>>>>    (summary: of "Occam in C")
>>>>>>>  - Eclipse-based dev environment
>>>>>>
>>>>>> Any yet the XMOS can't run fast enough to return a result in the time
>>>>>> available.  What a pity, all dressed up an nowhere to go.
>>>>>
>>>>> You'll have to explain that, because I don't understand
>>>>> what you are referring to.
>>>>
>>>> Maybe I don't understand the speed of the XMOS.
>>>
>>> A 30000ft opverview:
>>> http://www.xmos.com/published/xcore-architecture-flyer?version=latest
>>>
>>>> Nothing you write above says to
>>>> me it is any faster than the 700 MIPS processor I considered. Nothing
>>>> written
>>>> above would allow it to perform the task any differently than the bit
>>>> banging
>>>> approach I initially considered.
>>>
>>> What are the *guaranteed* timings and latencies?
>>> Include all possible disturbances due to cache/TLB
>>> misses, branch predictions and interrupts.
>>> Include simultaneous USB comms to a host PC.
>>>
>>> N.B. "guaranteed" /precludes/ measurements to see
>>> what is happening. "Guaranteed" requires accurate
>>> /prediction/ *before* the code executes.
>>
>> I know this is clear to you, but I'm not sure what processor you are
>> talking about.
>>
>>
>>>> So what are you confused about?
>>>>
>>>> Maybe you don't understand the problem.  A bit serial data port (two bits
>>>> wide
>>>> actually, but that is not relevant) provides data, address and a two bit
>>>> command
>>>> word serially.  The last bit of the command work is indicated by a
>>>> "command"
>>>> signal going high during that bit.  Data is clocked in on the rising edge
>>>> of the
>>>> clock.  When the command word is a read, the read data is provided serially
>>>> starting on the subsequent falling edge.  This provides 15 ns from rising
>>>> edge
>>>> to falling edge.  It is possible to prefetch the read data based on the
>>>> shifted
>>>> address on each rising edge until the command signal is asserted.  This
>>>> helps
>>>> with timing of the memory fetch, but still, that data has to be presented
>>>> to the
>>>> serial port output in 15 ns and updated every 30 ns.
>>>>
>>>> I'd like to see code that will make this work.  Or maybe you weren't
>>>> addressing
>>>> my previous application?  For the OP, can the XMOS emulate a 30 Mbps SPI
>>>> port?
>>>
>>> Each i/o pin can be clocked at up to 250Mb/s.
>>> That clock can come from an internal clock
>>> (up to 500MHz) or an external pin. The latter
>>> sounds relevant for your case. (Strobed and
>>> master/slave interfaces are also directly
>>> supported in hardware and software).
>>>
>>> Each I/O pin has SERDES registers, so the data
>>> rate processed by a core can be reduced by a factor
>>> of 32.
>>>
>>> So yes, it does look like a /small/ fraction of an xCORE
>>> device could very comfortably support that speed.
>>
>> You still are not addressing the issue at hand.  You are talking about raw
>> I/O
>> speeds and the problem is not about raw I/O speeds.  The issue is
>> interactivity.  Stimulus followed by response in very short order.  I have
>> seen
>> nothing to indicate the XMOS will work for this problem.  That's why I
>> asked for
>> a code snippet.
>
> There are many interesting snippets, plus a description
> of how they fit together in the /very/ lucid xC tutuorial.
> https://www.xmos.com/support/tools/programming?component=17653
>
> You might care to look for the "pinseq/pinsneq" attribute
> of i/o ports, in conjunction with the ways of combining i/o
> pins, and port timers.
>
> For more details of some i/o structures, see
> https://www.xmos.com/published/xs1-ports-introduction?version=latest
>
>
>
>>>>>>> Currently I'm managing to count transitions in
>>>>>>> software on two 50Mb/s inputs, plus do
>>>>>>> simultaneous USB comms to a host computer.
>>>>>>
>>>>>> Can you output a result before the next input transition?
>>>>>
>>>>> Yes and no.
>>>>>
>>>>> Yes: the output (to a host processor and/or LCD)
>>>>> proceeds in parallel with the next capture phase.
>>>>
>>>> I'm referring to bit I/O using the same 50 MHz clock.
>>>>
>>>>
>>>>> No: in the current incarnation there is a short gap
>>>>> between capture phases as the results are passed
>>>>> from one core to another and the next capture phase
>>>>> is started. While it isn't important in my application,
>>>>> I may be able to remove that limitation in future
>>>>> incarnations.
>>>>
>>>> That's a lot more than a "short" gap in the context of a fast clock.
>>>
>>> Sigh.
>>>
>>> My application only stops for a few microseconds when
>>> it is convenient for my application, i.e. once every
>>> 1 or 10seconds at the end of a measurement cycle.
>>>
>>> At other times it chunters away continuously at full
>>> rate without interruption - as guaranteed by design.
>>
>> Fine, but the fact that it will work for your needs does not mean it will
>> work
>> for mine.  Again the requirement is to read the inputs looking for a command
>> strobe, on finding that retrieve the appropriate word from memory and
>> outputting
>> it.  The clock cycle is 30 ns and the total I/O time from command strobe
>> read on
>> the positive edge of the clock to the input of the device monitoring the
>> output
>> pin (with a 5 ns setup time) 15 ns.  You haven't even addressed the output
>> delays in the I/O pins.
>>
>>
>>>>>>> Processors plus &pound;10 devkit available from Farnell
>>>>>>> and DigiKey.
>>>>>>
>>>>>> Not a very good processor to use in cost conscious applications.  The
>>>>>> lowest
>>>>>> price at Digikey is $3 at qty 1000.  They can't make a part in the
>>>>>> sub-dollar
>>>>>> range?
>>>>>
>>>>> Don't presume everybody has your constraints.
>>>>
>>>> I don't, but not everyone has *your* constraints.  If a processor can't be
>>>> sold
>>>> in a low cost product, it cuts off the largest volume parts of the market
>>>> including the app I am looking at presently.
>>>
>>> I don't think there are any surprises there.
>>>
>>> But, again, your (current) requirement is only one
>>> perspective.
>>
>> I can't argue with that.  But that is the need I have and low cost is not
>> a very
>> minor requirement.  Processors are sold at much higher volumes for the low
>> cost
>> products.  The higher cost, lower volume products often can be built with
>> a wide
>> range of solutions, again the selected device is often chosen as the one that
>> meets the requirements at the lowest price.  So shrugging off the cost
>> issue of
>> *my* current need is rather disingenuous.
>
> Of course I will shrug them off; they are of no direct
> interest to me.
>
> You have a choice...
>
> You could look at the XMOS devices see where they
> might complement the existing design options
> (principally FPGAs and conventional MCUs), and
> blur the boundaries between them.
>
> Or you could find ways in which the devices
> couldn't /possibly/ do what existing options
> can do (cf the way h/w designers reacted to
> early microrocessors).
>
> I can imagine that FPGA designers might see them as
> a /bit/ of a threat, and react accordingly.

A threat???  That makes no sense.

-- 

Rick C

Reply by David Brown ●June 16, 20172017-06-16

On 16/06/17 00:15, rickman wrote:
> David Brown wrote on 6/15/2017 5:37 PM:
>> On 15/06/17 22:52, rickman wrote:
> 
> I think you are saying the CPU can do things with a 10 ns time
> resolution, no?  That is the relevant number for this if bit banging the
> I/O port.  I assume the "dual issue" can't simultaneously execute
> instructions where one depends on the result from the other?
> 

Yes, that sounds right.

>>
>> The XMOS has parallel/serial converters for every GPIO pin.  For an
>> application like this, you would use an 8-bit SERDES on the MISO and MOSI
>> pins, and use the clock pin to trigger the transfers.
> 
> Serial to parallel converters may not help with this design.  Data is 8
> bits, address is 8 bits, command is 2 bits, the incoming data path is 2
> bits wide, outgoing data path is 1 bit wide.  The design was made to be
> tolerant of extraneous clock edges between words.  The end of the serial
> transfer was flagged by a CMD signal going high on the 2 bit command
> word transfer.  I don't see how an 8 bit serial shift register would
> help receiving the input data even if it were only 1 bit wide (or you
> use two CPUs) since you can't rely on the clock count to always be
> right, *plus* you get a single clock with 2 input bits and a flag to
> indicate the end of the input transfer. Then you have 15 ns (minus setup
> and hold time on the output pin) to fetch the data and start shifting it
> out.
> 
> I recall even in the FPGA I was reading the output of a register mux
> which then had to feed a shift register.  I used another 1 bit mux to
> select the output of the mux for the first bit and the output of the
> shift register for the remaining bits *and* the timing was tight.

It sounds like a particularly challenging design you have here.  The
XMOS will let you do many things an ordinary microcontroller cannot, but
it can't do /everything/ !  There are some timing requirements that are
impossible without programmable logic.

> 
> The data ports of the design had serial interfaces up to 50 MHz, time
> correlated to a CODEC.  The CODEC received a time code which was
> transmitted along with the digital data in packets over IP.  The same
> board on the other end received the packets and reconstructed the data
> and time  stamp.
> 
> Once an FPGA was on the board there was no reason to use a CPU, although
> I would have liked to have a hybrid chip with about 1000-4 input LUTs
> and a moderate CPU or DSP even.  Add a 16 bit stereo CODEC and it would
> be perfect!

Often a soft processor can be enough for housekeeping, but an FPGA with
a fast ARM core would be nice.  Expensive, but nice.

I think Atmel/Microchip now have a microcontroller with a bit of
programmable logic - I don't know how useful that might be.  (Not for
your application here, of course - neither the cpu nor the PLD part are
powerful enough.)

> 
> I wonder why they can't make lower cost versions?  The GA144 has 144
> processors, $15 @ qty 1.  It's not even a modern process node, 150 or
> 180 nm I think, 100 times more area than what they are using today.
> 

I guess it is the usual matter - NRE and support costs have to be
amortized.  When the chip is not a big seller (and I don't imagine the
GA144 is that popular), they have to make back their investment somehow.

Have you used the GA144?  It sounds interesting, but I haven't thought
of any applications for it.

Reply by David Brown ●June 16, 20172017-06-16

On 15/06/17 23:46, rickman wrote:
> David Brown wrote on 6/15/2017 7:04 AM:
>> On 15/06/17 10:41, Tom Gardner wrote:
>>> On 15/06/17 09:02, David Brown wrote:
> 
> I think that depends on the design.  Perhaps you are familiar with the
> GA144 with 144 processors at a cost of around $0.10 per CPU.  They took
> a similar route with NO dedicated I/O hardware other than a SERDES
> receiver and transmitter pair.
> 
> It is expected that all I/O would be through software.  Along that line
> the chip boots through one of three I/O ports, async serial, SPI serial,
> a 2 wire interface and I believe there is a 1 wire interface but I'm not
> certain.  Three of the CPUs can be ganged to form a parallel port/memory
> interface, two CPUs with 18 bits of I/O and the third 4 bits.  All of
> this is controlled by software.
> 
> I find the device has significant limitations overall, but certainly
> with a peak execution rate of 700 MIPS there is a lot of potential. 
> Much like no one focuses on the idea that using the 4 input LUT of an
> FPGA as an inverter is excessively wasteful, the GA144 with its 10 cent
> CPUs gets us out of the thinking that using a CPU as a UART is wasteful.
> 
> Not trying to be negative, but I see the $2 cost of an XMOS CPU as being
> excessive and wasteful.  Heck, you can buy complete MCU devices for a
> fraction of that price.

I had a little look at the GreenArrays website for the GA144.  It
appears to be pretty much a dead company.  There is almost nothing
happening - no new products, no new roadmaps, no new software, no new
application notes.  It is a highly specialised device, with a very
unusual development process (Forth is 50 years old, and it looks it -
and these devices use a weird variant of Forth).  I think it would be a
very risky gamble to use these devices for a real project, even though
there is a lot of interesting stuff in the technology.

> 
> 
>>> XMOS does provide some techniques (e.g. composable
>>> and interfaces) to reduce the sharpness of the cliffs,
>>> but not to eliminate them. But then the same is true
>>> of ARM+RTOS etc etc.
>>
>> Yes, there are learning curves everywhere.  And scope for getting things
>> wrong :-)  XMOS gives a different balance amongst many of the challenges
>> facing designers - it is better in some ways, worse in other ways.
>>
>>>
>>> I wouldn't regard XMOS as being a replacement for
>>> a general-purpose processor, but there is a large
>>> overlap in many hard-realtime applications.
>>>
>>> Similarly I wouldn't regard an ARM as being a
>>> replacement for a CPLD/FPGA, but there can be
>>> an overlap in many soft-realtime applications
>>>
>>> The XMOS devices inhabit an interesting and important
>>> niche between those two.
>>
>> Agreed.
> 
> The question is whether it is worth investing the time and energy into
> learning the chip if you don't focus your work in this realm. 

Certainly - working with XMOS means thinking a little differently.  But
it is not /nearly/ as different as the GA144.

> Personally I find the FPGA approach covers a *lot* of ground that others
> don't see and the region left that is not so easy to address with either
> FPGAs or more conventional CPUs is very limited.  If the XMOS isn't a
> good price fit, I most likely would just go with a small FPGA.  I saw
> the XMOS has a 12 bit, 1 MSPS ADC which is nice.  But again, this only
> makes its range of good fit slightly larger.

If you have lots of experience with FPGA development, it is natural to
look to FPGAs for solutions to design problems - and that is absolutely
fine.  It is impossible to jump on every different development method
and be an expert at them all - and most problems can be solved in a
variety of ways.

> 
> 
>>>> XMOS devices can be a lot of fun, and I would enjoy working with them
>>>> again.  For some tasks, they are an ideal solution (like for this one),
>>>> once you have learned to use them.  But not everything is easy with
>>>> them, and you have to think in a somewhat different way.
>>>
>>> Yes, but the learning curve is very short and
>>> there are no unpleasant surprises.
>>>
>>> IMNSHO "thinking in CSP" will help structure
>>> thoughts for any realtime application, whether
>>> or not it is formally part of the implementation.
>>>
>>
>> Agreed.
> 
> I prefer Forth for embedded work.  I believe there is a Forth
> available.  

Available for what?  The XMOS?  I'd be surprised.

> I don't know if it captures any of the flavor of CSP, mostly
> because I know little about CSP.  I did use Occam some time ago.  Mostly
> I recall it had a lot of constraints on what the programmer was allowed
> to do.  The project we were using the Transputer on programmed it all in C.
> 

Occam has its own advantages and disadvantages independent of the use of
CSP-style synchronisation and message passing.

I have looked at Forth a few times over the years, but I have yet to see
a version that has changed for the better since I played with it as a
teenager some 30 years ago.  The stuff on the GA144 website is absurd -
their "innovation" is that their IDE has colour highlighting despite
looking like a reject from the days of MSDOS 3.3.  Words are apparently
only sensitive to the first "5 to 7 characters" - so "block" and
"blocks" are the same identifier.  (You would think they would /know/ if
the limit were 5, 6 or 7 characters.)  Everything is still designed
around punch card formats of 8 by 64 characters.

I can appreciate that a stack machine design is ideal for a small
processor, and can give very tight code.  I can appreciate that this
means a sort of Forth is the natural assembly language for the system.
I can even appreciate that the RPN syntax, the interactivity, and the
close-to-the-metal programming appeals to some people.  But I cannot
understand why this cannot be done with a modern language with decent
typing, static checking, optimised compilation, structured syntax, etc.

Reply by Stephen Pelc ●June 16, 20172017-06-16

On Fri, 16 Jun 2017 13:21:03 +0200, David Brown
<david.brown@hesbynett.no> wrote:

>I have looked at Forth a few times over the years, but I have yet to see
>a version that has changed for the better since I played with it as a
>teenager some 30 years ago. 

The versions shipped by the professional companies have changed a lot.
The major commercial suppliers are Forth Inc and MicroProcessor
Engineering. See
  http://www.forth.com
  http://www.mpeforth.com

Forth does not suit everyone, but it has changed a lot since you
were a teenager.

Stephen

-- 
Stephen Pelc, stephenXXX@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads

Reply by David Brown ●June 16, 20172017-06-16

On 16/06/17 15:55, Stephen Pelc wrote:
> On Fri, 16 Jun 2017 13:21:03 +0200, David Brown
> <david.brown@hesbynett.no> wrote:
> 
>> I have looked at Forth a few times over the years, but I have yet to see
>> a version that has changed for the better since I played with it as a
>> teenager some 30 years ago. 
> 
> The versions shipped by the professional companies have changed a lot.
> The major commercial suppliers are Forth Inc and MicroProcessor
> Engineering. See
>   http://www.forth.com
>   http://www.mpeforth.com
> 
> Forth does not suit everyone, but it has changed a lot since you
> were a teenager.
> 
> Stephen
> 

I had a look.  No, FORTH has not progressed that I can see (unless you
think adding colour to the editor is a revolution).  Some FORTH
compilers might be good at producing optimised code on microcontrollers,
but that is an improvement in the implementations, not the language.

Reply by Stephen Pelc ●June 16, 20172017-06-16

On Fri, 16 Jun 2017 16:34:55 +0200, David Brown
<david.brown@hesbynett.no> wrote:

>I had a look.  No, FORTH has not progressed that I can see (unless you
>think adding colour to the editor is a revolution).  Some FORTH
>compilers might be good at producing optimised code on microcontrollers,
>but that is an improvement in the implementations, not the language.

We use standard editors with syntax colouring files as we have done
for decades. The professional Forth compilers, whether for
microcontrollers or for the desktop, produce optimised native
code.

The current Forth standard is Forth-2012. See:
  http://www.forth200x.org/documents/forth-2012.pdf

What you used 30 years ago did not include target code for
  USB stack
  FAT file system
  TCP/IP stack with HTTP, FTP and Telnet servers
  Embedded GUI
  and so on and so on

Having been in the Forth compiler business for a very long time,
I can assure you that the tools and libraries supplied in this
decade are vastly superior to those of 30 years ago.

Stephen

-- 
Stephen Pelc, stephenXXX@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads

Reply by rickman ●June 16, 20172017-06-16

David Brown wrote on 6/16/2017 3:25 AM:
> On 16/06/17 00:15, rickman wrote:
>> David Brown wrote on 6/15/2017 5:37 PM:
>>> On 15/06/17 22:52, rickman wrote:
>>
>> I think you are saying the CPU can do things with a 10 ns time
>> resolution, no?  That is the relevant number for this if bit banging the
>> I/O port.  I assume the "dual issue" can't simultaneously execute
>> instructions where one depends on the result from the other?
>>
>
> Yes, that sounds right.
>
>>>
>>> The XMOS has parallel/serial converters for every GPIO pin.  For an
>>> application like this, you would use an 8-bit SERDES on the MISO and MOSI
>>> pins, and use the clock pin to trigger the transfers.
>>
>> Serial to parallel converters may not help with this design.  Data is 8
>> bits, address is 8 bits, command is 2 bits, the incoming data path is 2
>> bits wide, outgoing data path is 1 bit wide.  The design was made to be
>> tolerant of extraneous clock edges between words.  The end of the serial
>> transfer was flagged by a CMD signal going high on the 2 bit command
>> word transfer.  I don't see how an 8 bit serial shift register would
>> help receiving the input data even if it were only 1 bit wide (or you
>> use two CPUs) since you can't rely on the clock count to always be
>> right, *plus* you get a single clock with 2 input bits and a flag to
>> indicate the end of the input transfer. Then you have 15 ns (minus setup
>> and hold time on the output pin) to fetch the data and start shifting it
>> out.
>>
>> I recall even in the FPGA I was reading the output of a register mux
>> which then had to feed a shift register.  I used another 1 bit mux to
>> select the output of the mux for the first bit and the output of the
>> shift register for the remaining bits *and* the timing was tight.
>
> It sounds like a particularly challenging design you have here.  The
> XMOS will let you do many things an ordinary microcontroller cannot, but
> it can't do /everything/ !  There are some timing requirements that are
> impossible without programmable logic.

I think it would have been possible with the GA144 and the 700 MIPS peak 
rate, but it depended on the I/O timing which I couldn't get them to 
provide.  They suggested I build one and test it!  lol

>> The data ports of the design had serial interfaces up to 50 MHz, time
>> correlated to a CODEC.  The CODEC received a time code which was
>> transmitted along with the digital data in packets over IP.  The same
>> board on the other end received the packets and reconstructed the data
>> and time  stamp.
>>
>> Once an FPGA was on the board there was no reason to use a CPU, although
>> I would have liked to have a hybrid chip with about 1000-4 input LUTs
>> and a moderate CPU or DSP even.  Add a 16 bit stereo CODEC and it would
>> be perfect!
>
> Often a soft processor can be enough for housekeeping, but an FPGA with
> a fast ARM core would be nice.  Expensive, but nice.

Not sure what you mean by a "fast" ARM core, but ARMs combined with FPGAs 
are sold by three of the four FPGA companies.

> I think Atmel/Microchip now have a microcontroller with a bit of
> programmable logic - I don't know how useful that might be.  (Not for
> your application here, of course - neither the cpu nor the PLD part are
> powerful enough.)

You might be thinking of the PSOC devices from Cypress.  They have either an 
8051 type processor or an ARM CM3 with various programmable logic and 
analog.  Not really an FPGA in any sense.  They can be programmed in 
Verilog, but they are not terribly capable.  Think of them as having highly 
flexible peripherals.

If you really mean the Atmel FPGAs, they are very old, very slow and very 
expensive.  I don't consider them to be in the FPGA business, they are more 
in the obsolete device business like Rochester Electronics.  The device line 
they had that included a small 8 bit processor is long gone and was never 
cost competitive.

>> I wonder why they can't make lower cost versions?  The GA144 has 144
>> processors, $15 @ qty 1.  It's not even a modern process node, 150 or
>> 180 nm I think, 100 times more area than what they are using today.
>>
>
> I guess it is the usual matter - NRE and support costs have to be
> amortized.  When the chip is not a big seller (and I don't imagine the
> GA144 is that popular), they have to make back their investment somehow.

I'm talking about the XMOS device.  The GA144 could easily be sold cheaply 
if they use a more modern process and sold them in high volumes.  But what 
is holding back XMOS from selling a $1 chip?  My understanding is the CPU is 
normally a pretty small part of an MCU with memory being the lion's share of 
the real estate.  Even having 8 CPUs shouldn't run the area and cost up 
since the chip is really all about the RAM.  Is the RAM special in some way? 
  I thought it was just fast and shared through multiplexing.

> Have you used the GA144?  It sounds interesting, but I haven't thought
> of any applications for it.

There are a number of issues with using the GA144 in a production design. 
Not the least is the lack of reliable supply.  The company runs on a 
shoestring with minimal offices, encouraging free help from anyone 
interested in writing an app note.  lol  When they kicked off the GA144 
there was a lot of interest from fairly fringe groups of designers (the 
assembly language is pretty much Forth) but I have yet to hear of any 
designs reaching production which is not the same thing as there being none. 
  The production runs appear to be the minimum size test runs from the 
foundry.  The chip is pretty small, so they get a *lot* from a wafer.

-- 

Rick C

Reply by rickman ●June 16, 20172017-06-16

David Brown wrote on 6/16/2017 7:21 AM:
> On 15/06/17 23:46, rickman wrote:
>> David Brown wrote on 6/15/2017 7:04 AM:
>>> On 15/06/17 10:41, Tom Gardner wrote:
>>>> On 15/06/17 09:02, David Brown wrote:
>>
>> I think that depends on the design.  Perhaps you are familiar with the
>> GA144 with 144 processors at a cost of around $0.10 per CPU.  They took
>> a similar route with NO dedicated I/O hardware other than a SERDES
>> receiver and transmitter pair.
>>
>> It is expected that all I/O would be through software.  Along that line
>> the chip boots through one of three I/O ports, async serial, SPI serial,
>> a 2 wire interface and I believe there is a 1 wire interface but I'm not
>> certain.  Three of the CPUs can be ganged to form a parallel port/memory
>> interface, two CPUs with 18 bits of I/O and the third 4 bits.  All of
>> this is controlled by software.
>>
>> I find the device has significant limitations overall, but certainly
>> with a peak execution rate of 700 MIPS there is a lot of potential.
>> Much like no one focuses on the idea that using the 4 input LUT of an
>> FPGA as an inverter is excessively wasteful, the GA144 with its 10 cent
>> CPUs gets us out of the thinking that using a CPU as a UART is wasteful.
>>
>> Not trying to be negative, but I see the $2 cost of an XMOS CPU as being
>> excessive and wasteful.  Heck, you can buy complete MCU devices for a
>> fraction of that price.
>
> I had a little look at the GreenArrays website for the GA144.  It
> appears to be pretty much a dead company.  There is almost nothing
> happening - no new products, no new roadmaps, no new software, no new
> application notes.  It is a highly specialised device, with a very
> unusual development process (Forth is 50 years old, and it looks it -
> and these devices use a weird variant of Forth).  I think it would be a
> very risky gamble to use these devices for a real project, even though
> there is a lot of interesting stuff in the technology.

Yeah, I don't know of any product using the GA144.  I looked hard at using 
it in a production design where I needed to replace an EOL FPGA.  Ignoring 
all the other issues, I wasn't sure it would meet the timing I've outlined 
in other posts in this thread.  I needed info on the I/O timing and GA 
wouldn't provide it.  They seemed to think I wanted to reverse engineer the 
transistor level design.  Silly gooses.

I don't know why the age of a computer language is even a factor.  I don't 
think C is a newcomer and is the most widely used programming language in 
embedded devices, no?

The GA144 is a stack processor and so the assembly language looks a lot like 
Forth which is based on a stack processor virtual machine.  I'm not sure 
what is "weird" about it other than the fact that most programmers aren't 
familiar with stack programming other than Open Boot, Postscript, RPL and 
BibTeX.

>>>> XMOS does provide some techniques (e.g. composable
>>>> and interfaces) to reduce the sharpness of the cliffs,
>>>> but not to eliminate them. But then the same is true
>>>> of ARM+RTOS etc etc.
>>>
>>> Yes, there are learning curves everywhere.  And scope for getting things
>>> wrong :-)  XMOS gives a different balance amongst many of the challenges
>>> facing designers - it is better in some ways, worse in other ways.
>>>
>>>>
>>>> I wouldn't regard XMOS as being a replacement for
>>>> a general-purpose processor, but there is a large
>>>> overlap in many hard-realtime applications.
>>>>
>>>> Similarly I wouldn't regard an ARM as being a
>>>> replacement for a CPLD/FPGA, but there can be
>>>> an overlap in many soft-realtime applications
>>>>
>>>> The XMOS devices inhabit an interesting and important
>>>> niche between those two.
>>>
>>> Agreed.
>>
>> The question is whether it is worth investing the time and energy into
>> learning the chip if you don't focus your work in this realm.
>
> Certainly - working with XMOS means thinking a little differently.  But
> it is not /nearly/ as different as the GA144.

"A little" is a *lot* larger learning curve than any other MCU I am aware of 
(the GA144 aside).  My point is that can be worth it only if you do a lot of 
designs that would make use of its unique features.  I'm not sure there 
really is a very large sweet spot given that the chips are not sold at the 
low end and other devices will do the same job using existing techniques and 
tools.

>> Personally I find the FPGA approach covers a *lot* of ground that others
>> don't see and the region left that is not so easy to address with either
>> FPGAs or more conventional CPUs is very limited.  If the XMOS isn't a
>> good price fit, I most likely would just go with a small FPGA.  I saw
>> the XMOS has a 12 bit, 1 MSPS ADC which is nice.  But again, this only
>> makes its range of good fit slightly larger.
>
> If you have lots of experience with FPGA development, it is natural to
> look to FPGAs for solutions to design problems - and that is absolutely
> fine.  It is impossible to jump on every different development method
> and be an expert at them all - and most problems can be solved in a
> variety of ways.

Yes, but you don't need to know a "variety" of ways of solving problems. 
You only need to know ways that are highly effective for most design problems.

The many misconceptions of FPGAs relegate them to situations where CPUs just 
can't cut the mustard.  In reality they are very flexible and only limited 
by the lack of on chip peripherals.  Microsemi adds more peripherals to 
their devices, but still don't compete directly with lower cost MCUs.

>>>>> XMOS devices can be a lot of fun, and I would enjoy working with them
>>>>> again.  For some tasks, they are an ideal solution (like for this one),
>>>>> once you have learned to use them.  But not everything is easy with
>>>>> them, and you have to think in a somewhat different way.
>>>>
>>>> Yes, but the learning curve is very short and
>>>> there are no unpleasant surprises.
>>>>
>>>> IMNSHO "thinking in CSP" will help structure
>>>> thoughts for any realtime application, whether
>>>> or not it is formally part of the implementation.
>>>>
>>>
>>> Agreed.
>>
>> I prefer Forth for embedded work.  I believe there is a Forth
>> available.
>
> Available for what?  The XMOS?  I'd be surprised.
>
>> I don't know if it captures any of the flavor of CSP, mostly
>> because I know little about CSP.  I did use Occam some time ago.  Mostly
>> I recall it had a lot of constraints on what the programmer was allowed
>> to do.  The project we were using the Transputer on programmed it all in C.
>>
>
> Occam has its own advantages and disadvantages independent of the use of
> CSP-style synchronisation and message passing.
>
> I have looked at Forth a few times over the years, but I have yet to see
> a version that has changed for the better since I played with it as a
> teenager some 30 years ago.  The stuff on the GA144 website is absurd -
> their "innovation" is that their IDE has colour highlighting despite
> looking like a reject from the days of MSDOS 3.3.  Words are apparently
> only sensitive to the first "5 to 7 characters" - so "block" and
> "blocks" are the same identifier.  (You would think they would /know/ if
> the limit were 5, 6 or 7 characters.)  Everything is still designed
> around punch card formats of 8 by 64 characters.

You seem obsessed with your perceptions of the UI rather than utility.  I 
don't have a problem with large fonts.  Most of the designers of the system 
are older and have poorer eyesight (a feature I share with them).  The use 
of color to indicate aspects of the language is pretty much the same as the 
color highlighting I see in nearly every modern editor.  The difference is 
that in ColorForth the highlighting is *part* of the language as it 
distinguishes when commands are executed.  Some commands in Forth are 
executed at compile time rather than being compiled.  This is one of the 
many powerful features of Forth.  ColorForth pushes further to allow some 
commands to be executed at edit time.  I have not studied it in detail, so I 
can't give you details on this.

I just want to explain how you are using very simplistic perceptions and 
expectations to "color" your impression of ColorForth without learning 
anything important about it.

> I can appreciate that a stack machine design is ideal for a small
> processor, and can give very tight code.  I can appreciate that this
> means a sort of Forth is the natural assembly language for the system.
> I can even appreciate that the RPN syntax, the interactivity, and the
> close-to-the-metal programming appeals to some people.  But I cannot
> understand why this cannot be done with a modern language with decent
> typing, static checking, optimised compilation, structured syntax, etc.

You can't understand because you have not tried to learn about Forth.  I can 
assure you there are a number of optimizing compilers for Forth.  I don't 
know what you are seeing that you think Forth doesn't have "structured 
syntax".  Is this different from the control flow structures?

I see Stephen Pelc responded to your posts.  He is the primary author of VFX 
from MPE.  Instead of throwing a tantrum about Forth "looking" like it is 30 
years old, why not engage him and learn something?

-- 

Rick C

Reply by Don Y ●June 16, 20172017-06-16

On 6/15/2017 3:08 PM, John Speth wrote:
> Sorry about not being clear describing my endeavor.  My only concern was the
> difficulty of making the MCU look like SPI flash.  All other functionality is
> not a problem.

OK.  Just "projecting" from problems I've had glomming kludges onto existing
interfaces at clients' requests:
- network interface to replace an existing disk interface
- memory served over a serial port (imagine fetching opcodes!)
- read/write disk hanging on a serial port
- write-only disk in place of a printer port
etc.

Despite the numerous disclaimers that "what you want isn't really going
to work", clients inevitably placated me with assurances that *all* they
wanted is EXACTLY what they've specified -- only to eventually complain
when they changed their software and "broke" the kludged system (because
they now had different expectations/assumptions in play) *or* bemoaned
how they had "paid for" this extra capability but only a portion of it
was actually available ("Yes, those are the limitations I mentioned
when I was trying to discourage you from this path!  You thought you
had found a clever way to get something for nothing and now realize
what you *really* bought!")

Good luck!

Previous 2 345 6 7 Next

MCU mimicking a SPI flash slave

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group