EmbeddedRelated.com
Forums

Parallax Propeller

Started by Peter Jakacki January 13, 2013
Op Thu, 24 Jan 2013 23:03:32 -0800 (PST) schreef Mark Wills:

> On Jan 25, 6:55&#4294967295;am, Hugh Aguilar <hughaguila...@yahoo.com> wrote: >> On Jan 24, 3:03&#4294967295;pm, Mark Wills <markrobertwi...@yahoo.co.uk> wrote: >> >>> The TMS99xx family of processors (very old) has 16 prioritised >>> cascading interrupts. Probably inherited from mini-computer >>> architecture. Very very powerful for its day. Since they were >>> prioritised, a lower level interrupt would not interrupt a higher >>> level interrupt until the higher level ISR terminated. Makes serving >>> multiple interrupts an absolute doddle. Not bad for 1976. >> >> Doddle? I've never heard that word before. Is a doddle good or bad? > > doddle = extremely simple/easy > > "Did you manage to fix that bug?" > "Yeah, it was a doddle!" > > :-)
As we say: een fluitje van een cent. A flute of a cent does cost nearly nothing and can be made for nearly nothing. There is a herb (Anthriscus sylvestris) we call Fluitekruid. Due to nearly French purism (hash-tag vs. mot-di&#4294967295;se) we must write Fluitenkruid, as if it were plural. -- Coos CHForth, 16 bit DOS applications http://home.hccnet.nl/j.j.haak/forth.html
rickman <gnuarm@gmail.com> writes:
> How do you know the display "fuzziness" was due to software timing? I > would expect software timing on a clocked processor to be on par with > other means of timing. There are other aspects of design that could > cause fuzziness or timing ambiguities in the signal.
If you look at the inner loop driving the output pin, you can do a min/max skew calculation which ends up with quite a bit of jitter on the table. The product is the PockeTerm, you can pick one up at: http://www.brielcomputers.com/wordpress/?cat=25 It's open source, VGA_HiRes_Text.spin is the low level driver for VGA output. Note it actually uses *two* CPUs, and is some pretty darn cool assembly code--written by the president of the Propeller company! Andy Valencia Home page: http://www.vsta.org/andy/ To contact me: http://www.vsta.org/contact/andy.html
On 1/25/2013 4:57 PM, None wrote:
> rickman<gnuarm@gmail.com> writes: >> How do you know the display "fuzziness" was due to software timing? I >> would expect software timing on a clocked processor to be on par with >> other means of timing. There are other aspects of design that could >> cause fuzziness or timing ambiguities in the signal. > > If you look at the inner loop driving the output pin, you can do a min/max > skew calculation which ends up with quite a bit of jitter on the table. > The product is the PockeTerm, you can pick one up at: > > http://www.brielcomputers.com/wordpress/?cat=25 > > It's open source, VGA_HiRes_Text.spin is the low level driver for > VGA output. Note it actually uses *two* CPUs, and is some pretty darn > cool assembly code--written by the president of the Propeller company! > > Andy Valencia > Home page: http://www.vsta.org/andy/ > To contact me: http://www.vsta.org/contact/andy.html
I don't follow what causes the skew you mention. Instruction timings are deterministic, no? If not, trying to time using code is hopeless. If the timings are deterministic, the skew should not be cumulative since they are all based on the CPU clock. Is the CPU clock from an accurate oscillator like a crystal? If it is using an internal RC clock, again timing to sufficient accuracy is hopeless. Rick
rickman <gnuarm@gmail.com> writes:
> I don't follow what causes the skew you mention. Instruction timings > are deterministic, no?
The chip has a lower level bit stream engine which the higher level CPU ("cog") is feeding. Well, a pair of cogs. Each cog has local memory and then a really expensive path through a central arbiter ("hub"). It fills its image of the scanlines from the shared memory, then has to feed it via waitvid into the lower level. Note that it's bit stream engine *per cog*, so you also have to worry about their sync. So yes, instruction timings are deterministic (although your shared memory accesses will vary modulo the hub round-robin count). You need to reach the waitvid before it's your turn to supply the next value. But given that, this is much like the old wait state sync feeding bytes to a floppy controller. PLL and waitvid sync are achieved with magic incantations from Parallax, and it is not 100%.
> If the timings are deterministic, the skew should not be cumulative > since they are all based on the CPU clock. Is the CPU clock from an > accurate oscillator like a crystal? If it is using an internal RC > clock, again timing to sufficient accuracy is hopeless.
The board has a CPU clock from which the PLL derives the video output frequency. I recall the CPU clock being based on a crystal, but not one with any consideration for video intervals. And the PLL's are per cog, again my comment about (potential lack of) global sync. Anyway, you should buy one and check it out. I'd be curious to hear if (1) you also observe the same video quality, and (2) if you think it's the waitvid mechanism, more the PLL->SVGA generation, or the sync issues of the paired video generators. They even supply the schematic, FWIW. Andy Valencia Home page: http://www.vsta.org/andy/ To contact me: http://www.vsta.org/contact/andy.html
On Jan 26, 11:07&#4294967295;pm, rickman <gnu...@gmail.com> wrote:
> On 1/25/2013 4:57 PM, None wrote: > > > > > > > > > > > rickman<gnu...@gmail.com> &#4294967295;writes: > >> How do you know the display "fuzziness" was due to software timing? &#4294967295;I > >> would expect software timing on a clocked processor to be on par with > >> other means of timing. &#4294967295;There are other aspects of design that could > >> cause fuzziness or timing ambiguities in the signal. > > > If you look at the inner loop driving the output pin, you can do a min/max > > skew calculation which ends up with quite a bit of jitter on the table. > > The product is the PockeTerm, you can pick one up at: > > > &#4294967295; &#4294967295; &#4294967295;http://www.brielcomputers.com/wordpress/?cat=25 > > > It's open source, VGA_HiRes_Text.spin is the low level driver for > > VGA output. &#4294967295;Note it actually uses *two* CPUs, and is some pretty darn > > cool assembly code--written by the president of the Propeller company! > > > Andy Valencia > > Home page:http://www.vsta.org/andy/ > > To contact me:http://www.vsta.org/contact/andy.html > > I don't follow what causes the skew you mention. &#4294967295;Instruction timings > are deterministic, no? &#4294967295;If not, trying to time using code is hopeless. > If the timings are deterministic, the skew should not be cumulative > since they are all based on the CPU clock. &#4294967295;Is the CPU clock from an > accurate oscillator like a crystal? &#4294967295;If it is using an internal RC > clock, again timing to sufficient accuracy is hopeless. > > Rick
The instruction times are deterministic (presumably; never written code on the propeller), but when generating video in software, *per scan line* all possible code-paths have to add up to the same number of cycles in order to completely avoid jitter. That's very hard to do. Consider a single scan line that contains text interspersed with spaces. For the current horizontal position the software has to: * Determine if background or foreground (i.e. a pixel of text colour) should be drawn * If background * select background colour to video output register * If foreground * determine character under current horizontal position * determine offset (in pixels) into the current line of the character * is a pixel to be drawn? * If yes, load pixel colour * otherwise, load background colour The second code path is a lot more complex, containing many more instructions, yet both code paths have to balance in terms of execution time. This is just one example. This is how video is done on the original Atari VCS console. 100% software, with the hardware only providing horizontal interrupts (one per scan line) and VBLNK interrupts, IIRC. Caveat: The above assumes that there is no interrupt per horizontal pixel. With interrupts, it's much easier. The Propeller doesn't have any interrupts so software video generation would be non-trivial to say the least. The easiest way would be to provide a pixel clock and use an I/O pin to sync to, as Chuck found out for himself when implementing video on the GA144.
In comp.lang.forth Mark Wills <markrobertwills@yahoo.co.uk> wrote:
> Caveat: The above assumes that there is no interrupt per horizontal > pixel. With interrupts, it's much easier.
I really don't understand why you say this. You need to be able to sync to a timing pulse; whether this is done with interrupts doesn't matter. Andrew.
Op 27-Jan-13 9:50, Mark Wills schreef:
> On Jan 26, 11:07 pm, rickman <gnu...@gmail.com> wrote: >> On 1/25/2013 4:57 PM, None wrote: >> >> >> >> >> >> >> >> >> >>> rickman<gnu...@gmail.com> writes: >>>> How do you know the display "fuzziness" was due to software timing? I >>>> would expect software timing on a clocked processor to be on par with >>>> other means of timing. There are other aspects of design that could >>>> cause fuzziness or timing ambiguities in the signal. >> >>> If you look at the inner loop driving the output pin, you can do a min/max >>> skew calculation which ends up with quite a bit of jitter on the table. >>> The product is the PockeTerm, you can pick one up at: >> >>> http://www.brielcomputers.com/wordpress/?cat=25 >> >>> It's open source, VGA_HiRes_Text.spin is the low level driver for >>> VGA output. Note it actually uses *two* CPUs, and is some pretty darn >>> cool assembly code--written by the president of the Propeller company! >> >>> Andy Valencia >>> Home page:http://www.vsta.org/andy/ >>> To contact me:http://www.vsta.org/contact/andy.html >> >> I don't follow what causes the skew you mention. Instruction timings >> are deterministic, no? If not, trying to time using code is hopeless. >> If the timings are deterministic, the skew should not be cumulative >> since they are all based on the CPU clock. Is the CPU clock from an >> accurate oscillator like a crystal? If it is using an internal RC >> clock, again timing to sufficient accuracy is hopeless. >> >> Rick > > The instruction times are deterministic (presumably; never written > code on the propeller), but when generating video in software, *per > scan line* all possible code-paths have to add up to the same number > of cycles in order to completely avoid jitter. That's very hard to do. > > Consider a single scan line that contains text interspersed with > spaces. For the current horizontal position the software has to: > > * Determine if background or foreground (i.e. a pixel of text colour) > should be drawn > * If background > * select background colour to video output register > * If foreground > * determine character under current horizontal position > * determine offset (in pixels) into the current line of the > character > * is a pixel to be drawn? > * If yes, load pixel colour > * otherwise, load background colour > > The second code path is a lot more complex, containing many more > instructions, yet both code paths have to balance in terms of > execution time. This is just one example. > > This is how video is done on the original Atari VCS console. 100% > software, with the hardware only providing horizontal interrupts (one > per scan line) and VBLNK interrupts, IIRC.
On the Atari VCS the software did not have to send out the individual pixels. The TIA chip had memory for a single scan-line, which the TIA chip converted to a video signal autonomously. The software just had to make sure that the right data was loaded into the TIA chip in time for each scan-line, it could finish doing that before the end of the scan-line, but not after that. The TIA chip has also a function to stall the CPU until the start of the next scan line. I.o.w. the software had to be fast enough for each possible execution flow, but did not have to complete in the exact same number of cycles.
On Jan 25, 12:03&#4294967295;am, Mark Wills <forthfr...@gmail.com> wrote:
> On Jan 25, 6:55&#4294967295;am, Hugh Aguilar <hughaguila...@yahoo.com> wrote: > > > On Jan 24, 3:03&#4294967295;pm, Mark Wills <markrobertwi...@yahoo.co.uk> wrote: > > > > The TMS99xx family of processors (very old) has 16 prioritised > > > cascading interrupts. Probably inherited from mini-computer > > > architecture. Very very powerful for its day. Since they were > > > prioritised, a lower level interrupt would not interrupt a higher > > > level interrupt until the higher level ISR terminated. Makes serving > > > multiple interrupts an absolute doddle. Not bad for 1976. > > > Doddle? I've never heard that word before. Is a doddle good or bad? > > doddle = extremely simple/easy > > "Did you manage to fix that bug?" > "Yeah, it was a doddle!" > > :-)
Maybe the reason why we don't have "doddle" or any similar word in America, is because we never do anything the simple/easy way here! :-)
On 1/27/2013 3:50 AM, Mark Wills wrote:
> On Jan 26, 11:07 pm, rickman<gnu...@gmail.com> wrote: >> On 1/25/2013 4:57 PM, None wrote: >> >> >> >> >> >> >> >> >> >>> rickman<gnu...@gmail.com> writes: >>>> How do you know the display "fuzziness" was due to software timing? I >>>> would expect software timing on a clocked processor to be on par with >>>> other means of timing. There are other aspects of design that could >>>> cause fuzziness or timing ambiguities in the signal. >> >>> If you look at the inner loop driving the output pin, you can do a min/max >>> skew calculation which ends up with quite a bit of jitter on the table. >>> The product is the PockeTerm, you can pick one up at: >> >>> http://www.brielcomputers.com/wordpress/?cat=25 >> >>> It's open source, VGA_HiRes_Text.spin is the low level driver for >>> VGA output. Note it actually uses *two* CPUs, and is some pretty darn >>> cool assembly code--written by the president of the Propeller company! >> >>> Andy Valencia >>> Home page:http://www.vsta.org/andy/ >>> To contact me:http://www.vsta.org/contact/andy.html >> >> I don't follow what causes the skew you mention. Instruction timings >> are deterministic, no? If not, trying to time using code is hopeless. >> If the timings are deterministic, the skew should not be cumulative >> since they are all based on the CPU clock. Is the CPU clock from an >> accurate oscillator like a crystal? If it is using an internal RC >> clock, again timing to sufficient accuracy is hopeless. >> >> Rick > > The instruction times are deterministic (presumably; never written > code on the propeller), but when generating video in software, *per > scan line* all possible code-paths have to add up to the same number > of cycles in order to completely avoid jitter. That's very hard to do. > > Consider a single scan line that contains text interspersed with > spaces. For the current horizontal position the software has to: > > * Determine if background or foreground (i.e. a pixel of text colour) > should be drawn > * If background > * select background colour to video output register > * If foreground > * determine character under current horizontal position > * determine offset (in pixels) into the current line of the > character > * is a pixel to be drawn? > * If yes, load pixel colour > * otherwise, load background colour > > The second code path is a lot more complex, containing many more > instructions, yet both code paths have to balance in terms of > execution time. This is just one example. > > This is how video is done on the original Atari VCS console. 100% > software, with the hardware only providing horizontal interrupts (one > per scan line) and VBLNK interrupts, IIRC.
I'm not getting it. I guess the software had to be done this way to optimize the CPU utilization. The "proper" way to time in software is to have the video data already calculated in a frame buffer and use spin loops to time when pixels are shifted out. That way you don't have lots of processing to figure out the timing for. But you spend most of your processing time in spin loops. Why was it done this way? To save a few bucks on video hardware? That's just not an issue now days... unless you are really obsessive about not using hardware where hardware is warranted.
> Caveat: The above assumes that there is no interrupt per horizontal > pixel. With interrupts, it's much easier. The Propeller doesn't have > any interrupts so software video generation would be non-trivial to > say the least. The easiest way would be to provide a pixel clock and > use an I/O pin to sync to, as Chuck found out for himself when > implementing video on the GA144.
I would have to go back and reread the web pages, but I think Chuck's original attempt was to time the *entire* frame timing in software with NO hardware timing at all. He found the timings drifted too much from temperature (that's what async processors do after all, they are timed by the silicon delays which vary with temp) so that with the monitor he was using it would stop working once the board warmed up. I'm surprised he had to build it to find that out. But I guess he didn't have specs on the monitor. His "compromise" to hardware timing was to use a horizontal *line* interrupt (with a casual use of the word "interrupt", it is really a wait for a signal) which was driven from the 10 MHz oscillator node, like you described the Atari VCS. He still did the pixel timing in a software loop. With 144 processors it is no big deal to do that... *OR* he could have sprinkled a few counters around the chip to be used for *really* low power timing. Each CPU core uses 5 mW when it is running a simple timing loop. One of the big goals of the chip is to be low power and software timing is the antithesis of low power in my opinion. But then you would need an oscillator and a clock tree... I think there is an optimal compromise between a chip with fully async CPUs, with teeny tiny memories, no clocks, no peripherals (including nearly no real memory interface) and a chip with a very small number of huge CPUs, major clock trees running at very high clock rates, massive memories (multiple types), a plethora of hardware peripherals and a maximal bandwidth memory interface. How about an array of many small CPUs, much like the F18 (or an F32 which rumor has is under development), each one with a few kB of memory, with a dedicated idle timer connected to lower speed clock trees (is one or two small clock trees a real power problem?), some real hardware peripherals for the higher speed I/O standards like 100/1000 Mbps Ethernet, real USB (including USB 3.0), some amount of on chip block RAM and some *real* memory interface which works at 200 or 300 MHz clock rates? I get where Chuck is coming from with the minimal CPU thing. I have said before that I think it is a useful chip in many ways. But so far I haven't been able to use it. One project faced the memory interface limitation and another found the chip to be too hard to use in the low power modes it is supposed to be capable of, just not when you need to do real time stuff at real low power. It only needs a few small improvements including *real* I/O that can work at a number of voltages rather than just the core voltage. Oh yeah, some real documentation on the development system would be useful too. I think you have to read some three or more documents just to get started with the tools. I know it was pretty hard to figure it all out, not that I *actually* figured it out. Rick
Weird, your posts all show up in my reader as replies to your own 
messages rather than replies to my posts.  The trimming made it hard for 
me to figure out just what we were talking about with the odd 
connections in my reader.


On 1/26/2013 10:09 PM, None wrote:
> rickman<gnuarm@gmail.com> writes: >> I don't follow what causes the skew you mention. Instruction timings >> are deterministic, no? > > The chip has a lower level bit stream engine which the higher level CPU > ("cog") is feeding. Well, a pair of cogs. Each cog has local memory > and then a really expensive path through a central arbiter ("hub"). It > fills its image of the scanlines from the shared memory, then has to > feed it via waitvid into the lower level. Note that it's bit stream > engine *per cog*, so you also have to worry about their sync.
I can't picture the processing with this description. I don't know about the higher level and lower level CPUs you describe. Are you saying there is some sort of dedicated hardware in each CPU for video? Or is this separate from the CPUs? Why a *pair* of COGs? I assume a COG is the Propeller term for a CPU?
> So yes, instruction timings are deterministic (although your shared > memory accesses will vary modulo the hub round-robin count). You > need to reach the waitvid before it's your turn to supply the next value. > But given that, this is much like the old wait state sync feeding bytes > to a floppy controller. PLL and waitvid sync are achieved with > magic incantations from Parallax, and it is not 100%.
Not 100%? What does that mean? Magic? I guess this is the magic smoke you want to keep from getting out of the chip?
>> If the timings are deterministic, the skew should not be cumulative >> since they are all based on the CPU clock. Is the CPU clock from an >> accurate oscillator like a crystal? If it is using an internal RC >> clock, again timing to sufficient accuracy is hopeless. > > The board has a CPU clock from which the PLL derives the video output > frequency. I recall the CPU clock being based on a crystal, but not one > with any consideration for video intervals. And the PLL's are per cog, > again my comment about (potential lack of) global sync.
I still don't know enough about the architecture to know what this means. I don't care if the CPUs are not coordinated closely. If you have a video engine providing the clock timing, why would the CPU timing matter?
> Anyway, you should buy one and check it out. I'd be curious to hear > if (1) you also observe the same video quality, and (2) if you think it's > the waitvid mechanism, more the PLL->SVGA generation, or the sync issues > of the paired video generators. They even supply the schematic, FWIW.
I appreciate your enthusiasm, but I have my own goals and projects. I am currently oriented towards absurdly low power levels in digital designs and am working on a design that will require no explicit power source, it will scavenge power from the environment. I don't think a Propeller is suitable for such a task is it? Rick