EmbeddedRelated.com
Forums

Integrated TFT controller in PIC MCUs

Started by pozz January 7, 2015
David Brown <david.brown@hesbynett.no> writes:
> (Regarding more than 32 general-purpose registers, I think the Itanium > is the only cpu I know of with 128 integer registers and 128 floating > point registers. It needs more registers for its EPIC architecture, but > I don't think anyone would consider it an "optimum" design!)
MMIX (instructional architecture by Donald Knuth) has 256 registers. I don't think any MMIX chips have been made, but there are FPGA implementations around. I believe it had some resemblance to a 1990's-era processor that was actually produced but I don't remember which one. The Intel HD Graphics processors have 128 SIMD registers if I remember right. They have a relatively conventional instruction set (resembling a typical computer) compared with other GPU's.
On 10.1.2015 &#1075;. 03:56, Dimiter_Popoff wrote:
> On 10.1.2015 &#1075;. 03:36, Simon Clubley wrote: >> On 2015-01-09, Dimiter_Popoff <dp@tgi-sci.com> wrote: >>> >>> Well I really cannot simplify the concept of saving say 4 out of 32 >>> registers, using only them in an IRQ handler, then restoring only >>> them and returning from the exception. >>> >> >> In the general case, you have to push and pop all the registers every >> time you take an interrupt. > > !? > > In the general case you do not, if you are the programmer. > >> In your world you may not have to always do that, but for the general >> purpose case when you don't have absolute control of the code being >> called from the handler you do. >> Even when you control the code being called from the handler, you still >> have to push all the registers the code could potentially use if it's >> written in a high level language. > > Of course you can program any machine to a complete halt. Or just use > a hammer to smash it, this will perhaps be an easier way. > >> Or to put this another way, your usage model when it comes to interrupt >> handlers is not the general usage model that most other people have to >> work with. :-) > > Well if programming has deteriorated by *such* a degree I really > do not have many people to converse with about programming, this much > is obvious :-). > > But this does not change the validity of the concept "save/restore only > what you have to" when applied in the core to core comparison context. > > My God, I really did not think things had gone *that* bad.
> I am still somewhat amazed at what was said. What on earth is there to stop people from doing something *that* simple - here are two IRQ handlers on a small MCU, I used an mcf52211 a couple of months back to make a HV source - it does the PWM/regulating, overcurrent protection/limiting, serial communication etc., all in all 4 tasks, several IRQs. Took me about 2 weeks to program (I had hoped it would take 2 days but I had completely forgotten the insides of the 52211 so I had to recall a lot which is where the 2 weeks went). A total of about 250k sources, the object code being almost 9 kilobytes. So here are two IRQ handlers where hopefully it is obvious how only what is needed is saved and restored, pretty basic stuff: http://tgi-sci.com/misc/hvst0q.gif This is not VPA, just plain 68k (well, CF) assembly, so it is far from being taken from my own world. I just wonder how hopeless things must have become to question the viability of doing something that basic. Dimiter ------------------------------------------------------ Dimiter Popoff, TGI http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/
On Fri, 9 Jan 2015, John Devereux wrote:

> Vladimir Ivanov <none@none.tld> writes: > >> On Fri, 9 Jan 2015, David Brown wrote: >> >>> On 09/01/15 10:54, Vladimir Ivanov wrote: >>>> >>>> On Fri, 9 Jan 2015, David Brown wrote: >>>> >>>>> For microcontrollers, such as the Cortex M devices, I think 16 registers >>>>> is a good balance for a lot of typical code. >>>> >>>> In Thumb2 you work directly with 8 GP registers, indirectly with few >>>> like PC and SP, and accessing the rest of the GPRs is different and/or >>>> has penalties. >>> >>> As far as I understand it, accessing the other registers means 32-bit >>> instructions rather than the short 16-bit instructions. So accessing >>> them has penalties compared to accessing the faster registers, but not >>> compared to normal ARM 32-bit instructions. >> >> Yes, longer code sequences, and most likely very limited instruction >> forms. The latter leads to shuffling of data between the regular 8 >> GPRs and the other, "unregular" GPRs. >> >>>> Just trying to say that it is a moot point. And personally, I never >>>> understood the existence of Cortex-M - why cripple the ability to switch >>>> to native 32-bit mode, if most or all of the underlying logic is there? >>> >>> My knowledge of the details is weak, but AFAIK the only thing you really >>> lose with Thumb2 compared to ARM instruction sets is the conditional >>> execution flags - with ARM, you can use the flags with most >>> instructions, while with Thumb2 you have the if-then-else construction. >>> (You also lose the barrel shifter on some instructions, but that is not >>> going to affect much code.) >> >> I am not Thumb2 expert, either. As a very strong personal (biased) >> opinion, I don't find it elegant at all. MIPS16e impressed me bit more >> with their EXTEND instruction. >> >> What I am trying to communicate, is that the CPU core with all the >> blocks is there. Thumb2 is more or less a decoder, just like the ARM >> mode is. Same with MIPS32 and MIPS16e. Why would one cripple something >> by removing one of the decoders? The power savings are negligible. >> >> ARM7TDMI was more balanced in that regard. > > I have used both, cortex M3/M4 is just much nicer to program. The code > is compact, and faster clock-for-clock than even 32-bit ARM7 code.
The "compact" I can easily agree with. For the latter, are you referring to ARM7TDMI core in ARM mode, or some ARMv7 core in ARM mode?
> No more convoluted assembly language wrappers everywhere, no "thumb > interworking", "GLUE7" segments, no half a dozen system modes+stacks to > worry about.
The modes + stacks were okay, even useful for some things. The interworking and associated cruft I avoided by not using Thumb at all. There was no narrow bus to give Thumb advantage.

On Fri, 9 Jan 2015, Tauno Voipio wrote:

> On 9.1.15 17:30, Vladimir Ivanov wrote: > >>> As far as I understand it, accessing the other registers means 32-bit >>> instructions rather than the short 16-bit instructions. So accessing >>> them has penalties compared to accessing the faster registers, but not >>> compared to normal ARM 32-bit instructions. >> >> Yes, longer code sequences, and most likely very limited instruction >> forms. The latter leads to shuffling of data between the regular 8 GPRs >> and the other, "unregular" GPRs. >> > > This applies partially to old Thumb. Thumb2 is still shorter than 32 bit ARM > code for the same task. The cost of r8-r15 use is two bytes in most > instructions, but we are only in the length of regular 32-bit code in these > expensive forms.
Yes. And there is speed penalty using R8-R15 even if code size remains not larger than ARM mode. That's why claims that Thumb2 is almost as fast as ARM mode smell of marketing. But Anders posted in the previous post a link to interesting paper on the subject, which I am going to read now.
On Sat, 10 Jan 2015, Vladimir Ivanov wrote:

> On Fri, 9 Jan 2015, Tauno Voipio wrote: > >> On 9.1.15 17:30, Vladimir Ivanov wrote: >> >>>> As far as I understand it, accessing the other registers means 32-bit >>>> instructions rather than the short 16-bit instructions. So accessing >>>> them has penalties compared to accessing the faster registers, but not >>>> compared to normal ARM 32-bit instructions. >>> >>> Yes, longer code sequences, and most likely very limited instruction >>> forms. The latter leads to shuffling of data between the regular 8 GPRs >>> and the other, "unregular" GPRs. >>> >> >> This applies partially to old Thumb. Thumb2 is still shorter than 32 bit >> ARM code for the same task. The cost of r8-r15 use is two bytes in most >> instructions, but we are only in the length of regular 32-bit code in these >> expensive forms. > > Yes. And there is speed penalty using R8-R15 even if code size remains not > larger than ARM mode. That's why claims that Thumb2 is almost as fast as ARM > mode smell of marketing.
Or maybe I read that wrong and there is really no speed penalty if we're talking about wider 32-bit version of the same 16-bit instruction. Do all instruction forms have wider version to accomodate R8-R15 usage?
On Fri, 9 Jan 2015, Anders.Montonen@kapsi.spam.stop.fi.invalid wrote:

> Vladimir Ivanov <none@none.tld> wrote: >> On Fri, 9 Jan 2015, Simon Clubley wrote: > >>> Do current versions of the MIPS ISA allow you to push a set of registers >>> onto the stack in one instruction as you can with ARM or do you still have >>> to push (and pop) them one after the other manually in your handlers ? >> No, because it is a load/store architecture. I don't really have any >> experience with the new MicroMIPS, but not expecting that to change. > > microMIPS has instructions for pushing and popping the callee-save > registers onto the stack (LWM32/LWM16/SWM32/SWM16). This is notable in a > way because MIPS have traditionally avoided committing an ABI into the > architecture.
Thanks for pointing this out. There must have been a big marketing pressure to introduce multiple load/store instructions and make the hardware more hairy. Consider that all the bigger iron should also support microMIPS and this stuff.
>> MIPS16e, as present in the PIC32MX (MIPS M4K core), is comparable to >> Thumb2. >> MicroMIPS, as present in the newer PIC32MZ (MIPS 14K core), is even better >> than MIPS16e. >> >> One benefit is that you can always switch to MIPS32 mode for performance >> reasons, unlike the pure Thumb2 MCUs, like Cortex-M. > > MIPS16e is much closer to Thumb. You only have a subset of the > registers available, and no system control instructions. microMIPS is > comparable to Thumb-2, and the idea is the same. Shrink the code size > while retaining performance.
I have excluded the system control instructions, since that seems mostly like shortsightedness when they devised the compressed instruction sets. But, yes, Thumb2 and microMIPS fixed that.
> MIPS32 support is optional for cores that support microMIPS. In fact, > the latest version of Microchip's XC32 compiler includes support for an > unreleased PIC32MM family which only supports microMIPS.
Now that you mention it, I remember seeing pointers about future PIC32MM stuck to microMIPS only. Again marketing pressure?
Paul Rubin <no.email@nospam.invalid> wrote:
> David Brown <david.brown@hesbynett.no> writes: >> (Regarding more than 32 general-purpose registers, I think the Itanium >> is the only cpu I know of with 128 integer registers and 128 floating >> point registers. It needs more registers for its EPIC architecture, but >> I don't think anyone would consider it an "optimum" design!) > MMIX (instructional architecture by Donald Knuth) has 256 registers. I > don't think any MMIX chips have been made, but there are FPGA > implementations around. I believe it had some resemblance to a > 1990's-era processor that was actually produced but I don't remember > which one. > > The Intel HD Graphics processors have 128 SIMD registers if I remember > right. They have a relatively conventional instruction set (resembling > a typical computer) compared with other GPU's.
Renesas' SH-5 had 64 64-bit GPRs and 64 32-bit floating-point registers. I think there were silicon implementations, but that architecture never went anywhere. -a
Vladimir Ivanov <none@none.tld> writes:

> On Fri, 9 Jan 2015, John Devereux wrote: > >> Vladimir Ivanov <none@none.tld> writes: >> >>> On Fri, 9 Jan 2015, David Brown wrote: >>> >>>> On 09/01/15 10:54, Vladimir Ivanov wrote: >>>>> >>>>> On Fri, 9 Jan 2015, David Brown wrote: >>>>> >>>>>> For microcontrollers, such as the Cortex M devices, I think 16 registers >>>>>> is a good balance for a lot of typical code. >>>>> >>>>> In Thumb2 you work directly with 8 GP registers, indirectly with few >>>>> like PC and SP, and accessing the rest of the GPRs is different and/or >>>>> has penalties. >>>> >>>> As far as I understand it, accessing the other registers means 32-bit >>>> instructions rather than the short 16-bit instructions. So accessing >>>> them has penalties compared to accessing the faster registers, but not >>>> compared to normal ARM 32-bit instructions. >>> >>> Yes, longer code sequences, and most likely very limited instruction >>> forms. The latter leads to shuffling of data between the regular 8 >>> GPRs and the other, "unregular" GPRs. >>> >>>>> Just trying to say that it is a moot point. And personally, I never >>>>> understood the existence of Cortex-M - why cripple the ability to switch >>>>> to native 32-bit mode, if most or all of the underlying logic is there? >>>> >>>> My knowledge of the details is weak, but AFAIK the only thing you really >>>> lose with Thumb2 compared to ARM instruction sets is the conditional >>>> execution flags - with ARM, you can use the flags with most >>>> instructions, while with Thumb2 you have the if-then-else construction. >>>> (You also lose the barrel shifter on some instructions, but that is not >>>> going to affect much code.) >>> >>> I am not Thumb2 expert, either. As a very strong personal (biased) >>> opinion, I don't find it elegant at all. MIPS16e impressed me bit more >>> with their EXTEND instruction. >>> >>> What I am trying to communicate, is that the CPU core with all the >>> blocks is there. Thumb2 is more or less a decoder, just like the ARM >>> mode is. Same with MIPS32 and MIPS16e. Why would one cripple something >>> by removing one of the decoders? The power savings are negligible. >>> >>> ARM7TDMI was more balanced in that regard. >> >> I have used both, cortex M3/M4 is just much nicer to program. The >> code is compact, and faster clock-for-clock than even 32-bit ARM7 >> code. > > The "compact" I can easily agree with. > > For the latter, are you referring to ARM7TDMI core in ARM mode, or > some ARMv7 core in ARM mode?
Sorry I meant ARM7TDMI!
>> No more convoluted assembly language wrappers everywhere, no "thumb >> interworking", "GLUE7" segments, no half a dozen system modes+stacks >> to worry about. > > The modes + stacks were okay, even useful for some things. > The interworking and associated cruft I avoided by not using Thumb at > all. There was no narrow bus to give Thumb advantage.
Remember the Cortex M was for single-chip *microcontrollers* (where the majority of the chip is memory). -- John Devereux
Vladimir Ivanov <none@none.tld> wrote:
> On Fri, 9 Jan 2015, Anders.Montonen@kapsi.spam.stop.fi.invalid wrote:
>> MIPS32 support is optional for cores that support microMIPS. In fact, >> the latest version of Microchip's XC32 compiler includes support for an >> unreleased PIC32MM family which only supports microMIPS. > Now that you mention it, I remember seeing pointers about future PIC32MM > stuck to microMIPS only. Again marketing pressure?
As far as I can tell from the header files and compiler source code, the PIC32MM could be a replacement/follow-up for the PIC32MX1xx/2xx. There's no DSP ASE, and no shadow registers, so it's clearly not a high- performance chip, and it doesn't seem like it has any special peripherals either. Using microMIPS at the low end makes sense, as you can fit more code in a smaller flash. I don't know how much silicon area is saved by having only the one instruction set, but that kind of makes sense for a low-end chip as well. -a
Vladimir Ivanov <none@none.tld> wrote:

> Do all instruction forms have wider version to accomodate R8-R15 usage?
I believe that is the case. -a