EmbeddedRelated.com
Forums

Getting started with AVR and C

Started by Robert Roland November 24, 2012
On 29-Nov-12 14:36, Keith Thompson wrote:
> upsidedown@downunder.com writes: >> IMHO CHAR_BIT = 21 is the correct way to handle the Unicode range. >> >> On the Unicode list, I even suggested packing three 21 characters into >> a single 64 bit data word as UTF-64 :-) > > I like it -- but it breaks as soon as they add U+200000 or higher, and > I'm not aware of any guarantee that they won't.
I thought they had guaranteed they would never go above U+10FFFF, which would break UTF-16.
> I've thought of UTF-24, encoding each character in 3 octets; that's > good for up to 16,777,216 distinct code points.
AIUI, there are some DSPs with CHAR_BIT==24 (or was that 12?). S -- Stephen Sprunk "God does not play dice." --Albert Einstein CCIE #3723 "God is an inveterate gambler, and He throws the K5SSS dice at every possible opportunity." --Stephen Hawking
On Thu, 29 Nov 2012 22:06:08 +0000 (UTC), Grant Edwards
<invalid@invalid.invalid> wrote:

>On 2012-11-29, Jon Kirwan <jonk@infinitefactors.org> wrote: >> On Thu, 29 Nov 2012 22:40:41 +0200, upsidedown@downunder.com >> wrote: >> >>>On Thu, 29 Nov 2012 16:36:34 +0000, John Devereux >>><john@devereux.me.uk> wrote: >>> >>>>Grant Edwards <invalid@invalid.invalid> writes: >>>> >>>>> On 2012-11-29, Tim Wescott <tim@seemywebsite.com> wrote: >>>>> >>>>>> It's certainly what I would expect from gcc-avr. There's no reason you >>>>>> can't make a beautifully compliant, reasonably efficient compiler that >>>>>> works well on the AVR. >>>>> >>>>> avr-gcc does indeed work very nicely as long as you don't look at the >>>>> code generated when you use pointers. You'll go blind -- especially >>>>> if you're used to something like the msp430. It's easy to forget that >>>>> the AVR is an 8-bit CPU not a 16-bit CPU like the '430, and use of >>>>> 16-bit pointers on the AVR requires a lot of overhead. >>>> >>>>Other problem with it is the separate program and data memory >>>>spaces. Fine for small deeply embedded things but started to show strain >>>>when I wanted a LCD display, menus etc. I would not use it for a new >>>>project unless there was a very good reason, ultra-low power >>>>perhaps. Cortex M3 is much nicer but the chips are much more complicated >>>>of course. >>> >>>Except for self modifying code, why would one want data (program) >>>access into program space (unless you are writing a linker or >>>debugger) ?? >>> >>>While working with PDP-11's in the 1970's, the ability to use separate >>>I/D (Instruction/Data) space helped a lot to keep code/data in private >>>64 KiD address spaces. >> >> There are good reasons for self-modifying code space. > >Nobody said anything about modifying code space.
Sorry I didn't interpret things well.
>The "data" that's put in code space is never modified (at least not >any any project I've ever seen).
I've needed writable code space. Thunking is one such example.
>It's not _modifying_ the progam space that's the issue (that is >generally only done for firmware updates, where the entire flash is >erased and reprogrammed).
While I agree with the "generally" I don't agree that this translates into 100%.
>Simply _reading_ program space _as_data_ is problematic. If you've >got a lot of string constants or constant tables, you want to just >leave them in flash (program space) rather than copy them all to >(scarce) RAM on startup.
Indeed. Completely agreed. Jon
>Now you need three-byte pointers/addresses to differentiate between >data at 0xABCD in data space and the data at 0xABCD in program space. >Three byte pointers is how some compilers solve that problem -- but I >don't think avr-gcc does that.
James Kuyper wrote:
> > On 11/29/2012 09:07 AM, James Kuyper wrote: > ... > > "If an int can represent all values of the original type > > (as restricted > > by the width, for a bit-field), the value is converted to an int; > > otherwise, it is converted to an unsigned int. These are called the > > integer promotions.58) All other types are unchanged by the integer > > promotions." > > (6.3.1.1p2) The first use of "integer promotions" in that > > clause is italicized, which is an ISO convention indicating > > that the > > sentence containing that phrase serves > > as the definition of the phrase. > > I just realized that the meaning of the phrase "All other types" > is not > clear without the preceding part of that clause which I snipped: > > > The following may be used in an expression > > wherever an int or unsigned int may > > be used: > > &mdash; An object or expression with an integer type > > (other than int or unsigned int) > > whose integer conversion rank is less than > > or equal to the rank of int and unsigned int. > > &mdash; > > A bit-field of type _Bool, int, signed int, or unsigned int.
I recall reading some posts in this newsgroup a long time ago, which claimed that under certain circumstances, that it was possible in C99, for unsigned int to promote to type signed int. But that was never the case. In C99 6.3.1.1 paragraph 2, read as "less than" instead of "less than or equal" as you have above; and unsigned int type was covered by "All other types" in the last sentence. -- pete
glen herrmannsfeldt <gah@ugcs.caltech.edu> writes:
> In comp.lang.c Jon Kirwan <jonk@infinitefactors.org> wrote: >> On Thu, 29 Nov 2012 11:01:34 -0500, James Kuyper >> <jameskuyper@verizon.net> wrote: > >>><snip> >>>Claims have frequently been made on >>>comp.lang.c that, while the C standard allows CHAR_BIT != 8, the > > As I remember the stories, the CRAY-1 had 64 bit char.
[...] That may well be true; I never used a Cray-1. (And there was more emphasis on Fortran, or should I say FORTRAN, than on C.) By the time I started using Crays, they were running Unicos, Cray's version of Unix, so they pretty much had to have CHAR_BIT==8. -- Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst> Will write code for food. "We must do something. This is something. Therefore, we must do this." -- Antony Jay and Jonathan Lynn, "Yes Minister"
Stephen Sprunk <stephen@sprunk.org> writes:
> On 29-Nov-12 14:36, Keith Thompson wrote: >> upsidedown@downunder.com writes: >>> IMHO CHAR_BIT = 21 is the correct way to handle the Unicode range. >>> >>> On the Unicode list, I even suggested packing three 21 characters into >>> a single 64 bit data word as UTF-64 :-) >> >> I like it -- but it breaks as soon as they add U+200000 or higher, and >> I'm not aware of any guarantee that they won't. > > I thought they had guaranteed they would never go above U+10FFFF, which > would break UTF-16.
You're right. <http://www.unicode.org/faq/utf_bom.html> says: Both Unicode and ISO 10646 have policies in place that formally limit future code assignment to the integer range that can be expressed with current UTF-16 (0 to 1,114,111).
>> I've thought of UTF-24, encoding each character in 3 octets; that's >> good for up to 16,777,216 distinct code points. > > AIUI, there are some DSPs with CHAR_BIT==24 (or was that 12?).
-- Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst> Will write code for food. "We must do something. This is something. Therefore, we must do this." -- Antony Jay and Jonathan Lynn, "Yes Minister"
On 11/29/2012 06:25 PM, pete wrote:
> James Kuyper wrote: >> >> On 11/29/2012 09:07 AM, James Kuyper wrote: >> ... >>> "If an int can represent all values of the original type >>> (as restricted >>> by the width, for a bit-field), the value is converted to an int; >>> otherwise, it is converted to an unsigned int. These are called the >>> integer promotions.58) All other types are unchanged by the integer >>> promotions." >>> (6.3.1.1p2) The first use of "integer promotions" in that >>> clause is italicized, which is an ISO convention indicating >>> that the >>> sentence containing that phrase serves >>> as the definition of the phrase. >> >> I just realized that the meaning of the phrase "All other types" >> is not >> clear without the preceding part of that clause which I snipped: >> >>> The following may be used in an expression >>> wherever an int or unsigned int may >>> be used: >>> &mdash; An object or expression with an integer type >>> (other than int or unsigned int) >>> whose integer conversion rank is less than >>> or equal to the rank of int and unsigned int. >>> &mdash; >>> A bit-field of type _Bool, int, signed int, or unsigned int. > > I recall reading some posts in this newsgroup a long time ago, > which claimed that under certain circumstances, > that it was possible in C99, > for unsigned int to promote to type signed int.
An unsigned type whose entire range can be represented by an int will promote to signed int, as can easily be confirmed by checking the above text, and that point has been raised in this group - there were several threads that touched on that subject in just this past summer. However, anyone who claimed that it could happen to "unsigned int" was mistaken. That clause explicitly applies only to types "other than int or unsigned int".
> But that was never the case. > > In C99 > 6.3.1.1 paragraph 2, read as "less than" > instead of "less than or equal" as you have above; > and unsigned int type was covered by "All other types" > in the last sentence.
n1256.pdf (which is C99 with all three TCs applied, making it MORE useful than C99 itself) and n1570.pdf (which is essentially identical to C2011) both have "less than or equal to". The line is marked as being changed from C99 in n1256.pdf, implying that one of the TCs is the reason. My copy of C99 itself is inaccessible right now, so I can't confirm the nature of the change. -- James Kuyper
Grant Edwards <invalid@invalid.invalid> writes:

> On 2012-11-29, upsidedown@downunder.com <upsidedown@downunder.com> wrote: >> On Thu, 29 Nov 2012 16:36:34 +0000, John Devereux >><john@devereux.me.uk> wrote: >> >>>Grant Edwards <invalid@invalid.invalid> writes: >>> >>>> On 2012-11-29, Tim Wescott <tim@seemywebsite.com> wrote: >>>> >>>>> It's certainly what I would expect from gcc-avr. There's no reason you >>>>> can't make a beautifully compliant, reasonably efficient compiler that >>>>> works well on the AVR. >>>> >>>> avr-gcc does indeed work very nicely as long as you don't look at the >>>> code generated when you use pointers. You'll go blind -- especially >>>> if you're used to something like the msp430. It's easy to forget that >>>> the AVR is an 8-bit CPU not a 16-bit CPU like the '430, and use of >>>> 16-bit pointers on the AVR requires a lot of overhead. >>> >>>Other problem with it is the separate program and data memory >>>spaces. Fine for small deeply embedded things but started to show strain >>>when I wanted a LCD display, menus etc. I would not use it for a new >>>project unless there was a very good reason, ultra-low power >>>perhaps. Cortex M3 is much nicer but the chips are much more complicated >>>of course. >> >> Except for self modifying code, why would one want data (program) >> access into program space (unless you are writing a linker or >> debugger) ?? > > The "program" space was flash (non-volatile). The "data" space was > registers and RAM (volatile). All non-volatile data (strings, screen > templates, lookup tables, menu structures, and so) has to be in flash > memory (IOW "program space"). It makes a _lot_ of sense to just use > directly from flash instead of copying it all to RAM when RAM is so > scarce.
Yes, that is precisely it. The AVRs especially tended to have lots of flash but little RAM. Access to program memory is possible on the AVR, but you have to use special attribute modifiers everywhere and the resulting objects become incompatible with the standard libraries, so you have to write special versions of these... Another thing is that, being an 8 bit machine, int and short operations are not atomic. So you have to be very careful about protecting variables shared with interrupt handlers (or other tasks in a preemptive system). Good practice anyway of course but a modern CPU like Cortex M3 is a lot more forgiving since even 32 bit load/store operations are atomic. [...] -- John Devereux
James Kuyper <jameskuyper@verizon.net> wrote:
> > n1256.pdf (which is C99 with all three TCs applied, making it MORE > useful than C99 itself) and n1570.pdf (which is essentially identical to > C2011) both have "less than or equal to". The line is marked as being > changed from C99 in n1256.pdf, implying that one of the TCs is the > reason. My copy of C99 itself is inaccessible right now, so I can't > confirm the nature of the change.
It was TC2 and the change came from DR 230. It was to handle the case of enumerationed types with the same rank as int, it didn't have anything to do with unsigned int. -- Larry Jones I'm a genius. -- Calvin
> I believe that C was implemented on the PDP-10. I didn't use > it when I was programming the PDP-10 (I used assembly, then, > and some other languages... but not C, until I worked on Unix > v6 in '78.) But that was a 36-bit machine. And ASCII was > packed into 7 bits so that 5 chars fit in a word. No one used > 8, so far as I recall. That was the standard method. So I'm > curious now what the C implementation did.
For Unix, serial I/O was as important as efficient storage of data. Most serial terminal can't do more than 8 bits, and usually 7E or 7O. So, 8 bit char became standard.
On Fri, 30 Nov 2012 09:26:39 -0800 (PST),
me@linnix.info-for.us wrote:

>> I believe that C was implemented on the PDP-10. I didn't use >> it when I was programming the PDP-10 (I used assembly, then, >> and some other languages... but not C, until I worked on Unix >> v6 in '78.) But that was a 36-bit machine. And ASCII was >> packed into 7 bits so that 5 chars fit in a word. No one used >> 8, so far as I recall. That was the standard method. So I'm >> curious now what the C implementation did. > >For Unix, serial I/O was as important as efficient storage of data.
Given the cost of memory back then, primary or secondary, a great many man-hours were spent on efficient storage. Serial I/O was almost exclusively used because of how modems worked, then, for transmission over long distances. (Some may argue that it requires fewer wires, too, in cables. But that was less an issue then -- witness the 36-pin and 25-pin Centronix cables/connectors which were very wire-heavy.) It turns out that terminals, like the ASR-33 and KSR-35, were often used without a computer for dial-up modem use over a phone line. So they used a serial interface, by design. Which meant that Unix needed to cope with it. But I wouldn't say "as important as." I worked on the v6 Unix kernel, so I was slightly aware of the situation. Of course, I was just bringing up the PDP-10 because of its odd way of packing 7-bit codes into a 36-bit word.
>Most serial terminal can't do more than 8 bits, and usually 7E or 7O. >So, 8 bit char became standard.
At the time, there was no real standard at all. I saw equal numbers of machines using EBCDIC and 6-bit (5-bit Boudot was waning by this time but I also remember old terminals that used 5-bit) and 7-bit. No machine used 8-bit for anything, then. The 8th bit was always just looked at as either 'don't punch it at all, so the paper tape is more durable' or else make it even or odd parity. Some of us would write programs to punch out visible English messages on the tape, which was one of the few reasons we actually wanted control over 8 bits (for those paper punch machines that punched 8.) I honestly hoped, but didn't know, if ASCII would win out in the end. I almost had a feeling then that I'd be converting from one code to another the rest of my life, if things continued as they were. I wanted ASCII to win, though. Side note: there was only a gradual "coming together" on the idea that an 8-bit byte was a "good idea." I think a lot of people these days imagine that it was always as obvious and as ubiquitous as it is today. But that's not entirely true. Things went to 8-bit, gradually. Partly, because 8 bits is a nice 2^3 power thing and partly because ASCII was gradually taking over as a standard and would fit into a 8-bit byte, nicely. There was a confluence of forces going on and this kind of "precipitated out" to what it is today. Side note again: Recently, I read a "personal history" talking about the complexity of the ASR-33. The author has no idea. I also remember quite well the much more complicated KSR-35. I worked on repairing both, from time to time. By comparison, the ASR-33 was a toy, designed for less lifetime and less complex, as well. The earlier KSR-35 was made for men, so to speak -- extremely well lubricated system with real man-parts and not toy pieces. The ASR-33 had a cute little cylinder with the letters on it, not that unlike the typewriter ball. The KSR-35 had a large hammer block, instead. Jon