Reply by Jonathan Kirwan ●December 5, 20032003-12-05

On Fri, 5 Dec 2003 20:20:20 -0000, Paul wrote:

>Jon
>
>> On Fri, 5 Dec 2003 16:38:50 -0000, Paul wrote:
>> 
>> >> > I just ran this both on 1.26 and on version 2 of the
>> >> compiler.  Both
>> >> > specifies that the size was 8.
>> >> 
>> >> ok, it was untested and the wrong right code, but this is the
>> >> right wrong
>> >> code:
>> >> 
>> >> typedef unsigned char TUINT8;
>> >> typedef unsigned short int TUINT16;
>> >> typedef struct
>> >> {
>> >>   TUINT8 lP;
>> >>   TUINT8 lW;
>> >>   TUINT8 lT;
>> >>   TUINT16 te:5;
>> >>   TUINT16 de:3;
>> >>   TUINT16 ct;
>> >> } T_M_H_T;
>> >> #define T_M_D_T  sizeof(T_M_D_T) // Warning: The IAR compiler
>> >> calculates 8 (instead of correct 7)!!!
>> >
>> >Actually, from my reading of the ISO standard, a compiler 
>> can do what 
>> >it jolly well pleases when packing a structure and thus different 
>> >compilers for the same architecture can return different 
>> values for the 
>> >size of the same structure.
>> 
>> It can also do what it jolly well pleases when casting an 
>> unsigned int to a signed int, if the magnitude of the value 
>> being assigned is outside the positive extent of the signed 
>> int. Like deciding to generate a random number in that case, 
>> for example.  (ISO/IEC 9899:1999(E) 6.3.1.3)  Negative zero 
>> integers are also permitted.
>> 
>> Luckily, consumer pressures keep most of those weirdnesses in 
>> the closet and not in the products.
>> 
>> Sounds like IAR is using word alignments, so that 16-bit 
>> values are aligned on even address boundaries.  Do you think 
>> that the case here?  And would there be a rational argument 
>> for this, if so?
>
>The rationale is that the ANSI standard *only* specifies bit packing
>with unsigned int and signed int types, nothing else.  Thus, bitfields
>are required to be packed into the natural size of an int, usually the
>processor word size.  You can argue, however, that using "signed
int"
>and "unsigned int" with field widths of 3 and 5 is no different
from
>using "signed long" and "unsigned long" with field
witdhs of 3 and 5.
>The fact is, you're still asking for *only* 3 and 5 bits of information
>so the compiler *could* pack all that into a byte and it would be
>*transparent* at the C level.  Are you with me on this?

Absolutely, Paul.  I was mostly curious, not being argumentative
at all.

The example's "bit packing" does, in fact, deal with unsigned
short int, which on the MSP430 I assume will mean 16 bits.

>Thus, you could argue that a structure is allocated
like this:
>
>struct {
>  TUINT8 lP;     // byte 0
>  TUINT8 lW;     // byte 1
>  TUINT8 lT;     // byte 2
>  TUINT16 te:5;  // bits 7-3 of byte 3
>  TUINT16 de:3;  // bits 2-0 of byte 3
>  TUINT16 ct;    // bytes 4 and 5
>};

Yes, except my query wasn't about that.  It was about aligning
on even address boundaries, as in the VAX where you often found
4-byte objects aligned on 4-byte boundaries, 2-byte objects
aligned on 2-byte boundaries, 8-byte objects aligned on 8-byte
boundaries, and so on.  So I was wondering about the possibility
that the 8 bytes arrived from:

>// structure assumed to always start on even
addresses, as it's
>// largest atom is a 2-byte object....
>struct {
>  TUINT8 lP;     // byte 0         1-byte alignment rule
>  TUINT8 lW;     // byte 1         1-byte alignment rule
>  TUINT8 lT;     // byte 2         1-byte alignment rule
>                 // byte 3 "pad"   forced '.even'
alignment
>  TUINT16 te:5;  // part of byte 4..5
>  TUINT16 de:3;  // part of byte 4..5
>  TUINT16 ct;    // byte 6..7
>};

I'd considered a smaller packing, too, as you seem to be
addressing... but that wasn't my question.  I was only curious
if it were possible that the compiler in question was using some
kind of VAX-like alignment rules and, if so, if there was logic
to that.

In other words, I was only trying to explain the 8-byte finding
in my own mind.  Not trying to argue that it should have been 6.

>So, the structure can be allocated in *six* bytes;
the packing into a
>word of the two bitfields is *not* necessary.  It's always confused me,
>and possibly other people too, what effect the type of a bitfield gives
>you other than to denote whether the field is treated as signed or
>not--I mean, you're telling the compiler how many bits you need, so
>surely it should be man enough to figure it out the optimal pacing for
>itself?

Hehe.

>I know that people think that using "unsigned
char" means "pack this
>into a byte" and "unsigned long" usually means "pack
this into a 32-bit
>word", but, well, both "unsigned char x : 3" and
"unsigned long x : 3"
>are *beyond* the ANSI standard and are *not* defined by it.  Such
>constructs are noted as extensions to the ANSI standard by CrossWorks.

Agreed.

But my question remains.  And there is a market which sometimes
judges these decisions, too, regardless of what a compiler
vendor chooses to do within the specification.  I still wonder
about how the 8 bytes are arrived at.

Jon

Beginning Microcontrollers with the MSP430

Reply by microbit ●December 5, 20032003-12-05

> Sounds like IAR is using word alignments, so that 16-bit values
> are aligned on even address boundaries.  Do you
think that the
> case here?  And would there be a rational argument for this, if
> so?
> 
> Jon

The MSP430 doesn't have bus alignment, so if you read/write a word/long
to an odd address it's bye bye.
This has been a nightmare for me in my RFBasic project, as I deal with 16/32 bit

integers and 32 bit floats being transferred around between an RS232 buffer (in
modem mode),
RF buffersRX/TX, parser buffer, and Flash. Each "transaction" needs
constant check for padding insert/deletes,
it is an absolute ____nigthmare____
And, naturally, the linker cannot know at link time whether 16/32 bit values
stored in char-based
arrays will be word aligned, so it needs to be runtime checked/managed.
(not to mention alignment of linked lists)

It must be a 50 fold worse nightmare for compiler vendors, unless the LIB code
uses 8 bit based
code, which of course flushes the MSP430's 16 bit performance out the
window.

Best regards,
Kris

Reply by r f ●December 5, 20032003-12-05

Hi,

> The rationale is that the ANSI standard *only*
specifies bit packing
> with unsigned int and signed int types, nothing else.  

Ok, the ANSI standard is not clear at that point and does not specifies 
things like little endian or IEEE 754 format or the radix in which 
numbers are represented but usually the same C source code is used on 
PCs and MCs and therefore you usually expect that you can read/write 
data produced from a MC with a PC. From a high price compiler like the 
icc430 i do expect the same behaivior or a warning! Why else should 
someone pay so much money?

Maybe some other Compiler are not better at that point but the icc430 
has some other bugs which are showing that he is no ANSI compiler. One 
of these bugs is than in sprintf the conversion specifier u does not 
work; the compiled code simply prints the specifier (and the preceding 
%) insted of the number although the conversion specifier u is in the 
ANSI C standard!
As a working workaroung i use the specifier d insted of u.

I'm sure i would find dozens of other bugs with a compiler test suite.

Rolf F.

Reply by Paul Curtis ●December 5, 20032003-12-05

Jon,

I'll cut the agreed bits; Anders has covered this more than a little.
Perhaps he's cursing the day he decided to join or stop trolling... ;-)


> >Thus, you could argue that a structure is
allocated like this:
> >
> >struct {
> >  TUINT8 lP;     // byte 0
> >  TUINT8 lW;     // byte 1
> >  TUINT8 lT;     // byte 2
> >  TUINT16 te:5;  // bits 7-3 of byte 3
> >  TUINT16 de:3;  // bits 2-0 of byte 3
> >  TUINT16 ct;    // bytes 4 and 5
> >};
> 
> Yes, except my query wasn't about that.  It was about 
> aligning on even address boundaries, as in the VAX where you 
> often found 4-byte objects aligned on 4-byte boundaries, 
> 2-byte objects aligned on 2-byte boundaries, 8-byte objects 
> aligned on 8-byte boundaries, and so on.  So I was wondering 
> about the possibility that the 8 bytes arrived from:
> 
> >// structure assumed to always start on even addresses, as it's //

> >largest atom is a 2-byte object.... struct {
> >  TUINT8 lP;     // byte 0         1-byte alignment rule
> >  TUINT8 lW;     // byte 1         1-byte alignment rule
> >  TUINT8 lT;     // byte 2         1-byte alignment rule
> >                 // byte 3 "pad"   forced '.even'
alignment
> >  TUINT16 te:5;  // part of byte 4..5
> >  TUINT16 de:3;  // part of byte 4..5
> >  TUINT16 ct;    // byte 6..7
> >};
> 
> I'd considered a smaller packing, too, as you seem to be 
> addressing... but that wasn't my question.  I was only 
> curious if it were possible that the compiler in question was 
> using some kind of VAX-like alignment rules and, if so, if 
> there was logic to that.

Hmm, I thought the VAX didn't require alignment but would benefit from
alignment if aligned at runtime.

> In other words, I was only trying to explain the
8-byte 
> finding in my own mind.  Not trying to argue that it should 
> have been 6.

Most (if not all) compilers by default allocate fields read
top-to-bottom (lexical order) as low-to-high data in memory order.
Anders correctly points out that a field that requires alignment may
lead to padding if not aligned at the point it's seen.  On the MSP430 an
int requires word alignment (even address).  In the example, the int
fields aren't word aligned hence the forced alignment and consequent
padding bytes.  Anders also points out the possibility of "rounding"
the
structure by adding extra padding bytes *after* the last member as the
alignment of the structure as a whole is equal to the maximal alignment
required by any of its members.  This neatly deals with allocating array
elements correctly so their members are all aligned correctly.

Some compilers can *reorder* a structure to better use memory *without*
a compromise in code quality.  So, for instance:

struct { char x; int y; char z; }

would be reordered to

struct { char x, z; int y; }

removing two padding bytes, but having no size increase on compiled code
size.

Some compilers implement packing (ahh, the infamous PACKED structures of
Pascal, the bane of my life) which will reduce the overall data size for
a program by packing data and disregarding its natural alignment
requirement, but this forces the compiler to generate much more code to
access the packed structures.  The 68020 have BFEXTU and BFEXTS to do
bitfield extraction, which helped, but many other processors use
shifting and masking.  The MSP430 isn't well endowed in the shift and
rotate department, rather unlike the 68K.  I wish the MSP430 had a
barrel shifter and instructions to use it.

--
Paul Curtis, Rowley Associates Ltd http://www.rowley.co.uk
CrossWorks for MSP430 and ARM processors

Reply by Paul Curtis ●December 5, 20032003-12-05

Kris,

> The MSP430 doesn't have bus alignment, so if
you read/write a 
> word/long to an odd address it's bye bye. This has been a 
> nightmare for me in my RFBasic project, as I deal with 16/32 bit 
> integers and 32 bit floats being transferred around between 
> an RS232 buffer (in modem mode), RF buffersRX/TX, parser 
> buffer, and Flash. Each "transaction" needs constant check 
> for padding insert/deletes, it is an absolute 
> ____nigthmare____ And, naturally, the linker cannot know at 
> link time whether 16/32 bit values stored in char-based 
> arrays will be word aligned, so it needs to be runtime 
> checked/managed. (not to mention alignment of linked lists)

Actually, the linker *can* know whether an access will be to a
word-aligned address or not in some cases.

Consider the absolute, symbolic, and register-offset addressing mode on
the stack using word addressing:

(a) Absolute, coded as x(R0).  x must be divisible by 2 to maintain word
alignment of data.  The linker can check this.

(b) Symbolic, coded as x(PC).  The PC is always word aligned, so x must
be divisible by two to maintain word alignment.  The linker can check
this.

(c) Stack-relative, coded as x(SP).  The SP is always word aligned, so x
must be divisible by two to maintain word alignment.  The linker can
check this.

In many cases the linker can't check for problems as it doesn't know,
at
any given instance, whether a register value will be aligned correctly
or not.

Regards,

--
Paul Curtis, Rowley Associates Ltd http://www.rowley.co.uk
CrossWorks for MSP430 and ARM processors

Reply by Paul Curtis ●December 5, 20032003-12-05

Rolf,

> > The rationale is that the ANSI standard
*only* specifies 
> bit packing 
> > with unsigned int and signed int types, nothing else.
> 
> Ok, the ANSI standard is not clear at that point and does not 
> specifies 
> things like little endian or IEEE 754 format or the radix in which 
> numbers are represented but usually the same C source code is used on 
> PCs and MCs and therefore you usually expect that you can read/write 
> data produced from a MC with a PC.

That's a high expectation.  I know that ISO 60559 is now more common,
but <float.h> gives you a good indication of the floating point
environment.  For a uC that's big endian, I *know* I can't read/write
data without byte swapping on a PC.

> From a high price compiler 
> like the 
> icc430 i do expect the same behaivior or a warning! Why else should 
> someone pay so much money?

Percieved quality.  Big company.  Big overheads.  Big cash requirement.
Therefore, big price.  QED.  ;-)

> Maybe some other Compiler are not better at that
point but the icc430 
> has some other bugs which are showing that he is no ANSI 
> compiler. 

It's pretty good, having studied the compiler.  I'd say it's one
of the
better ANSI compilers.  Besides, I believe it's based on the EDG front
end, and that's a very good indication of ANSI compliance.  However, I
assume it has no conformance certificate, but can't say for sure.  A
compiler isn't LINT.  If you need such a tool, get a tool that's
suitable for the job.  Gimpel must be the market leader in commercial
linting, and if you need that crutch, then it's a good buy.  We don't
ask Gimpel to generate code out of lint, but a compiler would be
reckless if it didn't report a potential problem that's easy to
diagnose.  However, my compiler doesn't go out of its way to lint a
program, I'm much more concerned with its speed of compilation and the
quality of the generated code.

> One 
> of these bugs is than in sprintf the conversion specifier u does not 
> work; the compiled code simply prints the specifier (and the 
> preceding 
> %) insted of the number although the conversion specifier u is in the 
> ANSI C standard!
> As a working workaroung i use the specifier d insted of u.

%u works, I know, I tested it in IAR's product.  However, whether %hu or
%hhu work, I'm not sure, I didn't test that with the medium formatter.
Are you sure you don't have finger/linker script trouble?

> I'm sure i would find dozens of other bugs
with a compiler test suite.

That's an assertion that I find hard to swallow because you have so
little data to back up the claim.  I think your %hu/%u problem is a
configuration problem, not an ANSI compliance problem.  In micro land,
we often need to ditch the baggage of the ISO standard because if we
included everything in the library to achieve compliance for all
applications, then we'd run out of room on our devices.  However, there
*is* the possibility of a full and complete freestanding library
implementation available in IAR and CrossWorks, the compilers I've
studied the most. It's just that most users just don't want that as
the
default.

I can't believe I'm standing up for IAR here... ;-)

--
Paul Curtis, Rowley Associates Ltd http://www.rowley.co.uk
CrossWorks for MSP430 and ARM processors

Reply by microbit ●December 5, 20032003-12-05

Hi Paul,

I have noticed such behaviour, but I am talking specifically reading/writing
processed
16/32 bit stuff from/into char based arrays, which then can be written to
another buffer
yet again (eg. for RF transfer). 
This is totally dynamic, and depending on the user's program input can land
in even 
or odd addresses after tokenising.
Once it's done it's not so bad, but the initial "trapping"
at various points in runtime
is woeful... !  I have no other choice there, not does the compiler/linker used.

Cheers,
Kris

> Actually, the linker *can* know whether an access
will be to a
> word-aligned address or not in some cases.
> 
> Consider the absolute, symbolic, and register-offset addressing mode on
> the stack using word addressing:
> 
> (a) Absolute, coded as x(R0).  x must be divisible by 2 to maintain word
> alignment of data.  The linker can check this.
> 
> (b) Symbolic, coded as x(PC).  The PC is always word aligned, so x must
> be divisible by two to maintain word alignment.  The linker can check
> this.
> 
> (c) Stack-relative, coded as x(SP).  The SP is always word aligned, so x
> must be divisible by two to maintain word alignment.  The linker can
> check this.
> 
> In many cases the linker can't check for problems as it doesn't
know, at
> any given instance, whether a register value will be aligned correctly
> or not.
> 
> Regards,
> 
> --
> Paul Curtis, Rowley Associates Ltd http://www.rowley.co.uk
> CrossWorks for MSP430 and ARM processors 
> 
>       
>            
>      
>      
> 
> .
> 
> 
> 
>  
>

Reply by Jonathan Kirwan ●December 5, 20032003-12-05

On Sat, 6 Dec 2003 08:20:35 +1100, Kris wrote:

>> Sounds like IAR is using word alignments, so
that 16-bit values
>> are aligned on even address boundaries.  Do you think that the
>> case here?  And would there be a rational argument for this, if
>> so?
>> 
>> Jon
>
>The MSP430 doesn't have bus alignment, so if you read/write a word/long
>to an odd address it's bye bye.

Yes.  And this would explain why IAR would want to do a "pad
byte" in their structure and then always align the structure
itself (or arrays of them) on an even address.  I was wondering
if this would be brought up.

>This has been a nightmare for me in my RFBasic
project, as I deal with 16/32 bit 
>integers and 32 bit floats being transferred around between an RS232 buffer
(in modem mode),
>RF buffersRX/TX, parser buffer, and Flash. Each "transaction"
needs constant check for padding insert/deletes,
>it is an absolute ____nigthmare____
>And, naturally, the linker cannot know at link time whether 16/32 bit values
stored in char-based
>arrays will be word aligned, so it needs to be runtime checked/managed.
>(not to mention alignment of linked lists)

Yes!

>It must be a 50 fold worse nightmare for compiler
vendors, unless the LIB code uses 8 bit based
>code, which of course flushes the MSP430's 16 bit performance out the
window.

hehe.

Excellent comments, all.

Jon

Reply by Jonathan Kirwan ●December 5, 20032003-12-05

On Fri, 5 Dec 2003 21:29:15 -0000, Paul wrote:

>I'll cut the agreed bits; Anders has covered
this more than a little.
>Perhaps he's cursing the day he decided to join or stop trolling... ;-)
>
>
>> >Thus, you could argue that a structure is allocated like this:
>> >
>> >struct {
>> >  TUINT8 lP;     // byte 0
>> >  TUINT8 lW;     // byte 1
>> >  TUINT8 lT;     // byte 2
>> >  TUINT16 te:5;  // bits 7-3 of byte 3
>> >  TUINT16 de:3;  // bits 2-0 of byte 3
>> >  TUINT16 ct;    // bytes 4 and 5
>> >};
>> 
>> Yes, except my query wasn't about that.  It was about 
>> aligning on even address boundaries, as in the VAX where you 
>> often found 4-byte objects aligned on 4-byte boundaries, 
>> 2-byte objects aligned on 2-byte boundaries, 8-byte objects 
>> aligned on 8-byte boundaries, and so on.  So I was wondering 
>> about the possibility that the 8 bytes arrived from:
>> 
>> >// structure assumed to always start on even addresses, as
it's // 
>> >largest atom is a 2-byte object.... struct {
>> >  TUINT8 lP;     // byte 0         1-byte alignment rule
>> >  TUINT8 lW;     // byte 1         1-byte alignment rule
>> >  TUINT8 lT;     // byte 2         1-byte alignment rule
>> >                 // byte 3 "pad"   forced
'.even' alignment
>> >  TUINT16 te:5;  // part of byte 4..5
>> >  TUINT16 de:3;  // part of byte 4..5
>> >  TUINT16 ct;    // byte 6..7
>> >};
>> 
>> I'd considered a smaller packing, too, as you seem to be 
>> addressing... but that wasn't my question.  I was only 
>> curious if it were possible that the compiler in question was 
>> using some kind of VAX-like alignment rules and, if so, if 
>> there was logic to that.
>
>Hmm, I thought the VAX didn't require alignment but would benefit from
>alignment if aligned at runtime.

Benefits... yes.  The rules were "suggestions."

>> In other words, I was only trying to explain
the 8-byte 
>> finding in my own mind.  Not trying to argue that it should 
>> have been 6.
>
>Most (if not all) compilers by default allocate fields read
>top-to-bottom (lexical order) as low-to-high data in memory order.
>Anders correctly points out that a field that requires alignment may
>lead to padding if not aligned at the point it's seen.  On the MSP430
an
>int requires word alignment (even address).  In the example, the int
>fields aren't word aligned hence the forced alignment and consequent
>padding bytes.  Anders also points out the possibility of
"rounding" the
>structure by adding extra padding bytes *after* the last member as the
>alignment of the structure as a whole is equal to the maximal alignment
>required by any of its members.  This neatly deals with allocating array
>elements correctly so their members are all aligned correctly.

My suspicion is that the alignment pad takes place *after* the
three bytes in the structure.  That seems to be the logical
place for it.

>Some compilers can *reorder* a structure to better
use memory *without*
>a compromise in code quality.  So, for instance:
>
>struct { char x; int y; char z; }
>
>would be reordered to
>
>struct { char x, z; int y; }
>
>removing two padding bytes, but having no size increase on compiled code
>size.

But it cannot change it to:

  struct { char z, x; int y; }

can it?

My memory may be wrong about this.  I do believe I know that
padding may not be inserted at the beginning of a structure,
under 6.7.2.1 (13).  But is it possible to move the first item
from being the "first?"  The reason I ask this is:  If no
padding is allowed at the beginning and if it is also true that
the first item named in a structure declaration must remain the
logically first in terms of memory addressing, then it follows
that a pointer to the structure will compare equal to a pointer
to the first member of the structure, suitably cast.

Just curious if you recall the chapter and verse on this.

>Some compilers implement packing (ahh, the infamous
PACKED structures of
>Pascal, the bane of my life) which will reduce the overall data size for
>a program by packing data and disregarding its natural alignment
>requirement, but this forces the compiler to generate much more code to
>access the packed structures.  The 68020 have BFEXTU and BFEXTS to do
>bitfield extraction, which helped, but many other processors use
>shifting and masking.  The MSP430 isn't well endowed in the shift and
>rotate department, rather unlike the 68K.  I wish the MSP430 had a
>barrel shifter and instructions to use it.

Thanks for the comments.  Enjoyed them.

And yes!  My applications could very often benefit from a barrel
shifter (full, so that any number of shifts can be done in one
clock.)  I'd like the functions to include support for
normalization (finding the lead bit) as well as denorms.

I don't want floating point in hardware in my embedded
applications as the power premium one must pay is almost always
far too high for it.  Not to mention the piece price.  I pay for
its existence even when I don't need it, in amps.  (I use the
ADSP-21xx quite a bit and it's a wonderful, dead-cold to the
touch, low-power DSP with a nice, handy barrel shifter for those
times when you want it.)

But a barrel shifter???  A very loud YES!

Jon

Reply by Clyde Stubbs ●December 5, 20032003-12-05

On Fri, Dec 05, 2003 at 02:39:32PM -0800, Jonathan Kirwan wrote:
> On Fri, 5 Dec 2003 21:29:15 -0000, Paul wrote:
> >would be reordered to
> >
> >struct { char x, z; int y; }
> >
> >removing two padding bytes, but having no size increase on compiled
code
> >size.
> 
> But it cannot change it to:
> 
>   struct { char z, x; int y; }
> 
> can it?

Quite right Jon, the ordering of structure members is guaranteed
by the standard to be the same as the lexical order. Not so with
scalar variables, whose memory address need bear no relationship
with order in the source file.

As far as padding goes, it may be inserted anywhere except at
the beginning. On the msp430 it is *necessary* to word-align 
structure members, and to pad structure sizes to even numbers
of bytes.

Trust me on this, I've been doing it for 25 years.

Anyone who expects to be able to transfer data from one
architecture to another just by copying structures is in for
a nasty surprise, sooner or later.

Cheers, Clyde

-- 
Clyde Stubbs                     |            HI-TECH Software
Email: clyde@clyd...          |          Phone            Fax
WWW:   http://www.htsoft.com/    | USA: (408) 490 2885  (408) 490 2885
PGP:   finger clyde@clyd...   | AUS: +61 7 3552 7777 +61 7 3552 7778
---
HI-TECH C: compiling the real world.

2 345 6 7 Next

C language tools to use with MSP430 or ...

Beginning Microcontrollers with the MSP430

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group