Octets with non-8 bit bytes...

Ok, I'm sure this has been beaten to death, but google, etc. found a
lot of descriptions of the problem but none of a portable solution.

I'm working with some firmware drivers which are intended to be as
portable as possible.  Data moves thru a switchable 8- or 16-bit data
bus chip (a USB device controller specifically).  Performance is
critical so 16-bit is pretty much necessary.  Following that example,
let's look at the USB mass storage class.  You get commands from the
host in 31 octet command wrappers that look like this (endian issues
aside...):

typedef struct
{
	u32    Signature;
        u32    Tag;
	u32    TransferLength;
	u8     Flags;
        u8     Lun;
	u8     CommandLength;
	u8     Command[15];
} Cbw;

If I have 8 bit data types that's easy enough to get and deal with. 
But right now I'm working with a TMS320C55x variant with nothing
smaller than 16-bit data types.  So naturally the 8 bit types get all
mixed up when I read them and when I send back similar data every
other octet is garbage.  Some responses are filled at runtime, a few
are global constants.  I can pack things early, but then I need to
unpack, modify, and repack.  Or I can pack before transmission, but
that'd take a bite out of performance.  Or I can break things down:

typedef struct
{
	BYTE	Signature0;
	BYTE	Signature1;
	BYTE	Signature2;
	BYTE	Signature3;
	BYTE	Tag0;
	BYTE	Tag1;
	BYTE	Tag2;
	BYTE	Tag3;
	BYTE	TransferLength0;
	BYTE	TransferLength1;
	BYTE	TransferLength2;
	BYTE	TransferLength3;
	BYTE	Flags;
	BYTE	Lun;
	BYTE	CommandLength;
	BYTE	Command[15];
} Cbw;

Ugly.  I'd really like to avoid that...

Now, I see this problem described countless times (yes, yes,
sizeof(char)==sizeof(int)==1, 16 bit byte is 100% ok by the standard),
but what's the best portable solution to dealing with this?  Or at
least *mostly* portable.  All the messages I see say "don't store
binary data and don't worry about how many bits are in anything". 
Great, but that embedded command field being sent from my host
computer 5 meters away is 15 octets whether I like it or not.  I don't
care if everything's stored locally inefficiently so long as
performance is reasonable (and it's clear!  Other people *will* be
dealing with this code!)

I'm making progress getting things to work, but it's getting ugly so I
was curious how people deal with this in real life.

Thanks for whatever guidance you can provide,
alex

Reply by Guy Macon ●June 10, 20042004-06-10

"Octets" and "Bytes" are always 8 bits.  The term you want is "Words."

Reply by Jerry Avins ●June 10, 20042004-06-10

Guy Macon wrote:

> "Octets" and "Bytes" are always 8 bits.  ...


Not in C. A C byte id the smallest of

1) a character used by the system,
2) the smallest memory chunk that can be individually addressed, or
3) eight bits.

In most DSPs, a C compilers considers a byte to contain 16 or 32 bits.

sizeof(char) is always 1. sizeof() returns storage size in bytes. On
most DSPs, sizeof(int) is 1. On many, sizeof(long) is also 1. Try it.

Jerry
-- 
Engineering is the art of making what you want from things you can get.
&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;

Reply by Jack Klein ●June 10, 20042004-06-10

On 10 Jun 2004 16:59:49 -0700, usenet1@sanks.net (Alex Sanks) wrote in
comp.arch.embedded:

> Ok, I'm sure this has been beaten to death, but google, etc. found a
> lot of descriptions of the problem but none of a portable solution.
> 
> I'm working with some firmware drivers which are intended to be as
> portable as possible.  Data moves thru a switchable 8- or 16-bit data
> bus chip (a USB device controller specifically).  Performance is
> critical so 16-bit is pretty much necessary.  Following that example,
> let's look at the USB mass storage class.  You get commands from the
> host in 31 octet command wrappers that look like this (endian issues
> aside...):
> 
> typedef struct
> {
> 	u32    Signature;
>         u32    Tag;
> 	u32    TransferLength;
> 	u8     Flags;
>         u8     Lun;
> 	u8     CommandLength;
> 	u8     Command[15];
> } Cbw;
> 
> If I have 8 bit data types that's easy enough to get and deal with. 
> But right now I'm working with a TMS320C55x variant with nothing
> smaller than 16-bit data types.  So naturally the 8 bit types get all
> mixed up when I read them and when I send back similar data every
> other octet is garbage.  Some responses are filled at runtime, a few
> are global constants.  I can pack things early, but then I need to
> unpack, modify, and repack.  Or I can pack before transmission, but
> that'd take a bite out of performance.  Or I can break things down:
> 
> typedef struct
> {
> 	BYTE	Signature0;
> 	BYTE	Signature1;
> 	BYTE	Signature2;
> 	BYTE	Signature3;
> 	BYTE	Tag0;
> 	BYTE	Tag1;
> 	BYTE	Tag2;
> 	BYTE	Tag3;
> 	BYTE	TransferLength0;
> 	BYTE	TransferLength1;
> 	BYTE	TransferLength2;
> 	BYTE	TransferLength3;
> 	BYTE	Flags;
> 	BYTE	Lun;
> 	BYTE	CommandLength;
> 	BYTE	Command[15];
> } Cbw;
> 
> Ugly.  I'd really like to avoid that...
> 
> Now, I see this problem described countless times (yes, yes,
> sizeof(char)==sizeof(int)==1, 16 bit byte is 100% ok by the standard),
> but what's the best portable solution to dealing with this?  Or at
> least *mostly* portable.  All the messages I see say "don't store
> binary data and don't worry about how many bits are in anything". 
> Great, but that embedded command field being sent from my host
> computer 5 meters away is 15 octets whether I like it or not.  I don't
> care if everything's stored locally inefficiently so long as
> performance is reasonable (and it's clear!  Other people *will* be
> dealing with this code!)
> 
> I'm making progress getting things to work, but it's getting ugly so I
> was curious how people deal with this in real life.
> 
> Thanks for whatever guidance you can provide,
> alex

I ran across something similar in parsing and formatting CAN packets
for the TI 2812 DSP, which likewise has 16-bit chars and ints.  A CAN
packet may contain between 0 and 8 octets in the data field of the
frame.  In our interface, any octet may be part of an 8-bit, 16-bit,
or 32-bit value.

I wrote two low-level routines to pack/unpack to an array of eight
1-bit words.  When compiled with full optimization it is quite short
and fast, at least on the 2812, which has a C-friendly architecture
compared to some older DSPs.  The result was good enough that I had no
need to write it in assembly language.  In fact one of my colleagues
who wrote the other side of the interface on an ARM used the code
unchanged.

You might be able to adapt something from them:

#define	OCTET_MASK					0xFFU

static void split_frame(const uint16_t words [4], uint_least8_t
*split)
{
	/* can't just walk a pointer to unsigned char through the octets
of the		*/
	/* data frame because unsigned char is 16 bits on the 2812 DSP!
*/
	split [0] =  words[0]        & OCTET_MASK;
	split [1] = (words[0] >>  8) & OCTET_MASK;
	split [2] =  words[1]        & OCTET_MASK;
	split [3] = (words[1] >>  8) & OCTET_MASK;
	split [4] =  words[2]        & OCTET_MASK;
	split [5] = (words[2] >>  8) & OCTET_MASK;
	split [6] =  words[3]        & OCTET_MASK;
	split [7] = (words[3] >>  8) & OCTET_MASK;
}

static void assemble_frame(const uint_least8_t *split, uint16_t
*words)
{
	/* can't just walk a pointer to unsigned char through the octets
of the		*/
	/* data frame because unsigned char is 16 bits on the 2812 DSP!
*/
	words [0] = ((uint16_t)split [1] << 8) | split [0];
	words [1] = ((uint16_t)split [3] << 8) | split [2];
	words [2] = ((uint16_t)split [5] << 8) | split [4];
	words [3] = ((uint16_t)split [7] << 8) | split [6];
}

Note that TI doesn't supply a C99 <stdint.h> header with Code Composer
Studio for the 2812, I had to write my own.  On mine for the TI, the
C99 type uint_least_8_t is a typedef for unsigned int.  On the ARM
compiler, which does supply a <stdint.h>, it is unsigned char.

These things can be done in C in a portable way, it just takes a
little thought.

-- 
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html

Reply by Guy Macon ●June 11, 20042004-06-11

Jerry Avins <jya@ieee.org> says...
>
>Guy Macon wrote:
>
>> "Octets" and "Bytes" are always 8 bits.  ...
>
>Not in C. A C byte id the smallest of
>
>1) a character used by the system,
>2) the smallest memory chunk that can be individually addressed, or
>3) eight bits.

Yup.  One more reason to hate C.

Reply by CBFalconer ●June 11, 20042004-06-11

Guy Macon wrote:
> 
> "Octets" and "Bytes" are always 8 bits.  The term you want is "Words."

In C, Octets yes, but bytes contain CHAR_BIT bits, as defined in
<limits.h>

-- 
Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net)
   Available for consulting/temporary embedded and systems.
   <http://cbfalconer.home.att.net>  USE worldnet address!

Reply by Wolfgang ●June 11, 20042004-06-11

Please excuse as I can give no "whatever guidance".
But another question to you:
Can you tell me where to get information about the USB mass storage class ?

                                        Thanks, Wolfgang

Reply by Hans-Bernhard Broeker ●June 11, 20042004-06-11

In comp.arch.embedded Alex Sanks <usenet1@sanks.net> wrote:

> I'm working with some firmware drivers which are intended to be as
> portable as possible.  Data moves thru a switchable 8- or 16-bit
> data bus chip (a USB device controller specifically).

What the data bus of that chip is should be pretty much irrelevant.
What you need to know is what size the registers are.  Or more
generally, how that 16-bit layout actually works.  The makers of that
USB controller *must* be aware of this problem, so check them for app
notes.

> But right now I'm working with a TMS320C55x variant with nothing
> smaller than 16-bit data types.  So naturally the 8 bit types get all
> mixed up when I read them and when I send back similar data every
> other octet is garbage.  

So don't do that.  Marshal your incoming data into something your CPU
can use (e.g. one 16-bit word for each octet, let 32bit words keep
32bit words, and forget about possible waste), right at the interface
betwen the USB controller and the DSP.

> Ugly.  I'd really like to avoid that...

You won't manage to avoid all the ugliness --- you've maneouvered
yourself into too ugly a situation for that.

> but what's the best portable solution to dealing with this?  

Essentially the same one you use to work with single bits in a C byte:
masks and shifts.  Or, only if you know your compiler will _never_
change its behaviour in that aspect, bit-fields.

-- 
Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Reply by Vadim Borshchev ●June 11, 20042004-06-11

On Fri, 11 Jun 2004 08:26:45 +0200, Wolfgang <never@nowhere.com> wrote:

> Can you tell me where to get information about the USB mass storage 
> class ?

USB.org has a good collection of documents, including class specs.
http://www.usb.org/developers/devclass/

HTH,

   Vadim

Reply by Alan Balmer ●June 11, 20042004-06-11

On Thu, 10 Jun 2004 21:42:56 -0400, Jerry Avins <jya@ieee.org> wrote:

>Guy Macon wrote:
>
>> "Octets" and "Bytes" are always 8 bits.  ...
>
>
>Not in C. A C byte id the smallest of
>
>1) a character used by the system,
>2) the smallest memory chunk that can be individually addressed, or
>3) eight bits.

Close. It's the smallest addressable unit which will hold a character.

From the standard:
byte
addressable unit of data storage large enough to hold any member of
the basic character
set of the execution environment
2 NOTE 1 It is possible to express the address of each individual byte
of an object uniquely.
3 NOTE 2 A byte is composed of a contiguous sequence of bits, the
number of which is implementation defined.
The least significant bit is called the low-order bit; the most
significant bit is called the high-order
bit.
>
>In most DSPs, a C compilers considers a byte to contain 16 or 32 bits.
>
>sizeof(char) is always 1. sizeof() returns storage size in bytes. On
>most DSPs, sizeof(int) is 1. On many, sizeof(long) is also 1. Try it.
>
>Jerry

-- 
Al Balmer
Balmer Consulting
removebalmerconsultingthis@att.net

Previous12 3 4 Next

Octets with non-8 bit bytes...

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group