Octets with non-8 bit bytes...| page 2

Reply by Hans-Bernhard Broeker ●June 11, 20042004-06-11

Alan Balmer <albalmer@att.net> wrote:
> On Thu, 10 Jun 2004 21:42:56 -0400, Jerry Avins <jya@ieee.org> wrote:

> >Guy Macon wrote:
> >
> >> "Octets" and "Bytes" are always 8 bits.  ...
> >
> >
> >Not in C. A C byte id the smallest of
> >
> >1) a character used by the system,
> >2) the smallest memory chunk that can be individually addressed, or
> >3) eight bits.

> Close. It's the smallest addressable unit which will hold a character.

Closer, but still no cigar.  It must be addressable, and must be able
to represent each character distinctly.  But by no means does it
*have* to be the _smallest_ addressable unit fulfilling those
requirements.  E.g. a C translation system targetting a 32-bit x86 PC
yet using 19-bit chars, although obviously a total perversion, is
quite certainly allowed by the C standard.

-- 
Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Reply by Chris Hills ●June 11, 20042004-06-11

In article <10chvgq33qr2v00@corp.supernews.com>, Guy Macon
<http@?.guymacon.com> writes
>
>"Octets" and "Bytes" are always 8 bits.  The term you want is "Words."
>
Octets yes but not so bytes so I am told. Back in the depths of
computing history bytes could be other than 8 bits hence the use of
"octet"

/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\ Chris Hills  Staffs  England    /\/\/\/\/\
/\/\/ chris@phaedsys.org       www.phaedsys.org \/\/
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

Reply by rickman ●June 11, 20042004-06-11

Hans-Bernhard Broeker wrote:
> 
> Alan Balmer <albalmer@att.net> wrote:
> > On Thu, 10 Jun 2004 21:42:56 -0400, Jerry Avins <jya@ieee.org> wrote:
> 
> > >Guy Macon wrote:
> > >
> > >> "Octets" and "Bytes" are always 8 bits.  ...
> > >
> > >
> > >Not in C. A C byte id the smallest of
> > >
> > >1) a character used by the system,
> > >2) the smallest memory chunk that can be individually addressed, or
> > >3) eight bits.
> 
> > Close. It's the smallest addressable unit which will hold a character.
> 
> Closer, but still no cigar.  It must be addressable, and must be able
> to represent each character distinctly.  But by no means does it
> *have* to be the _smallest_ addressable unit fulfilling those
> requirements.  E.g. a C translation system targetting a 32-bit x86 PC
> yet using 19-bit chars, although obviously a total perversion, is
> quite certainly allowed by the C standard.

So how large is a byte on such a machine?  I'm not clear if this is 24
or 32 bits.  I guess this would be 32 bits since 24 bits would not be
"directly" addressable.  How is that different from what Alan said?  

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Reply by Jim Stewart ●June 11, 20042004-06-11

rickman wrote:
> Hans-Bernhard Broeker wrote:
> 
>>Alan Balmer <albalmer@att.net> wrote:
>>
>>>On Thu, 10 Jun 2004 21:42:56 -0400, Jerry Avins <jya@ieee.org> wrote:
>>
>>>>Guy Macon wrote:
>>>>
>>>>
>>>>>"Octets" and "Bytes" are always 8 bits.  ...
>>>>
>>>>
>>>>Not in C. A C byte id the smallest of
>>>>
>>>>1) a character used by the system,
>>>>2) the smallest memory chunk that can be individually addressed, or
>>>>3) eight bits.
>>
>>>Close. It's the smallest addressable unit which will hold a character.
>>
>>Closer, but still no cigar.  It must be addressable, and must be able
>>to represent each character distinctly.  But by no means does it
>>*have* to be the _smallest_ addressable unit fulfilling those
>>requirements.  E.g. a C translation system targetting a 32-bit x86 PC
>>yet using 19-bit chars, although obviously a total perversion, is
>>quite certainly allowed by the C standard.
> 
> 
> So how large is a byte on such a machine?  I'm not clear if this is 24
> or 32 bits.  I guess this would be 32 bits since 24 bits would not be
> "directly" addressable.  How is that different from what Alan said?  

Just to stir the pot a little, a pdp8 was a
12-bit machine and 12 bits was common usage
for a byte.  The machine addressed data in
12-bit bytes, but also common usage was to
store and manipulate chars as 6-bit nibbles
packed 2 per byte. This was back on TTY's
where upper case was irrelevent.

Reply by Alan Balmer ●June 11, 20042004-06-11

On 11 Jun 2004 15:51:48 GMT, Hans-Bernhard Broeker
<broeker@physik.rwth-aachen.de> wrote:

> E.g. a C translation system targetting a 32-bit x86 PC
>yet using 19-bit chars, although obviously a total perversion, is
>quite certainly allowed by the C standard.

Providing that the 32-bit words are addressable in 19-bit chunks? My
head hurts.

-- 
Al Balmer
Balmer Consulting
removebalmerconsultingthis@att.net

Reply by Paul Keinanen ●June 11, 20042004-06-11

On Fri, 11 Jun 2004 08:38:36 -0700, Alan Balmer <albalmer@att.net>
wrote:

>>
>>Not in C. A C byte id the smallest of
>>
>>1) a character used by the system,
>>2) the smallest memory chunk that can be individually addressed, or
>>3) eight bits.
>
>Close. It's the smallest addressable unit which will hold a character.

Now the question is, what is a character ?

I can think of character sets based on 5, 6, 7, 8, (9), 16, 21 and 31
(32) bits.

Paul

Reply by Guy Macon ●June 11, 20042004-06-11

Jim Stewart <jstewart@jkmicro.com> says...

>Just to stir the pot a little, a pdp8 was a
>12-bit machine and 12 bits was common usage
>for a byte.  The machine addressed data in
>12-bit bytes, but also common usage was to
>store and manipulate chars as 6-bit nibbles
>packed 2 per byte. This was back on TTY's
>where upper case was irrelevent.

Just to stir the pot a little more...


http://www.catb.org/~esr/jargon/html/B/byte.html
byte: /bi:t/, n.
[techspeak] A unit of memory or data equal to the amount used to 
represent one character; on modern architectures this is invariably 
8 bits. Some older architectures used byte for quantities of 6, 7, 
or (especially) 9 bits, and the PDP-10 supported bytes that were 
actually bitfields of 1 to 36 bits! These usages are now obsolete, 
killed off by universal adoption of power-of-2 word sizes.

Historical note: The term was coined by Werner Buchholz in 1956 
during the early design phase for the IBM Stretch computer; originally 
it was described as 1 to 6 bits (typical I/O equipment of the period 
used 6-bit chunks of information). The move to an 8-bit byte happened 
in late 1956, and this size was later adopted and promulgated as a 
standard by the System/360. The word was coined by mutating the word 
&#4294967295;bite&#4294967295; so it would not be accidentally misspelled as bit. 
See also nybble.


http://www.catb.org/~esr/jargon/html/C/chawmp.html
chawmp: n.
[University of Florida] 16 or 18 bits (half of a machine word). 
This term was used by FORTH hackers during the late 1970s/early 
1980s; it is said to have been archaic then, and may now be 
obsolete. It was coined in revolt against the promiscuous use 
of &#4294967295;word&#4294967295; for anything between 16 and 32 bits; &#4294967295;word&#4294967295; has an 
additional special meaning for FORTH hacks that made the 
overloading intolerable. For similar reasons, /gaw&#4294967295;bl/ (spelled 
&#4294967295;gawble&#4294967295; or possibly &#4294967295;gawbul&#4294967295;) was in use as a term for 32 or 
48 bits (presumably a full machine word, but our sources are 
unclear on this). These terms are more easily understood if 
one thinks of them as faithful phonetic spellings of &#4294967295;chomp&#4294967295; 
and &#4294967295;gobble&#4294967295; pronounced in a Florida or other Southern U.S. 
dialect. For general discussion of similar terms, see nybble.


nybble: /nib&#4294967295;l/, nibble, n.
[from v. nibble by analogy with &#4294967295;bite&#4294967295; ? &#4294967295;byte&#4294967295;] Four bits; one 
hex digit; a half-byte. Though &#4294967295;byte&#4294967295; is now techspeak, this 
useful relative is still jargon. Compare byte; see also bit. 
The more mundane spelling &#4294967295;nibble&#4294967295; is also commonly used. 
Apparently the &#4294967295;nybble&#4294967295; spelling is uncommon in Commonwealth 
Hackish, as British orthography would suggest the pronunciation 
/ni:&#4294967295;bl/.

Following &#4294967295;bit&#4294967295;, &#4294967295;byte&#4294967295; and &#4294967295;nybble&#4294967295; there have been quite a 
few analogical attempts to construct unambiguous terms for 
bit blocks of other sizes. All of these are strictly jargon, 
not techspeak, and not very common jargon at that (most 
hackers would recognize them in context but not use them 
spontaneously). We collect them here for reference together 
with the ambiguous techspeak terms &#4294967295;word&#4294967295;, &#4294967295;half-word&#4294967295;, 
&#4294967295;double word&#4294967295;, and &#4294967295;quad&#4294967295; or quad word; some (indicated) 
have substantial information separate entries.

2 bits: crumb, quad, quarter, tayste, tydbit, morsel 

4 bits: nybble 

5 bits: nickle 

10 bits: deckle 

16 bits: playte, chawmp (on a 32-bit machine), word (on a 16-bit machine), 
         half-word (on a 32-bit machine). 

18 bits: chawmp (on a 36-bit machine), half-word (on a 36-bit machine) 

32 bits: dynner, gawble (on a 32-bit machine), word (on a 32-bit machine), 
         longword (on a 16-bit machine). 

36 bits: word (on a 36-bit machine) 

48 bits: gawble (under circumstances that remain obscure) 

64 bits: double word (on a 32-bit machine) quad (on a 16-bit machine) 

128 bits: quad (on a 32-bit machine) 

The fundamental motivation for most of these jargon terms (aside from 
the normal hackerly enjoyment of punning wordplay) is the extreme 
ambiguity of the term word and its derivatives

Also see:

http://www.catb.org/~esr/jargon/html/P/playte.html
http://www.catb.org/~esr/jargon/html/T/tayste.html
http://www.catb.org/~esr/jargon/html/Q/quarter.html
http://www.catb.org/~esr/jargon/html/B/bit.html
http://www.catb.org/~esr/jargon/
http://www.catb.org/~esr/jargon/jargoogle.html

Comment by Guy Macon: Concerning the statement "on modern 
architectures this is invariably 8 bits", in my opinion C has 
no resemblance to anything that can reasonably be called "modern."
See [ http://cm.bell-labs.com/cm/cs/who/dmr/chist.html ].


-- 
Guy Macon, Electronics Engineer & Project Manager for hire. 
Remember Doc Brown from the _Back to the Future_ movies? Do you 
have an "impossible" engineering project that only someone like 
Doc Brown can solve?  My resume is at http://www.guymacon.com/

Reply by Dave Hansen ●June 11, 20042004-06-11

On Fri, 11 Jun 2004 12:17:15 -0700, Guy Macon
<http://www.guymacon.com> wrote:

[...]
>http://www.catb.org/~esr/jargon/html/C/chawmp.html
>chawmp: n.
>[University of Florida] 16 or 18 bits (half of a machine word). 
>This term was used by FORTH hackers during the late 1970s/early 
>1980s; it is said to have been archaic then, and may now be 
>obsolete. It was coined in revolt against the promiscuous use 

I first used Forth in the late 70's-early 80's (though I've never been
a "Forth Hacker"), and I've never seen this term before.  

The term I've always heard used is "cell," which is the size of a
single entry on the data stack, and at least 16 bits wide in ANSI
standard Forth.  A "cell pair" holds "double cell" values.  A
"character" is allowed (but not required) to be narrower than a
"cell."

>of &#4294967295;word&#4294967295; for anything between 16 and 32 bits; &#4294967295;word&#4294967295; has an 
>additional special meaning for FORTH hacks that made the 
>overloading intolerable. For similar reasons, /gaw&#4294967295;bl/ (spelled 

FWIW, a "word" in Forth is what you might call an "operator" or a
"function" in c.  Actually, it's a little more generic than that.
Almost everything in a Forth program is a word.

>&#4294967295;gawble&#4294967295; or possibly &#4294967295;gawbul&#4294967295;) was in use as a term for 32 or 
>48 bits (presumably a full machine word, but our sources are 

Never heard of that one either...

[...]

>2 bits: [...] quarter,

Shave and a haircut...

Regards,

                               -=Dave
-- 
Change is inevitable, progress is not.

Reply by rickman ●June 11, 20042004-06-11

Dave Hansen wrote:
> 
> On Fri, 11 Jun 2004 12:17:15 -0700, Guy Macon
> <http://www.guymacon.com> wrote:
> 
> [...]
> >http://www.catb.org/~esr/jargon/html/C/chawmp.html
> 
> >2 bits: [...] quarter,
> 
> Shave and a haircut...

Would that make 8 bits a dollar?  I've never liked calling 8 bits an
octet, it sounds like an overgrown musical group... 

Lets see...   a dollar buys an ascii char, two dollars gets you signed
numbers from 32575 to -32576 and four dollars can buy... well you get
the idea.  :) 

I am building a 50 cent CPU!  Cool, thats what I'll call it,
FiftyCents.  

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Reply by William Meyer ●June 11, 20042004-06-11

oN 11-Jun-04, rickman said:

> Would that make 8 bits a dollar?

Yes, and comes from the practice of cutting a silver dollar into 8
"bits" for making change.

-- 
Bill
Posted with XanaNews Version 1.16.3.1