Binary protocol design: TLV, LTV, or else?

Hi all.

I'm making a protocol for communication between a PC and a peripheral 
device. The protocol is expected to, at first, run on raw Ethernet but I 
am also supposed to not make any blunders that would make it impossible 
to later use the exact same protocol on things like IP and friends.

Since I saw these kinds of things in many Internet protocols (DNS, DHCP, 
TCP options, off the top of my head - but note that these may have a 
different order of fields), I have decided to make it an array of type-
length-value triplets encapsulated in the packet frame (no header). The 
commands would fill the "type" field, "length" would specify the length 
of data ("value") following the length field, and "value" would contain 
the data for the command.

But I would like to hear other (read: opposing) opinions. Particularly so 
since I am self-taught so there may be considerations obvious to 
graduated engineers that I am oblivious to.

BTW, the periphery that is on the other end is autonomous and rather 
intelligent, but very resource constrained. Really, resource 
constrainment of the periphery is my main problem here.


Some interesting questions:
Is ommiting a packet header a good idea? In the long run?

If I put a packet header, what do I put in it? Since addressing and error 
detection and "recovery" is supposed to be done by underlying protocols, 
the only thing I can think of putting into the header is the total-length 
field, and maybe, maybe, maybe a packet-id or transaction-id field. But I 
really don't need any of these.

My reasoning with packet-id and transaction-id (and protocol-version, 
really) is that I don't need them now, so I can omit them, and if I ever 
do need them, I can just add a command which implements them. In doing 
this, am I setting myself up for a very nasty problem in the future?

Is using flexible packets like this one (opposed the the contents of, 
say, IP header which has strictly defined fields) a good idea, or am I 
better off rigidifying my packets?

Is there a special prefference or reason as to why some protocols do TLV 
and others do LTV? (Note that I am not trying to ignite a holy war, I'm 
just asking.)

Is it good practice to require aligning the beggining of a TLV with a 
boundary, say 16-bit word boundary?

Reply by Grant Edwards ●January 8, 20142014-01-08

On 2014-01-08, Aleksandar Kuktin <akuktin@gmail.com> wrote:

> I'm making a protocol for communication between a PC and a peripheral 
> device. The protocol is expected to, at first, run on raw Ethernet

I've been supporting a protocol like that for many years.  Doing raw
Ethernet on Windows hosts is becoming increasingly problematic due to
attempts by Microsoft to fix security issues. We anticipate it will
soon no longer be feasible and we'll be forced to switch to UDP.

I'm not the Windows guy, but as I understand it you'll have to write a
Windows kernel-mode driver to support your protcol, and users will
require admin privlidges.  Even then you'll have problems with various
firewall setups and anti-virus software.

If the PC is running Linux, raw Ethernet isn't nearly as problematic
as it is on Windows, but it does still require either root privledges
or special security capabilities.

If you can, I'd recommend using UDP (which is fairly low overhead).
The PC end can then be written as a normal user-space application that
doesn't require admin privledges.  You'll still have problems with
some routers and NAT firewalls, but way fewer problems than trying to
use raw Ethernet.

Using TCP will allow the easiest deployment, but TCP requires quite a
bit more overhead than UDP.

-- 
Grant Edwards               grant.b.edwards        Yow! HAIR TONICS, please!!
                                  at               
                              gmail.com

Reply by Don Y ●January 8, 20142014-01-08

Hi Aleksander,

On 1/8/2014 2:30 PM, Aleksandar Kuktin wrote:

> I'm making a protocol for communication between a PC and a peripheral

Here there be dragons...

> device. The protocol is expected to, at first, run on raw Ethernet but I
> am also supposed to not make any blunders that would make it impossible
> to later use the exact same protocol on things like IP and friends.
>
> Since I saw these kinds of things in many Internet protocols (DNS, DHCP,
> TCP options, off the top of my head - but note that these may have a
> different order of fields), I have decided to make it an array of type-
> length-value triplets encapsulated in the packet frame (no header). The
> commands would fill the "type" field, "length" would specify the length
> of data ("value") following the length field, and "value" would contain
> the data for the command.

Are you sure you have enough variety to merit the extra overhead
(in the packet *and* in the parsing of the packet)?  Can you,
instead, create a single packet format whose contents are indicated
by a "packet type" specified in the header?  Even if this means leaving
space for values/parameters that might not be required in every
packet type?  For example:
    <header> <field1> <field2> <field3> <field4>
Where certain fields may not be used in certain packet types
(their contents then being "don't care").

Alternatively, a packet type that implicitly *defines* the format
of the balance of the packet.  For example:
    type1:   <header1> <fieldA> <fieldB>
    type2:   <header2> <fieldA>
    type3:   <header3> <fieldA> <fieldB> <fieldC> <fieldD>
(where the format of each field may vary significantly between
message types)

It seems like you are headed in the direction of:
    <header> <fields>
where the number of fields can vary as can their individual formats.

> But I would like to hear other (read: opposing) opinions. Particularly so
> since I am self-taught so there may be considerations obvious to
> graduated engineers that I am oblivious to.
>
> BTW, the periphery that is on the other end is autonomous and rather
> intelligent, but very resource constrained. Really, resource
> constrainment of the periphery is my main problem here.

So, the less "thinking" (i.e., handling of variations) that the
remote device has to handle, the better.

Of course, this can be done in a variety of different ways!
E.g., you could adopt a format where each field consists of:
    <parameterNumber> <parameterValue>
and the receiving device can blindly parse the parameterNumber
and plug the corresponding parameterValue into a "slot" in an
array of parameters that your algorithms use.

Alternatively, you could write a parser that expects an entire
message to have a fixed format and plug the parameters it
discovers into predefined locations in your app.

> Some interesting questions:
> Is ommiting a packet header a good idea? In the long run?

Headers (and, where necessary, trailers) are intended to pass
specific data (e.g., message type) in a way that is invariant of
the content of the balance of the message.  Like saying, "What follows
is ...".

They also help to improve reliability of the message as they can
carry information that helps verify that integrity.  E.g., a
checksum.  Or, simply the definition of "What follows is..."
allows the recipient to perform some tests on that which follows!
So, if you are claiming that "what follows is an email address",
the recipient can expect <alphanumeric>@<domain>.  Anything that
doesn't fit this template suggests something is broken -- you
are claiming this is an email address yet it doesn't conform to
the template for an email address!

> If I put a packet header, what do I put in it? Since addressing and error
> detection and "recovery" is supposed to be done by underlying protocols,

Will that ALWAYS be the case for you?  What if you later decide to
run your protocol over EIA232?  Will you then require inserting
another protocol *beneath* it to provide those guarantees?

Will your underlying protocol guarantee that messages are delivered IN
ORDER?  *Always*?

Do you expect the underlying protocol to guarantee delivery?  At most
once?  At least once?

> the only thing I can think of putting into the header is the total-length
> field, and maybe, maybe, maybe a packet-id or transaction-id field. But I
> really don't need any of these.
>
> My reasoning with packet-id and transaction-id (and protocol-version,
> really) is that I don't need them now, so I can omit them, and if I ever
> do need them, I can just add a command which implements them. In doing
> this, am I setting myself up for a very nasty problem in the future?
>
> Is using flexible packets like this one (opposed the the contents of,
> say, IP header which has strictly defined fields) a good idea, or am I
> better off rigidifying my packets?

That depends on what you expect in the future -- in terms of additions
to the protocol as well as the conveyance by which your data gets
to/from the device.  Simpler tends to be better.

> Is there a special prefference or reason as to why some protocols do TLV
> and others do LTV? (Note that I am not trying to ignite a holy war, I'm
> just asking.)
>
> Is it good practice to require aligning the beggining of a TLV with a
> boundary, say 16-bit word boundary?

Depends on how you are processing the byte stream.  E.g., for ethernet,
if you try to deal with any types bigger than single octets, you need
to resolve byte ordering issues (so-called network byte order).
If you design your protocol to deal exclusively with octets, then
you can sidestep this (by specifying an explicit byte ordering)
but then force the receiving (and sending) tasks to demangle/mangle
the data types outof/into these forms.

Reply by Joe Chisolm ●January 8, 20142014-01-08

On Wed, 08 Jan 2014 21:30:09 +0000, Aleksandar Kuktin wrote:

> Hi all.
> 
> I'm making a protocol for communication between a PC and a peripheral
> device. The protocol is expected to, at first, run on raw Ethernet but I
> am also supposed to not make any blunders that would make it impossible
> to later use the exact same protocol on things like IP and friends.
> 
> Since I saw these kinds of things in many Internet protocols (DNS, DHCP,
> TCP options, off the top of my head - but note that these may have a
> different order of fields), I have decided to make it an array of type-
> length-value triplets encapsulated in the packet frame (no header). The
> commands would fill the "type" field, "length" would specify the length
> of data ("value") following the length field, and "value" would contain
> the data for the command.
> 
> But I would like to hear other (read: opposing) opinions. Particularly
> so since I am self-taught so there may be considerations obvious to
> graduated engineers that I am oblivious to.
> 
> BTW, the periphery that is on the other end is autonomous and rather
> intelligent, but very resource constrained. Really, resource
> constrainment of the periphery is my main problem here.
> 
> 
> Some interesting questions:
> Is ommiting a packet header a good idea? In the long run?
> 
> If I put a packet header, what do I put in it? Since addressing and
> error detection and "recovery" is supposed to be done by underlying
> protocols, the only thing I can think of putting into the header is the
> total-length field, and maybe, maybe, maybe a packet-id or
> transaction-id field. But I really don't need any of these.
> 
> My reasoning with packet-id and transaction-id (and protocol-version,
> really) is that I don't need them now, so I can omit them, and if I ever
> do need them, I can just add a command which implements them. In doing
> this, am I setting myself up for a very nasty problem in the future?
> 
> Is using flexible packets like this one (opposed the the contents of,
> say, IP header which has strictly defined fields) a good idea, or am I
> better off rigidifying my packets?
> 
> Is there a special prefference or reason as to why some protocols do TLV
> and others do LTV? (Note that I am not trying to ignite a holy war, I'm
> just asking.)
> 
> Is it good practice to require aligning the beggining of a TLV with a
> boundary, say 16-bit word boundary?

Read the Radius protocol RFCs and how they deal with UDP.  There is a
boat load of parsing code out there in the various Radius server and
client implementations.  If you start with UDP you can even cob together
a test system using many of the scripting languages like perl, python,
ruby, etc.

-- 
Chisolm
Republic of Texas

Reply by David LaRue ●January 8, 20142014-01-08

Aleksandar Kuktin <akuktin@gmail.com> wrote in
news:lakg10$kri$1@speranza.aioe.org: 

> Hi all.
> 
> I'm making a protocol for communication between a PC and a peripheral 
> device. The protocol is expected to, at first, run on raw Ethernet but
> I am also supposed to not make any blunders that would make it
> impossible to later use the exact same protocol on things like IP and
> friends. 
> 
> Since I saw these kinds of things in many Internet protocols (DNS,
> DHCP, TCP options, off the top of my head - but note that these may
> have a different order of fields), I have decided to make it an array
> of type- length-value triplets encapsulated in the packet frame (no
> header). The commands would fill the "type" field, "length" would
> specify the length of data ("value") following the length field, and
> "value" would contain the data for the command.
> 
> But I would like to hear other (read: opposing) opinions. Particularly
> so since I am self-taught so there may be considerations obvious to 
> graduated engineers that I am oblivious to.
> 
> BTW, the periphery that is on the other end is autonomous and rather 
> intelligent, but very resource constrained. Really, resource 
> constrainment of the periphery is my main problem here.
> 
> 
> Some interesting questions:
> Is ommiting a packet header a good idea? In the long run?
> 
> If I put a packet header, what do I put in it? Since addressing and
> error detection and "recovery" is supposed to be done by underlying
> protocols, the only thing I can think of putting into the header is
> the total-length field, and maybe, maybe, maybe a packet-id or
> transaction-id field. But I really don't need any of these.
> 
> My reasoning with packet-id and transaction-id (and protocol-version, 
> really) is that I don't need them now, so I can omit them, and if I
> ever do need them, I can just add a command which implements them. In
> doing this, am I setting myself up for a very nasty problem in the
> future? 
> 
> Is using flexible packets like this one (opposed the the contents of, 
> say, IP header which has strictly defined fields) a good idea, or am I
> better off rigidifying my packets?
> 
> Is there a special prefference or reason as to why some protocols do
> TLV and others do LTV? (Note that I am not trying to ignite a holy
> war, I'm just asking.)
> 
> Is it good practice to require aligning the beggining of a TLV with a 
> boundary, say 16-bit word boundary?

Hello,

I originated a product that used TLV packets back in the 90s and it is 
still in use today without any problems.  It was similar to a 
configuration file that contained various parameters for applications 
that shared data.  There was a root packet header.  This allowed 
transmission acros TCP, serial, queued pipes, and file storage.  We 
enforced a 4-byte alignment on fields due to the machines being used to 
parse the data - we had Windows, linux, and embedded devices reading the 
data.  Just be sure to define the byte order.  We wrote and maintained 
an RFC like document.

One rule we followed that may help you is that once a tag is defined it 
is never redefined.  That prevented issues migrating forward and 
backward.  Tags could be removed from use, but were always supported.

One issue we had with TLV was with one of the developers taking 
shortcuts.  The TLVs were built in a tree so any V started with a TL 
until you got to the lowest level item being communicated.  Anyway the 
developer in question would read the T and presume they could bypass 
reading the lower level tags because the order was fixed - it was not.  
Upgraded protocols added fields (a low level TLV) that cause read 
issues.  Easy to find but frustrating that we had to re-release one of 
the node devices.

The only other error you are likely to get is due with TLVs like this is 
an issue if they entrire message isn't delivered.  The follow on data 
becomes part of the previous message.  That is why some encaptulation 
might be wise.  If you are using UDP and there is no need for multiple 
packets per message (ever) that might be your encapsulation method.

Good luck,

David

Reply by ●January 9, 20142014-01-09

On Wed, 8 Jan 2014 22:14:35 +0000 (UTC), Grant Edwards
<invalid@invalid.invalid> wrote:

>On 2014-01-08, Aleksandar Kuktin <akuktin@gmail.com> wrote:
>
>> I'm making a protocol for communication between a PC and a peripheral 
>> device. The protocol is expected to, at first, run on raw Ethernet
>
>I've been supporting a protocol like that for many years.  Doing raw
>Ethernet on Windows hosts is becoming increasingly problematic due to
>attempts by Microsoft to fix security issues. We anticipate it will
>soon no longer be feasible and we'll be forced to switch to UDP.

UDP adds very little compared to raw ethernet, some more or less
stable header bytes and a small ARP protocol (much less than a page of
code). There are a lot of tools to display the various IP and UDP
headers and standard socket drivers should work OK.

If you are using raw ethernet on a big host, you most likely would
have to put the ethernet adapter into promiscuous mode, which might
security / permission issue.

Reply by dp ●January 9, 20142014-01-09

On Thursday, January 9, 2014 8:59:25 AM UTC+2, upsid...@downunder.com wrote:
> On Wed, 8 Jan 2014 22:14:35 +0000 (UTC), Grant Edwards
> <invalid@invalid.invalid> wrote:
> 
> >On 2014-01-08, Aleksandar Kuktin <akuktin@gmail.com> wrote:
> >
> >> I'm making a protocol for communication between a PC and a peripheral 
> >> device. The protocol is expected to, at first, run on raw Ethernet
> >
> >I've been supporting a protocol like that for many years.  Doing raw
> >Ethernet on Windows hosts is becoming increasingly problematic due to
> >attempts by Microsoft to fix security issues. We anticipate it will
> >soon no longer be feasible and we'll be forced to switch to UDP.
> 
> UDP adds very little compared to raw ethernet, some more or less
> stable header bytes and a small ARP protocol (much less than a page of
> code). There are a lot of tools to display the various IP and UDP
> headers and standard socket drivers should work OK.

I would also advocate using UDP rather than raw Ethernet.
Implementing IP can be pretty simple if one does not intend
(as in this case) connect the device to the internet, fragment/defragment
out of order datagrams etc. UDP on top of that is almost negligible.
I can't see which MCU will have an Ethernet MAC and lack the
resources for such an "almost IP" implementation.

Dimiter

------------------------------------------------------
Dimiter Popoff, TGI             http://www.tgi-sci.com
------------------------------------------------------
http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/

Reply by Don Y ●January 9, 20142014-01-09

Hi Dimiter,

On 1/9/2014 12:37 AM, dp wrote:
> On Thursday, January 9, 2014 8:59:25 AM UTC+2, upsid...@downunder.com wrote:
>> On Wed, 8 Jan 2014 22:14:35 +0000 (UTC), Grant Edwards
>> <invalid@invalid.invalid>  wrote:
>>
>>> On 2014-01-08, Aleksandar Kuktin<akuktin@gmail.com>  wrote:
>>>
>>>> I'm making a protocol for communication between a PC and a peripheral
>>>> device. The protocol is expected to, at first, run on raw Ethernet
>>>
>>> I've been supporting a protocol like that for many years.  Doing raw
>>> Ethernet on Windows hosts is becoming increasingly problematic due to
>>> attempts by Microsoft to fix security issues. We anticipate it will
>>> soon no longer be feasible and we'll be forced to switch to UDP.
>>
>> UDP adds very little compared to raw ethernet, some more or less
>> stable header bytes and a small ARP protocol (much less than a page of
>> code). There are a lot of tools to display the various IP and UDP
>> headers and standard socket drivers should work OK.
>
> I would also advocate using UDP rather than raw Ethernet.
> Implementing IP can be pretty simple if one does not intend
> (as in this case) connect the device to the internet, fragment/defragment
> out of order datagrams etc. UDP on top of that is almost negligible.
> I can't see which MCU will have an Ethernet MAC and lack the
> resources for such an "almost IP" implementation.

UDP tends to hit the "sweet spot" between "bare iron" and the bloat
of TCP/IP.  The implementer has probably the most leeway in deciding
what he *wants* to implement vs. what he *must* implement (once you
climb up into TCP, most of the "options" go away).

Having said that, the OP still has a fair number of decisions to
make if he chooses to layer his protocol atop UDP.  MTU, ARP/RARP
implementation, checksum support (I'd advocate doing this in *his*
protocol if he ever intends to run it over a leaner protocol where
*he* has to provide this reliability), etc.

I've (we've?) been assuming he can cram an entire message into
a tiny "no-fragment" packet -- that may not be the case!  (Or,
may prove to be a problem when run over protocols with smaller
MTU's)

Reply by dp ●January 9, 20142014-01-09

On Thursday, January 9, 2014 10:28:21 AM UTC+2, Don Y wrote:
> ...
> I've (we've?) been assuming he can cram an entire message into
> a tiny "no-fragment" packet -- that may not be the case!  (Or,
> may prove to be a problem when run over protocols with smaller
> MTU's)

Hi Don,
UDP does not add any fragmentation overhead compared to his
raw Ethernet anyway (that is, if he stays with UDP packets
fitting in apr. 1500 bytes he will be no worse off than without
UDP).
IP does add fragmentation overhead - if it is a real IP. The sender
may choose its MTU (likely a full size Ethernet packet) but a
receiver must be ready to get that same fragmented in a few pieces
and out of order and be able to defragment it.
But since he is OK with raw Ethernet he does not need a true IP
implementation so he can just do it as if everybody is fine with
a fullsized ethernet MTU and get on with it as you suggest.
Will lose a few bytes for encapsulation but if losing 100 bytes
out of 1500 is an issue chances are there will be a lot of other,
real problems :-).

Dimiter

------------------------------------------------------
Dimiter Popoff, TGI             http://www.tgi-sci.com
------------------------------------------------------
http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/

Reply by Don Y ●January 9, 20142014-01-09

Hi Dimiter,

On 1/9/2014 1:53 AM, dp wrote:
> On Thursday, January 9, 2014 10:28:21 AM UTC+2, Don Y wrote:
>> ...
>> I've (we've?) been assuming he can cram an entire message into
>> a tiny "no-fragment" packet -- that may not be the case!  (Or,
>> may prove to be a problem when run over protocols with smaller
>> MTU's)
>
> UDP does not add any fragmentation overhead compared to his
> raw Ethernet anyway (that is, if he stays with UDP packets
> fitting in apr. 1500 bytes he will be no worse off than without
> UDP).

I'm thinking more in terms of any other media (protocols) over
which he may eventually use for transport.  If he doesn't want to
add support for packet reassembly in *his* protocol, then he
would be wise to pick a message format that fits in the smallest
MTU "imaginable".

For ethernet, I think that is ~60+ octets (i.e., just bigger than the
frame header).  I'm a big fan of ~500 byte messages (the minimum
that any node *must* be able to accommodate).  I think you have to
consider any other media that may get injected along the path
from source to destination (i.e., if it is not purely "ethernet"
from end to end.  IIRC, a PPP link drops the MTU to the 200-300
range.

> IP does add fragmentation overhead - if it is a real IP. The sender
> may choose its MTU (likely a full size Ethernet packet) but a
> receiver must be ready to get that same fragmented in a few pieces
> and out of order and be able to defragment it.

As above, I think if you truly want to avoid dealing with fragments,
you have to be able to operate with an MTU that is little more than
the header (plus 4? or 8?? octets).  Even a ~500 byte message could,
conceivably, appear as *100* little fragments!  :-/  (and the
receiving node had better be equipped to handle all 500 bytes as
they trickle in!)

> But since he is OK with raw Ethernet he does not need a true IP
> implementation so he can just do it as if everybody is fine with
> a fullsized ethernet MTU and get on with it as you suggest.
> Will lose a few bytes for encapsulation but if losing 100 bytes
> out of 1500 is an issue chances are there will be a lot of other,
> real problems :-).

OP hasn't really indicated how complex/big his messages need to be.
Nor what the ultimate fabric might look like.

E.g., here, I've tried really hard to keep messages *ultra* tiny
by thinking about exactly what *needs* to fit in the message and
how best to encode it.  So, for example, I can build an ethernet-CAN
bridge in a heartbeat and not have to worry about trading latency
and responsiveness for packet size on the CAN bus (those nodes can
have super tiny input buffers and still handle complete messages
without having to worry about fragmentation, etc.)

It must have been entertaining for the folks who came up with
ethernet, IP, etc. way back when to start with a clean slate
and *guess* as to what would work best!  :>

Previous12 3 4 5 6 Next

Binary protocol design: TLV, LTV, or else?

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group