Designing Communication Protocols, Practical Aspects
For most embedded developers always comes the time when they have to make their embedded MCU talk to another system. That other system will be a PC or a different embedded system or a smartphone etc. For the purpose of this article I am assuming that we are in the control of the protocol between the two ends and we don’t have to follow something that is already in place on one side.
So let’s say that we have our embedded MCU, we have implemented and configured the USB stack (or just used and UART to USB) and we can exchange data with our PC application. Moving forward we need to formalize this data exchange i.e. design a protocol. Below I am going to talk about the most basic aspects that one needs to have in mind in order to design and bring up quickly a reliable communication protocol.
Making it easy (to develop and maintain)
The common approach is to define a frame structure and then send frames back and forth. The frame consists of a header of some (preferably fixed) number of bytes and some payload of variable size.
The header contains the data fields that are common to all of the frames that will be exchanged. That will be the size of the frame, a value that defines what the payload is, perhaps a sequence number (allows to detect duplicate data and missing frames) and any other data that has to do either with the protocol or the payload.
The data in the header is usually encoded in fields that are not multiple of bytes (to save space). If you are not concerned about this then making everything byte multiple removes the complexity of handling bit-fields. If you choose to handle bit-fields then consider making the header small enough so it fits in a 32-bit or 64-bit etc. variable. This way you can use bit shifting on one variable to encode and decode the values. Encoding the header in Big Endian makes the data read naturally both in a logic analyser and in a Hex Dump. It also helps if you align your fields to 4-bit boundaries (so they don’t span Hex digits). Finally consider leaving some bits in the header free and reserved for future use, especially if you are deploying this protocol in the field.
For the payload try to avoid bit-fields, make everything byte multiple and maybe also consider Big Endian (for debug-ability). If you expecting a lot of 32-bit values in your payload then consider adding some padding so they end up in 32-bit aligned positions in memory. This will make your code both simpler and faster. If they are not aligned then you must pack and unpack them byte by byte (implementation specific exceptions exist). If we are going to follow the aligned approach do it in a structured way, avoid adhoc in place casts and make an interface instead.
Depending on the particular application you can put limits on the protocol values, so despite have a 14-bit size field you can define that in this application the maximum frame size is 256 bytes. This will make your memory management easier (especially in a resource constrained MCU).
Managing the code
Create a file that contains the specification of the fields in your protocol. This includes both the header and the payload fields (unless the payload data is formatted following some different spec). In this file you must have everything you need to read and write all the fields of your protocol. You really don’t want to use magic numbers around. You also don’t want to implement bit shifting and masking around, so consider making some functions or macros to work together with the specification you created. Make this file common and use it in all the places that use the protocol (i.e. same file on MCU firmware and PC app).
Have a single place in your software that handles the protocol. This will encode and decode the protocol headers, will handle the error detection and recovery and will also provide a place to put logging and dumping of the exchanged data.
Error detection and recovery
Expect errors, all sorts of errors. You should always design with the assumption that data will be lost, the link will disconnect, duplicate data will arrive etc. Even if you are implementing something over a protocol that guarantees data integrity you will have to handle link disconnections. Also don’t forget bugs in your code. It really pays off to have your system produce a sensible error message when things don’t work.
The most common way to check for data integrity is using a CRC. This will protect you both from wrong and missing data. Each side calculates and validates the CRC over the data it receives and discards any wrong frames.So what happens when data is discarded? The simple way to solve this is to always expect a response for anything you send. So if you don’t see the response after some time then you retry. After so many retries in a row it makes sense to just give up.
Synchronisation - Timeouts
The ‘after some time’ phrase above introduces the timeout concept. Timeouts are really important because they allow the systems that participate in the communication link to re-synchronise. Without timeouts the receiving side can only advance its state if it receives data (unless of course there is a side-band channel next to the main data channel). So if data is lost it is very hard to make the receiver drop the failed frame and wait for a new one.
In order to tackle the situation with timeouts, we insert a timeout window on both sides. On the transmitter, the timeout is activated after the transmitter completes transmission and it defines how long the transmitter waits for the response before trying again. On the receiver, it is activated when the reception starts and it defines how long the receiver shall wait for the current frame to complete before dropping it.
Synchronisation – Start Codes
It is quite common and could also make life a bit easier to add a fixed value as the first byte of every frame. This way you can easily avoid random data appearing in the receiving end very early on (if there is such possibility). This can also make it easier to visually inspect Hex dumps of frame. It also enables oscilloscopes and logic analysers to trigger on this pattern so you may have a better live view of the data exchange if you are debugging your system.
One point to add is the need to 'escape' the bytes used to delimit frames or sub-frames in the protocol. The ASCII character set defines SOH for "start of header". If that same value appears in the payload of the message it can be interpreted as an SOH instead of a payload byte. The character set defines ESC for this purpose.
The frame creating code scans the payload for SOH characters and inserts an ESC ahead of the SOH. The decoding code scans that received payload, removes the ESC, and leaves the SOH as data. An ESC in the data is similarly escaped. The character set includes other message protocol characters like STX (start of text or payload) and ETX (end of text).
Sometimes the protocol will not only escape but then change the escaped character to a different value.
I'm in the process of decoding an undocumented protocol so in the midst of considering all these issues.
Very good point you are bringing up. If you are delimiting frames only by a start code you definitely need to escape that from your payload. On the other hand if your frame header includes a size field you can implicitly ignore and start codes inside your payload. This makes your framing process lighter and even zero copy between the different layers of your protocol. The downside is of course that in case of a frame sync loss you may have more trouble to re-synchronise if the start code appears in your payload.
Good luck with your decoding work. Especially in cases where the protocol is sloppy and bodged-up in order to support required features, it can be a real nightmare.
To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.
Registering will allow you to participate to the forums on ALL the related sites and give you access to all pdf downloads.