EmbeddedRelated.com
Forums
The 2026 Embedded Online Conference

Simple BUT reliable serial protocol

Started by pozz January 3, 2016
On 4.1.16 15:46, Jack wrote:
> Il giorno luned� 4 gennaio 2016 02:01:44 UTC+1, pozz ha scritto: > > >> I'm sure a simple solution exists, because TCP/IP is now used for many >> applications, even critical (on-line banking and similar things), but I >> can't find it myself. > > If you can set a MCU as master and the other as slave, I find that MODBUS RTU is simple and robust enough for most of the communication that small MCU need to do. > There are of course limits: a MCU is the master the other(s) are slave: itmeans that one ask the other(s) answer. No communication can be initiated from slaves. It measn that the communication is half-duplex (even if you use two lines, rx and tx). > > Bye Jack
Modbus RTU is from the data communication standpoint one of the very worst protocols. For the receiving station, there is no way of knowing the packet boundaries without parsing the whole protocol with all variants. The timing limits set for packet transmission (obviously an attempt to patch the original bad decision) makes it impossible to use an interface with built-in FIFO, as the timing is then lost. For simple async serial transmission, the HDLC-style encapsulation of PPP is at least one of the vety best. -- -TV
pozz <pozzugno@gmail.com> writes:
> Fortunately I don't work for medical and mission-critical > applications. Anyway I'm curious how those kind of problems are solved > in those applications where I can't accept *any* error.
There's a standard dilemma in distributed systems, where you want to send messages with a guarantee that they're delivered exactly once, but you can't do that because of the two generals problem. You can choose between "at least once" (but it might be delivered more than once), and "at most once" (but it might not get there at all). So the basic idea is choose "at least once" and design things such that multiple deliveries don't cause problems.
On Mon, 4 Jan 2016 10:29:51 +0100, pozz <pozzugno@gmail.com> wrote:

>Il 04/01/2016 06:43, Robert Wessel ha scritto: >> On Mon, 4 Jan 2016 01:12:59 +0000, Tom Gardner >> <spamjunk@blueyonder.co.uk> wrote: >> >>> On 04/01/16 01:01, pozz wrote: >>>> Now the big question: how the sender can be *always* sure that the message has >>>> really arrived (then processed) by the receiver? >>> >>> See https://en.wikipedia.org/wiki/Two_Generals%27_Problem >> >> >> And a partial solution is something like the two-phase commit protocol >> (also well described in the obviously named Wikipedia article), but >> it's not 100% (it may require administrator intervention to resolve >> certain failures), but that's how distributed databases maintain >> coherency over unreliable links. > >Of course, it is too complicated for simple serial links, maybe between >a PC and an embedded board. > > >Fortunately I don't work for medical and mission-critical applications. >Anyway I'm curious how those kind of problems are solved in those >applications where I can't accept *any* error.
At least you need to define much more accurately what you really want to do. After that, someone might be able to give some usable pointers :-)&#4294967295;
Tom Gardner wrote:
> On 04/01/16 02:25, Les Cargill wrote: >> It's often useful to have a state of "oh, the line is dead" based >> on the timeout. > > Yes. > > Especially if that results in the explicit design of an FSM, > with a corresponding simple, easily readable and easily > modifiable implementation. >
+1.
> The latter should strongly shape the implementation > techniques, because it is all too common that "doing > the simplest thing" at each modification leads to an > unmaintainable ball of string. >
It's absolutely vital. -- Les Cargill
Jack wrote:
> Il giorno luned&#4294967295; 4 gennaio 2016 02:01:44 UTC+1, pozz ha scritto: > > >> I'm sure a simple solution exists, because TCP/IP is now used for >> many applications, even critical (on-line banking and similar >> things), but I can't find it myself. > > If you can set a MCU as master and the other as slave, I find that > MODBUS RTU is simple and robust enough for most of the communication > that small MCU need to do. There are of course limits: a MCU is the > master the other(s) are slave: itmeans that one ask the other(s) > answer. No communication can be initiated from slaves. It measn that > the communication is half-duplex (even if you use two lines, rx and > tx). > > Bye Jack >
Half-duplex RTU can therefore offer only limited utilization of those half-duplex circuits. Sometimes that matters. -- Les Cargill
On Sunday, January 3, 2016 at 8:01:44 PM UTC-5, pozz wrote:
> I'm trying to implement a simple protocol for a point-to-point > full-duplex serial link. It could be a reliable link, such as a > connection between two near MCUs on the same PCB, or a noisy > link, such > as RF link. > > The application layer should send and receive generic messages: > -> How are you? > <- I'm fine, and you? > -> That's ok here. > The above example is a half-duplex protocol, but the link > is full-duplex > and the messages could be transmitted anytime. > > I'd like to isolate the reliability feature to lower protocols > (transport, network, link layers), as TCP guarantees a > connection-oriented session to the application layer. > > Even with a very reliable connection, I have to face the event > of some > error during transmission. This brings to implement the > mechanism of > acks and retransmissions. If the sender doesn't receive one > ack in a > certain timeout, it sends again the packet. > > But I can't retransmit the naked packet as is, because it could be > received twice (imagine what happens if the message is "charge > the bank > account for 1000USD").
[]
> > I'm sure a simple solution exists, [], but I > can't find it myself.
You got may other good comments, I just wanted to add one more about something not really covered yet. There is a reason the Network layers include a application layer. That example of charging the account twice is clearly an application layer issue. Lower protocols can never solve that problem because it doesn't belong at the lower protocol levels in the stack. HTH, ed
Op 4-1-2016 om 2:01 schreef pozz:
> I'm trying to implement a simple protocol for a point-to-point > full-duplex serial link. It could be a reliable link, such as a > connection between two near MCUs on the same PCB, or a noisy link, such > as RF link. > > The application layer should send and receive generic messages: > -> How are you? > <- I'm fine, and you? > -> That's ok here. > The above example is a half-duplex protocol, but the link is full-duplex > and the messages could be transmitted anytime. > > I'd like to isolate the reliability feature to lower protocols > (transport, network, link layers), as TCP guarantees a > connection-oriented session to the application layer. > > Even with a very reliable connection, I have to face the event of some > error during transmission. This brings to implement the mechanism of > acks and retransmissions. If the sender doesn't receive one ack in a > certain timeout, it sends again the packet. > > But I can't retransmit the naked packet as is, because it could be > received twice (imagine what happens if the message is "charge the bank > account for 1000USD"). > > So I looked at sequence numbers mechanism: every message is marked with > a different number so the receiver can detect duplicated packets and, in > case, ack them again (but don't process them another time). > > One standard protocol with those features is HDLC in Asyncronous > Balanced Mode (ABM). It defines a good asyncronous serial framing > (similar to SLIP) *and* introduces tx and rx sequence numbers for every > type-I frames. > Of course, it is very similar to TCP that uses sequence and > acknowledgment number. > > > Now the big question: how the sender can be *always* sure that the > message has really arrived (then processed) by the receiver? > > Of course, if the sender receives the ack for the frame of interest, it > can be sure the message has arrived and processed. > But what happens if the sender doesn't receive the ack, because it is > transmitted with errors? > It can sends without problem the message again, indeed the receiver will > detect the duplicate message and sends again the ack for that frame > (without processing the message another time). > And what happens if the sender doesn't receiver second, third, ... ack? > This could happen for example when an intermediate router/forwarder/hub > has been powered off or the receiever has physically disconnected from > the link. > Maybe the message has arrived (and processed) just before the connection > trouble, so the ack will never arrive to the sender. > > In this odd case, the sender can't be 100% sure if its message has > arrived. What is the solution for this situation? > > This scenario is very similar with actual TCP/IP network. As said > before, TCP is a protocol that implements acks, retransmissions and > sequence numbers. > After a TCP connection has established, the two hosts start exchanging > data. At some time, something goes wrong and one host doesn't know if > the last message has arrived or not to destination. > > I'm sure a simple solution exists, because TCP/IP is now used for many > applications, even critical (on-line banking and similar things), but I > can't find it myself.
Its quite simple to add an message counter. Each time a message is send with a counter value, the remote side has to ack this with the same counter value. If no ack is received within a certain time, the message is send again. The remote side keeps track of the last received counter value. If that value is received again, the message is not processed, but another ack is send. I have implemented this scheme over UDP, but with framing and a checksum/crc it could be applied to a serial communication stream as well. Regards, Vincent
On Thu, 14 Jan 2016 09:01:03 +0100, Vincent vB <embedded@spam.com>
wrote:

>Op 4-1-2016 om 2:01 schreef pozz: >> I'm trying to implement a simple protocol for a point-to-point >> full-duplex serial link. It could be a reliable link, such as a >> connection between two near MCUs on the same PCB, or a noisy link, such >> as RF link. >> >> The application layer should send and receive generic messages: >> -> How are you? >> <- I'm fine, and you? >> -> That's ok here. >> The above example is a half-duplex protocol, but the link is full-duplex >> and the messages could be transmitted anytime. >> >> I'd like to isolate the reliability feature to lower protocols >> (transport, network, link layers), as TCP guarantees a >> connection-oriented session to the application layer. >> >> Even with a very reliable connection, I have to face the event of some >> error during transmission. This brings to implement the mechanism of >> acks and retransmissions. If the sender doesn't receive one ack in a >> certain timeout, it sends again the packet. >> >> But I can't retransmit the naked packet as is, because it could be >> received twice (imagine what happens if the message is "charge the bank >> account for 1000USD"). >> >> So I looked at sequence numbers mechanism: every message is marked with >> a different number so the receiver can detect duplicated packets and, in >> case, ack them again (but don't process them another time). >> >> One standard protocol with those features is HDLC in Asyncronous >> Balanced Mode (ABM). It defines a good asyncronous serial framing >> (similar to SLIP) *and* introduces tx and rx sequence numbers for every >> type-I frames. >> Of course, it is very similar to TCP that uses sequence and >> acknowledgment number. >> >> >> Now the big question: how the sender can be *always* sure that the >> message has really arrived (then processed) by the receiver? >> >> Of course, if the sender receives the ack for the frame of interest, it >> can be sure the message has arrived and processed. >> But what happens if the sender doesn't receive the ack, because it is >> transmitted with errors? >> It can sends without problem the message again, indeed the receiver will >> detect the duplicate message and sends again the ack for that frame >> (without processing the message another time). >> And what happens if the sender doesn't receiver second, third, ... ack? >> This could happen for example when an intermediate router/forwarder/hub >> has been powered off or the receiever has physically disconnected from >> the link. >> Maybe the message has arrived (and processed) just before the connection >> trouble, so the ack will never arrive to the sender. >> >> In this odd case, the sender can't be 100% sure if its message has >> arrived. What is the solution for this situation? >> >> This scenario is very similar with actual TCP/IP network. As said >> before, TCP is a protocol that implements acks, retransmissions and >> sequence numbers. >> After a TCP connection has established, the two hosts start exchanging >> data. At some time, something goes wrong and one host doesn't know if >> the last message has arrived or not to destination. >> >> I'm sure a simple solution exists, because TCP/IP is now used for many >> applications, even critical (on-line banking and similar things), but I >> can't find it myself. > >Its quite simple to add an message counter. Each time a message is send >with a counter value, the remote side has to ack this with the same >counter value. If no ack is received within a certain time, the message >is send again. The remote side keeps track of the last received counter >value. If that value is received again, the message is not processed, >but another ack is send. > >I have implemented this scheme over UDP, but with framing and a >checksum/crc it could be applied to a serial communication stream as well.
Which doesn't solve the problem of the remote side receiving and processing the message, but going offline before the ACK can be sent back (perhaps the ACK was half-way out the network card at the instant the backhoe hit the cable). In that case the local node can retransmit all it wants, but it will presumably eventually give up and assume incorrectly that the remote side did *not* process the message.
On 14/01/16 19:53, Robert Wessel wrote:
>> Its quite simple to add an message counter. Each time a message is send >> with a counter value, the remote side has to ack this with the same >> counter value. If no ack is received within a certain time, the message >> is send again.
Or, like TCP, it can wait a short period to see if further messages are closely following, and just ack the highest sequence number that forms a continuous sequence. Fewer ACKs are needed that way. TCP also piggy-backs bi-directional ACKs on outgoing data frames, which saves frames. In order to determine the throughput of the slowest hop, it also does a "slow start", increasing in speed until a NACK indicates packet loss, then adaptively backing off a little. None of these algorithms are difficult, but collectively, they "make the magic".
> Which doesn't solve the problem of the remote side receiving and > processing the message, but going offline before the ACK can be sent > back (perhaps the ACK was half-way out the network card at the instant > the backhoe hit the cable). In that case the local node can > retransmit all it wants, but it will presumably eventually give up and > assume incorrectly that the remote side did *not* process the message.
Why should it make that assumption? It might equally wrongly make the opposite assumption. Until a mutual closing handshake has been completed there is no certainty. And even then, the change of state at one end might not have been persisted properly, and lost during a reboot. It's important to design protocols based on "desired state" where possible, and not on "state change" requests. Idempotence rules. So does idempotence, and idempotence. Clifford Heath
Op 14-1-2016 om 10:06 wrokte Clifford Heath:
> On 14/01/16 19:53, Robert Wessel wrote: >>> Its quite simple to add an message counter. Each time a message is send >>> with a counter value, the remote side has to ack this with the same >>> counter value. If no ack is received within a certain time, the message >>> is send again. > > Or, like TCP, it can wait a short period to see if further > messages are closely following, and just ack the highest > sequence number that forms a continuous sequence. Fewer > ACKs are needed that way. > > TCP also piggy-backs bi-directional ACKs on outgoing data > frames, which saves frames.
Yes, of course that is possible. I have actually implemented something similar, but the ACK piggy-backs the remote side.
> >> Which doesn't solve the problem of the remote side receiving and >> processing the message, but going offline before the ACK can be sent >> back (perhaps the ACK was half-way out the network card at the instant >> the backhoe hit the cable). In that case the local node can >> retransmit all it wants, but it will presumably eventually give up and >> assume incorrectly that the remote side did *not* process the message. > > Why should it make that assumption? It might equally wrongly > make the opposite assumption. Until a mutual closing handshake > has been completed there is no certainty.
There is no way to know. Message ordinals/acks and CRCs/Checksums will make your connection quite reliable. The idea is to minimize the types and rate of errors which can occur, and to design the higher layer of the software such that it can deal with special circumstances.
> > And even then, the change of state at one end might not have > been persisted properly, and lost during a reboot. > > It's important to design protocols based on "desired state" > where possible, and not on "state change" requests.
That would do it. If messages are lost, and connection is regained, verify the state of the remote system and update where desired. Its all a matter of careful thinking. Regards, Vincent
The 2026 Embedded Online Conference