EmbeddedRelated.com
Forums

MQTT QoS vs TCP

Started by pozz May 19, 2023
I know TCP is able to guarantee the delivery of messages from sender to 
receiver, without corruption (thanks to checksums), in order and without 
duplication (thanks to sequence numbers).

So I'm confused when I read something about MQTT QoS. For example, QoS 1 
(at least once) uses ack (PUBACK) and retransmissions (with packet ids) 
to guarantee that the message is delivered at least one time.

I'm wondering how this is possible over a TCP connection that uses the 
same technique of ack and sequence numbers.

 From what I know about TCP, if some data isn't well acknowledged by the 
receiver, the sender automatically resends the packets not acked. This 
is performed by TCP/IP stack or kernel, without the application knows 
anything about this.

It seems to me it's impossible that a MQTT client needs to resend a MQTT 
message because it wasn't received by the broker. If this happens, TCP 
should signal the error to the application that should close and try to 
reopen the connection.
On Friday, May 19, 2023 at 5:42:14 AM UTC-4, pozz wrote:
> I know TCP is able to guarantee the delivery of messages from sender to > receiver, without corruption (thanks to checksums), in order and without > duplication (thanks to sequence numbers). > > So I'm confused when I read something about MQTT QoS. For example, QoS 1 > (at least once) uses ack (PUBACK) and retransmissions (with packet ids) > to guarantee that the message is delivered at least one time. > > I'm wondering how this is possible over a TCP connection that uses the > same technique of ack and sequence numbers. > > From what I know about TCP, if some data isn't well acknowledged by the > receiver, the sender automatically resends the packets not acked. This > is performed by TCP/IP stack or kernel, without the application knows > anything about this. > > It seems to me it's impossible that a MQTT client needs to resend a MQTT > message because it wasn't received by the broker. If this happens, TCP > should signal the error to the application that should close and try to > reopen the connection.
TCP is only one of the lower levels of the protocol stack. Data can sometimes be lost in the higher levels. Secondly, there is the issue of resend timeouts. If TCP fails to deliver the message past the MQTT retry time limit, then MQTT will resend the message. HTH, Ed
>>>>> On 2023-05-19, pozz <pozzugno@gmail.com> wrote:
> It seems to me it's impossible that a MQTT client needs to resend > a MQTT message because it wasn't received by the broker. If this > happens, TCP should signal the error to the application that > should close and try to reopen the connection. After which an MQTT client will need to retransmit its message, no? The difference between QoS 0 and QoS 1 boils down to whether the sender of the message is actually bothered to do that. (I /think/ QoS 1 also allows for reliable delivery along the entire client-to-server-to-another-client path, but I'm not sure about that.) TCP will also signal an error when the message has successfully reached its destination, but the respective acknowledgement has not; as such, an application level protocol running over TCP generally needs the means to weed out the duplicates that are bound to happen in this case. Which is what MQTT QoS 2 does. -- FSF associate member #7257 http://am-1.org/~ivan/
Il 19/05/2023 17:14, Ed Prochak ha scritto:
> On Friday, May 19, 2023 at 5:42:14&#8239;AM UTC-4, pozz wrote: >> I know TCP is able to guarantee the delivery of messages from sender to >> receiver, without corruption (thanks to checksums), in order and without >> duplication (thanks to sequence numbers). >> >> So I'm confused when I read something about MQTT QoS. For example, QoS 1 >> (at least once) uses ack (PUBACK) and retransmissions (with packet ids) >> to guarantee that the message is delivered at least one time. >> >> I'm wondering how this is possible over a TCP connection that uses the >> same technique of ack and sequence numbers. >> >> From what I know about TCP, if some data isn't well acknowledged by the >> receiver, the sender automatically resends the packets not acked. This >> is performed by TCP/IP stack or kernel, without the application knows >> anything about this. >> >> It seems to me it's impossible that a MQTT client needs to resend a MQTT >> message because it wasn't received by the broker. If this happens, TCP >> should signal the error to the application that should close and try to >> reopen the connection. > > TCP is only one of the lower levels of the protocol stack. > Data can sometimes be lost in the higher levels.
In this case, there's only one higher level, that is MQTT application. How an application running on a machine could lost something? Network links aren't reliable, but applicaions running on a processor are reliable. Do you think about application crash or an entire machine crash that needs a reboot? In this case, after the reboot, the MQTT application usually doesn't know anything about the previous connection, timeout and lost messages... except it saved something on a non volatile memory.
> Secondly, there is the issue of resend timeouts. If TCP fails to deliver > the message past the MQTT retry time limit, then MQTT will resend the > message.
What happens in this case? Suppose one TCP fragment with a single MQTT message (just for simplicity) sent by a client to the server (the broker) was lost. After a TCP timeout, the network stack autonomously resend the fragment until an ACK is received. Even if the MQTT application resend the MQTT message *before* TCP timeout, it will not be sent by TCP layer until the previous fragment is acked. Maybe, more exactly, on the receiver machine, the TCP layer will not pass the resent message to the application (the MQTT broker) before the lost TCP segment is received as well. When the lost TCP fragment is received, the broker will receive two MQTT messages: the "original" and the resent ones. I think it's impossible for the broker to receive the second transmission without receiving the first. So it seems to me the retransmission made at the MQTT level is completely useless... but I think I didn't get the real point here.
On 22/05/2023 09:11, pozz wrote:
> Il 19/05/2023 17:14, Ed Prochak ha scritto: >> On Friday, May 19, 2023 at 5:42:14&#8239;AM UTC-4, pozz wrote: >>> I know TCP is able to guarantee the delivery of messages from sender to >>> receiver, without corruption (thanks to checksums), in order and without >>> duplication (thanks to sequence numbers). >>> >>> So I'm confused when I read something about MQTT QoS. For example, QoS 1 >>> (at least once) uses ack (PUBACK) and retransmissions (with packet ids) >>> to guarantee that the message is delivered at least one time. >>> >>> I'm wondering how this is possible over a TCP connection that uses the >>> same technique of ack and sequence numbers. >>> >>> &nbsp;From what I know about TCP, if some data isn't well acknowledged by the >>> receiver, the sender automatically resends the packets not acked. This >>> is performed by TCP/IP stack or kernel, without the application knows >>> anything about this. >>> >>> It seems to me it's impossible that a MQTT client needs to resend a MQTT >>> message because it wasn't received by the broker. If this happens, TCP >>> should signal the error to the application that should close and try to >>> reopen the connection. >> >> TCP is only one of the lower levels of the protocol stack. >> Data can sometimes be lost in the higher levels. > > In this case, there's only one higher level, that is MQTT application. > How an application running on a machine could lost something? Network > links aren't reliable, but applicaions running on a processor are reliable. > Do you think about application crash or an entire machine crash that > needs a reboot? In this case, after the reboot, the MQTT application > usually doesn't know anything about the previous connection, timeout and > lost messages... except it saved something on a non volatile memory. > > >> Secondly, there is the issue of resend timeouts. If TCP fails to deliver >> the message past the MQTT retry time limit, then MQTT will resend the >> message. > > What happens in this case? Suppose one TCP fragment with a single MQTT > message (just for simplicity) sent by a client to the server (the > broker) was lost. After a TCP timeout, the network stack autonomously > resend the fragment until an ACK is received. Even if the MQTT > application resend the MQTT message *before* TCP timeout, it will not be > sent by TCP layer until the previous fragment is acked. > Maybe, more exactly, on the receiver machine, the TCP layer will not > pass the resent message to the application (the MQTT broker) before the > lost TCP segment is received as well. When the lost TCP fragment is > received, the broker will receive two MQTT messages: the "original" and > the resent ones. I think it's impossible for the broker to receive the > second transmission without receiving the first. > > So it seems to me the retransmission made at the MQTT level is > completely useless... but I think I didn't get the real point here.
I haven't used MQTT much, but generally if an application gets a timeout and wants to retry, it will close the TCP/IP connection and open a new one. (Or rather, open a new one while the old one is closing - closing a failing TCP/IP connection can be slow.) I would actually have thought that UDP was a more natural choice for MQTT, rather than TCP - although older versions of MQTT did not have QoS and were therefore reliant on TCP's acknowledges and retries. (I always think its a shame that SCTP never caught on - among its many benefits, you don't have this head-of-line blocking issue.)
Il 22/05/2023 10:09, David Brown ha scritto:
> On 22/05/2023 09:11, pozz wrote: >> Il 19/05/2023 17:14, Ed Prochak ha scritto: >>> On Friday, May 19, 2023 at 5:42:14&#8239;AM UTC-4, pozz wrote: >>>> I know TCP is able to guarantee the delivery of messages from sender to >>>> receiver, without corruption (thanks to checksums), in order and >>>> without >>>> duplication (thanks to sequence numbers). >>>> >>>> So I'm confused when I read something about MQTT QoS. For example, >>>> QoS 1 >>>> (at least once) uses ack (PUBACK) and retransmissions (with packet ids) >>>> to guarantee that the message is delivered at least one time. >>>> >>>> I'm wondering how this is possible over a TCP connection that uses the >>>> same technique of ack and sequence numbers. >>>> >>>> &nbsp;From what I know about TCP, if some data isn't well acknowledged by >>>> the >>>> receiver, the sender automatically resends the packets not acked. This >>>> is performed by TCP/IP stack or kernel, without the application knows >>>> anything about this. >>>> >>>> It seems to me it's impossible that a MQTT client needs to resend a >>>> MQTT >>>> message because it wasn't received by the broker. If this happens, TCP >>>> should signal the error to the application that should close and try to >>>> reopen the connection. >>> >>> TCP is only one of the lower levels of the protocol stack. >>> Data can sometimes be lost in the higher levels. >> >> In this case, there's only one higher level, that is MQTT application. >> How an application running on a machine could lost something? Network >> links aren't reliable, but applicaions running on a processor are >> reliable. >> Do you think about application crash or an entire machine crash that >> needs a reboot? In this case, after the reboot, the MQTT application >> usually doesn't know anything about the previous connection, timeout >> and lost messages... except it saved something on a non volatile memory. >> >> >>> Secondly, there is the issue of resend timeouts. If TCP fails to deliver >>> the message past the MQTT retry time limit, then MQTT will resend the >>> message. >> >> What happens in this case? Suppose one TCP fragment with a single MQTT >> message (just for simplicity) sent by a client to the server (the >> broker) was lost. After a TCP timeout, the network stack autonomously >> resend the fragment until an ACK is received. Even if the MQTT >> application resend the MQTT message *before* TCP timeout, it will not >> be sent by TCP layer until the previous fragment is acked. >> Maybe, more exactly, on the receiver machine, the TCP layer will not >> pass the resent message to the application (the MQTT broker) before >> the lost TCP segment is received as well. When the lost TCP fragment >> is received, the broker will receive two MQTT messages: the "original" >> and the resent ones. I think it's impossible for the broker to receive >> the second transmission without receiving the first. >> >> So it seems to me the retransmission made at the MQTT level is >> completely useless... but I think I didn't get the real point here. > > I haven't used MQTT much, but generally if an application gets a timeout > and wants to retry, it will close the TCP/IP connection and open a new > one.&nbsp; (Or rather, open a new one while the old one is closing - closing > a failing TCP/IP connection can be slow.)
I'm quite sure that MQTT retransmission mechanism is *not* based on a new TCP connection. In MQTT, the TCP connection is persistent. It can stay open for days without exchanging any real data. In this case, the keepalive facility is there to detect a broken link.
> I would actually have thought that UDP was a more natural choice for > MQTT, rather than TCP - although older versions of MQTT did not have QoS > and were therefore reliant on TCP's acknowledges and retries. > > (I always think its a shame that SCTP never caught on - among its many > benefits, you don't have this head-of-line blocking issue.)
On 22/05/2023 11:08, pozz wrote:
> Il 22/05/2023 10:09, David Brown ha scritto: >> On 22/05/2023 09:11, pozz wrote: >>> Il 19/05/2023 17:14, Ed Prochak ha scritto: >>>> On Friday, May 19, 2023 at 5:42:14&#8239;AM UTC-4, pozz wrote: >>>>> I know TCP is able to guarantee the delivery of messages from >>>>> sender to >>>>> receiver, without corruption (thanks to checksums), in order and >>>>> without >>>>> duplication (thanks to sequence numbers). >>>>> >>>>> So I'm confused when I read something about MQTT QoS. For example, >>>>> QoS 1 >>>>> (at least once) uses ack (PUBACK) and retransmissions (with packet >>>>> ids) >>>>> to guarantee that the message is delivered at least one time. >>>>> >>>>> I'm wondering how this is possible over a TCP connection that uses the >>>>> same technique of ack and sequence numbers. >>>>> >>>>> &nbsp;From what I know about TCP, if some data isn't well acknowledged >>>>> by the >>>>> receiver, the sender automatically resends the packets not acked. This >>>>> is performed by TCP/IP stack or kernel, without the application knows >>>>> anything about this. >>>>> >>>>> It seems to me it's impossible that a MQTT client needs to resend a >>>>> MQTT >>>>> message because it wasn't received by the broker. If this happens, TCP >>>>> should signal the error to the application that should close and >>>>> try to >>>>> reopen the connection. >>>> >>>> TCP is only one of the lower levels of the protocol stack. >>>> Data can sometimes be lost in the higher levels. >>> >>> In this case, there's only one higher level, that is MQTT >>> application. How an application running on a machine could lost >>> something? Network links aren't reliable, but applicaions running on >>> a processor are reliable. >>> Do you think about application crash or an entire machine crash that >>> needs a reboot? In this case, after the reboot, the MQTT application >>> usually doesn't know anything about the previous connection, timeout >>> and lost messages... except it saved something on a non volatile memory. >>> >>> >>>> Secondly, there is the issue of resend timeouts. If TCP fails to >>>> deliver >>>> the message past the MQTT retry time limit, then MQTT will resend the >>>> message. >>> >>> What happens in this case? Suppose one TCP fragment with a single >>> MQTT message (just for simplicity) sent by a client to the server >>> (the broker) was lost. After a TCP timeout, the network stack >>> autonomously resend the fragment until an ACK is received. Even if >>> the MQTT application resend the MQTT message *before* TCP timeout, it >>> will not be sent by TCP layer until the previous fragment is acked. >>> Maybe, more exactly, on the receiver machine, the TCP layer will not >>> pass the resent message to the application (the MQTT broker) before >>> the lost TCP segment is received as well. When the lost TCP fragment >>> is received, the broker will receive two MQTT messages: the >>> "original" and the resent ones. I think it's impossible for the >>> broker to receive the second transmission without receiving the first. >>> >>> So it seems to me the retransmission made at the MQTT level is >>> completely useless... but I think I didn't get the real point here. >> >> I haven't used MQTT much, but generally if an application gets a >> timeout and wants to retry, it will close the TCP/IP connection and >> open a new one.&nbsp; (Or rather, open a new one while the old one is >> closing - closing a failing TCP/IP connection can be slow.) > > I'm quite sure that MQTT retransmission mechanism is *not* based on a > new TCP connection. In MQTT, the TCP connection is persistent. It can > stay open for days without exchanging any real data. In this case, the > keepalive facility is there to detect a broken link.
If the TCP/IP connection is working correctly, messages will be transmitted correctly to the broker. If a QoS message fails to be transmitted - the MQTT client or server does not receive an acknowledge in time - then there are two possible issues. One is that the server/broker application is in trouble. The other is that there is an issue with the network. In most cases, I would suspect the network first. TCP/IP already has acknowledges and timeouts, so if it is a temporary problem then it is likely to be handled there. By the time it reaches the attention of the application protocol's QoS handling, you are definitely at the point where a new TCP/IP connection is the right way to go - perhaps targeting a different IP address or via a different route. The MQTT application already has to handle dropping and making new TCP/IP connections - even if the norm is for the connection to last for weeks at a time or more. So creating a new TCP/IP link has a lot to gain, and very little to lose, and it is the standard way to handle such issues.
> > >> I would actually have thought that UDP was a more natural choice for >> MQTT, rather than TCP - although older versions of MQTT did not have >> QoS and were therefore reliant on TCP's acknowledges and retries. >> >> (I always think its a shame that SCTP never caught on - among its many >> benefits, you don't have this head-of-line blocking issue.) >
Il 22/05/2023 12:54, David Brown ha scritto:
> On 22/05/2023 11:08, pozz wrote: >> Il 22/05/2023 10:09, David Brown ha scritto: >>> On 22/05/2023 09:11, pozz wrote: >>>> Il 19/05/2023 17:14, Ed Prochak ha scritto: >>>>> On Friday, May 19, 2023 at 5:42:14&#8239;AM UTC-4, pozz wrote: >>>>>> I know TCP is able to guarantee the delivery of messages from >>>>>> sender to >>>>>> receiver, without corruption (thanks to checksums), in order and >>>>>> without >>>>>> duplication (thanks to sequence numbers). >>>>>> >>>>>> So I'm confused when I read something about MQTT QoS. For example, >>>>>> QoS 1 >>>>>> (at least once) uses ack (PUBACK) and retransmissions (with packet >>>>>> ids) >>>>>> to guarantee that the message is delivered at least one time. >>>>>> >>>>>> I'm wondering how this is possible over a TCP connection that uses >>>>>> the >>>>>> same technique of ack and sequence numbers. >>>>>> >>>>>> &nbsp;From what I know about TCP, if some data isn't well acknowledged >>>>>> by the >>>>>> receiver, the sender automatically resends the packets not acked. >>>>>> This >>>>>> is performed by TCP/IP stack or kernel, without the application knows >>>>>> anything about this. >>>>>> >>>>>> It seems to me it's impossible that a MQTT client needs to resend >>>>>> a MQTT >>>>>> message because it wasn't received by the broker. If this happens, >>>>>> TCP >>>>>> should signal the error to the application that should close and >>>>>> try to >>>>>> reopen the connection. >>>>> >>>>> TCP is only one of the lower levels of the protocol stack. >>>>> Data can sometimes be lost in the higher levels. >>>> >>>> In this case, there's only one higher level, that is MQTT >>>> application. How an application running on a machine could lost >>>> something? Network links aren't reliable, but applicaions running on >>>> a processor are reliable. >>>> Do you think about application crash or an entire machine crash that >>>> needs a reboot? In this case, after the reboot, the MQTT application >>>> usually doesn't know anything about the previous connection, timeout >>>> and lost messages... except it saved something on a non volatile >>>> memory. >>>> >>>> >>>>> Secondly, there is the issue of resend timeouts. If TCP fails to >>>>> deliver >>>>> the message past the MQTT retry time limit, then MQTT will resend the >>>>> message. >>>> >>>> What happens in this case? Suppose one TCP fragment with a single >>>> MQTT message (just for simplicity) sent by a client to the server >>>> (the broker) was lost. After a TCP timeout, the network stack >>>> autonomously resend the fragment until an ACK is received. Even if >>>> the MQTT application resend the MQTT message *before* TCP timeout, >>>> it will not be sent by TCP layer until the previous fragment is acked. >>>> Maybe, more exactly, on the receiver machine, the TCP layer will not >>>> pass the resent message to the application (the MQTT broker) before >>>> the lost TCP segment is received as well. When the lost TCP fragment >>>> is received, the broker will receive two MQTT messages: the >>>> "original" and the resent ones. I think it's impossible for the >>>> broker to receive the second transmission without receiving the first. >>>> >>>> So it seems to me the retransmission made at the MQTT level is >>>> completely useless... but I think I didn't get the real point here. >>> >>> I haven't used MQTT much, but generally if an application gets a >>> timeout and wants to retry, it will close the TCP/IP connection and >>> open a new one.&nbsp; (Or rather, open a new one while the old one is >>> closing - closing a failing TCP/IP connection can be slow.) >> >> I'm quite sure that MQTT retransmission mechanism is *not* based on a >> new TCP connection. In MQTT, the TCP connection is persistent. It can >> stay open for days without exchanging any real data. In this case, the >> keepalive facility is there to detect a broken link. > > If the TCP/IP connection is working correctly, messages will be > transmitted correctly to the broker.&nbsp; If a QoS message fails to be > transmitted - the MQTT client or server does not receive an acknowledge > in time - then there are two possible issues.&nbsp; One is that the > server/broker application is in trouble.&nbsp; The other is that there is an > issue with the network.&nbsp; In most cases, I would suspect the network > first.&nbsp; TCP/IP already has acknowledges and timeouts, so if it is a > temporary problem then it is likely to be handled there.&nbsp; By the time it > reaches the attention of the application protocol's QoS handling, you > are definitely at the point where a new TCP/IP connection is the right > way to go - perhaps targeting a different IP address or via a different > route.
Yes, this is the only solution for me too. Anyway, I don't know if this behaviour (closing and reopening TCP connection) is described in the MQTT specifications.
> The MQTT application already has to handle dropping and making new > TCP/IP connections - even if the norm is for the connection to last for > weeks at a time or more.&nbsp; So creating a new TCP/IP link has a lot to > gain, and very little to lose, and it is the standard way to handle such > issues.
Here[1] the MQTT client implementation of lwip, a popular TCP/IP stack for embedded systems. When the timeout for the ACK is expired, this client only calls an application callback with ERR_TIMEOUT. Maybe the decision to close and reopen a new TCP connection is passed to the application. I don't know if other MQTT clients implement an embedded mechanism that automatically tries to solve the issue of lost ACKs by reopening a TCP connection.
>>> I would actually have thought that UDP was a more natural choice for >>> MQTT, rather than TCP - although older versions of MQTT did not have >>> QoS and were therefore reliant on TCP's acknowledges and retries. >>> >>> (I always think its a shame that SCTP never caught on - among its >>> many benefits, you don't have this head-of-line blocking issue.) >> >
[1] https://github.com/lwip-tcpip/lwip/blob/master/src/apps/mqtt/mqtt.c
On 23/05/2023 08:53, pozz wrote:
> Il 22/05/2023 12:54, David Brown ha scritto: >> On 22/05/2023 11:08, pozz wrote: >>> Il 22/05/2023 10:09, David Brown ha scritto: >>>> On 22/05/2023 09:11, pozz wrote: >>>>> Il 19/05/2023 17:14, Ed Prochak ha scritto: >>>>>> On Friday, May 19, 2023 at 5:42:14&#8239;AM UTC-4, pozz wrote: >>>>>>> I know TCP is able to guarantee the delivery of messages from >>>>>>> sender to >>>>>>> receiver, without corruption (thanks to checksums), in order and >>>>>>> without >>>>>>> duplication (thanks to sequence numbers). >>>>>>> >>>>>>> So I'm confused when I read something about MQTT QoS. For >>>>>>> example, QoS 1 >>>>>>> (at least once) uses ack (PUBACK) and retransmissions (with >>>>>>> packet ids) >>>>>>> to guarantee that the message is delivered at least one time. >>>>>>> >>>>>>> I'm wondering how this is possible over a TCP connection that >>>>>>> uses the >>>>>>> same technique of ack and sequence numbers. >>>>>>> >>>>>>> &nbsp;From what I know about TCP, if some data isn't well acknowledged >>>>>>> by the >>>>>>> receiver, the sender automatically resends the packets not acked. >>>>>>> This >>>>>>> is performed by TCP/IP stack or kernel, without the application >>>>>>> knows >>>>>>> anything about this. >>>>>>> >>>>>>> It seems to me it's impossible that a MQTT client needs to resend >>>>>>> a MQTT >>>>>>> message because it wasn't received by the broker. If this >>>>>>> happens, TCP >>>>>>> should signal the error to the application that should close and >>>>>>> try to >>>>>>> reopen the connection. >>>>>> >>>>>> TCP is only one of the lower levels of the protocol stack. >>>>>> Data can sometimes be lost in the higher levels. >>>>> >>>>> In this case, there's only one higher level, that is MQTT >>>>> application. How an application running on a machine could lost >>>>> something? Network links aren't reliable, but applicaions running >>>>> on a processor are reliable. >>>>> Do you think about application crash or an entire machine crash >>>>> that needs a reboot? In this case, after the reboot, the MQTT >>>>> application usually doesn't know anything about the previous >>>>> connection, timeout and lost messages... except it saved something >>>>> on a non volatile memory. >>>>> >>>>> >>>>>> Secondly, there is the issue of resend timeouts. If TCP fails to >>>>>> deliver >>>>>> the message past the MQTT retry time limit, then MQTT will resend the >>>>>> message. >>>>> >>>>> What happens in this case? Suppose one TCP fragment with a single >>>>> MQTT message (just for simplicity) sent by a client to the server >>>>> (the broker) was lost. After a TCP timeout, the network stack >>>>> autonomously resend the fragment until an ACK is received. Even if >>>>> the MQTT application resend the MQTT message *before* TCP timeout, >>>>> it will not be sent by TCP layer until the previous fragment is acked. >>>>> Maybe, more exactly, on the receiver machine, the TCP layer will >>>>> not pass the resent message to the application (the MQTT broker) >>>>> before the lost TCP segment is received as well. When the lost TCP >>>>> fragment is received, the broker will receive two MQTT messages: >>>>> the "original" and the resent ones. I think it's impossible for the >>>>> broker to receive the second transmission without receiving the first. >>>>> >>>>> So it seems to me the retransmission made at the MQTT level is >>>>> completely useless... but I think I didn't get the real point here. >>>> >>>> I haven't used MQTT much, but generally if an application gets a >>>> timeout and wants to retry, it will close the TCP/IP connection and >>>> open a new one.&nbsp; (Or rather, open a new one while the old one is >>>> closing - closing a failing TCP/IP connection can be slow.) >>> >>> I'm quite sure that MQTT retransmission mechanism is *not* based on a >>> new TCP connection. In MQTT, the TCP connection is persistent. It can >>> stay open for days without exchanging any real data. In this case, >>> the keepalive facility is there to detect a broken link. >> >> If the TCP/IP connection is working correctly, messages will be >> transmitted correctly to the broker.&nbsp; If a QoS message fails to be >> transmitted - the MQTT client or server does not receive an >> acknowledge in time - then there are two possible issues.&nbsp; One is that >> the server/broker application is in trouble.&nbsp; The other is that there >> is an issue with the network.&nbsp; In most cases, I would suspect the >> network first.&nbsp; TCP/IP already has acknowledges and timeouts, so if it >> is a temporary problem then it is likely to be handled there.&nbsp; By the >> time it reaches the attention of the application protocol's QoS >> handling, you are definitely at the point where a new TCP/IP >> connection is the right way to go - perhaps targeting a different IP >> address or via a different route. > > Yes, this is the only solution for me too. Anyway, I don't know if this > behaviour (closing and reopening TCP connection) is described in the > MQTT specifications. >
I haven't read the MQTT specifications - I don't even know what documentation exists for the protocol. But implementation details like this are not always covered in such documents, as it is really at a level below the protocol itself. (The specifications for HTTP, for example, don't say how many simultaneous connections a browser should have to a web server, or when it should give up and retry.) So don't be surprised if this is /not/ in the specs - that does not mean a client cannot or should not make new TCP/IP connections.
> >> The MQTT application already has to handle dropping and making new >> TCP/IP connections - even if the norm is for the connection to last >> for weeks at a time or more.&nbsp; So creating a new TCP/IP link has a lot >> to gain, and very little to lose, and it is the standard way to handle >> such issues. > > Here[1] the MQTT client implementation of lwip, a popular TCP/IP stack > for embedded systems.
This is a bit muddled. I am familiar with LWIP, but I don't know whether you are talking about an MQTT client that you wrote yourself, or which comes as part of newer LWIP, or which someone else contributed as a sample.
> When the timeout for the ACK is expired, this client only calls an > application callback with ERR_TIMEOUT. Maybe the decision to close and > reopen a new TCP connection is passed to the application.
Yes, that would be the normal behaviour.
> I don't know if other MQTT clients implement an embedded mechanism that > automatically tries to solve the issue of lost ACKs by reopening a TCP > connection. >
I don't know either. I can only tell you that if you are failing to communicate on a TCP/IP connection, then making a new one (possibly after a delay) is the normal way to handle things if you want automatic recovery.
> >>>> I would actually have thought that UDP was a more natural choice for >>>> MQTT, rather than TCP - although older versions of MQTT did not have >>>> QoS and were therefore reliant on TCP's acknowledges and retries. >>>> >>>> (I always think its a shame that SCTP never caught on - among its >>>> many benefits, you don't have this head-of-line blocking issue.) >>> >> > > [1] https://github.com/lwip-tcpip/lwip/blob/master/src/apps/mqtt/mqtt.c
Il 23/05/2023 09:55, David Brown ha scritto:
> On 23/05/2023 08:53, pozz wrote: >> Il 22/05/2023 12:54, David Brown ha scritto: >>> On 22/05/2023 11:08, pozz wrote: >>>> Il 22/05/2023 10:09, David Brown ha scritto: >>>>> On 22/05/2023 09:11, pozz wrote: >>>>>> Il 19/05/2023 17:14, Ed Prochak ha scritto: >>>>>>> On Friday, May 19, 2023 at 5:42:14&#8239;AM UTC-4, pozz wrote: >>>>>>>> I know TCP is able to guarantee the delivery of messages from >>>>>>>> sender to >>>>>>>> receiver, without corruption (thanks to checksums), in order and >>>>>>>> without >>>>>>>> duplication (thanks to sequence numbers). >>>>>>>> >>>>>>>> So I'm confused when I read something about MQTT QoS. For >>>>>>>> example, QoS 1 >>>>>>>> (at least once) uses ack (PUBACK) and retransmissions (with >>>>>>>> packet ids) >>>>>>>> to guarantee that the message is delivered at least one time. >>>>>>>> >>>>>>>> I'm wondering how this is possible over a TCP connection that >>>>>>>> uses the >>>>>>>> same technique of ack and sequence numbers. >>>>>>>> >>>>>>>> &nbsp;From what I know about TCP, if some data isn't well >>>>>>>> acknowledged by the >>>>>>>> receiver, the sender automatically resends the packets not >>>>>>>> acked. This >>>>>>>> is performed by TCP/IP stack or kernel, without the application >>>>>>>> knows >>>>>>>> anything about this. >>>>>>>> >>>>>>>> It seems to me it's impossible that a MQTT client needs to >>>>>>>> resend a MQTT >>>>>>>> message because it wasn't received by the broker. If this >>>>>>>> happens, TCP >>>>>>>> should signal the error to the application that should close and >>>>>>>> try to >>>>>>>> reopen the connection. >>>>>>> >>>>>>> TCP is only one of the lower levels of the protocol stack. >>>>>>> Data can sometimes be lost in the higher levels. >>>>>> >>>>>> In this case, there's only one higher level, that is MQTT >>>>>> application. How an application running on a machine could lost >>>>>> something? Network links aren't reliable, but applicaions running >>>>>> on a processor are reliable. >>>>>> Do you think about application crash or an entire machine crash >>>>>> that needs a reboot? In this case, after the reboot, the MQTT >>>>>> application usually doesn't know anything about the previous >>>>>> connection, timeout and lost messages... except it saved something >>>>>> on a non volatile memory. >>>>>> >>>>>> >>>>>>> Secondly, there is the issue of resend timeouts. If TCP fails to >>>>>>> deliver >>>>>>> the message past the MQTT retry time limit, then MQTT will resend >>>>>>> the >>>>>>> message. >>>>>> >>>>>> What happens in this case? Suppose one TCP fragment with a single >>>>>> MQTT message (just for simplicity) sent by a client to the server >>>>>> (the broker) was lost. After a TCP timeout, the network stack >>>>>> autonomously resend the fragment until an ACK is received. Even if >>>>>> the MQTT application resend the MQTT message *before* TCP timeout, >>>>>> it will not be sent by TCP layer until the previous fragment is >>>>>> acked. >>>>>> Maybe, more exactly, on the receiver machine, the TCP layer will >>>>>> not pass the resent message to the application (the MQTT broker) >>>>>> before the lost TCP segment is received as well. When the lost TCP >>>>>> fragment is received, the broker will receive two MQTT messages: >>>>>> the "original" and the resent ones. I think it's impossible for >>>>>> the broker to receive the second transmission without receiving >>>>>> the first. >>>>>> >>>>>> So it seems to me the retransmission made at the MQTT level is >>>>>> completely useless... but I think I didn't get the real point here. >>>>> >>>>> I haven't used MQTT much, but generally if an application gets a >>>>> timeout and wants to retry, it will close the TCP/IP connection and >>>>> open a new one.&nbsp; (Or rather, open a new one while the old one is >>>>> closing - closing a failing TCP/IP connection can be slow.) >>>> >>>> I'm quite sure that MQTT retransmission mechanism is *not* based on >>>> a new TCP connection. In MQTT, the TCP connection is persistent. It >>>> can stay open for days without exchanging any real data. In this >>>> case, the keepalive facility is there to detect a broken link. >>> >>> If the TCP/IP connection is working correctly, messages will be >>> transmitted correctly to the broker.&nbsp; If a QoS message fails to be >>> transmitted - the MQTT client or server does not receive an >>> acknowledge in time - then there are two possible issues.&nbsp; One is >>> that the server/broker application is in trouble.&nbsp; The other is that >>> there is an issue with the network.&nbsp; In most cases, I would suspect >>> the network first.&nbsp; TCP/IP already has acknowledges and timeouts, so >>> if it is a temporary problem then it is likely to be handled there. >>> By the time it reaches the attention of the application protocol's >>> QoS handling, you are definitely at the point where a new TCP/IP >>> connection is the right way to go - perhaps targeting a different IP >>> address or via a different route. >> >> Yes, this is the only solution for me too. Anyway, I don't know if >> this behaviour (closing and reopening TCP connection) is described in >> the MQTT specifications. >> > > I haven't read the MQTT specifications - I don't even know what > documentation exists for the protocol.&nbsp; But implementation details like > this are not always covered in such documents, as it is really at a > level below the protocol itself.&nbsp; (The specifications for HTTP, for > example, don't say how many simultaneous connections a browser should > have to a web server, or when it should give up and retry.)&nbsp; So don't be > surprised if this is /not/ in the specs - that does not mean a client > cannot or should not make new TCP/IP connections. > >> >>> The MQTT application already has to handle dropping and making new >>> TCP/IP connections - even if the norm is for the connection to last >>> for weeks at a time or more.&nbsp; So creating a new TCP/IP link has a lot >>> to gain, and very little to lose, and it is the standard way to >>> handle such issues. >> >> Here[1] the MQTT client implementation of lwip, a popular TCP/IP stack >> for embedded systems. > > This is a bit muddled.&nbsp; I am familiar with LWIP, but I don't know > whether you are talking about an MQTT client that you wrote yourself, or > which comes as part of newer LWIP, or which someone else contributed as > a sample.
In the link, there's the official MQTT client implementation of lwip project.
>> When the timeout for the ACK is expired, this client only calls an >> application callback with ERR_TIMEOUT. Maybe the decision to close and >> reopen a new TCP connection is passed to the application. > > Yes, that would be the normal behaviour. > >> I don't know if other MQTT clients implement an embedded mechanism >> that automatically tries to solve the issue of lost ACKs by reopening >> a TCP connection. >> > > I don't know either.&nbsp; I can only tell you that if you are failing to > communicate on a TCP/IP connection, then making a new one (possibly > after a delay) is the normal way to handle things if you want automatic > recovery. > >> >>>>> I would actually have thought that UDP was a more natural choice >>>>> for MQTT, rather than TCP - although older versions of MQTT did not >>>>> have QoS and were therefore reliant on TCP's acknowledges and retries. >>>>> >>>>> (I always think its a shame that SCTP never caught on - among its >>>>> many benefits, you don't have this head-of-line blocking issue.) >>>> >>> >> >> [1] https://github.com/lwip-tcpip/lwip/blob/master/src/apps/mqtt/mqtt.c >