Il 23/05/2023 09:55, David Brown ha scritto:
> On 23/05/2023 08:53, pozz wrote:
>> Il 22/05/2023 12:54, David Brown ha scritto:
>>> On 22/05/2023 11:08, pozz wrote:
>>>> Il 22/05/2023 10:09, David Brown ha scritto:
>>>>> On 22/05/2023 09:11, pozz wrote:
>>>>>> Il 19/05/2023 17:14, Ed Prochak ha scritto:
>>>>>>> On Friday, May 19, 2023 at 5:42:14 AM UTC-4, pozz wrote:
>>>>>>>> I know TCP is able to guarantee the delivery of messages from
>>>>>>>> sender to
>>>>>>>> receiver, without corruption (thanks to checksums), in order and
>>>>>>>> without
>>>>>>>> duplication (thanks to sequence numbers).
>>>>>>>>
>>>>>>>> So I'm confused when I read something about MQTT QoS. For
>>>>>>>> example, QoS 1
>>>>>>>> (at least once) uses ack (PUBACK) and retransmissions (with
>>>>>>>> packet ids)
>>>>>>>> to guarantee that the message is delivered at least one time.
>>>>>>>>
>>>>>>>> I'm wondering how this is possible over a TCP connection that
>>>>>>>> uses the
>>>>>>>> same technique of ack and sequence numbers.
>>>>>>>>
>>>>>>>> From what I know about TCP, if some data isn't well
>>>>>>>> acknowledged by the
>>>>>>>> receiver, the sender automatically resends the packets not
>>>>>>>> acked. This
>>>>>>>> is performed by TCP/IP stack or kernel, without the application
>>>>>>>> knows
>>>>>>>> anything about this.
>>>>>>>>
>>>>>>>> It seems to me it's impossible that a MQTT client needs to
>>>>>>>> resend a MQTT
>>>>>>>> message because it wasn't received by the broker. If this
>>>>>>>> happens, TCP
>>>>>>>> should signal the error to the application that should close and
>>>>>>>> try to
>>>>>>>> reopen the connection.
>>>>>>>
>>>>>>> TCP is only one of the lower levels of the protocol stack.
>>>>>>> Data can sometimes be lost in the higher levels.
>>>>>>
>>>>>> In this case, there's only one higher level, that is MQTT
>>>>>> application. How an application running on a machine could lost
>>>>>> something? Network links aren't reliable, but applicaions running
>>>>>> on a processor are reliable.
>>>>>> Do you think about application crash or an entire machine crash
>>>>>> that needs a reboot? In this case, after the reboot, the MQTT
>>>>>> application usually doesn't know anything about the previous
>>>>>> connection, timeout and lost messages... except it saved something
>>>>>> on a non volatile memory.
>>>>>>
>>>>>>
>>>>>>> Secondly, there is the issue of resend timeouts. If TCP fails to
>>>>>>> deliver
>>>>>>> the message past the MQTT retry time limit, then MQTT will resend
>>>>>>> the
>>>>>>> message.
>>>>>>
>>>>>> What happens in this case? Suppose one TCP fragment with a single
>>>>>> MQTT message (just for simplicity) sent by a client to the server
>>>>>> (the broker) was lost. After a TCP timeout, the network stack
>>>>>> autonomously resend the fragment until an ACK is received. Even if
>>>>>> the MQTT application resend the MQTT message *before* TCP timeout,
>>>>>> it will not be sent by TCP layer until the previous fragment is
>>>>>> acked.
>>>>>> Maybe, more exactly, on the receiver machine, the TCP layer will
>>>>>> not pass the resent message to the application (the MQTT broker)
>>>>>> before the lost TCP segment is received as well. When the lost TCP
>>>>>> fragment is received, the broker will receive two MQTT messages:
>>>>>> the "original" and the resent ones. I think it's impossible for
>>>>>> the broker to receive the second transmission without receiving
>>>>>> the first.
>>>>>>
>>>>>> So it seems to me the retransmission made at the MQTT level is
>>>>>> completely useless... but I think I didn't get the real point here.
>>>>>
>>>>> I haven't used MQTT much, but generally if an application gets a
>>>>> timeout and wants to retry, it will close the TCP/IP connection and
>>>>> open a new one. (Or rather, open a new one while the old one is
>>>>> closing - closing a failing TCP/IP connection can be slow.)
>>>>
>>>> I'm quite sure that MQTT retransmission mechanism is *not* based on
>>>> a new TCP connection. In MQTT, the TCP connection is persistent. It
>>>> can stay open for days without exchanging any real data. In this
>>>> case, the keepalive facility is there to detect a broken link.
>>>
>>> If the TCP/IP connection is working correctly, messages will be
>>> transmitted correctly to the broker. If a QoS message fails to be
>>> transmitted - the MQTT client or server does not receive an
>>> acknowledge in time - then there are two possible issues. One is
>>> that the server/broker application is in trouble. The other is that
>>> there is an issue with the network. In most cases, I would suspect
>>> the network first. TCP/IP already has acknowledges and timeouts, so
>>> if it is a temporary problem then it is likely to be handled there.
>>> By the time it reaches the attention of the application protocol's
>>> QoS handling, you are definitely at the point where a new TCP/IP
>>> connection is the right way to go - perhaps targeting a different IP
>>> address or via a different route.
>>
>> Yes, this is the only solution for me too. Anyway, I don't know if
>> this behaviour (closing and reopening TCP connection) is described in
>> the MQTT specifications.
>>
>
> I haven't read the MQTT specifications - I don't even know what
> documentation exists for the protocol. But implementation details like
> this are not always covered in such documents, as it is really at a
> level below the protocol itself. (The specifications for HTTP, for
> example, don't say how many simultaneous connections a browser should
> have to a web server, or when it should give up and retry.) So don't be
> surprised if this is /not/ in the specs - that does not mean a client
> cannot or should not make new TCP/IP connections.
>
>>
>>> The MQTT application already has to handle dropping and making new
>>> TCP/IP connections - even if the norm is for the connection to last
>>> for weeks at a time or more. So creating a new TCP/IP link has a lot
>>> to gain, and very little to lose, and it is the standard way to
>>> handle such issues.
>>
>> Here[1] the MQTT client implementation of lwip, a popular TCP/IP stack
>> for embedded systems.
>
> This is a bit muddled. I am familiar with LWIP, but I don't know
> whether you are talking about an MQTT client that you wrote yourself, or
> which comes as part of newer LWIP, or which someone else contributed as
> a sample.
In the link, there's the official MQTT client implementation of lwip
project.
>> When the timeout for the ACK is expired, this client only calls an
>> application callback with ERR_TIMEOUT. Maybe the decision to close and
>> reopen a new TCP connection is passed to the application.
>
> Yes, that would be the normal behaviour.
>
>> I don't know if other MQTT clients implement an embedded mechanism
>> that automatically tries to solve the issue of lost ACKs by reopening
>> a TCP connection.
>>
>
> I don't know either. I can only tell you that if you are failing to
> communicate on a TCP/IP connection, then making a new one (possibly
> after a delay) is the normal way to handle things if you want automatic
> recovery.
>
>>
>>>>> I would actually have thought that UDP was a more natural choice
>>>>> for MQTT, rather than TCP - although older versions of MQTT did not
>>>>> have QoS and were therefore reliant on TCP's acknowledges and retries.
>>>>>
>>>>> (I always think its a shame that SCTP never caught on - among its
>>>>> many benefits, you don't have this head-of-line blocking issue.)
>>>>
>>>
>>
>> [1] https://github.com/lwip-tcpip/lwip/blob/master/src/apps/mqtt/mqtt.c
>