Windows tcp Rx hanging| page 3

Reply by Son of a Sea Cook ●September 11, 20092009-09-11

On Fri, 11 Sep 2009 08:52:04 -0700 (PDT), Chris Stratton
<cs07024@gmail.com> wrote:

>On Sep 11, 9:42&#4294967295;am, Dimiter Popoff <d...@tgi-sci.com> wrote:
>
>> Retransmission is an integral part of tcp and the receiver *must*
>> be ready to rewind back and accept retransmitted data it has
>> already acked, at any offset. &#4294967295;Further, the latest data it has
>> received
>> has precedence over older data at that offset. But all that
>> has been standardised &#4294967295;since decades, consult rfc793 for the details.
>
>How is a tcp implementation supposed to rewind into data it may have
>already delivered to the client application?

  What are those?  "Oooops packets"?  Severe FEC.  :-)

Reply by Proteus IIV ●September 12, 20092009-09-12

On Sep 11, 3:24=A0am, FatBytestard
<FatBytest...@somewheronyourharddrive.org> wrote:
> On Thu, 10 Sep 2009 13:07:05 -0700 (PDT), Dimiter Popoff <d...@tgi-sci.co=
m>
> wrote:
>
>
>
>
>
> >On Sep 10, 11:25=A0pm, "Paul Hovnanian P.E." <p...@hovnanian.com> wrote:
> >> You might have to adjust your system's MTU, MSS, or other networking
> >> parameters. I'm not certain how to do this in Windows, or why you woul=
d
> >> be having problems on a 100 Mbps Ethernet link.
>
> >I did play with them - and they are nothing special anyway (e.g. 1460
> >bytes
> >tcp segment size etc.), things are pretty much "normal".
>
> > And I must say it only happens with the ftp server, so it may be its
> >fault after all (get clogged at some point and stays stuck with the
> >socket buffer being full, hence the constant 0 window - while the rest
> >of the network is OK and other tcp connections keep on working).
>
> > I'll try to locate and install some other ftp server and see what
> >happens.
>
> >Dimiter
>
> =A0 There are also some settings on the card itself usually (the device).
>
> =A0 In the device settings dialogs, there is a page where hard 100Mb/s is
> selected or "auto", which is what the whole world typically uses. =A0Hard=
ly
> anyone hard sets a link to 100Mb/s. Auto negotiation has usually been the
> rule. =A0Anyway, there are quite a few other settings there as well.
>
> =A0 This is not registry settings, it is for the card itself in device
> manager (there are also other ways to get to this device dialog).
>
> Try examining that. Perhaps you hard set it to 100Mb/s, which was "auto".
> If so, that is one of your mistakes.- Hide quoted text -
>
> - Show quoted text -

YOUR MISTAKE IS THINKING ANYONE IN USENET ACTUALLY BELIEVES OR NEEDS
YOUR ADVICE
YOU FLIPPING COX.NET TROLL

BEGONE !

I AM PROTEUS

Reply by Proteus IIV ●September 12, 20092009-09-12

On Sep 11, 3:30=A0am, SoothSayer <SaySo...@TheMonastery.org> wrote:
> On Thu, 10 Sep 2009 20:24:40 -0700 (PDT), David Schwartz
>
>
>
>
>
> <dav...@webmaster.com> wrote:
> >On Sep 10, 6:42=A0pm, Dimiter Popoff <d...@tgi-sci.com> wrote:
> >> On Sep 11, 4:22=A0am, David Schwartz <dav...@webmaster.com> wrote:
>
> >> > ... Is this your own TCP implementation or an unusual one?
>
> >> It is mine.
>
> >Then I'll almost bet your retransmit algorithm is busted. Post a dump
> >of the last few packets before the hang and the hang.
>
> >> > You must retransmit *DATA*, not packets and not segments.
>
> >> I was thinking about that myself. There is nothing which says
> >> I need to change the size of the last segment I have retransmitted;
> >> and there is no sane reason to do that.
>
> >You *cannot* retransmit segments. That will cause TCP to deadlock. You
> >*must* retransmit data.
>
> >> In this case, it was a
> >> 1460 bytes long segment; the ack was coming for about 1200
> >> bytes (the latter figure is imprecise, don't remember the exact one).
> >> But what was making it clear it was the peers fault was the
> >> fact that it kept repeating that same ack with a window size
> >> of 0 for 30 seconds - and the windows system was OK, the
> >> ftp server running there would even accept "ABOR" via the
> >> control connection, close the hanging data connection and
> >> recover.
>
> >Let me guess -- you kept sending it the same data it had already ACKed
> >and it kept telling you it had already ACKed it. See the problem? You
> >are refusing to honor the peer's ACK until it ACKs more data. The peer
> >is refusing to ACK more data until you accept what it has already
> >ACKed. You are in the wrong.
>
> >Again, TCP *does* *not* have any segment retransmission or packet
> >retransmission and that WILL NOT work. Really.
>
> >> Now that I saw this gets locked up the same way when uploading
> >> to that server from another tcp (windows'), things are clear.
> >> Filezilla (the server software) gets stuck with its buffer
> >> (local connection buffer, I guess) full because of some
> >> bug =A0- mind you, this does not occur at 10 MbpS (or did I not
> >> hold it long enough... I think I did, though) - and poor thing
> >> keeps on acking what it can take (< 1 segment) and reporting
> >> a 0 window....
>
> >Why do you keep refusing to accept the peer's ACK? Why do you keep
> >retransmitting data it has already ACKed, forcing it to ACK it again?
>
> >> Anyway, none of my problem. I was eager to make sure I am
> >> going out of tcp porting mode without leaving any (known)
> >> issues behind, well, that's done.
>
> >Nonsense, unless I'm misunderstanding you.
>
> >DS
>
> =A0 Popeye would say "Ack, ack, ack, ack, ack..."
>
> =A0 What a bad re-ACK-tion. =A0:-] every ACK-tion has an equal and opposi=
te
> re-ACK-tion.- Hide quoted text -
>
> - Show quoted text -

AND TAKE ALL YOUR TROLLOPING COX.NET BUDDIES WITH YOU

I AM PROTEUS

Reply by Nobody ●September 13, 20092009-09-13

On Fri, 11 Sep 2009 06:42:49 -0700, Dimiter Popoff wrote:

>> Let me guess -- you kept sending it the same data it had already ACKed
>> and it kept telling you it had already ACKed it.
> 
> It is not telling it has acked it. It is telling it has a window size
> of 0 for half a minute.
> 
>> See the problem? You
>> are refusing to honor the peer's ACK until it ACKs more data. The peer
>> is refusing to ACK more data until you accept what it has already
>> ACKed. You are in the wrong.
> 
> So who (except you) says that. If the peer reports a window size of 0
> forever it is just dead, period.

There's a big difference between "half a minute" and "forever".

If the receiving client application doesn't consume received data (e.g.
because it is suspended, or in a blocking wait on user input), the receive
buffer will fill up, resulting in the kernel's TCP implementation
reporting a window size of zero. It will continue to report a window size
of zero so long as the receive buffer is full (i.e. until the application
eventually retrieves data from the buffer).

Unless the sending application implements its own timeout, the sender will
continue to probe the receive window indefinitely. The kernel won't time
out the connection so long as the receiver continues to send ACKs, even if
the ACKs report a zero window size.

>> Why do you keep refusing to accept the peer's ACK? Why do you keep
>> retransmitting data it has already ACKed, forcing it to ACK it again?
> 
> Again, it has not acked that data. Only part of them.
> Retransmission is an integral part of tcp and the receiver *must*
> be ready to rewind back and accept retransmitted data it has
> already acked, at any offset.  Further, the latest data it has
> received
> has precedence over older data at that offset. But all that
> has been standardised  since decades, consult rfc793 for the details.

The RFCs say nothing about the case where you get two conflicting versions
of a particular byte.

In many cases, the behaviour which you suggest (allow the newer version to
override the older version) is impossible, as the older version will
already have been passed up to the application.

In order for that behaviour to even be possible, the older data must still
exist in the kernel's receive window, either because the application
simply hasn't retrieved it, or because preceding data is missing.

> Anyway, let me see what you think the transmitter must do in
> that situation: it sends a 1460 bytes segment, receives an ack
> for 1200 of them with a window size of 0 (repeat that
> forever).

If the sender reports with a window size of zero, the receiver should
continue to send window probes containing a single byte of data. Such
probes should have an exponential back-off capped at 60 seconds, and
should be sent until the socket is closed, either by the application or by
the protocol (i.e. either a RST or if the sender stops ACKing the probes
for an extended period; 9 minutes is typical).

> The obvious meaning is "I am done with that file, cannot
> take any more but the last 1200 bytes you sent me, please
> go away".

The correct meaning is "I'm busy, and cannot consume the data right now;
please hold".

There are any number of reasons why reception may be suspended
temporarily, e.g. the user suspended the application, or the application
is blocked waiting for a new tape to be inserted, etc. Note: "temporarily"
could easily mean "until Monday morning".

An application doesn't have to do anything special to wait indefinitely
for its data to be consumed; rather, it has to explicitly set a time-out
if it doesn't want to wait forever.

> Which I do, after a 30 seconds timeout.

That may be reasonable (if somewhat aggressive) behaviour for an
application, but the TCP stack shouldn't time-out the connection so long
as the window probes continue to be ACKed.

Reply by Dimiter Popoff ●September 13, 20092009-09-13

On Sep 13, 10:22=A0am, Nobody <nob...@nowhere.com> wrote:
> ...
> There's a big difference between "half a minute" and "forever".

Well yes, of course, but on a 100 MbpS link it feels like
"forever". After having transferred hundreds of megabytes at
the speed it will sustain (apr. 8 megabytes/S, the smaller
system (mine) busses are the limiting factor), things stop
and it is obvious there will be no recovery - the hanging
application is not struggling, it reacts immediately to my
"ABOR" over the control connection.

> > .... =A0Further, the latest data it has
> > received
> > has precedence over older data at that offset. But all that
> > has been standardised =A0since decades, consult rfc793 for the details.
>
> The RFCs say nothing about the case where you get two conflicting version=
s
> of a particular byte.

Page 69 of rfc793 says something about that.

> In many cases, the behaviour which you suggest (allow the newer version t=
o
> override the older version) is impossible, as the older version will
> already have been passed up to the application.

Clearly so; but since the receiving side has no control over the
segment size except over setting its maximum, it is wise to take
the latest data as the sender may choose to resize the retransmitted
segments (which if mixed with old, differently sized/overlapping, will
be
quite a mess to dig through). What I do is to take the new segments,
discard the old ones, and ignore the beginning of the first segment
of the "new" series which overlaps an old which has been consumed
by the application. I already forgot how I decide that the peer
has begun to retransmit it all (and not just retransmitting one
segment, this can be successfully caught both ways without
discarding the rest), but it works fine.

> > Anyway, let me see what you think the transmitter must do in
> > that situation: it sends a 1460 bytes segment, receives an ack
> > for 1200 of them with a window size of 0 (repeat that
> > forever).
>
> If the sender reports with a window size of zero, the receiver should
> continue to send window probes containing a single byte of data.
> ....

I know, and I know where that comes from.
The clear error of the tcp implementation at the windows' size is
the fact that it chooses to take part of a segment; the sending side
(mine) sees too late that the tcp window is too small and sends
a segment which the receiver cannot accept (no matter how low my
system latencies are, at 12-13 uS/segment this can still happen).
Instead of discarding the segment, the receiver acks *part* of it;
this is illegal (rfc793, page 69: "If the RCV.WND is zero, no segments
will be acceptable, but special allowance should be made to accept
valid ACKs, URGs and RSTs". The table at that page is also quite
explicit about that).

Since I do not see that behaviour too often from the windows'
side, and never at lower speeds, I believe it is safe to say
they have some issue, shared with the application (the latter
fails to process the data, the tcp will accept the wrong
offset).

> > The obvious meaning is "I am done with that file, cannot
> > take any more but the last 1200 bytes you sent me, please
> > go away".
>
> The correct meaning is "I'm busy, and cannot consume the data right now;
> please hold".

The 0 window does mean that indeed, the 1200 ack is nonsense; and I
agree that 30 seconds timeout may be somewhat aggressive but at 100
MbpS
it seems an eternity (or is it an ethernity :-) . It is my tcp sending
action which times out, but the application can set if there is one
and how long it must be on a per connection basis, in this case
it is 30 seconds (I believe this was the default setting, also
settable).

Dimiter

------------------------------------------------------
Dimiter Popoff               Transgalactic Instruments

http://www.tgi-sci.com
------------------------------------------------------
http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/

Original message: http://groups.google.com/group/comp.arch.embedded/msg/b3a=
fab088ea95c22?dmode=3Dsource

Reply by Rocky ●September 13, 20092009-09-13

On Sep 11, 3:42=A0pm, Dimiter Popoff <d...@tgi-sci.com> wrote:
> Anyway, let me see what you think the transmitter must do in
> that situation: it sends a 1460 bytes segment, receives an ack
> for 1200 of them with a window size of 0 (repeat that
> forever).

Just for interest, what was the window size on the previous ACK?
The receiver should always be able to handle its advertised window
size.

Reply by Dimiter Popoff ●September 13, 20092009-09-13

On Sep 13, 2:05=A0pm, Rocky <robertg...@gmail.com> wrote:
> On Sep 11, 3:42=A0pm, Dimiter Popoff <d...@tgi-sci.com> wrote:
>
> > Anyway, let me see what you think the transmitter must do in
> > that situation: it sends a 1460 bytes segment, receives an ack
> > for 1200 of them with a window size of 0 (repeat that
> > forever).
>
> Just for interest, what was the window size on the previous ACK?
> The receiver should always be able to handle its advertised window
> size.

I believe it was the right one, i.e. it takes exactly as many
bytes as last advertised and acks as many (which is sheer nonsense,
partial segment ack); then it fails to handle the case of being
overflown - sometimes.

Dimiter

Reply by Rocky ●September 13, 20092009-09-13

On Sep 13, 1:44=A0pm, Dimiter Popoff <d...@tgi-sci.com> wrote:
> On Sep 13, 2:05=A0pm, Rocky <robertg...@gmail.com> wrote:
>
> > On Sep 11, 3:42=A0pm, Dimiter Popoff <d...@tgi-sci.com> wrote:
>
> > > Anyway, let me see what you think the transmitter must do in
> > > that situation: it sends a 1460 bytes segment, receives an ack
> > > for 1200 of them with a window size of 0 (repeat that
> > > forever).
>
> > Just for interest, what was the window size on the previous ACK?
> > The receiver should always be able to handle its advertised window
> > size.
>
> I believe it was the right one, i.e. it takes exactly as many
> bytes as last advertised and acks as many (which is sheer nonsense,
> partial segment ack); then it fails to handle the case of being
> overflown - sometimes.
>
What I was getting at is that unless fragmentation took place - which
on a direct connection is somewhat unlikely :) - then the window of
the receiver should have been greater than or equal to about 1400 if a
1460 segement was sent. If it was >=3D 1400, but only 1200 bytes got
acked, then my view would be that the TCP engine of the receiver is
possibly broken.
If however the reciever window was only 1200 bytes and your trasmitter
sent about 1400, then the transmitter is broken.

I have approximated 1400 bytes for the window, because of the TCP
overhead.

Reply by Dimiter Popoff ●September 13, 20092009-09-13

On Sep 13, 6:32=A0pm, Rocky <robertg...@gmail.com> wrote:
> ...
> If however the reciever window was only 1200 bytes and your trasmitter
> sent about 1400, then the transmitter is broken.

Mine does send at times > the actual window size - this cannot
be avoided, 12 uS per segment is negligible compared to
overall RTT, internal system latencies (both sides) etc.
There are a lot of unacked segments in transit, sometimes
the smaller window information is seen too late.
The receiving side must be prepared to receive such a segment,
discard it and stay alive (and open the window at some point).
When I get conservative enough to guarantee I will never ever
overflow the foreign window, the overall speed drops about 3-4
times; not a good deal. It is normal to have such an overflow
every now and then (a few seconds, perhaps tens of seconds apart),
recover and maintain maximum link speed. My side keeps on receiving
that all the time and handles it without any hiccups.

> I have approximated 1400 bytes for the window, because of the TCP
> overhead.

Well, not that it matters in this context but the segment size
(minus overhead) is 1460 bytes, $5b4.

Dimiter

------------------------------------------------------
Dimiter Popoff               Transgalactic Instruments

http://www.tgi-sci.com
------------------------------------------------------
http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/

Original message: http://groups.google.com/group/comp.arch.embedded/msg/73e=
b2e5db82b175d?dmode=3Dsource

Reply by David Schwartz ●September 13, 20092009-09-13

On Sep 13, 3:40=A0am, Dimiter Popoff <d...@tgi-sci.com> wrote:

> Well yes, of course, but on a 100 MbpS link it feels like
> "forever". After having transferred hundreds of megabytes at
> the speed it will sustain (apr. 8 megabytes/S, the smaller
> system (mine) busses are the limiting factor), things stop
> and it is obvious there will be no recovery - the hanging
> application is not struggling, it reacts immediately to my
> "ABOR" over the control connection.

The other side keeps trying to get you to accept its ACK until its
calculated RTT gets high. Once that happens, the recovery is going to
be slow.

> > If the sender reports with a window size of zero, the receiver should
> > continue to send window probes containing a single byte of data.

> I know, and I know where that comes from.
> The clear error of the tcp implementation at the windows' size is
> the fact that it chooses to take part of a segment; the sending side
> (mine) sees too late that the tcp window is too small and sends
> a segment which the receiver cannot accept (no matter how low my
> system latencies are, at 12-13 uS/segment this can still happen).
> Instead of discarding the segment, the receiver acks *part* of it;
> this is illegal (rfc793, page 69: "If the RCV.WND is zero, no segments
> will be acceptable, but special allowance should be made to accept
> valid ACKs, URGs and RSTs". The table at that page is also quite
> explicit about that).

There is nothing wrong with taking part of a segment. And you cannot
rely on a zero window size advertised meaning there will still be a
zero window size when the packet is received.

> > The correct meaning is "I'm busy, and cannot consume the data right now=
;
> > please hold".

> The 0 window does mean that indeed, the 1200 ack is nonsense;

How do you figure that? Why do you think the window size invalidates
the ACK?

And what should the other side infer from the fact that you keep
ignoring the ACK other than that it dropped?

DS