Forums

Windows tcp Rx hanging

Started by Didi September 10, 2009
On Thu, 10 Sep 2009 20:24:40 -0700 (PDT), David Schwartz
<davids@webmaster.com> wrote:

>On Sep 10, 6:42&#2013266080;pm, Dimiter Popoff <d...@tgi-sci.com> wrote: >> On Sep 11, 4:22&#2013266080;am, David Schwartz <dav...@webmaster.com> wrote: >> >> > ... Is this your own TCP implementation or an unusual one? >> >> It is mine. > >Then I'll almost bet your retransmit algorithm is busted. Post a dump >of the last few packets before the hang and the hang. > >> > You must retransmit *DATA*, not packets and not segments. > >> I was thinking about that myself. There is nothing which says >> I need to change the size of the last segment I have retransmitted; >> and there is no sane reason to do that. > >You *cannot* retransmit segments. That will cause TCP to deadlock. You >*must* retransmit data. > >> In this case, it was a >> 1460 bytes long segment; the ack was coming for about 1200 >> bytes (the latter figure is imprecise, don't remember the exact one). >> But what was making it clear it was the peers fault was the >> fact that it kept repeating that same ack with a window size >> of 0 for 30 seconds - and the windows system was OK, the >> ftp server running there would even accept "ABOR" via the >> control connection, close the hanging data connection and >> recover. > >Let me guess -- you kept sending it the same data it had already ACKed >and it kept telling you it had already ACKed it. See the problem? You >are refusing to honor the peer's ACK until it ACKs more data. The peer >is refusing to ACK more data until you accept what it has already >ACKed. You are in the wrong. > >Again, TCP *does* *not* have any segment retransmission or packet >retransmission and that WILL NOT work. Really. > >> Now that I saw this gets locked up the same way when uploading >> to that server from another tcp (windows'), things are clear. >> Filezilla (the server software) gets stuck with its buffer >> (local connection buffer, I guess) full because of some >> bug &#2013266080;- mind you, this does not occur at 10 MbpS (or did I not >> hold it long enough... I think I did, though) - and poor thing >> keeps on acking what it can take (< 1 segment) and reporting >> a 0 window.... > >Why do you keep refusing to accept the peer's ACK? Why do you keep >retransmitting data it has already ACKed, forcing it to ACK it again? > >> Anyway, none of my problem. I was eager to make sure I am >> going out of tcp porting mode without leaving any (known) >> issues behind, well, that's done. > >Nonsense, unless I'm misunderstanding you. > >DS
Popeye would say "Ack, ack, ack, ack, ack..." What a bad re-ACK-tion. :-] every ACK-tion has an equal and opposite re-ACK-tion.
On Sep 11, 6:24=A0am, David Schwartz <dav...@webmaster.com> wrote:
> ... > Then I'll almost bet your retransmit algorithm is busted. Post a dump > of the last few packets before the hang and the hang.
Ah, you want to see some authority stamp. But I am not interested in demonstrating you my credentials nor will I enter any dialogue of that kind. Anyway, feel free to bet when the "almost" is gone :-).
> You *cannot* retransmit segments. That will cause TCP to deadlock. You > *must* retransmit data.
Uhm, you seem to have your own interpretation of the word "segment". In tcp, it means data of a certain size, encapsulated in a standardized way. The maximum segment size is agreed upon during syn/syn-ack exchange (if not, it is close to 64k, minus the headers etc; in our case, it had been agreed upon 1460 for both sides). There is no other known way of tcp data transmission except in a segment by segment manner, perhaps you will be willing to share your invention :-).
> Let me guess -- you kept sending it the same data it had already ACKed > and it kept telling you it had already ACKed it.
It is not telling it has acked it. It is telling it has a window size of 0 for half a minute.
> See the problem? You > are refusing to honor the peer's ACK until it ACKs more data. The peer > is refusing to ACK more data until you accept what it has already > ACKed. You are in the wrong.
So who (except you) says that. If the peer reports a window size of 0 forever it is just dead, period.
> Again, TCP *does* *not* have any segment retransmission or packet > retransmission and that WILL NOT work. Really.
So why does the same peer fail exactly the same way with a different host with a different tcp implementation (I already posted the details about that in previous messages, please consult them before betting).
> Why do you keep refusing to accept the peer's ACK? Why do you keep > retransmitting data it has already ACKed, forcing it to ACK it again?
Again, it has not acked that data. Only part of them. Retransmission is an integral part of tcp and the receiver *must* be ready to rewind back and accept retransmitted data it has already acked, at any offset. Further, the latest data it has received has precedence over older data at that offset. But all that has been standardised since decades, consult rfc793 for the details. Anyway, let me see what you think the transmitter must do in that situation: it sends a 1460 bytes segment, receives an ack for 1200 of them with a window size of 0 (repeat that forever). The obvious meaning is "I am done with that file, cannot take any more but the last 1200 bytes you sent me, please go away". Which I do, after a 30 seconds timeout. An ftp client under windows I tried does the same so I am not the only one who thinks this is sane.
> Nonsense, unless I'm misunderstanding you.
So tell that the implementor of the windows' tcp stack. It behaves exactly as mine in the same situation. I am fine with that - and my receiving side does not get stuck when its buffer size is not a multiple of the segment size, unlike windows. It does get slower when fed data at 100 MbpS and has to "disk allocate and write" (e.g. the file is of unknown size and the preallocated part has been filled up), but it just works. Not so with that ftp server under windows; it appears to have an issue exactly with that. Now whether it is the servers or tcp implementation issue I have no idea, perhaps both. None of my problem, as I said. Dimiter ------------------------------------------------------ Dimiter Popoff Transgalactic Instruments http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/ Original message: http://groups.google.com/group/comp.arch.embedded/msg/bcf= 570ecf7d9dd6f?dmode=3Dsource
On Sep 11, 9:42=A0am, Dimiter Popoff <d...@tgi-sci.com> wrote:

> Retransmission is an integral part of tcp and the receiver *must* > be ready to rewind back and accept retransmitted data it has > already acked, at any offset. =A0Further, the latest data it has > received > has precedence over older data at that offset. But all that > has been standardised =A0since decades, consult rfc793 for the details.
How is a tcp implementation supposed to rewind into data it may have already delivered to the client application?
On Sep 11, 6:52=A0pm, Chris Stratton <cs07...@gmail.com> wrote:
> On Sep 11, 9:42=A0am, Dimiter Popoff <d...@tgi-sci.com> wrote: > > > Retransmission is an integral part of tcp and the receiver *must* > > be ready to rewind back and accept retransmitted data it has > > already acked, at any offset. =A0Further, the latest data it has > > received > > has precedence over older data at that offset. But all that > > has been standardised =A0since decades, consult rfc793 for the details. > > How is a tcp implementation supposed to rewind into data it may have > already delivered to the client application?
Obviously it is unable to do that. I can say what mine does in that case: it treats any data up to what the application has eaten as "water under the bridge", i.e. newer data cannot take precedece over that; my tcp will ignore(part of)the segment which it receives in duplication of the alredy passed data and will keep on ack-ing the highest offset it has continuosly buffered (it will never rewind its ack position, obviously). Crucially in the context of the failure I saw, it will report the valid window. There is no way it will selectively ack a part of a segment it has received, this would be illegal. So in our case - it would open the window, take the 1460 byte segment and provide the difference to the 1200 it has already acked and devivered to the application for application consumption; it would ack the 1460 and ignore the first 1200. - it would not put itself in that position in the first place because it would never partially ack a segment; the whole 1460 byte segment will remain unacked until it can be successfully buffered (but it will handle any retransmissions it gets, it has no way of knowing which of its acks made it to the peer if it is not sending data in this direction, which is the case here). Dimiter ------------------------------------------------------ Dimiter Popoff Transgalactic Instruments http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/ Original message: http://groups.google.com/group/comp.arch.embedded/msg/d09= 5a9e4826dfa84?dmode=3Dsource
All else notwithstanding, asking for a packet trace is a reasonable request
when trying to diagnost TCP issues.

rick jones
-- 
The glass is neither half-empty nor half-full. The glass has a leak.
The real question is "Can it be patched?"
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
On Sep 11, 6:42=A0am, Dimiter Popoff <d...@tgi-sci.com> wrote:

> > You *cannot* retransmit segments. That will cause TCP to deadlock. You > > *must* retransmit data.
> Uhm, you seem to have your own interpretation of the word "segment".
A segment is a chunk of data encapsulated into a packet.
> In tcp, it means data of a certain size, encapsulated in a > standardized > way.
Right. Again, you *CANNOT* retransmit segments. That will cause TCP to deadlock.
> =A0There is no other known way of tcp data transmission except in > a segment by segment manner, perhaps you will be willing to share > your invention :-).
Yes, you retransmit *data* (in a new segment).
> > Let me guess -- you kept sending it the same data it had already ACKed > > and it kept telling you it had already ACKed it.
> It is not telling it has acked it. It is telling it has a window size > of 0 for half a minute.
It is telling you it has a window size of 0 because you ignored its ACK. TCP implementations do not have some special way to handle an ignored ACK. They treat it like a dropped ACK.
> > See the problem? You > > are refusing to honor the peer's ACK until it ACKs more data. The peer > > is refusing to ACK more data until you accept what it has already > > ACKed. You are in the wrong.
> So who (except you) says that. If the peer reports a window size of 0 > forever it is just dead, period.
And if a peer keep ignoring an ACK and retransmitting already-ACKed data, it's jsut dead, period. The difference is, the peer's behavior is permitted by TCP but your behavior is not.
> > Again, TCP *does* *not* have any segment retransmission or packet > > retransmission and that WILL NOT work. Really.
> So why does the same peer fail exactly the same way with a > different host with a different tcp implementation (I already > posted the details about that in previous messages, please consult > them before betting).
Hard to say without seeing packet traces.
> > Why do you keep refusing to accept the peer's ACK? Why do you keep > > retransmitting data it has already ACKed, forcing it to ACK it again?
> Again, it has not acked that data. Only part of them.
I AM TALKING ABOUT THE DATA IT HAS ACKED! That data, the data the peer has ACKed, why do you keep retransmitting it? That forces the peer to: 1) Conclude that its ACK has dropped. (TCP implementations do not distinguish between an "ignored ACK" and a dropped ACK.) 2) Send its ACK again. The peer is concluding that its packets are not getting through because you are ignoring them. That is why no further progress is being made.
> Retransmission is an integral part of tcp and the receiver *must* > be ready to rewind back and accept retransmitted data it has > already acked, at any offset. =A0Further, the latest data it has > received > has precedence over older data at that offset. But all that > has been standardised =A0since decades, consult rfc793 for the details.
All of that is not the problem. The problem is, and only is, that you are ignoring the ACK.
> Anyway, let me see what you think the transmitter must do in > that situation: it sends a 1460 bytes segment, receives an ack > for 1200 of them with a window size of 0 (repeat that > forever).
It's hard to say without seeing exactly how you got there. But the one thing you must not do is refuse to honor the ACK and re-transmit data you know the peer has already ACKed. Of course, the peer will not barf on the data, but it will conclude the ACK was dropped, preventing it from opening up the window. DS
On Sep 11, 9:20=A0pm, Rick Jones <rick.jon...@hp.com> wrote:
> All else notwithstanding, asking for a packet trace is a reasonable reque=
st
> when trying to diagnost TCP issues. > > rick jones > -- > The glass is neither half-empty nor half-full. The glass has a leak. > The real question is "Can it be patched?" > these opinions are mine, all mine; HP might not want them anyway... :) > feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
Sure, here it is: http://tgi-sci.com/misc/hanglog.txt (full version, 20+M), http://tgi-sci.com/misc/hanglogb.txt (brief version, 800+k; just beg. and end), http://tgi-sci.com/misc/hanglog.lg - binary log, one needs DPS to be able to process it into text (pppll hanglog.lg >textlog.txt ). The packets are clipped to 48 bytes in the log, it takes a while until the failure occurs (this time I was lucky and got it almost immediately, hence the log of clipped packets is only 6M or so). Dimiter ------------------------------------------------------ Dimiter Popoff Transgalactic Instruments http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/
In comp.protocols.tcp-ip Dimiter Popoff <dp@tgi-sci.com> wrote:
> Sure, here it is: > http://tgi-sci.com/misc/hanglog.txt (full version, 20+M), > http://tgi-sci.com/misc/hanglogb.txt (brief version, 800+k; just beg. > and end), > http://tgi-sci.com/misc/hanglog.lg - binary log, one needs DPS to be > able to process it into text (pppll hanglog.lg >textlog.txt ).
Now I remember why I like tcpdump so much :) rick jones -- the road to hell is paved with business decisions... these opinions are mine, all mine; HP might not want them anyway... :) feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
In comp.protocols.tcp-ip Dimiter Popoff <dp@tgi-sci.com> wrote:
> The packets are clipped to 48 bytes in the log
Looks like they include the Ethernet header, so that means you are missing some of the TCP header no? 14 bytes Ethernet header, + 20 bytes IPv4 header plus 20 or more bytes of TCP header (depending on options present) would be 54+ bytes, or 6 bytes more than where you say the packets are clipped. If I am recalling my headers correctly that would suggest it would be missing the window, checksum and urgent pointer fields of the TCP headers and all the options (if present). rick jones -- a wide gulf separates "what if" from "if only" these opinions are mine, all mine; HP might not want them anyway... :) feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
On Sep 12, 12:08=A0am, Rick Jones <rick.jon...@hp.com> wrote:
> In comp.protocols.tcp-ip Dimiter Popoff <d...@tgi-sci.com> wrote: > > > The packets are clipped to 48 bytes in the log > > Looks like they include the Ethernet header, so that means you are > missing some of the TCP header no?
Hey, I may be insane but I am not impractical, you know :-) . But I clip at 64, not 48, I am tired enough not to be able to count to 4, it seems, LOL. 4 lines, 16 bytes each.... All is there, of course, the tcp header starts at offset $22 (the flags being the last byte at this line, first two bytes at the lowest line are the tcp window; somewhere at the start you can see the scaling option being set to 3 shifts). Or look at the first few pages, you'll see the beginning of the text exchanges over the control connection (even part of the password - which is old enough not to worry about,
> 10 years, still in use only at don't care locations :-) ).
Cheers, Dimiter ------------------------------------------------------ Dimiter Popoff Transgalactic Instruments http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/ Original message: http://groups.google.com/group/comp.arch.embedded/msg/9dc= 565e81e458da7?dmode=3Dsource