Windows tcp Rx hanging| page 4

Reply by Dimiter Popoff ●September 13, 20092009-09-13

On Sep 14, 3:33=A0am, David Schwartz <dav...@webmaster.com> wrote:
> ...
> And what should the other side infer from the fact that you keep
> ignoring the ACK other than that it dropped?

Please consult the paragraph past line 4264 of rfc793, it explains
that.

Thanks for your comments,

Dimiter

Reply by Nobody ●September 14, 20092009-09-14

On Sun, 13 Sep 2009 03:40:32 -0700, Dimiter Popoff wrote:

>> > .... &#4294967295;Further, the latest data it has
>> > received
>> > has precedence over older data at that offset. But all that
>> > has been standardised &#4294967295;since decades, consult rfc793 for the details.
>>
>> The RFCs say nothing about the case where you get two conflicting versions
>> of a particular byte.
> 
> Page 69 of rfc793 says something about that.

I presume that you are referring to:

	If a segment's contents straddle the boundary between old and new, only
	the new parts should be processed.

Which implies that the old version should take precedence, regardless of
whether it has been passed up to the application.

Although it shouldn't make any difference (the sender shouldn't be sending
conflicting data for a given range of sequence numbers), this has been
suggested as a possible attack vector, to "smuggle" malicious data past
a security scanner built into a router.

>> > Anyway, let me see what you think the transmitter must do in
>> > that situation: it sends a 1460 bytes segment, receives an ack
>> > for 1200 of them with a window size of 0 (repeat that
>> > forever).
>>
>> If the sender reports with a window size of zero, the receiver should
>> continue to send window probes containing a single byte of data.
>> ....
> 
> I know, and I know where that comes from.
> The clear error of the tcp implementation at the windows' size is
> the fact that it chooses to take part of a segment; the sending side
> (mine) sees too late that the tcp window is too small and sends
> a segment which the receiver cannot accept (no matter how low my
> system latencies are, at 12-13 uS/segment this can still happen).
> Instead of discarding the segment, the receiver acks *part* of it;
> this is illegal (rfc793, page 69: "If the RCV.WND is zero, no segments
> will be acceptable, but special allowance should be made to accept
> valid ACKs, URGs and RSTs". The table at that page is also quite
> explicit about that).

Note that it's talking about the receive window within the receiver's TCP
stack, not the last advertised window. It's possible that the window has
just opened but this fact hasn't yet been announced. If the receiver
responds to a probe with a zero window size, then ACKs some data from the
next packet, it isn't necessarily violating the rules.

>> > The obvious meaning is "I am done with that file, cannot
>> > take any more but the last 1200 bytes you sent me, please
>> > go away".
>>
>> The correct meaning is "I'm busy, and cannot consume the data right now;
>> please hold".
> 
> The 0 window does mean that indeed, the 1200 ack is nonsense; and I
> agree that 30 seconds timeout may be somewhat aggressive but at 100 MbpS
> it seems an eternity (or is it an ethernity :-) . It is my tcp sending
> action which times out, but the application can set if there is one
> and how long it must be on a per connection basis, in this case
> it is 30 seconds (I believe this was the default setting, also
> settable).

That's an API issue. The BSD sockets API offers the SO_SNDTIMEO socket
option (for all socket families). This specifies how long a send/write/etc
call can block for; if the timeout is exceeded, the call will return a
short count (or -1 with errno set to EAGAIN), but the socket remains
valid for futher operations (i.e. it doesn't terminate the connection).

One thing I'm not entirely clear on is whether the result reflects:

1. the data actually acknowledged by the receiver,
2. the data actually sent (and scheduled for retransmission until
acknowledgement), or
3. the amount of data copied into the kernel's transmit buffer.

I assume that it would be either 2 or 3; data sent but not acknowledged
may have already been received and passed up to the application, so it
cannot be "rescinded". If it's 2, the kernel can just discard any data
which hasn't been sent yet.

Reply by Nobody ●September 14, 20092009-09-14

On Sun, 13 Sep 2009 04:05:53 -0700, Rocky wrote:

> Just for interest, what was the window size on the previous ACK?
> The receiver should always be able to handle its advertised window
> size.

OTOH, the sender should always be able to handle the receiver failing to
handle its advertised window size (i.e. window shrinking).

Reply by Dimiter Popoff ●September 14, 20092009-09-14

On Sep 14, 8:40=A0am, Nobody <nob...@nowhere.com> wrote:
> On Sun, 13 Sep 2009 04:05:53 -0700, Rocky wrote:
> > Just for interest, what was the window size on the previous ACK?
> > The receiver should always be able to handle its advertised window
> > size.
>
> OTOH, the sender should always be able to handle the receiver failing to
> handle its advertised window size (i.e. window shrinking).

Well yes, but it cannot do much except probing and waiting.
Now why my 1 byte probing does not set in in that case is something
I have to investigate (I have it there for years, since day one,
actually),
but it is completely irrelevant which segment size (and at which
offset)
is sent to get a valid ack with a valid window size in reply; in our
case
the fact is that my peer is stuck at 0 window size, clearly messed up.

Dimiter

------------------------------------------------------
Dimiter Popoff               Transgalactic Instruments

http://www.tgi-sci.com
------------------------------------------------------
http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/

Reply by Dimiter Popoff ●September 14, 20092009-09-14

On Sep 14, 8:37=A0am, Nobody <nob...@nowhere.com> wrote:
> ....
> > Page 69 of rfc793 says something about that.
>
> I presume that you are referring to:
>
> =A0 =A0 =A0 =A0 If a segment's contents straddle the boundary between old=
 and new, only
> =A0 =A0 =A0 =A0 the new parts should be processed.
>
> Which implies that the old version should take precedence, regardless of
> whether it has been passed up to the application.

You are right, this is the correct interpretation. Come to think of
it,
this is how I do it (I set the newcomers  offset to the first byte
high
enough to skip over the overlapping data with the previous segment,
which does exactly that...). But I am sure I had read somewhere about
newer stuff taking precedence, it must have been in rfc791 regarding
defragmentation, so I am not completely making that up :-). It's been
several years since I wrote that implementation, now I am inside it
because it first sees 100 MbpS.

> .....
> > The 0 window does mean that indeed, the 1200 ack is nonsense; and I
> > agree that 30 seconds timeout may be somewhat aggressive but at 100 Mbp=
S
> > it seems an eternity (or is it an ethernity :-) . It is my tcp sending
> > action which times out, but the application can set if there is one
> > and how long it must be on a per connection basis, in this case
> > it is 30 seconds (I believe this was the default setting, also
> > settable).
>
> That's an API issue. The BSD sockets API offers the SO_SNDTIMEO socket
> option (for all socket families). This specifies how long a send/write/et=
c
> call can block for; if the timeout is exceeded, the call will return a
> short count (or -1 with errno set to EAGAIN), but the socket remains
> valid for futher operations (i.e. it doesn't terminate the connection).

Very similar behaviour here. Obviously my "send" (send and wait for
ack), sendq (queue for sending and return - can be polled for status),
and a new out25 (...:-), which allows the application to serve the
connection in a loop while getting back key parameters (ack position,
queued position etc.) all just return with the proper status upon
timeout;
it is up to the application to decide whether to close the connection.
The only case where it will get closed automatically is if the task
is killed.

Dimiter

------------------------------------------------------
Dimiter Popoff               Transgalactic Instruments

http://www.tgi-sci.com
------------------------------------------------------
http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/

Original message: http://groups.google.com/group/comp.arch.embedded/msg/074=
96ba8c54db547?dmode=3Dsource

Reply by David Schwartz ●September 14, 20092009-09-14

On Sep 13, 11:30=A0pm, Dimiter Popoff <d...@tgi-sci.com> wrote:

> in our
> case
> the fact is that my peer is stuck at 0 window size, clearly messed up.

It's stuck at 0 window size because it believes its ACK keeps
dropping.

DS

Reply by Jorgen Grahn ●September 14, 20092009-09-14

["Followup-To:" header set to comp.protocols.tcp-ip.]

On Sun, 13 Sep 2009 18:24:43 -0700 (PDT), Dimiter Popoff <dp@tgi-sci.com> wrote:
> On Sep 14, 3:33&#4294967295;am, David Schwartz <dav...@webmaster.com> wrote:
>> ...
>> And what should the other side infer from the fact that you keep
>> ignoring the ACK other than that it dropped?
>
> Please consult the paragraph past line 4264 of rfc793, it explains
> that.

FYI, he's probably referring to this text on page 69 of 85:

        Segments are processed in sequence.  Initial tests on arrival
        are used to discard old duplicates, but further processing is
        done in SEG.SEQ order.  If a segment's contents straddle the
        boundary between old and new, only the new parts should be
        processed.

/Jorgen

-- 
  // Jorgen Grahn <grahn@  Oo  o.   .  .
\X/     snipabacken.se>   O  o   .

Reply by Dimiter Popoff ●September 21, 20092009-09-21

On Sep 14, 9:30=A0am, Dimiter Popoff <d...@tgi-sci.com> wrote:
> On Sep 14, 8:40=A0am, Nobody <nob...@nowhere.com> wrote:
>
> > ....
> > OTOH, the sender should always be able to handle the receiver failing t=
o
> > handle its advertised window size (i.e. window shrinking).
>
> Well yes, but it cannot do much except probing and waiting.
> ...

Out of curiousity - and after a wasted week because of a running
nose and a head full of what should have been some rubber glue - I
tried to make the stuck windows host happy, i.e. I began probing
it not by repeating the segment it was partially ack-ing, but just
with the part past what it had acked in the hope it would
eventually recover.

No such luck, though. Absolutely no change, keeps on repeating
the same (acks the position it had last acked, window size 0).
Clearly dead - not that it was not obvious before that, I know
a messed up system when I see it, but I did try and thought
I'd post the result as well.
Only with filezilla, though - another ftp server, xlight something,
does not do it. A difference between the two I notice is the fact that
only filezilla opens the data connection using window scaling;
this may have to do with their problem (I see much larger window
advertised than the buffer size which is currently set, not that
changing that buffer size had had an effect during earlier tests).
 With both servers, the upload speed is about 7.5 Mbytes/S - the
window scaling is not really needed, this is a local (via a buffering
switch) connection.
 I tried to install an ftp server under linux to do the same
test, but after wasting an hour to make that run (under ubuntu)
without success I gave up, no time for that now.

Dimiter

------------------------------------------------------
Dimiter Popoff =A0 =A0 =A0 =A0 =A0 =A0 =A0 Transgalactic Instruments

http://www.tgi-sci.com
------------------------------------------------------
http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/

Reply by Paul Carpenter ●September 21, 20092009-09-21

In article <b42d5037-38ef-4720-a11d-
1b92e8ba5477@z34g2000vbl.googlegroups.com>, dp@tgi-sci.com says...
> On Sep 14, 9:30=A0am, Dimiter Popoff <d...@tgi-sci.com> wrote:
> > On Sep 14, 8:40=A0am, Nobody <nob...@nowhere.com> wrote:
> >
> > > ....
> > > OTOH, the sender should always be able to handle the receiver failing=
 to
> > > handle its advertised window size (i.e. window shrinking).
> >
> > Well yes, but it cannot do much except probing and waiting.
> > ...
>=20
> Out of curiousity - and after a wasted week because of a running
> nose and a head full of what should have been some rubber glue - I
> tried to make the stuck windows host happy, i.e. I began probing
> it not by repeating the segment it was partially ack-ing, but just
> with the part past what it had acked in the hope it would
> eventually recover.
>=20
> No such luck, though. Absolutely no change, keeps on repeating
> the same (acks the position it had last acked, window size 0).
> Clearly dead - not that it was not obvious before that, I know
> a messed up system when I see it, but I did try and thought
> I'd post the result as well.
> Only with filezilla, though - another ftp server, xlight something,
> does not do it.

Considering, the problems I have seen with their Passive mode support
on filezilla client, to connect through an ftp-proxy to external
ftp servers, I would blame filezilla.

In my case I replaced the client with a 10 year old ftp client
on the SAME system and everything worked.

...

> I tried to install an ftp server under linux to do the same
> test, but after wasting an hour to make that run (under ubuntu)
> without success I gave up, no time for that now.

If you can get it going you will probably find it works.

Now back to sorting some bugs in some USB to SPI controller
driver, that does not work for all CPOL and CPHA modes as=20
advertised.

--=20
Paul Carpenter          | paul@pcserviceselectronics.co.uk
<http://www.pcserviceselectronics.co.uk/>    PC Services
<http://www.pcserviceselectronics.co.uk/fonts/> Timing Diagram Font
<http://www.gnuh8.org.uk/>  GNU H8 - compiler & Renesas H8/H8S/H8 Tiny
<http://www.badweb.org.uk/> For those web sites you hate

Previous 2 34Next

Windows tcp Rx hanging

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group