Multicasting and Switches| page 3

Reply by D Yuniskis ●November 16, 20102010-11-16

Hi Clifford,

Clifford Heath wrote:
> D Yuniskis wrote:
>> Clifford Heath wrote:
>>> When I implemented a P2P file distribution platform, I
>>> decided that multicast wasn't useful and went instead for
>> Why? --------------------^^^^^^^^^^^^^
> 
> For this app, the computers targeted are almost always
> on the same subnet, so broadcasts naturally get propagated
> to just the places they're needed (by default, broadcast

Sorry, I should have been more precise in my comment.
I intended "multicast" to encompass broadcast  :-/  I
should have said "non-unicast" to be more clear.

> traffic stops at subnet boundaries). LAN traffic is regarded
> as essentially free, it's WAN traffic that needs to be
> limited and shared.

Understood.  See below

>>> broadcasts. The LDSS protocol (Local Download Sharing
>> I'll have to look at the RFC's...
> 
> It got to a Draft, which has expired, but I've attached it
> below.
> 
> Very simple protocol; two messages only (since files are
> only identified by SHA-1 hash); just NeedFile and WillSend,
> having the same packet structure.

The control packets are broadcast.  But, it is unclear as
to how the actual payload is delivered (seems to be left
to the application to decide?).  E.g., the protocol seems
to imply each "Need"-ing host can use whatever (supported)
protocol to fetch the payload from the "Have"-ing host(s).
I don't see anything akin to reliable multicast inherent
in the protocol (though, conceivably, a host that loses/drops
payload can subsequently reissue a "NeedFile" request).

>>> normal system operations. It was an interesting project!
>> This is geared towards asynchronous sharing, no doubt.
> 
> Software distribution. All machines fetch an individual
> policy file saying what software they should install, and
> in most cases there is overlap; more than one machine
> needs the same software. Rather than all downloading a
> separate copy, they announce their plans, progress, and
> ETA, so others know they can wait for a LAN transfer when
> it's done.

OK, different usage model than what I was targeting.
E.g., consider N (N being large) diskless workstations
powering up simultaneously and all wanting to fetch
the (identical) image from a single/few server(s).
Clearly, a broadcast/reliable multicast scheme would
best utilize the network bandwidth (in this case).

>> How would you (re)consider your design choices in a
>> *synchronous* environment?
> 
> In the same-subnet scenario, I'd probably still use broadcast,
> unless the utilization is likely to reach a significant
> percentage (say, >25%) of the media's capability, or there
> is a likelihood of multiple synchronized groups which won't
> necessarily have to pass traffic across the same link.
> 
> The latter case is pretty rare, actually - a home media subnet
> is likely all going through one switch and hence limited by
> its capability. Using a IGMP aware router is unlikely to help.

Here, I'm looking at the scenario where many devices are powered
up simultaneously (as above) and need to fetch images over
a network that is already being used for other traffic (MM and
otherwise).  Or, when their collective "mode of operation"
changes at run-time and they need to (all) load an "overlay", etc.

Unicast transfers (of any sort) mean that the "image server"
sees a higher load as it has to push the same image out N times.
(A P2P scheme shares that load among the nodes themselves but
you still have a longer theoretical time until all nodes have
valid images -- unless your P2P algorithm carefully schedules
which packets go where to maximize network utilization).  Ideally,
the "image server" would coincide with the "media server" (or
at least *one* such media server) so that host is already busy
with some sort of load.

I *think* (unsupported) that reliable multicast/broadcast gives
you the shortest time to "everyone having a clean image" -- of
course, cases exist where any protocol can be boundless.

>> How would you (re)consider that same scenario in a wireless
>> network (with nodes closely located -- "tight weave"
>> instead of a "loose mesh")?
> 
> I'm not familiar with the implementation of broadcast/multicast
> IP in a wireless environment, but I can't imagine that it would
> change very much.

I was only mentioning wireless in the sense that it can exploit
broadcast easily if there is an underlying protocol to govern
access to the "medium" (hence the distinction between loose/tight
meshes)

Reply by Clifford Heath ●November 16, 20102010-11-16

Resend, last didn't appear. Sorry if you  get it twice.

D Yuniskis wrote:
>> Very simple protocol; two messages only (since files are
>> only identified by SHA-1 hash); just NeedFile and WillSend,
>> having the same packet structure.
> 
> The control packets are broadcast.  But, it is unclear as
> to how the actual payload is delivered (seems to be left
> to the application to decide?).

TCP - it's documented under the heading "TCP file transfers".
Either a raw stream is sent in response to a NeedFile
request (if the needer advertised a port number), or a
simplified HTTP-style raw request-response, if the needer
responds to an advertised port in a WillSend promise.

The two forms are needed in case one system has a (software?)
firewall.

> I don't see anything akin to reliable multicast inherent
> in the protocol (though, conceivably, a host that loses/drops
> payload can subsequently reissue a "NeedFile" request).

Everything is checked through SHA-1 hashes. If a file reaches
its expected size but the hash doesn't match, the whole file
is dropped (since there's no way to know where the error occurred)
and the search starts afresh.

> OK, different usage model than what I was targeting.

Yes.

> E.g., consider N (N being large) diskless workstations
> powering up simultaneously and all wanting to fetch
> the (identical) image from a single/few server(s).
> Clearly, a broadcast/reliable multicast scheme would
> best utilize the network bandwidth (in this case).

Yes, but beware that a single dropped packet at the source
will cause every recipient to send a NACK. This is the
problem with massive wide-area reliable multicast protocols,
they get you out of the data fan-out problem but replace it
with a NACK fan-in one instead. That's also why some routers
have been taught how to aggregate such NACKs.

If you're dealing with tens or hundreds of machines on a
LAN, it's probably not an issue - low error rate and it's
possible to cope with the NACKs. Even if many of them are
dropped, it only requires one to get through and the requested
section will be multicast again.

> Here, I'm looking at the scenario where many devices are powered
> up simultaneously (as above) and need to fetch images over
> a network

I expect that due to startup timing differences, you'd need
to sit and listen for a relevant multicast to start for a
few seconds before requesting it... or to wait after receiving
such a request for a few seconds before starting to send.

Include identification inside the stream so latecomers realise
what they're missing and can re-fetch the earlier parts.

The other thing we considered doing with massive multicast
involving overlapping sets of parties was to allocate 2**N
multicast groups, and take N bits of the SHA-1 of the file
(or channel ID, if not using content-addressing) to decide
which IGMP group to send it to. That way a smart router can
refrain from sending packets down links where no-one might
be interested.

Clifford Heath.

Reply by Paul Keinanen ●November 17, 20102010-11-17

On Tue, 16 Nov 2010 12:38:18 -0700, D Yuniskis
<not.going.to.be@seen.com> wrote:

>OK, different usage model than what I was targeting.
>E.g., consider N (N being large) diskless workstations
>powering up simultaneously and all wanting to fetch
>the (identical) image from a single/few server(s).
>Clearly, a broadcast/reliable multicast scheme would
>best utilize the network bandwidth (in this case).

One way would be to break up the message into numbered blocks with
CRCs and use a simple carousel to repeatedly broadcast those blocks.
This is used e.g. for firmware updates for TV STBs, in which no return
channel is available.

Each receiver accumulates blocks and if you did not get all blocks
during the first cycle, wait for the next carousel cycle to pick the
missing blocks.

If there is a return channel, each slave could initially request all
blocks, after one full cycle check which blocks are missing and only
request those missing blocks. After each full cycle, the server would
check all the update requests received during the cycle and drop those
blocks from the carousel which have not been requested, those speeding
up the update cycle, finally shutting down the carousel.

If the expected error rate is low, the missing blocks could be asked
even with unicasts.

If the expected error rate is high, such as in some radio links with
large blocks, a memory ARQ system could be used, in which blocks
failing the CRC are stored and if subsequent reception(s) of the same
block also fail the CRC check, the previously received blocks are
accumulated, until the accumulated block passes the CRC.

Alternatively, the few missing blocks could be transmitted again with
a better ECC coding or just send the actual error correction bits to
be combined with the ordinary received data block bits (assuming
proper interleaving) at the receiver.

Reply by D Yuniskis ●November 17, 20102010-11-17

Hi Clifford,

Clifford Heath wrote:
> Resend, last didn't appear. Sorry if you  get it twice.
> 
> D Yuniskis wrote:
>>> Very simple protocol; two messages only (since files are
>>> only identified by SHA-1 hash); just NeedFile and WillSend,
>>> having the same packet structure.
>>
>> The control packets are broadcast.  But, it is unclear as
>> to how the actual payload is delivered (seems to be left
>> to the application to decide?).
> 
> TCP - it's documented under the heading "TCP file transfers".
> Either a raw stream is sent in response to a NeedFile
> request (if the needer advertised a port number), or a
> simplified HTTP-style raw request-response, if the needer
> responds to an advertised port in a WillSend promise.

Yes but all "unicast" (i.e., connection oriented protocol)...
the "need-er" and the "have-er" engage in a dedicated dialog.

> The two forms are needed in case one system has a (software?)
> firewall.

Hmmm... not sure I see why (though my brain is frozen from lying
on the roof for the past hour  :-/ )

>> I don't see anything akin to reliable multicast inherent
>> in the protocol (though, conceivably, a host that loses/drops
>> payload can subsequently reissue a "NeedFile" request).
> 
> Everything is checked through SHA-1 hashes. If a file reaches
> its expected size but the hash doesn't match, the whole file
> is dropped (since there's no way to know where the error occurred)
> and the search starts afresh.

Understood.  You don't split the file into "pieces" (though,
conceivably, one could "pre-split" the REAL *files* into
smaller "files" at the expense of a tiny bit more overhead...

>> OK, different usage model than what I was targeting.
> 
> Yes.
> 
>> E.g., consider N (N being large) diskless workstations
>> powering up simultaneously and all wanting to fetch
>> the (identical) image from a single/few server(s).
>> Clearly, a broadcast/reliable multicast scheme would
>> best utilize the network bandwidth (in this case).
> 
> Yes, but beware that a single dropped packet at the source
> will cause every recipient to send a NACK. This is the
> problem with massive wide-area reliable multicast protocols,
> they get you out of the data fan-out problem but replace it
> with a NACK fan-in one instead. That's also why some routers
> have been taught how to aggregate such NACKs.

Yes.

> If you're dealing with tens or hundreds of machines on a
> LAN, it's probably not an issue - low error rate and it's
> possible to cope with the NACKs. Even if many of them are
> dropped, it only requires one to get through and the requested
> section will be multicast again.

I'm looking at a hybrid approach.  As always, the initial
assumptions drive the design...

Let "the" image server multicast (or even broadcast, depending
on the domain of the recipients) *THE* image.  Let hosts that
end up "missing" parts of that image request those parts from
their peers (assuming *some* peers have received the parts).
So, this can happen concurrent with the rest of the "main"
image's delivery (i.e., it doesn't need to be a serial activity).

I have to look at the model and how it would apply in mesh
networks (where your peer is often responsible for forwarding
traffic from other nodes -- i.e., the "image server") to see
what the overall traffic pattern looks like.  It might be a
win for that peer to broadcast/multicast that "piece" in the
event other hosts (i.e., those downstream from *you*) have missed
the piece as well.

<frown> Things seem to get harder, not easier :>

>> Here, I'm looking at the scenario where many devices are powered
>> up simultaneously (as above) and need to fetch images over
>> a network
> 
> I expect that due to startup timing differences, you'd need
> to sit and listen for a relevant multicast to start for a
> few seconds before requesting it... or to wait after receiving
> such a request for a few seconds before starting to send.

Yes.  A short delay and allow "need-ers" to pick up the stream
at arbitrary points (by cutting it into pieces) instead of
having to listen from the beginning (request the parts you missed,
later)

> Include identification inside the stream so latecomers realise
> what they're missing and can re-fetch the earlier parts.

Exactly.  And, if they can request those parts from peers
to distribute the traffic better...

> The other thing we considered doing with massive multicast
> involving overlapping sets of parties was to allocate 2**N
> multicast groups, and take N bits of the SHA-1 of the file
> (or channel ID, if not using content-addressing) to decide
> which IGMP group to send it to. That way a smart router can
> refrain from sending packets down links where no-one might
> be interested.

I'm not sure I follow -- isn't *everyone* interested?

Reply by Boudewijn Dijkstra ●November 17, 20102010-11-17

Op Tue, 09 Nov 2010 23:57:21 +0100 schreef D Yuniskis  
<not.going.to.be@seen.com>:
> Jim Stewart wrote:
>> D Yuniskis wrote:
>>> Jim Stewart wrote:
>>
>>>> Falling back to the educated guess disclaimer,
>>>> I'd say the maximum latency is indeterminate.
>>>>
>>>> It seems that by definition, that if the multicast
>>>> packet collides with another packet, the latency
>>>> will be indeterminate.
>>>
>>> That depends on the buffering in the switch. And,
>>> how the multicast packet is treated *by* the switch.
>>  Since to the best of my knowledge, in the event of
>> an ethernet collision, both senders back off a random
>> amount of time then retransmit, I can't see how the
>> switch buffering would make any difference.
>
> The time a packet (*any* packet) spends buffered in
> the switch looks like an artificial transport delay
> (there's really nothing "artificial" about it  :> ).
> Hence my comment re: "speed of light" delays.
>
> When you have multicast traffic, the delay through
> the switch can vary depending on the historical
> traffic seen by each targeted port.  I.e., if port A
> has a packet already buffered/queued while port B
> does not, then the multicast packet will get *to*
> the device on port B quicker than on port A.
>
> If you have two or more streams and are hoping to
> impose a temporal relationship on them, you need to
> know how they will get to their respective consumers.

Or use RTCP timestamps to synchronize the streams.

>> For that matter, does the sender even monitor for
>> collisions and retransmit in a multicast environment.
>> I guess I don't know...
>
> Multicast is like "shouting from the rooftop -- WITH A
> DEAF EAR".  If it gets heard, great.  If not, <shrug>.
>
> There are reliable multicast protocols that can be built
> on top of this.  They allow "consumers" to request
> retransmission of portions of the "broadcast" that they
> may have lost (since the packet may have been dropped
> at their doorstep or anyplace along the way).
>
> With AV use, this gets to be problematic because you
> want to reduce buffering in the consumers, minimize
> latency,

Latency?  Why would you have noticeable latency?  You can start playing  
the media before the buffer is full, then stretch it a bit to allow the  
buffer to catch up.

> etc.  So, the time required to detect a
> missing packet, request a new copy of it and accept
> that replacement copy (there is no guarantee that
> you will receive this in a fixed time period!) conflicts
> with those other goals (assuming you want to avoid
> audio dropouts, video pixelation, etc.).
>
> Remember that any protocol overhead you *add* contributes
> to the problem, to some extent (as it represents more
> network traffic and more processing requirements).
> The "ideal" is just to blast UDP packets down the pipe
> and *pray* they all get caught.


-- 
Gemaakt met Opera's revolutionaire e-mailprogramma:  
http://www.opera.com/mail/
(remove the obvious prefix to reply by mail)

Reply by D Yuniskis ●November 22, 20102010-11-22

Hi Paul,

Paul Keinanen wrote:
> On Tue, 16 Nov 2010 12:38:18 -0700, D Yuniskis
> <not.going.to.be@seen.com> wrote:
> 
>> OK, different usage model than what I was targeting.
>> E.g., consider N (N being large) diskless workstations
>> powering up simultaneously and all wanting to fetch
>> the (identical) image from a single/few server(s).
>> Clearly, a broadcast/reliable multicast scheme would
>> best utilize the network bandwidth (in this case).
> 
> One way would be to break up the message into numbered blocks with
> CRCs and use a simple carousel to repeatedly broadcast those blocks.
> This is used e.g. for firmware updates for TV STBs, in which no return
> channel is available.

The back channel, here, would see very little traffic (when
compared to the forward channel).  So, aside from the requirement
that it places on the "image server", its impact is relatively
small.

I was thinking of a protocol that could offload this "missed block"
portion of the process to peers who *may* have correctly received
the block.  This should fare well when transposed to a mesh
topology -- where your peer may, in fact, be your actual "upstream
link" (so why propagate the request all the way upstream if your
peer can -- and, ultimately *will* -- handle it?)

> Each receiver accumulates blocks and if you did not get all blocks
> during the first cycle, wait for the next carousel cycle to pick the
> missing blocks.
> 
> If there is a return channel, each slave could initially request all
> blocks, after one full cycle check which blocks are missing and only
> request those missing blocks. After each full cycle, the server would
> check all the update requests received during the cycle and drop those
> blocks from the carousel which have not been requested, those speeding
> up the update cycle, finally shutting down the carousel.

Rather than "requesting all blocks", I envision requesting a larger
object (e.g., a file or an entire image).  *Assume* it will arrive
intact at each consumer (concurrently).  Then, handle the missing parts
as I described above.  So, the image server is, effectively, the "peer
of last resort" if no other *true* peer can satisfy the request -- this
might be handled with something as simple as a timeout (i.e., the image
server deliberately ignores these requests for some period of time
to allow "peers" to attempt to satisfy it, instead).

> If the expected error rate is low, the missing blocks could be asked
> even with unicasts.
> 
> If the expected error rate is high, such as in some radio links with
> large blocks, a memory ARQ system could be used, in which blocks
> failing the CRC are stored and if subsequent reception(s) of the same
> block also fail the CRC check, the previously received blocks are
> accumulated, until the accumulated block passes the CRC.

Huh?  Perhaps you meant "until the failing block also passes the CRC"?

> Alternatively, the few missing blocks could be transmitted again with
> a better ECC coding or just send the actual error correction bits to
> be combined with the ordinary received data block bits (assuming
> proper interleaving) at the receiver.

Now you've lost me.  Why change ECC (you've got horsepower on the Rx end
so a "less effective" CRC doesn't really buy you much of anything)?

Reply by D Yuniskis ●November 22, 20102010-11-22

Hi Boudewijn,

Boudewijn Dijkstra wrote:
> Op Tue, 09 Nov 2010 23:57:21 +0100 schreef D Yuniskis 
> <not.going.to.be@seen.com>:
>> Jim Stewart wrote:
>>> D Yuniskis wrote:
>>>> Jim Stewart wrote:
>>>
>>>>> Falling back to the educated guess disclaimer,
>>>>> I'd say the maximum latency is indeterminate.
>>>>>
>>>>> It seems that by definition, that if the multicast
>>>>> packet collides with another packet, the latency
>>>>> will be indeterminate.
>>>>
>>>> That depends on the buffering in the switch. And,
>>>> how the multicast packet is treated *by* the switch.
>>>  Since to the best of my knowledge, in the event of
>>> an ethernet collision, both senders back off a random
>>> amount of time then retransmit, I can't see how the
>>> switch buffering would make any difference.
>>
>> The time a packet (*any* packet) spends buffered in
>> the switch looks like an artificial transport delay
>> (there's really nothing "artificial" about it  :> ).
>> Hence my comment re: "speed of light" delays.
>>
>> When you have multicast traffic, the delay through
>> the switch can vary depending on the historical
>> traffic seen by each targeted port.  I.e., if port A
>> has a packet already buffered/queued while port B
>> does not, then the multicast packet will get *to*
>> the device on port B quicker than on port A.
>>
>> If you have two or more streams and are hoping to
>> impose a temporal relationship on them, you need to
>> know how they will get to their respective consumers.
> 
> Or use RTCP timestamps to synchronize the streams.

But you can only synchronize to the granularity that the
"buffering discrepancy" in the switch allows.

>>> For that matter, does the sender even monitor for
>>> collisions and retransmit in a multicast environment.
>>> I guess I don't know...
>>
>> Multicast is like "shouting from the rooftop -- WITH A
>> DEAF EAR".  If it gets heard, great.  If not, <shrug>.
>>
>> There are reliable multicast protocols that can be built
>> on top of this.  They allow "consumers" to request
>> retransmission of portions of the "broadcast" that they
>> may have lost (since the packet may have been dropped
>> at their doorstep or anyplace along the way).
>>
>> With AV use, this gets to be problematic because you
>> want to reduce buffering in the consumers, minimize
>> latency,
> 
> Latency?  Why would you have noticeable latency?

Each consumer would need to be designed with a "deep enough"
buffer to be able to handle any dropped packets, short-term
network overload, etc.  I.e., if, statistically, it requires
T time to restart/resume an interrupted stream then your
buffer has to be able to maintain the integrity of the "A/V
signal" for that entire duration (else the failure to do so becomes
a "noticeable event" to the user).

Consider that some causes of "missed packets" can be system-wide.
I.e., *many* nodes could have lost the same packet -- or, packets
adjacent (temporally).  In that case, multiple (N) retransmission
requests can be destined for the server simultaneously.  If
those are processed as unicast requests, then multiple packet times
(N) may elapse before a particular node's request is satisfied.

> You can start playing 
> the media before the buffer is full, then stretch it a bit to allow the 
> buffer to catch up.

With video, the user can *tolerate* (though not *enjoy*!) the
occasional "frozen frame" -- as long as it doesn't become
frequent. (persistence of vision)

With audio, it's much harder to span any gaps.  You can't just
replay the past T of the audio stream without it *really*
being noticeable.

You also have to be able to re-synchronize the streams *after*
the "dropout" (imagine two displays/speakers side by side;
the image/sound from each must be in phase for a "pleasing
A/V experience"  :>).  So, you can't just "stretch time"
to span the dropout.

>> etc.  So, the time required to detect a
>> missing packet, request a new copy of it and accept
>> that replacement copy (there is no guarantee that
>> you will receive this in a fixed time period!) conflicts
>> with those other goals (assuming you want to avoid
>> audio dropouts, video pixelation, etc.).
>>
>> Remember that any protocol overhead you *add* contributes
>> to the problem, to some extent (as it represents more
>> network traffic and more processing requirements).
>> The "ideal" is just to blast UDP packets down the pipe
>> and *pray* they all get caught.

Previous 1 23Next

Multicasting and Switches

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group