Re: [PATCHv2 net-next 00/14] sctp: implement RFC8899: Packetization Layer Path MTU Discovery for SCTP transport

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 23, 2021 at 5:50 AM David Laight <David.Laight@xxxxxxxxxx> wrote:
>
> From: Xin Long
> > Sent: 23 June 2021 04:49
> ...
> > [103] pl_send: PLPMTUD: state: 1, size: 1200, high: 0 <--[a]
> > [103] pl_recv: PLPMTUD: state: 1, size: 1200, high: 0
> ...
> > [103] pl_send: PLPMTUD: state: 2, size: 1456, high: 0
> > [103] pl_recv: PLPMTUD: state: 2, size: 1456, high: 0  <--[b]
> > [103] pl_send: PLPMTUD: state: 2, size: 1488, high: 0
> > [108] pl_send: PLPMTUD: state: 2, size: 1488, high: 0
> > [113] pl_send: PLPMTUD: state: 2, size: 1488, high: 0
> > [118] pl_send: PLPMTUD: state: 2, size: 1488, high: 0
> > [118] pl_recv: PLPMTUD: state: 2, size: 1456, high: 1488 <---[c]
> > [118] pl_send: PLPMTUD: state: 2, size: 1460, high: 1488
> > [118] pl_recv: PLPMTUD: state: 2, size: 1460, high: 1488 <--- [d]
> > [118] pl_send: PLPMTUD: state: 2, size: 1464, high: 1488
> > [124] pl_send: PLPMTUD: state: 2, size: 1464, high: 1488
> > [129] pl_send: PLPMTUD: state: 2, size: 1464, high: 1488
> > [134] pl_send: PLPMTUD: state: 2, size: 1464, high: 1488
> > [134] pl_recv: PLPMTUD: state: 2, size: 1460, high: 1464 <-- around
> > 30s "search complete from 1200 bytes"
> > [287] pl_send: PLPMTUD: state: 3, size: 1460, high: 0
> > [287] pl_recv: PLPMTUD: state: 3, size: 1460, high: 0
> > [287] pl_send: PLPMTUD: state: 2, size: 1464, high: 0 <-- [aa]
> > [292] pl_send: PLPMTUD: state: 2, size: 1464, high: 0
> > [298] pl_send: PLPMTUD: state: 2, size: 1464, high: 0
> > [303] pl_send: PLPMTUD: state: 2, size: 1464, high: 0
> > [303] pl_recv: PLPMTUD: state: 2, size: 1460, high: 1464  <--[bb]  <--
> > around 15s "re-search complete from current pmtu"
> >
> > So since no interval to send the next probe when the ACK is received
> > for the last one,
> > it won't take much time from [a] to [b], and [c] to [d],
> > and there are at most 2 failures to find the right pmtu, each failure
> > takes 5s * 3 = 15s.
> >
> > when it goes back to search from search complete after a long timeout,
> > it will take only 1 failure to get the right pmtu from [aa] to [bb].
>
> What mtu is being used during the 'failures' ?
> I hope it is the last working one.
Yes, it's always the working one, which was set in search complete state.
More specifically, it changes in 3 cases:
a) at the beginning, it's using the dst->mtu;
b) set to the optimal one in search complete state after searching is done.
c) 'black hole' found, it sets to 1200, and starts to probe from 1200.
d) if still fails with 1200, the mtu is set to MIN_PLPMTU, but still
probes with 1200 until it succeeds.

>
> Also, what actually happen if the network route changes from
> one that supports 1460 bytes to one that only supports 1200
> and where ICMP errors are not generated?
then it's the case c) and d) above.

For the "black hole" detection, I'd like to probe it with the current
mtu and 3 times (15s in total)

>
> The first protocol retry is (probably) after 2 seconds.
> But it will use the 1460 byte mtu and fail again.
yeah, but 15s is the time we have to wait for confirming this is
caused by really the mtu change.

>
> Notwithstanding the standards, what pmtu actually exist
> 'in the wild' for normal networks?
PLPMTUD is trying to probe the max payload of SCTP, not
the real pmtu.

> Are there actually any others apart from 'full sized ethernet'
> and 'PPPoE'?
sctp over UDP if that's what you mean.
With the probe, we don't really care about the outer header, because
what we get with the probe is the real available size for sctp payload.

Like we probe with 1400, if it gets ACKed from the peer, the payload 1400
will be able to go though the path.

> So would it actually better to send two probes one for
> each of those two sizes and see which ones respond?

we could only if regardless of the RFC8899:

   "To avoid excessive load, the interval between individual probe
   packets MUST be at least one RTT..."

>
> (I'm not sure we ever manage to send full length packets.
> Our data is M3UA (mostly SMS) and sent with Nagle disabled.
> So even the customers sending 1000s of SMS/sec are unlikely
> to fill packets.)
>
>         David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)



[Index of Archives]     [Linux Networking Development]     [Linux OMAP]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux