Re: The new sysctl and socket option added for PLPMTUD (RFC8899)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 7, 2021 at 8:36 AM Timo Völker <timo.voelker@xxxxxxxxxxxxxx> wrote:
>
> > On 6. Jul 2021, at 18:01, Xin Long <lucien.xin@xxxxxxxxx> wrote:
> >
> > On Tue, Jul 6, 2021 at 5:13 AM Timo Völker <timo.voelker@xxxxxxxxxxxxxx> wrote:
> >>
> >>
> >> Hi Xin,
> >>
> >> I implemented RFC8899 for an SCTP simulation model.
> > great, can I know what that one is?
>
> I used the SCTP implementation in INET. INET is a simulation model suite for OMNeT++.
Thanks.

>
> >
> >>
> >> Comments follow inline.
> >>
> >>> Begin forwarded message:
> >>>
> >>> From: Xin Long <lucien.xin@xxxxxxxxx>
> >>> Subject: Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
> >>> Date: 12. June 2021 at 19:32:02 CEST
> >>> To: Michael Tuexen <tuexen@xxxxxxxxxxx>
> >>> Cc: "linux-sctp @ vger . kernel . org" <linux-sctp@xxxxxxxxxxxxxxx>, Marcelo Ricardo Leitner <marcelo.leitner@xxxxxxxxx>
> >>>
> >>> On Fri, Jun 11, 2021 at 4:42 PM <tuexen@xxxxxxxxxxx> wrote:
> >>>>
> >>>>> On 11. Jun 2021, at 22:20, Xin Long <lucien.xin@xxxxxxxxx> wrote:
> >>>>>
> >>>>> Hi, Michael,
> >>>>>
> >>>>> In the linux implementation of RFC8899, we decided to introduce one
> >>>>> sysctl and one socket option for users to set up the PLPMUTD probe:
> >>>>>
> >>>>> 1. sysctl -w net.sctp.plpmtud_probe_interval=1
> >>>>>
> >>>>> plpmtud_probe_interval - INTEGER
> >>>>>      The interval (in milliseconds) between PLPMTUD probe chunks. These
> >>>>>      chunks are sent at the specified interval with a variable size to
> >>>>>      probe the mtu of a given path between 2 associations. PLPMTUD will
> >>>> I guess you mean "between 2 end points" instead of "between 2 associations".
> >>>>
> >>>> I'm not sure what it means:
> >>>>
> >>>> I assume, you have candidate 1400, 1420, 1460, 1480, and 1500.
> >>>>
> >>>> Assume you sent a probe packet for 1400. Aren't you sending the
> >>>> probe packet for 1420 as soon as you get an ACK for the probe packet
> >>>> of size 1400? Or are you waiting for plpmtud_probe_interval ms?
> >>> It will wait for "plpmtud_probe_interval" ms in searching state, but in
> >>> searching complete it will be "plpmtud_probe_interval * 30" ms.
> >>
> >> Does this mean you always wait for plpmtud_probe_interval ms? Even if you receive an ack for a probe packet or a PTB?
> >>
> >> In my implementation, I start with the next probe immediately when receiving an ack or PTB.
> > yeah, we should do it immediately to make this more efficient, and I
> > already fixed it in linux for ACK.
> >
> > For PTB, I currently only set probe_size as the pmtu from ICMP packet
> > when pmtu > 'current pmtu' && pmtu < probe_size, and wait until next
> > probe_timer. But probably better to send it immediately too, I need to
> > confirm.
>
> I think so. At least I don't know what to wait for.
I'm not sure about this, as it says:

   PLPMTU < PL_PTB_SIZE < PROBED_SIZE
   ...
      *  The PL can use the reported PL_PTB_SIZE from the PTB message as
         the next search point when it resumes the search algorithm.

it doesn't seem to mean that.


>
> >
> >>
> >>>
> >>> The step we are using is 32, when it fails, we turn the step to 4. For example:
> >>> 1400, 1432, 1464, 1496, 1528 (failed), 1500(1496 + 4), 1504(failed,
> >>> 1500 is the PMTU).
> >>
> >> What does failed mean? Does it mean that you have sent MAX_PROBES (=3?) probe packets and waited for each plpmtud_probe_interval ms without receiving a response?
> > yes
> >
> >>
> >> If so, it might make sense to continue with smaller candidates earlier. For example, after one unanswered probe packet.
> > Sounds a good way to go, and it would save 2 intervals to get the
> > optimal value in the normal case.
> > But if the failure is false (like the link is unstable), it may also
> > take some time to catch up to the bigger candidate.
>
> Right, it's a trade off. What is better depends on the probability of a probe packet loss due to another reason than its size.
>
> I chose to do something like this, when searching for a PMTU of 1472:
>
> 1400 ack
> 1432 ack
> 1464 timeout (false negative)
> 1436 ack
> 1440 ack
> 1444 ack
> 1448 ack
> 1452 ack
> 1456 ack
> 1460 ack
> 1464 ack
> 1496 timeout
> 1468 ack
> 1472 ack
> 1476 timeout
> 1476 timeout
> 1476 timeout
> done with PMTU=1472
Looks good to me. :-)

>
> >
> >>
> >>>
> >>> Sorry, "sysctl -w net.sctp.plpmtud_probe_interval=1" won't work.
> >>> As plpmtud_probe_interval is the probe interval TIME for the timer.
> >>> Apart from 0, the minimal value is 5000ms.
> >>>
> >>> So it should be:
> >>>
> >>> plpmtud_probe_interval - INTEGER
> >>>       The time interval (in milliseconds) for sending PLPMTUD probe chunks.
> >>>       These chunks are sent at the specified interval with a variable size
> >>>       to probe the mtu of a given path between 2 endpoints. PLPMTUD will
> >>>       be disabled when 0 is set.
> >>>
> >>>       Default: 0
> >>
> >> What do you mean with probe chunks? You are sending probe *packets* containing a HEARTBEAT and a PAD chunk, right?
> > yes.
> >
> >>
> >> RFC8899 contains:
> >> The PROBE_TIMER is configured to expire after a period longer than the maximum time to receive an acknowledgment to a probe packet.
> >>
> >> So, how about plpmtud_probe_max_ack_time?
> > "plpmtud_probe_interval" I got the name from tcp's sysctl plpmtud in
> > linux. I was hoping to keep this consistent in sysctl and sockopt
> > between Linux and BSD.  Note this parameter is also the interval to
> > send a probe for the current pmtu in Search Complete status.
>
> Do you send probe packets in Search Complete to confirm the current PMTU estimation?
>
> RFC8899 suggests to do this only for non-reliable PLs. For a reliable PL like SCTP, it suggests to use the loss of (data) packets as indication instead.
Can you point out the place in RFC8899 saying so?

What I saw is:

   Search Complete:  The Search Complete Phase is entered when the
      PLPMTU is supported across the network path.  A PL can use a
      CONFIRMATION_TIMER to periodically repeat a probe packet for the
      current PLPMTU size.  If the sender is unable to confirm
      reachability (e.g., if the CONFIRMATION_TIMER expires) or the PL
      signals a lack of reachability, a black hole has been detected and
      DPLPMTUD enters the Base Phase.

it desn't matter if it's a reliable or non-reliable PL, no?

>
> >
> >>
> >> Also, I think more parameters would be helpful. For example,
> >>
> >> plpmtud_enable - boolean to control whether to use PLPMTUD (it is more explicit than plpmtud_probe_interval=0 or plpmtud_probe_max_ack_time=0)
> >> plpmtud_max_probes - controls the number of probe packets sent for one candidate.
> >> plpmtud_raise_time - time to wait before probing for a larger PMTU in search complete (0 to disable it).
> >> plpmtud_use_ptb - boolean to control whether to process an ICMP PTB.
> > With these, the control will be more detailed for sure.
> > But I didn't want to introduce too many parameters for this feature,
> > as you know, these parameters could also be per socket/asoc/transport,
> > and doing set/get with sockopt.
> >
> > instead, we keep most fixed:
> >
> > plpmtud_use_ptb = 1
> > plpmtud_raise_time = 30 * plpmtud_probe_max_ack_time(plpmtud_probe_interval)
> > plpmtud_max_probes = 3
> > plpmtud_enable = !! plpmtud_probe_interval
> >
> > Only one variable:
> > plpmtud_probe_interval >= 5000ms
>
> OK
>
> >
> > So I think this is up to the implementation, if you want more things
> > to tune, you can go ahead with these all parameters exposed to users.
>
> Agree. It is probably a good idea to add not too much parameters.
>
> >
> >>
> >> Timo
> >>
> >>>
> >>> Thanks.
> >>>>>      be disabled when 0 is set.
> >>>>>
> >>>>>      Default: 0
> >>>>>
> >>>>> 2. a socket option that can be used per socket, assoc or transport
> >>>>>
> >>>>> /* PLPMTUD Probe Interval socket option */
> >>>>> struct sctp_probeinterval {
> >>>>>      sctp_assoc_t spi_assoc_id;
> >>>>>      struct sockaddr_storage spi_address;
> >>>>>      __u32 spi_interval;
> >>>>> };
> >>>>>
> >>>>> #define SCTP_PLPMTUD_PROBE_INTERVAL    133
> >>>>>
> >>>>>
> >>>>> The value above will enable/disable the PLPMUTD probe by setting up the probe
> >>>>> interval for the timer. When it's 0, the timer will also stop and
> >>>>> PLPMUTD is disabled.
> >>>>> By this way, we don't need to introduce more options.
> >>>> OK.
> >>>>>
> >>>>> We're expecting to keep consistent with BSD on this, pls check and
> >>>>> share your thoughts.
> >>>> Looks good to me.
> >>>>
> >>>> Best regards
> >>>> Michael
> >>>>>
> >>>>> Thanks.
> >>>>
> >>
> >>
>




[Index of Archives]     [Linux Networking Development]     [Linux OMAP]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     SCTP

  Powered by Linux