Re: The new sysctl and socket option added for PLPMTUD (RFC8899)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On 6. Jul 2021, at 18:01, Xin Long <lucien.xin@xxxxxxxxx> wrote:
> 
> On Tue, Jul 6, 2021 at 5:13 AM Timo Völker <timo.voelker@xxxxxxxxxxxxxx> wrote:
>> 
>> 
>> Hi Xin,
>> 
>> I implemented RFC8899 for an SCTP simulation model.
> great, can I know what that one is?

I used the SCTP implementation in INET. INET is a simulation model suite for OMNeT++.

> 
>> 
>> Comments follow inline.
>> 
>>> Begin forwarded message:
>>> 
>>> From: Xin Long <lucien.xin@xxxxxxxxx>
>>> Subject: Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
>>> Date: 12. June 2021 at 19:32:02 CEST
>>> To: Michael Tuexen <tuexen@xxxxxxxxxxx>
>>> Cc: "linux-sctp @ vger . kernel . org" <linux-sctp@xxxxxxxxxxxxxxx>, Marcelo Ricardo Leitner <marcelo.leitner@xxxxxxxxx>
>>> 
>>> On Fri, Jun 11, 2021 at 4:42 PM <tuexen@xxxxxxxxxxx> wrote:
>>>> 
>>>>> On 11. Jun 2021, at 22:20, Xin Long <lucien.xin@xxxxxxxxx> wrote:
>>>>> 
>>>>> Hi, Michael,
>>>>> 
>>>>> In the linux implementation of RFC8899, we decided to introduce one
>>>>> sysctl and one socket option for users to set up the PLPMUTD probe:
>>>>> 
>>>>> 1. sysctl -w net.sctp.plpmtud_probe_interval=1
>>>>> 
>>>>> plpmtud_probe_interval - INTEGER
>>>>>      The interval (in milliseconds) between PLPMTUD probe chunks. These
>>>>>      chunks are sent at the specified interval with a variable size to
>>>>>      probe the mtu of a given path between 2 associations. PLPMTUD will
>>>> I guess you mean "between 2 end points" instead of "between 2 associations".
>>>> 
>>>> I'm not sure what it means:
>>>> 
>>>> I assume, you have candidate 1400, 1420, 1460, 1480, and 1500.
>>>> 
>>>> Assume you sent a probe packet for 1400. Aren't you sending the
>>>> probe packet for 1420 as soon as you get an ACK for the probe packet
>>>> of size 1400? Or are you waiting for plpmtud_probe_interval ms?
>>> It will wait for "plpmtud_probe_interval" ms in searching state, but in
>>> searching complete it will be "plpmtud_probe_interval * 30" ms.
>> 
>> Does this mean you always wait for plpmtud_probe_interval ms? Even if you receive an ack for a probe packet or a PTB?
>> 
>> In my implementation, I start with the next probe immediately when receiving an ack or PTB.
> yeah, we should do it immediately to make this more efficient, and I
> already fixed it in linux for ACK.
> 
> For PTB, I currently only set probe_size as the pmtu from ICMP packet
> when pmtu > 'current pmtu' && pmtu < probe_size, and wait until next
> probe_timer. But probably better to send it immediately too, I need to
> confirm.

I think so. At least I don't know what to wait for.

> 
>> 
>>> 
>>> The step we are using is 32, when it fails, we turn the step to 4. For example:
>>> 1400, 1432, 1464, 1496, 1528 (failed), 1500(1496 + 4), 1504(failed,
>>> 1500 is the PMTU).
>> 
>> What does failed mean? Does it mean that you have sent MAX_PROBES (=3?) probe packets and waited for each plpmtud_probe_interval ms without receiving a response?
> yes
> 
>> 
>> If so, it might make sense to continue with smaller candidates earlier. For example, after one unanswered probe packet.
> Sounds a good way to go, and it would save 2 intervals to get the
> optimal value in the normal case.
> But if the failure is false (like the link is unstable), it may also
> take some time to catch up to the bigger candidate.

Right, it's a trade off. What is better depends on the probability of a probe packet loss due to another reason than its size.

I chose to do something like this, when searching for a PMTU of 1472:

1400 ack
1432 ack
1464 timeout (false negative)
1436 ack
1440 ack
1444 ack
1448 ack
1452 ack
1456 ack
1460 ack
1464 ack
1496 timeout
1468 ack
1472 ack
1476 timeout
1476 timeout
1476 timeout
done with PMTU=1472

> 
>> 
>>> 
>>> Sorry, "sysctl -w net.sctp.plpmtud_probe_interval=1" won't work.
>>> As plpmtud_probe_interval is the probe interval TIME for the timer.
>>> Apart from 0, the minimal value is 5000ms.
>>> 
>>> So it should be:
>>> 
>>> plpmtud_probe_interval - INTEGER
>>>       The time interval (in milliseconds) for sending PLPMTUD probe chunks.
>>>       These chunks are sent at the specified interval with a variable size
>>>       to probe the mtu of a given path between 2 endpoints. PLPMTUD will
>>>       be disabled when 0 is set.
>>> 
>>>       Default: 0
>> 
>> What do you mean with probe chunks? You are sending probe *packets* containing a HEARTBEAT and a PAD chunk, right?
> yes.
> 
>> 
>> RFC8899 contains:
>> The PROBE_TIMER is configured to expire after a period longer than the maximum time to receive an acknowledgment to a probe packet.
>> 
>> So, how about plpmtud_probe_max_ack_time?
> "plpmtud_probe_interval" I got the name from tcp's sysctl plpmtud in
> linux. I was hoping to keep this consistent in sysctl and sockopt
> between Linux and BSD.  Note this parameter is also the interval to
> send a probe for the current pmtu in Search Complete status.

Do you send probe packets in Search Complete to confirm the current PMTU estimation?

RFC8899 suggests to do this only for non-reliable PLs. For a reliable PL like SCTP, it suggests to use the loss of (data) packets as indication instead.

> 
>> 
>> Also, I think more parameters would be helpful. For example,
>> 
>> plpmtud_enable - boolean to control whether to use PLPMTUD (it is more explicit than plpmtud_probe_interval=0 or plpmtud_probe_max_ack_time=0)
>> plpmtud_max_probes - controls the number of probe packets sent for one candidate.
>> plpmtud_raise_time - time to wait before probing for a larger PMTU in search complete (0 to disable it).
>> plpmtud_use_ptb - boolean to control whether to process an ICMP PTB.
> With these, the control will be more detailed for sure.
> But I didn't want to introduce too many parameters for this feature,
> as you know, these parameters could also be per socket/asoc/transport,
> and doing set/get with sockopt.
> 
> instead, we keep most fixed:
> 
> plpmtud_use_ptb = 1
> plpmtud_raise_time = 30 * plpmtud_probe_max_ack_time(plpmtud_probe_interval)
> plpmtud_max_probes = 3
> plpmtud_enable = !! plpmtud_probe_interval
> 
> Only one variable:
> plpmtud_probe_interval >= 5000ms

OK

> 
> So I think this is up to the implementation, if you want more things
> to tune, you can go ahead with these all parameters exposed to users.

Agree. It is probably a good idea to add not too much parameters.

> 
>> 
>> Timo
>> 
>>> 
>>> Thanks.
>>>>>      be disabled when 0 is set.
>>>>> 
>>>>>      Default: 0
>>>>> 
>>>>> 2. a socket option that can be used per socket, assoc or transport
>>>>> 
>>>>> /* PLPMTUD Probe Interval socket option */
>>>>> struct sctp_probeinterval {
>>>>>      sctp_assoc_t spi_assoc_id;
>>>>>      struct sockaddr_storage spi_address;
>>>>>      __u32 spi_interval;
>>>>> };
>>>>> 
>>>>> #define SCTP_PLPMTUD_PROBE_INTERVAL    133
>>>>> 
>>>>> 
>>>>> The value above will enable/disable the PLPMUTD probe by setting up the probe
>>>>> interval for the timer. When it's 0, the timer will also stop and
>>>>> PLPMUTD is disabled.
>>>>> By this way, we don't need to introduce more options.
>>>> OK.
>>>>> 
>>>>> We're expecting to keep consistent with BSD on this, pls check and
>>>>> share your thoughts.
>>>> Looks good to me.
>>>> 
>>>> Best regards
>>>> Michael
>>>>> 
>>>>> Thanks.
>>>> 
>> 
>> 

Attachment: smime.p7s
Description: S/MIME cryptographic signature


[Index of Archives]     [Linux Networking Development]     [Linux OMAP]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     SCTP

  Powered by Linux