On Wed, May 30, 2018 at 1:06 AM, Marcelo Ricardo Leitner <marcelo.leitner@xxxxxxxxx> wrote: > On Tue, May 29, 2018 at 12:03:46PM -0400, Neal Cardwell wrote: >> On Tue, May 29, 2018 at 11:45 AM Marcelo Ricardo Leitner < >> marcelo.leitner@xxxxxxxxx> wrote: >> > - patch2 - fix rtx attack vector >> > - Add the floor value to rto_min to HZ/20 (which fits the values >> > that Michael shared on the other email) >> >> I would encourage allowing minimum RTO values down to 5ms, if the ACK >> policy in the receiver makes this feasible. Our experience is that in >> datacenter environments it can be advantageous to allow timer-based loss >> recoveries using timeout values as low as 5ms, e.g.: > > Thanks Neal. On Xin's tests, the hearbeat timer becomes an issue at > ~25ms already. Xin, can you share more details on the hw, which CPU > was used? It was on a KVM guest, "-smp 2,cores=1,threads=1,sockets=2" # lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 2 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 13 Model name: QEMU Virtual CPU version 1.5.3 Stepping: 3 CPU MHz: 2397.222 BogoMIPS: 4794.44 Hypervisor vendor: KVM Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 4096K NUMA node0 CPU(s): 0,1 Flags: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl cpuid pni cx16 hypervisor lahf_lm abm pti If we're counting on max_t to fix this CPU stuck. It should not that matter if min rto < the value causing that stuck. > > Anyway, what about we add a floor to rto_max too, so that RTO can > actually grow into something bigger that don't hog the CPU? Like: > rto_min floor = 5ms > rto_max floor = 50ms > > Marcelo -- To unsubscribe from this list: send the line "unsubscribe linux-sctp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html