Re: [PATCH net-next] ipvs: allow connection reuse for unconfirmed conntrack

Simon Horman <horms@xxxxxxxxxxxx> · Thu, 2 Jul 2020 11:18:29 +0200

On Wed, Jul 01, 2020 at 06:17:19PM +0300, Julian Anastasov wrote:
> YangYuxi is reporting that connection reuse
> is causing one-second delay when SYN hits
> existing connection in TIME_WAIT state.
> Such delay was added to give time to expire
> both the IPVS connection and the corresponding
> conntrack. This was considered a rare case
> at that time but it is causing problem for
> some environments such as Kubernetes.
> 
> As nf_conntrack_tcp_packet() can decide to
> release the conntrack in TIME_WAIT state and
> to replace it with a fresh NEW conntrack, we
> can use this to allow rescheduling just by
> tuning our check: if the conntrack is
> confirmed we can not schedule it to different
> real server and the one-second delay still
> applies but if new conntrack was created,
> we are free to select new real server without
> any delays.
> 
> YangYuxi lists some of the problem reports:
> 
> - One second connection delay in masquerading mode:
> https://marc.info/?t=151683118100004&r=1&w=2
> 
> - IPVS low throughput #70747
> https://github.com/kubernetes/kubernetes/issues/70747
> 
> - Apache Bench can fill up ipvs service proxy in seconds #544
> https://github.com/cloudnativelabs/kube-router/issues/544
> 
> - Additional 1s latency in `host -> service IP -> pod`
> https://github.com/kubernetes/kubernetes/issues/90854
> 
> Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack")
> Co-developed-by: YangYuxi <yx.atom1@xxxxxxxxx>
> Signed-off-by: YangYuxi <yx.atom1@xxxxxxxxx>
> Signed-off-by: Julian Anastasov <ja@xxxxxx>

Thanks, this looks good to me.

Reviewed-by: Simon Horman <horms@xxxxxxxxxxxx>

Pablo, could you consider applying this to nf-next?