Hello, On Sat, 9 Apr 2016, Marco Angaroni wrote: > When using OPS mode in conjunction with SIP persistent-engine, packets > originating from the same ip-address/port could be balanced to different > real servers, and (to properly handle SIP responses) OPS connections > are created in the in-out direction too, where ip_vs_update_conntrack() > is called to modify the reply tuple. > > As a result, there can be collision of conntrack tuples, causing random > packet drops, as explained below: > > conntrack1: orig=CIP->VIP, reply=RIP1->CIP > conntrack2: orig=RIP2->CIP, reply=CIP->VIP > > Tuple CIP->VIP is both in orig of conntrack1 and reply of conntrack2. > The collision triggers packet drop inside nf_conntrack processing. > > In addition, the current implementation deletes the conntrack object at > every expire of an OPS connection (once every forwarded packet), to have > it recreated from scratch at next packet traversing IPVS. > > Since in OPS mode, by definition, we don't expect any associated > response, the choices implemented in this patch are: > a) don't call nf_conntrack_alter_reply() for OPS connections inside > ip_vs_update_conntrack(). > b) don't delete the conntrack object at OPS connection expire. > > The result is that created conntrack objects for each tuple CIP->VIP, > RIP-N->CIP, etc. are left in UNREPLIED state and not modified by IPVS > OPS connection management. This eliminates packet drops and leaves > a single conntrack object for each tuple packets are sent from. > > Signed-off-by: Marco Angaroni <marcoangaroni@xxxxxxxxx> Thanks! Looks good to me. Simon, please apply. Signed-off-by: Julian Anastasov <ja@xxxxxx> > --- > net/netfilter/ipvs/ip_vs_conn.c | 3 ++- > net/netfilter/ipvs/ip_vs_nfct.c | 4 ++++ > 2 files changed, 6 insertions(+), 1 deletion(-) > > diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c > index dd75d41..292365f 100644 > --- a/net/netfilter/ipvs/ip_vs_conn.c > +++ b/net/netfilter/ipvs/ip_vs_conn.c > @@ -836,7 +836,8 @@ static void ip_vs_conn_expire(unsigned long data) > if (cp->control) > ip_vs_control_del(cp); > > - if (cp->flags & IP_VS_CONN_F_NFCT) { > + if ((cp->flags & IP_VS_CONN_F_NFCT) && > + !(cp->flags & IP_VS_CONN_F_ONE_PACKET)) { > /* Do not access conntracks during subsys cleanup > * because nf_conntrack_find_get can not be used after > * conntrack cleanup for the net. > diff --git a/net/netfilter/ipvs/ip_vs_nfct.c b/net/netfilter/ipvs/ip_vs_nfct.c > index 30434fb..f04fd8d 100644 > --- a/net/netfilter/ipvs/ip_vs_nfct.c > +++ b/net/netfilter/ipvs/ip_vs_nfct.c > @@ -93,6 +93,10 @@ ip_vs_update_conntrack(struct sk_buff *skb, struct ip_vs_conn *cp, int outin) > if (IP_VS_FWD_METHOD(cp) != IP_VS_CONN_F_MASQ) > return; > > + /* Never alter conntrack for OPS conns (no reply is expected) */ > + if (cp->flags & IP_VS_CONN_F_ONE_PACKET) > + return; > + > /* Alter reply only in original direction */ > if (CTINFO2DIR(ctinfo) != IP_CT_DIR_ORIGINAL) > return; > -- > 1.8.3.1 Regards -- Julian Anastasov <ja@xxxxxx> -- To unsubscribe from this list: send the line "unsubscribe lvs-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html