Re: [PATCH net v2 RESEND] netfilter: fix conntrack flows stuck issue on cleanup.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Hi,
> 
> On Wed, Nov 03, 2021 at 11:31:36AM +0200, Volodymyr Mytnyk wrote:
> > From: Volodymyr Mytnyk <vmytnyk@xxxxxxxxxxx>
> > 
> > On busy system with big number (few thousands) of HW offloaded flows, it
> > is possible to hit the situation, where some of the conntack flows are
> > stuck in conntrack table (as offloaded) and cannot be removed by user.
> > 
> > This behaviour happens if user has configured conntack using tc sub-system,
> > offloaded those flows for HW and then deleted tc configuration from Linux
> > system by deleting the tc qdiscs.
> > 
> > When qdiscs are removed, the nf_flow_table_free() is called to do the
> > cleanup of HW offloaded flows in conntrack table.
> > 
> > ...
> > process_one_work
> >   tcf_ct_flow_table_cleanup_work()
> >     nf_flow_table_free()
> > 
> > The nf_flow_table_free() does the following things:
> > 
> >   1. cancels gc workqueue
> >   2. marks all flows as teardown
> >   3. executes nf_flow_offload_gc_step() once for each flow to
> >      trigger correct teardown flow procedure (e.g., allocate
> >      work to delete the HW flow and marks the flow as "dying").
> >   4. waits for all scheduled flow offload works to be finished.
> >   5. executes nf_flow_offload_gc_step() once for each flow to
> >      trigger the deleting of flows.
> > 
> > Root cause:
> > 
> > In step 3, nf_flow_offload_gc_step() expects to move flow to "dying"
> > state by using nf_flow_offload_del() and deletes the flow in next
> > nf_flow_offload_gc_step() iteration. But, if flow is in "pending" state
> > for some reason (e.g., reading HW stats), it will not be moved to
> > "dying" state as expected by nf_flow_offload_gc_step() and will not
> > be marked as "dead" for delition.
> > 
> > In step 5, nf_flow_offload_gc_step() assumes that all flows marked
> > as "dead" and will be deleted by this call, but this is not true since
> > the state was not set diring previous nf_flow_offload_gc_step()
> > call.
> > 
> > It issue causes some of the flows to get stuck in connection tracking
> > system or not release properly.
> > 
> > To fix this problem, add nf_flow_table_offload_flush() call between 2 & 3
> > step, to make sure no other flow offload works will be in "pending" state
> > during step 3.
> 
> Thanks for the detailed report.
> 
> I'm attaching two patches, the first one is a preparation patch. The
> second patch flushes the pending work, then it sets the teardown flag
> to all flows in the flowtable and it forces a garbage collector run to
> queue work to remove the flows from hardware, then it flushes this new
> pending work and (finally) it forces another garbage collector run to
> remove the entry from the software flowtable. Compile-tested only.

Hi Pablo,

	Thanks for reviewing the changes and problem investigation.

I will check the provided patches and will back to you.

Regards,
  Volodymyr



[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux