On Mon, May 11, 2020 at 11:32:36AM +0300, Paul Blakey wrote: > On 5/11/2020 1:14 AM, Pablo Neira Ayuso wrote: [...] > >> @@ -831,9 +832,14 @@ static void flow_offload_queue_work(struct flow_offload_work *offload) > >> { > >> struct flow_offload_work *offload; > >> > >> + if (test_and_set_bit(NF_FLOW_HW_PENDING, &flow->flags)) > >> + return NULL; > > In case of stats, it's fine to lose work. > > > > But how does this work for the deletion case? Does this falls back to > > the timeout deletion? > > We get to nf_flow_table_offload_del (delete) in these cases: > > >-------if (nf_flow_has_expired(flow) || nf_ct_is_dying(flow->ct) || > >------- test_bit(NF_FLOW_TEARDOWN, &flow->flags) { > >------->------- .... > >------->------- nf_flow_offload_del(flow_table, flow); > > Which are all persistent once set but the nf_flow_has_expired(flow). So we will > try the delete > again and again till pending flag is unset or the flow is 'saved' by the already > queued stats updating the timeout. > A pending stats update can't save the flow once it's marked for teardown or > (flow->ct is dying), only delay it. Thanks for explaining. > We didn't mention flush, like in table free. I guess we need to flush the > hardware workqueue > of any pending stats work, then queue the deletion, and flush again: > Adding nf_flow_table_offload_flush(flow_table), after > cancel_delayed_work_sync(&flow_table->gc_work); The "flush" makes sure that stats work runs before the deletion, to ensure no races happen for in-transit work objects, right? We might use alloc_ordered_workqueue() and let the workqueue handle this problem?