On Tue 24 Jan 2023 at 11:19, Vlad Buslov <vladbu@xxxxxxxxxx> wrote: > On Tue 24 Jan 2023 at 09:41, Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: >> Hi Vlad, >> >> On Tue, Jan 24, 2023 at 09:06:13AM +0200, Vlad Buslov wrote: >>> >>> On Fri 20 Jan 2023 at 12:41, Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: >>> > Hi Vlad, >>> > >>> > On Thu, Jan 19, 2023 at 08:51:01PM +0100, Vlad Buslov wrote: >>> >> Following patches in series need to update flowtable rule several times >>> >> during its lifetime in order to synchronize hardware offload with actual ct >>> >> status. However, reusing existing 'refresh' logic in act_ct would cause >>> >> data path to potentially schedule significant amount of spurious tasks in >>> >> 'add' workqueue since it is executed per-packet. Instead, introduce a new >>> >> flow 'update' flag and use it to schedule async flow refresh in flowtable >>> >> gc which will only be executed once per gc iteration. >>> > >>> > So the idea is to use a NF_FLOW_HW_UPDATE which triggers the update >>> > from the garbage collector. I understand the motivation here is to >>> > avoid adding more work to the workqueue, by simply letting the gc >>> > thread pick up for the update. >>> > >>> > I already proposed in the last year alternative approaches to improve >>> > the workqueue logic, including cancelation of useless work. For >>> > example, cancel a flying "add" work if "delete" just arrive and the >>> > work is still sitting in the queue. Same approach could be use for >>> > this update logic, ie. cancel an add UDP unidirectional or upgrade it >>> > to bidirectional if, by the time we see traffic in both directions, >>> > then work is still sitting in the queue. >>> >>> Thanks for the suggestion. I'll try to make this work over regular >>> workqueues without further extending the flow flags and/or putting more >>> stuff into gc. >> >> Let me make a second pass to sort out thoughts on this. >> >> Either we use regular workqueues (without new flags) or we explore >> fully consolidating this hardware offload workqueue infrastructure >> around flags, ie. use flags not only for update events, but also for >> new and delete. >> >> This would go more in the direction of your _UPDATE flag idea: >> >> - Update the hardware offload workqueue to iterate over the >> flowtable. The hardware offload workqueue would be "scanning" for >> entries in the flowtable that require some sort of update in the >> hardware. The flags would tell what kind of action is needed. >> >> - Add these flags: >> >> NF_FLOW_HW_NEW >> NF_FLOW_HW_UPDATE >> NF_FLOW_HW_DELETE >> >> and remove the work object (flow_offload_work) and the existing list. >> If the workqueue finds an entry with: >> >> NEW|DELETE, this means this is short lived flow, not worth to waste >> cycles to offload it. >> NEW|UPDATE, this means this is an UDP flow that is bidirectional. >> >> Then, there will be no more work allocation + "flying" work objects to >> the hardware offload workqueue. Instead, the hardware offload >> workqueue will be iterating. >> >> This approach would need _DONE flags to annotate if the offload >> updates have been applied to hardware already (similar to the >> conntrack _DONE flags). >> >> (Oh well, this proposal is adding even more flags. But I think flags >> are not the issue, but the mixture of the existing flow_offload_work >> approach with this new _UPDATE flag and the gc changes). >> >> If flow_offload_work is removed, we would also need to add a: >> >> struct nf_flowtable *flowtable; >> >> field to the flow_offload entry, which is an entry field that is >> passed via flow_offload_work. So it is one extra field for the each >> flow_offload entry. >> >> The other alternative is to use the existing nf_flow_offload_add_wq >> with UPDATE command, which might result in more flying objects in >> turn. I think this is what you are trying to avoid with the _UPDATE >> flag approach. > > This looks interesting, but is very ambitious and will probably be a > bigger change than this whole series. I have an idea how we can leverage > existing 'refresh' mechanism for updating flow state that doesn't > involve large-scale refactoring of existing offload infrastructure, > which I would prefer to try first. WDYT? Update: to illustrate this point I prepared V4 that uses regular refresh to update the flow and also tries to prevent excessive wq spam or updating flow offload to a state that is already outdated.