On Sat, Aug 27, 2022 at 08:41:50PM +0300, Julian Anastasov wrote: > Hello, > > This patchset implements stats estimation in >kthread context. Simple tests do not show any problem. >Please review, comment, test, etc. Hi, Julian: Thanks a lot for your work! I tested the patchset, until now, it all works well. On my test server with 64 CPUs and 1 million rules. The total CPU cost of all ipvs kthreads is about 67% of 1 CPU(31 ipvs threads). No ping slow detected. Tested-by: Dust Li <dust.li@xxxxxxxxxxxxxxxxx> > > Overview of the basic concepts. More in the >commit messages... > >RCU Locking: > >- when RCU preemption is enabled the kthreads use just RCU >lock for walking the chains and we do not need to reschedule. >May be this is the common case for distribution kernels. >In this case ip_vs_stop_estimator() is completely lockless. > >- when RCU preemption is not enabled, we reschedule by using >refcnt for every estimator to track if the currently removed >estimator is used at the same time by kthread for estimation. >As RCU lock is unlocked during rescheduling, the deletion >should wait kd->mutex, so that a new RCU lock is applied >before the estimator is freed with RCU callback. > >- As stats are now RCU-locked, tot_stats, svc and dest which >hold estimator structures are now always freed from RCU >callback. This ensures RCU grace period after the >ip_vs_stop_estimator() call. > >Kthread data: > >- every kthread works over its own data structure and all >such structures are attached to array > >- even while there can be a kthread structure, its task >may not be running, eg. before first service is added or >while the sysctl var is set to an empty cpulist or >when run_estimation is 0. > >- a task and its structure may be released if all >estimators are unlinked from its chains, leaving the >slot in the array empty > >- to add new estimators we use the last added kthread >context (est_add_ktid). The new estimators are linked to >the chain just before the estimated one, based on add_row. >This ensures their estimation will start after 2 seconds. >If estimators are added in bursts, common case if all >services and dests are initially configured, we may >spread the estimators to more chains. This will reduce >the chain imbalance. > >- the chain imbalance is not so fatal when we use >kthreads. We design each kthread for part of the >possible CPU usage, so even if some chain exceeds its >time slot it would happen all the time or sporadic >depending on the scheduling but still keeping the >2-second interval. The cpulist isolation can make >the things more stable as a 2-second time interval >per estimator. > >Julian Anastasov (4): > ipvs: add rcu protection to stats > ipvs: use kthreads for stats estimation > ipvs: add est_cpulist and est_nice sysctl vars > ipvs: run_estimation should control the kthread tasks > > Documentation/networking/ipvs-sysctl.rst | 24 +- > include/net/ip_vs.h | 144 +++++++- > net/netfilter/ipvs/ip_vs_core.c | 10 +- > net/netfilter/ipvs/ip_vs_ctl.c | 287 ++++++++++++++-- > net/netfilter/ipvs/ip_vs_est.c | 408 +++++++++++++++++++---- > 5 files changed, 771 insertions(+), 102 deletions(-) > >-- >2.37.2