Hello, On Wed, Jan 24, 2024 at 10:01:47AM +0800, Kemeng Shi wrote: > Hi Tejun, thanks for reply. For cgroup wb, it will belongs to a global wb > domain and a cgroup domain. I agree the way how we calculate wb's threshold > in global domain as you described above. This patch tries to fix calculation > of wb's threshold in cgroup domain which now is wb_calc_thresh(mdtc->wb, > mdtc->bg_thresh)), means: > (wb bandwidth) / (system bandwidth) * (*cgroup domain threshold*) > The cgroup domain threshold is > (memory of cgroup domain) / (memory of system) * (system threshold). > Then the wb's threshold in cgroup will be smaller than expected. > > Consider following domain hierarchy: > global domain (100G) > / \ > cgroup domain1(50G) cgroup domain2(50G) > | | > bdi wb1 wb2 > Assume wb1 and wb2 has the same bandwidth. > We have global domain bg_thresh 10G, cgroup domain bg_thresh 5G. > Then we have: > wb's thresh in global domain = 10G * (wb bandwidth) / (system bandwidth) > = 10G * 1/2 = 5G > wb's thresh in cgroup domain = 5G * (wb bandwidth) / (system bandwidth) > = 5G * 1/2 = 2.5G > At last, wb1 and wb2 will be limited at 2.5G, the system will be limited > at 5G which is less than global domain bg_thresh 10G. > > After the fix, threshold in cgroup domain will be: > (wb bandwidth) / (cgroup bandwidth) * (cgroup domain threshold) > The wb1 and wb2 will be limited at 5G, the system will be limited at > 10G which equals to global domain bg_thresh 10G. > > As I didn't take a deep look into memory cgroup, please correct me if > anything is wrong. Thanks! > > Also, how is this tested? Was there a case where the existing code > > misbehaved that's improved by this patch? Or is this just from reading code? > > This is jut from reading code. Would the case showed above convince you > a bit. Look forward to your reply, thanks!. So, the explanation makes some sense to me but can you please construct a case to actually demonstrate the problem and fix? I don't think it'd be wise to apply the change without actually observing the code change does what it says it does. Thanks. -- tejun