On Thu 17-11-22 15:19:20, Zhongkun He wrote: > Hi Michal, thanks for your replay. > > > > > It would be better to add the patch that has been tested. > > OK. > > > > > One way to deal with that would be to use a similar model as css_tryget > > Percpu_ref is a good way to reduce memory footprint in fast path.But it > has the potential to make mempolicy heavy. the sizeof mempolicy is 32 > bytes and it may not have a long life time, which duplicated from the > parent in fork().If we modify atomic_t to percpu_ref, the efficiency of > reading in fastpath will increase, the efficiency of creation and > deletion will decrease, and the occupied space will increase > significantly.I am not really sure it is worth it. > > atomic_t; 4 > sizeof(percpu_ref + percpu_ref_data + cpus* unsigned long) > 16+56+cpus*8 Yes the memory consumption is going to increase but the question is whether this is something that is a real problem. Is it really common to have many vmas with a dedicated policy? What I am arguing here is that there are essentially 2 ways forward. Either we continue to build up on top of the existing and arguably very fragile code and make it even more subtle or follow a general pattern of a proper reference counting (with usual tricks to reduce cache line bouncing and similar issues). I do not really see why memory policies should be any different and require very special treatment. > > Btw. have you tried to profile those slowdowns to identify hotspots? > > > > Thanks > > Yes, it will degrade performance about 2%-%3 may because of the task_lock > and atomic operations on the reference count as shown > in the previous email. > > new hotspots in perf. > 1.34% [kernel] [k] __mpol_put > 0.53% [kernel] [k] _raw_spin_lock > 0.44% [kernel] [k] get_task_policy Thanks! -- Michal Hocko SUSE Labs