On Tue 15-03-22 21:42:29, Miaohe Lin wrote: > On 2022/3/15 0:44, Michal Hocko wrote: > > On Fri 11-03-22 17:36:24, Miaohe Lin wrote: > >> If mpol_new is allocated but not used in restart loop, mpol_new will be > >> freed via mpol_put before returning to the caller. But refcnt is not > >> initialized yet, so mpol_put could not do the right things and might > >> leak the unused mpol_new. > > > > The code is really hideous but is there really any bug there? AFAICS the > > new policy is only allocated in if (n->end > end) branch and that one > > will set the reference count on the retry. Or am I missing something? > > > > Many thanks for your comment. > IIUC, new policy is allocated via the below code: > > shared_policy_replace: > alloc_new: > write_unlock(&sp->lock); > ret = -ENOMEM; > n_new = kmem_cache_alloc(sn_cache, GFP_KERNEL); > if (!n_new) > goto err_out; > mpol_new = kmem_cache_alloc(policy_cache, GFP_KERNEL); > if (!mpol_new) > goto err_out; > goto restart; > > And mpol_new' reference count will be set before used in n->end > end case. But > if that is "not" the case, i.e. mpol_new is not inserted into the rb_tree, mpol_new > will be freed via mpol_put before return: One thing I have missed previously is that the lock is dropped during the allocation so I guess the memory policy could have been changed during that time. Is this possible? Have you explored this possibility? Is this a theoretical problem or it can be triggered intentionally. These details would be really interesting for the changelog so that we can judge how important this would be. -- Michal Hocko SUSE Labs