Michal Hocko wrote: > On Wed 22-11-17 19:53:59, Tetsuo Handa wrote: > > Tetsuo Handa wrote: > > > Andrew Morton wrote: > > > > On Tue, 21 Nov 2017 21:02:37 +0900 Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > > There are users not checking for register_shrinker() failure. > > > > > Continuing with ignoring failure can lead to later oops at > > > > > unregister_shrinker(). > > > > > > > > > > ... > > > > > > > > > > --- a/include/linux/shrinker.h > > > > > +++ b/include/linux/shrinker.h > > > > > @@ -75,6 +75,6 @@ struct shrinker { > > > > > #define SHRINKER_NUMA_AWARE (1 << 0) > > > > > #define SHRINKER_MEMCG_AWARE (1 << 1) > > > > > > > > > > -extern int register_shrinker(struct shrinker *); > > > > > +extern __must_check int register_shrinker(struct shrinker *); > > > > > extern void unregister_shrinker(struct shrinker *); > > > > > #endif > > > > > > > > hm, well, OK, it's a small kmalloc(GFP_KERNEL). That won't be > > > > failing. > > > > > > It failed by fault injection and resulted in a report at > > > http://lkml.kernel.org/r/001a113f996099503a055e793dd3@xxxxxxxxxx . > > > > Since kzalloc() can become > 32KB allocation if CONFIG_NODES_SHIFT > 12 > > (which might not be impossible in near future), register_shrinker() can > > potentially become a costly allocation which might fail without invoking > > the OOM killer. It is a good opportunity to think whether we should allow > > register_shrinker() to fail. > > Is it really that hard to fix callers to handle the error? At least adding __must_check will encourage callers to check for an error. But > > > > > Affected code seems to be fs/xfs, fs/super.c, fs/quota, > > > > arch/x86/kvm/mmu, drivers/gpu/drm/ttm, drivers/md and a bunch of > > > > staging stuff. > > > > > > > > I'm not sure this is worth bothering about? > > > > > > > > > > Continuing with failed register_shrinker() is almost always wrong. > > > Though I don't know whether mm/zsmalloc.c case can make sense. > > > > > > > Thinking from the fact that register_shrinker() had been "void" until Linux 3.11 > > and we did not take appropriate precautions when changing to "int" in Linux 3.12, > > we need to consider making register_shrinker() "void" again. > > > > If we could agree with opening up the use of __GFP_NOFAIL for allocating a few > > non-contiguous pages on large systems, we can make register_shrinker() "void" > > again. (Draft patch is shown below. I choose array of kmalloc(PAGE_SIZE) > > rather than kvmalloc() in order to use __GFP_NOFAIL.) > > I am not sure we want to overcomplicate the code too much. Most > architectures do not have that many numa nodes to care. If we really > need to care maybe we should rethink and get rid of the per numa > deferred count altogether. the amount of changes needed for checking for an error will exceed the amount of changes needed for making register_shrinker() not to return an error. Do we want to overcomplicate register_shrinker() callers? I think that making register_shrinker() not to fail is less error-prone than updating register_shrinker() callers to check for failure. Thus, I appreciate if we could agree with __GFP_NOFAIL for register_shrinker(). -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>