On Sat, 5 Nov 2022 at 05:16, Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx> wrote: > > On 2022/11/5 02:21, Dmitry Vyukov wrote: > >> On Fri, 4 Nov 2022 at 11:03, Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > >>> > >>> On Fri, Nov 04, 2022 at 10:47:17AM -0700, Dmitry Vyukov wrote: > >>>>>> Do we know how common/useful such an allocation pattern is? > >>>>> > >>>>> I have coded something like this a few times, in my cases it is > >>>>> usually something like: try to allocate a big chunk of memory hoping > >>>>> for a huge page, then fall back to a smaller allocation > >>>>> > >>>>> Most likely the key consideration is that the callsites are using > >>>>> GFP_NOWARN, so perhaps we can just avoid decrementing the nth on a > >>>>> NOWARN case assuming that another allocation attempt will closely > >>>>> follow? > >>>> > >>>> GFP_NOWARN is also extensively used for allocations with > >>>> user-controlled size, e.g.: > >>>> https://elixir.bootlin.com/linux/v6.1-rc3/source/net/unix/af_unix.c#L3451 > >>>> > >>>> That's different and these allocations are usually not repeated. > >>>> So looking at GFP_NOWARN does not look like the right thing to do. > >>> > >>> This may be the best option then, arguably perhaps even more > >>> "realistic" than normal fail_nth as in a real system if this stuff > >>> starts failing there is a good chance things from then on will fail > >>> too during the error cleanup. > >>> > >>>>> However, this would also have to fix the obnoxious behavior of fail > >>>>> nth where it fails its own copy_from_user on its write system call - > >>>>> meaning there would be no way to turn it off. > >>>> > >>>> Oh, interesting. We added failing of copy_from/to_user later and did > >>>> not consider such interaction. > >>>> Filed https://bugzilla.kernel.org/show_bug.cgi?id=216660 for this. > >>> > >>> Oh, I will tell you the other two bugish things I noticed > >>> > >>> __should_failslab() has this: > >>> > >>> if (gfpflags & __GFP_NOWARN) > >>> failslab.attr.no_warn = true; > >>> > >>> return should_fail(&failslab.attr, s->object_size); > >>> > >>> Which always permanently turns off no_warn for slab during early > >>> boot. This is why syzkaller reports are so confusing. They trigger a > >>> slab fault injection, which in all other cases gives a notification > >>> backtrace, but in slab cases there is no hint about the fault > >>> injection in the log. > >> > >> Ouch, this looks like a bug in: > >> > >> commit 3f913fc5f9745613088d3c569778c9813ab9c129 > >> Author: Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx> > >> Date: Thu May 19 14:08:55 2022 -0700 > >> mm: fix missing handler for __GFP_NOWARN > >> > >> +Qi could you please fix it? > >> > >> At the very least the local gfpflags should not alter the global > >> failslab.attr that is persistent and shared by all tasks. > > > > Oh, It indeed shouldn't alter the global failslab.attr, I'll fix it. > > How about the following changes? If it's ok, I will send this fix patch. > Thanks. :) I think the interface design question is mostly to Akinobu as fault injection maintainer. > diff --git a/include/linux/fault-inject.h b/include/linux/fault-inject.h > index 9f6e25467844..b61a3fb7a2a3 100644 > --- a/include/linux/fault-inject.h > +++ b/include/linux/fault-inject.h > @@ -20,7 +20,6 @@ struct fault_attr { > atomic_t space; > unsigned long verbose; > bool task_filter; > - bool no_warn; > unsigned long stacktrace_depth; > unsigned long require_start; > unsigned long require_end; > @@ -40,12 +39,12 @@ struct fault_attr { > .ratelimit_state = RATELIMIT_STATE_INIT_DISABLED, \ > .verbose = 2, \ > .dname = NULL, \ > - .no_warn = false, \ > } > > #define DECLARE_FAULT_ATTR(name) struct fault_attr name = > FAULT_ATTR_INITIALIZER > int setup_fault_attr(struct fault_attr *attr, char *str); > bool should_fail(struct fault_attr *attr, ssize_t size); > +bool should_fail_gfp(struct fault_attr *attr, ssize_t size, gfp_t > gfpflags); > > #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS > > diff --git a/lib/fault-inject.c b/lib/fault-inject.c > index 4b8fafce415c..95af50832770 100644 > --- a/lib/fault-inject.c > +++ b/lib/fault-inject.c > @@ -41,9 +41,6 @@ EXPORT_SYMBOL_GPL(setup_fault_attr); > > static void fail_dump(struct fault_attr *attr) > { > - if (attr->no_warn) > - return; > - > if (attr->verbose > 0 && __ratelimit(&attr->ratelimit_state)) { > printk(KERN_NOTICE "FAULT_INJECTION: forcing a failure.\n" > "name %pd, interval %lu, probability %lu, " > @@ -98,12 +95,7 @@ static inline bool fail_stacktrace(struct fault_attr > *attr) > > #endif /* CONFIG_FAULT_INJECTION_STACKTRACE_FILTER */ > > -/* > - * This code is stolen from failmalloc-1.0 > - * http://www.nongnu.org/failmalloc/ > - */ > - > -bool should_fail(struct fault_attr *attr, ssize_t size) > +bool should_fail_check(struct fault_attr *attr, ssize_t size) > { > bool stack_checked = false; > > @@ -118,7 +110,7 @@ bool should_fail(struct fault_attr *attr, ssize_t size) > fail_nth--; > WRITE_ONCE(current->fail_nth, fail_nth); > if (!fail_nth) > - goto fail; > + return true; > > return false; > } > @@ -151,7 +143,19 @@ bool should_fail(struct fault_attr *attr, ssize_t size) > if (attr->probability <= get_random_u32_below(100)) > return false; > > -fail: > + return true; > +} > + > +/* > + * This code is stolen from failmalloc-1.0 > + * http://www.nongnu.org/failmalloc/ > + */ > + > +bool should_fail(struct fault_attr *attr, ssize_t size) > +{ > + if (!should_fail_check(attr, size)) > + return false; > + > fail_dump(attr); > > if (atomic_read(&attr->times) != -1) > @@ -161,6 +165,21 @@ bool should_fail(struct fault_attr *attr, ssize_t size) > } > EXPORT_SYMBOL_GPL(should_fail); > > +bool should_fail_gfp(struct fault_attr *attr, ssize_t size, gfp_t gfpflags) > +{ > + if (!should_fail_check(attr, size)) > + return false; > + > + if (!(gfpflags & __GFP_NOWARN)) > + fail_dump(attr); > + > + if (atomic_read(&attr->times) != -1) > + atomic_dec_not_zero(&attr->times); > + > + return true; > +} > +EXPORT_SYMBOL_GPL(should_fail_gfp); should_fail/should_fail_gfp duplicate some code + gfp is slab-specific (while clumsy to use for other fault injection types + we won't be able to pass any runtime flags that are not present in gfp). So I would go either with: bool should_fail(struct fault_attr *attr, ssize_t size, bool nowarn); or even more extensible interface would be: enum fault_flags { fault_nowarn = 1 << 0, }; bool should_fail(struct fault_attr *attr, ssize_t size, fault_flags flags); And if we don't want to change all callers to avoid code duplication: bool should_fail_ex(struct fault_attr *attr, ssize_t size, fault_flags flags); bool should_fail(struct fault_attr *attr, ssize_t size) { return should_fail_ex(attr, size, 0); } > #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS > > static int debugfs_ul_set(void *data, u64 val) > diff --git a/mm/failslab.c b/mm/failslab.c > index 58df9789f1d2..21338b256791 100644 > --- a/mm/failslab.c > +++ b/mm/failslab.c > @@ -30,10 +30,7 @@ bool __should_failslab(struct kmem_cache *s, gfp_t > gfpflags) > if (failslab.cache_filter && !(s->flags & SLAB_FAILSLAB)) > return false; > > - if (gfpflags & __GFP_NOWARN) > - failslab.attr.no_warn = true; > - > - return should_fail(&failslab.attr, s->object_size); > + return should_fail_gfp(&failslab.attr, s->object_size, gfpflags); > } > > static int __init setup_failslab(char *str) > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 7192ded44ad0..4e70b5599ada 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -3912,10 +3912,7 @@ static bool __should_fail_alloc_page(gfp_t > gfp_mask, unsigned int order) > (gfp_mask & __GFP_DIRECT_RECLAIM)) > return false; > > - if (gfp_mask & __GFP_NOWARN) > - fail_page_alloc.attr.no_warn = true; > - > - return should_fail(&fail_page_alloc.attr, 1 << order); > + return should_fail_gfp(&fail_page_alloc.attr, 1 << order, gfp_mask); > } > > #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS > > > > > But a warning should not be printed for callers that currently specify > > __GFP_NOWARN, because that could lead to deadlocks, such as the deadlock > > case mentioned in commit 6b9dbedbe349 ("tty: fix deadlock caused by > > calling printk() under tty_port->lock"). > > > > Thanks, > > Qi > > > >> > >> But I am not sure if we really don't want to issue the fault injection > >> stack in this case. It's not a WARNING, it's merely an information > >> message. It looks useful in all cases, even with GFP_NOWARN. Why > >> should it be suppressed? > >> > >> > >>> Once that is fixed we can quickly explain why the socketpair() example > >>> in the docs shows success ret codes in the middle of the sweep when > >>> run on syzkaller kernels > >>> > >>> fail_nth interacts badly with other kernel features typically enabled > >>> in syzkaller kernels. Eg it fails in hidden kmemleak instrumentation: > >>> > >>> [ 18.499559] FAULT_INJECTION: forcing a failure. > >>> [ 18.499559] name failslab, interval 1, probability 0, space 0, > >>> times 0 > >>> [ 18.499720] CPU: 10 PID: 386 Comm: iommufd_fail_nt Not tainted > >>> 6.1.0-rc3+ #34 > >>> [ 18.499826] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), > >>> BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > >>> [ 18.499971] Call Trace: > >>> [ 18.500010] <TASK> > >>> [ 18.500048] show_stack+0x3d/0x3f > >>> [ 18.500114] dump_stack_lvl+0x92/0xbd > >>> [ 18.500171] dump_stack+0x15/0x17 > >>> [ 18.500232] should_fail.cold+0x5/0xa > >>> [ 18.500291] __should_failslab+0xb6/0x100 > >>> [ 18.500349] should_failslab+0x9/0x20 > >>> [ 18.500416] kmem_cache_alloc+0x64/0x4e0 > >>> [ 18.500477] ? __create_object+0x40/0xc50 > >>> [ 18.500539] __create_object+0x40/0xc50 > >>> [ 18.500620] ? kasan_poison+0x3a/0x50 > >>> [ 18.500690] ? kasan_unpoison+0x28/0x50 > >>> [***18.500753] kmemleak_alloc+0x24/0x30 > >>> [ 18.500816] __kmem_cache_alloc_node+0x1de/0x400 > >>> [ 18.500900] ? iopt_alloc_area_pages+0x95/0x560 [iommufd] > >>> [ 18.500993] kmalloc_trace+0x26/0x110 > >>> [ 18.501059] iopt_alloc_area_pages+0x95/0x560 [iommufd] > >>> > >>> Which has the consequence of syzkaller wasting half its fail_nth > >>> effort because it is triggering failures in hidden instrumentation > >>> that has no impact on the main code path. > >>> > >>> Maybe a kmem_cache_alloc_no_fault_inject() would be helpful for a few > >>> cases. > >>> > >>> Jason > > > > -- > Thanks, > Qi > > -- > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@xxxxxxxxxxxxxxxx. > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/be6a67b0-479f-db0a-fa69-764713135d70%40bytedance.com.