On Mon, 23 Aug 2010 12:35:22 -0700 (PDT) David Rientjes <rientjes@xxxxxxxxxx> wrote: > On Mon, 23 Aug 2010, Andrew Morton wrote: > > > > diff --git a/drivers/md/dm-region-hash.c b/drivers/md/dm-region-hash.c > > > --- a/drivers/md/dm-region-hash.c > > > +++ b/drivers/md/dm-region-hash.c > > > @@ -289,8 +289,12 @@ static struct dm_region *__rh_alloc(struct dm_region_hash *rh, region_t region) > > > struct dm_region *reg, *nreg; > > > > > > nreg = mempool_alloc(rh->region_pool, GFP_ATOMIC); > > > - if (unlikely(!nreg)) > > > - nreg = kmalloc(sizeof(*nreg), GFP_NOIO | __GFP_NOFAIL); > > > + if (unlikely(!nreg)) { > > > + /* FIXME: this may potentially loop forever */ > > > + do { > > > + nreg = kmalloc(sizeof(*nreg), GFP_NOIO); > > > + } while (!nreg); > > > + } > > > > > > nreg->state = rh->log->type->in_sync(rh->log, region, 1) ? > > > DM_RH_CLEAN : DM_RH_NOSYNC; > > > > erm. > > > > The reason for adding GFP_NOFAIL in the first place was my observation > > that the kernel contained lots of open-coded retry-for-ever loops. > > > > All of these are wrong, bad, buggy and mustfix. So we consolidated the > > wrongbadbuggymustfix concept into the core MM so that miscreants could > > be easily identified and hopefully fixed. > > > > That consolidation would have been unnecessary, then, since all > allocations with order < PAGE_ALLOC_COSTLY_ORDER automatically loop > indefinitely in the page allocator. The difference is that an order-0 !__GFP_NOFAIL allocation attempt can fail due to oom-killing. Unless someone broke that. > struct dm_region allocations would > already do that. > > So this retry loop doesn't actually do anything that the page allocator > already doesn't, with or without __GFP_NOFAIL. The difference here is > that > > - it doesn't depend on the page allocator's implementation, which may > change over time, and > > - it adds documentation so that the subsystems doing these loops can > (hopefully) fix these problems later, although their appear to be > geniune cases where little other options are available. > > > I think that simply undoing that change is a bad idea - it allows the > > wrongbadbuggymustfix code to hide from view. > > > > It removes several branches from the page allocator. None on the fast path. > > The correct way to remove __GFP_NOFAIL is to fix the > > wrongbadbuggymustfix code properly. > > > > If the prerequisite for removing __GFP_NOFAIL is that nobody must ever > loop indefinitely looking for memory or smaller order allocations don't > implicitly retry, then there's little chance it'll ever get removed since > they've existed for years without anybody cleaning them up. The JBD one is hard - I haven't looked at the others. We should fix them. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html