On Fri, 11 Feb 2011, Andrea Arcangeli wrote: > On Thu, Feb 10, 2011 at 11:02:50PM -0800, Hugh Dickins wrote: > > There is a separate little issue here, Andrea. > > > > Although we went to some trouble for bad_page() to take the page out > > of circulation yet let the system continue, your VM_BUG_ON(!PageBuddy) > > inside __ClearPageBuddy(page), from two callsites in bad_page(), is > > turning it into a fatal error when CONFIG_DEBUG_VM. > > I see what you mean. Of course it is only a problem after bad_page > already triggered.... but then it trigger an BUG_ON instead of only a > bad_page. > > > You could that only MM developers switch CONFIG_DEBUG_VM=y, and they > > would like bad_page() to be fatal; maybe, but if so we should do that > > as an intentional patch, rather than as an unexpected side-effect ;) > > Fedora kernels are built with CONFIG_DEBUG_VM, all my kernels runs > with CONFIG_DEBUG_VM too, so we want it to be as "production" as > possible, and we don't want DEBUG_VM to decrease any reliability (only > to increase it of course). Oh, I hadn't realized Fedora use it. I wonder if that's wise, I thought Nick introduced it partly for the more expensive checks, and there might be one or two of those around - those bad_range()s in page_alloc.c? > > > I noticed this a few days ago, but hadn't quite decided whether just to > > remove the VM_BUG_ON, or move it to __ClearPageBuddy's third callsite, > > or... doesn't matter much. > > > > I do also wonder if PageBuddy would better be _mapcount -something else: > > if we've got a miscounted page (itself unlikely of course), there's a > > chance that its _mapcount will be further decremented after it has been > > freed: whereupon it will go from -1 to -2, PageBuddy at present. The > > special avoidance of PageBuddy being that it can pull a whole block of > > pages into misuse if its mistaken. > > Agreed. What about the below? > > ===== > Subject: mm: PageBuddy cleanups > > From: Andrea Arcangeli <aarcange@xxxxxxxxxx> > > bad_page could VM_BUG_ON(!PageBuddy(page)) inside __ClearPageBuddy(). > I prefer to keep the VM_BUG_ON for safety and to add a if to solve it. Too much iffery: I ended up preferring it in rmv_page_order() myself. > > Change the _mapcount value indicating PageBuddy from -2 to -1024 for more > robusteness against page_mapcount() undeflows. But the patch actually says -1024*1024: either would do. > > Signed-off-by: Andrea Arcangeli <aarcange@xxxxxxxxxx> > Reported-by: Hugh Dickins <hughd@xxxxxxxxxx> > --- > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index f6385fc..fa16ba0 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -402,16 +402,22 @@ static inline void init_page_count(struct page *page) > /* > * PageBuddy() indicate that the page is free and in the buddy system > * (see mm/page_alloc.c). > + * > + * PAGE_BUDDY_MAPCOUNT_VALUE must be <= -2 but better not too close to > + * -2 so that an underflow of the page_mapcount() won't be mistaken > + * for a genuine PAGE_BUDDY_MAPCOUNT_VALUE. Yes, good to comment that, thanks. > */ > +#define PAGE_BUDDY_MAPCOUNT_VALUE (-1024*1024) > + > static inline int PageBuddy(struct page *page) > { > - return atomic_read(&page->_mapcount) == -2; > + return atomic_read(&page->_mapcount) == PAGE_BUDDY_MAPCOUNT_VALUE; > } > > static inline void __SetPageBuddy(struct page *page) > { > VM_BUG_ON(atomic_read(&page->_mapcount) != -1); > - atomic_set(&page->_mapcount, -2); > + atomic_set(&page->_mapcount, PAGE_BUDDY_MAPCOUNT_VALUE); > } > > static inline void __ClearPageBuddy(struct page *page) Yes, that's fine, 0xfff00000 looks unlikely enough (and my imagination for "deadbeef"-like magic is too drowsy today). > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index a873e61..8aac134 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -286,7 +286,9 @@ static void bad_page(struct page *page) > > /* Don't complain about poisoned pages */ > if (PageHWPoison(page)) { > - __ClearPageBuddy(page); > + /* __ClearPageBuddy VM_BUG_ON(!PageBuddy(page)) */ > + if (PageBuddy(page)) > + __ClearPageBuddy(page); > return; > } > > @@ -317,7 +319,8 @@ static void bad_page(struct page *page) > dump_stack(); > out: > /* Leave bad fields for debug, except PageBuddy could make trouble */ > - __ClearPageBuddy(page); > + if (PageBuddy(page)) /* __ClearPageBuddy VM_BUG_ON(!PageBuddy(page)) */ > + __ClearPageBuddy(page); > add_taint(TAINT_BAD_PAGE); > } > Okay I suppose: it seems rather laboured to me, I think I'd have just moved the VM_BUG_ON into rmv_page_order() if I'd done the patch; but since I was too lazy to do it, I'd better be grateful for yours! Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>