On Tue, Aug 17, 2010 at 11:21:15AM +0900, Minchan Kim wrote: > Hi, Mel. > > On Mon, Aug 16, 2010 at 6:42 PM, Mel Gorman <mel@xxxxxxxxx> wrote: > > When allocating a page, the system uses NR_FREE_PAGES counters to determine > > if watermarks would remain intact after the allocation was made. This > > check is made without interrupts disabled or the zone lock held and so is > > race-prone by nature. Unfortunately, when pages are being freed in batch, > > the counters are updated before the pages are added on the list. During this > > window, the counters are misleading as the pages do not exist yet. When > > under significant pressure on systems with large numbers of CPUs, it's > > possible for processes to make progress even though they should have been > > stalled. This is particularly problematic if a number of the processes are > > using GFP_ATOMIC as the min watermark can be accidentally breached and in > > extreme cases, the system can livelock. > > > > This patch updates the counters after the pages have been added to the > > list. This makes the allocator more cautious with respect to preserving > > the watermarks and mitigates livelock possibilities. > > > > Signed-off-by: Mel Gorman <mel@xxxxxxxxx> > Reviewed-by: Minchan Kim <minchan.kim@xxxxxxxxx> > > Page free path looks good by your patch. > Thanks > Now allocation path decrease NR_FREE_PAGES _after_ it remove pages from buddy. > It can make that actually we don't have enough pages in buddy but > pretend to have enough pages. > It could make same situation with free path which is your concern. > So I think it can confuse watermark check in extreme case. > > So don't we need to consider _allocation_ path with conservative? > I considered it and it would be desirable. The downside was that the paths became more complicated. Take rmqueue_bulk() for example. It could start by modifying the counters but there then needs to be a recovery path if all the requested pages were not allocated. It'd be nice to see if these patches on their own were enough to alleviate the worst of the per-cpu-counter drift before adding new branches to the allocation path. Does that make sense? -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>