On Fri, Mar 26, 2010 at 12:05:06PM -0700, David Rientjes wrote: > On Fri, 26 Mar 2010, Mel Gorman wrote: > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > > --- a/mm/page_alloc.c > > > +++ b/mm/page_alloc.c > > > @@ -2582,7 +2582,7 @@ static int default_zonelist_order(void) > > > * ZONE_DMA and ZONE_DMA32 can be very small area in the sytem. > > > * If they are really small and used heavily, the system can fall > > > * into OOM very easily. > > > - * This function detect ZONE_DMA/DMA32 size and confgigures zone order. > > > + * This function detect ZONE_DMA/DMA32 size and configures zone order. > > > */ > > > > Spurious change here but it's not very important. > > > > > /* Is there ZONE_NORMAL ? (ex. ppc has only DMA zone..) */ > > > low_kmem_size = 0; > > > @@ -2594,6 +2594,15 @@ static int default_zonelist_order(void) > > > if (zone_type < ZONE_NORMAL) > > > low_kmem_size += z->present_pages; > > > total_size += z->present_pages; > > > + } else if (zone_type == ZONE_NORMAL) { > > > + /* > > > > What if it was ZONE_DMA32? > > > > This is part of a zone iteration for each node, so if the node consists of > only ZONE_DMA then it wouldn't have a populated ZONE_NORMAL either and > will return ZONELIST_ORDER_NODE on the next iteration. > Yep. Made sense when I wrote out an example. > > > + * If any node has only lowmem, then node order > > > + * is preferred to allow kernel allocations > > > + * locally; otherwise, they can easily infringe > > > + * on other nodes when there is an abundance of > > > + * lowmem available to allocate from. > > > + */ > > > + return ZONELIST_ORDER_NODE; > > > > It might be clearer if it was done as a similar check later > > > > if (low_kmem_size && > > total_size > average_size && /* ignore small node */ > > low_kmem_size > total_size * 70/100) > > return ZONELIST_ORDER_NODE; > > > > This is saying if low memory is > 70% of total, then use nodes. To take > > yours into account, it'd look something like; > > > > if (low_kmwm_size && total_size > average_size) { > > if (lowmem_size == total_size) > > return ZONELIST_ORDER_ZONE; > > > > if (lowmem_size > total_size * 70/100) > > return ZONELIST_ORDER_NODE; > > } > > There's no guarantee that we'd ever detect the node consisiting of solely > lowmem here since it may be asymmetrically smaller than the average node > size. > True. I wasn't sure if it was intentional or not to take even small nodes into account for this ordering. It it's intentional, I see no problem with the patch. It's seems like a reasonable default decision to me. Acked-by: Mel Gorman <mel@xxxxxxxxx> -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>