On Tue, 2011-04-19 at 13:35 -0500, Christoph Lameter wrote: > On Tue, 19 Apr 2011, James Bottomley wrote: > > > > } > > > > > > How in the world did you get a zone setup in node 1 with a !NUMA config? > > > > I told you ... I forced an allocation into the first discontiguous > > region. That will return 1 for page_to_nid(). > > How? The kernel has no concept of a node 1 without CONFIG_NUMA and so you > cannot tell the page allocator to allocate from node 1. Yes, it does, as I explained in the email. > zone_to_nid is used as a fallback mechanism for page_to_nid() and as shown > will always return 0 for !NUMA configs. > > page_to_nid(x) == zone_to_nid(page_zone(x)) must hold true. It is not > here. > > > > The problem seems to be that the kernel seems to allow a > > > definition of a page_to_nid() function that returns non zero in the !NUMA > > > case. > > > > This is called reality, yes. > > There you have the bug. Fix that and things will work fine. Why don't yout file the bug against reality? I'm not sure I have enough credibility ... > > right, that's what I told you: slub is broken because it's making a > > wrong assumption. Look in asm-generic/memory_model.h it shows how the > > page_to_nid() is used in finding the pfn array. DISCONTIGMEM uses some > > of the numa properties (including assigning zones to the discontiguous > > regions). > > Bitrotted code? Don't be silly: alpha, ia64, m32r, m68k, mips, parisc, tile and even x86 all use the discontigmem memory model in some configurations. > If it uses numa properties then it must use a zone field > in struct zone. So DISCONTIGMEM seems to require CONFIG_NUMA. No ... you're giving me back your assumptions. They're not based on what the kernel does. CONFIG_NUMA may or may not be defined with CONFIG_DISCONTIGMEM. Of all the above, only x86 always had NUMA with DISCONTIGMEM. > > > If you think that is broken then we have brokenness all over the kernel > > > whenever we determine the node from a page and use that to do a lookup. > > > > Not really. The rest of the kernel uses the proper macros. in > > DISCONTIGMEM but !NUMA configs, the numa macros expand correctly. > > You've cut across that with all the CONFIG_NUMA checks in slub. > > What are "the proper macros"? AFAICT page_to_nid() is the proper way to > access the node of a page. If page_to_nid() returns 1 then you have a zone > that the kernel knows of as being in node 0 having a page on a different > node. Well it depends what you want. If you only want the actual NUMA node, then pfn_to_nid() probably isn't what you want, because in a DISCONTIGMEM model, there may be multiple nids per actual numa node. > We can likely force page_to_nid to ignore the node information that have > been erroneously placed there but this looks like something deeper is > wrong here. The node field in struct page is not only used for the Linux > support of a NUMA node but also for blocks of memory. Those should be > separate things. Look, it's not wrong, it's by design. The assumption that non-numa systems don't use nodes is the wrong one. > --- > include/linux/mm.h | 4 ++++ > 1 file changed, 4 insertions(+) > > Index: linux-2.6/include/linux/mm.h > =================================================================== > --- linux-2.6.orig/include/linux/mm.h 2011-04-19 13:20:20.092521248 -0500 > +++ linux-2.6/include/linux/mm.h 2011-04-19 13:21:05.962521196 -0500 > @@ -665,6 +665,7 @@ static inline int zone_to_nid(struct zon > #endif > } > > +#ifdef CONFIG_NUMA > #ifdef NODE_NOT_IN_PAGE_FLAGS > extern int page_to_nid(struct page *page); > #else > @@ -673,6 +674,9 @@ static inline int page_to_nid(struct pag > return (page->flags >> NODES_PGSHIFT) & NODES_MASK; > } > #endif > +#else > +#define page_to_nid(x) 0 > +#endif Don't be silly ... that breaks asm-generic/memory_model.h James -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>