On 04/02/2015 12:30 PM, Christoph Hellwig wrote: > On Wed, Apr 01, 2015 at 05:25:22PM +0300, Boaz Harrosh wrote: >> pfn = PFN_DOWN(ei->addr + ei->size); >> >> - switch (ei->type) { >> - case E820_RAM: >> - case E820_PRAM: >> - case E820_RESERVED_KERN: >> - break; >> - default: >> + if (ei->type != E820_RAM && ei->type != E820_RESERVED_KERN) >> register_nosave_region(PFN_UP(ei->addr), pfn); >> - } > > I guess this makes sense - if the content is persistent already we don't need > to save it. > Yes >> - if (e820.map[i].type != E820_RESERVED || res->start < (1ULL<<20)) { >> - if (e820.map[i].type != E820_PRAM) >> - res->flags |= IORESOURCE_BUSY; >> + if (((e820.map[i].type != E820_RESERVED) && >> + (e820.map[i].type != E820_PRAM)) || >> + res->start < (1ULL<<20)) { > > So now we also trigger for PRAM regions under 1ULL<<20, was that the > intentional change? Honestly I don't really understand this 1ULL<<20 > magic here even for the existing case. Guess this is magic from the > old ISA PC days? > OK so by now I have had a chance to test all cases and figure out what happened. I was running on your V1 and then V2. And had huge problems with NUMA. The thing that actually fixed everything and brought the system back to life, was already in your V3. It was the remove of reserve_pmem() and the call to memblock_reserve() and init_memory_mapping() from within. It was making a mess. So your V3 was already running nice. But I already fixed everything on my side by the time you sent V3, and what I sent here is the diff from what I had and your V3. But these are all good fixes still. (Though not fatal as V2 was) 3 small fixes here: * Adding back the memmap=! now that everything works perfectly as expected. * The one above that fixes register_nosave_region and ... >> + res->flags |= IORESOURCE_BUSY; > > Guess this is the real change, and I'd love to understand why this > makes a difference for you. IORESOURCE_BUSY is checked almost never, > and is intented to mean it's a driver mapping. This here is a very minor thing. I have lots of experience with this code and with the internals of insert_resource() from my old e820.c fixes (Which are BTW still relevant but no longer needed for any current platform) So what happens here is just the adding of the IORESOURCE_BUSY flag. Actually all these resources are already in the resource list and this code is just changing the flag. So if you are not changing the flag there is just no point in calling insert_resource(). It will change nothing. >From what I understand from the history of this file and from prints that I put in this file and at the resource manager. The logic is that it wants to make all E820_XXX busy so to not let any driver try and claim them. And Also the code does not want to allow any kind of e820 resource below the 1M be allowed for drivers, reserved or otherwise. Any E820_RESERVED regions it allows for driver use by not setting IORESOURCE_BUSY. Your code and mine are effectively the same only that I optimize out the call to insert_resource() since the flags were not changed. Testing status: We had some birth problems with the new system and the APIs that changed so the testing rig broke half through the night. But we have wrinkled out the last problems and the machines are pumping away full steam, NUMA and everything. So far it looks good. I will very soon now, Send you a tested-by: That confirms these patches are just as good as what we had until now out-of-tree. (I'm running with my page-struct patches on top of your pmem driver. Will publish trees soon) Thank you for your patience Boaz -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html