On Tue, 2013-06-18 at 19:05 +0200, Vasilis Liaskovitis wrote: > Hi, > > On Thu, Jun 13, 2013 at 09:03:52PM +0800, Tang Chen wrote: > > The following patch-set from Yinghai allocates pagetables to local nodes. > > v1: https://lkml.org/lkml/2013/3/7/642 > > v2: https://lkml.org/lkml/2013/3/10/47 > > v3: https://lkml.org/lkml/2013/4/4/639 > > v4: https://lkml.org/lkml/2013/4/11/829 > > > > Since pagetable pages are used by the kernel, they cannot be offlined. > > As a result, they cannot be hot-remove. > > > > This patch fix this problem with the following solution: > > > > 1. Introduce a new bootmem type LOCAL_NODE_DATAL, and register local > > pagetable pages as LOCAL_NODE_DATAL by setting page->lru.next to > > LOCAL_NODE_DATAL, just like we register SECTION_INFO pages. > > > > 2. Skip LOCAL_NODE_DATAL pages in offline/online procedures. When the > > whole memory block they reside in is offlined, the kernel can > > still access the pagetables. > > (This changes the semantics of offline/online a little bit.) > > This could be a design problem of part3: if we allow local pagetable memory > to not be offlined but allow the offlining to return successfully, then > hot-remove is going to succeed. But the direct mapped pagetable pages are still > mapped in the kernel. The hot-removed memblocks will suddenly disappear (think > physical DIMMs getting disabled in real hardware, or in a VM case the > corresponding guest memory getting freed from the emulator e.g. qemu/kvm). The > system can crash as a result. > > I think these local pagetables do need to be unmapped from kernel, offlined and > removed somehow - otherwise hot-remove should fail. Could they be migrated > alternatively e.g. to node 0 memory? But Iiuc direct mapped pages cannot be > migrated, correct? > > What is the original reason for local node pagetable allocation with regards > to memory hotplug? I assume we want to have hotplugged nodes use only their local > memory, so that there are no inter-node memory dependencies for hot-add/remove. > Are there other reasons that I am missing? I second Vasilis. The part1/2/3 series could be much simpler & less riskier if we focus on the SRAT changes first, and make the local node pagetable changes as a separate item. Is there particular reason why they have to be done at a same time? Thanks, -Toshi -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html