On Thu, 2013-10-10 at 01:00 +0000, Tejun Heo wrote: > Hello, Toshi. > > On Wed, Oct 09, 2013 at 05:58:55PM -0600, Toshi Kani wrote: > > Well, there was a plan before, which considered to enhance it to a > > memory device granularity at step 3. But we had a major replan at step > > 1 per your suggestion. > > > > https://lkml.org/lkml/2013/6/19/73 > > Where? > > "3. Improve memory hotplug to support local device pagetable." > > How can the above possibly be considered as a plan for finer > granularity? Forget about the "how" part. The stated goal doesn't > even mention finer granularity. The word "device" above refers memory device level granularity. > Are firmware writers gonna be > required to split SRAT entries into multiple sub-nodes to support it? Yes, and that's part of the ACPI spec. That's not something the OS requests to do. If a memory range has different attribute, firmware has to put it in a separate entry. > Is segregating zones further for this even a good idea? Adding more > NUMA nodes has its own overhead and the mm code isn't written > expecting it to be repurposed for segmenting the same NUMA node for > hotplug underneath it. I agree. But my point is that it is an issue today with the current kernel implementation. This issue is not introduced by using SRAT. > Maybe zoning is a viable approach. Maybe it is not. I don't know, > but you guys don't seem to be too interested in actual long term > planning while pushing for something invasive which may or may not be > viable in the longer term, which can often lead to silly situations. > It isn't even clear whether SRAT is the right interface for this. If > it's gonna require firwmare writer's cooperation anyway, why not > provide the information as extended part of e820? It doesn't seem to > have much to do with NUMA or zones. The only information the kernel > needs to know is whether certain memory areas should only be used for > page cache. SRAT and _EJ0 method are the only interfaces that define ejectability in the standard spec. Are you suggesting us to change the e820 spec or not to comply with the spec? I do not think such approaches work. > At this point, at least to me, it doesn't seem reasonably clear how > this is gonna develop and the whole thing feels like a kludge, which > can be fine too, but seriously if you guys wanna push for an invasive > approach, it should really be backed by longer term plan, vision, > justification and the ability to make the necessary changes in the > various involved layers. Maybe I'm being too pessimistic but I feel > that there are a lot missing in most of those areas, which makes it > quite risky to commit to invasive changes. > > If the zone based kludgy appraoch is something meaningfully useful, > I'd suggest to sticking to it at least for now. Some of it would be > useful anyway and if it doesn't fan out the added maintenance overhead > is fairly low. I think memory hotplug was originally implemented on ia64 with the node granularity. I share your concerns, but that's been done a long time ago. It's too late to complain the past. This SRAT work is not introducing such restriction. Thanks, -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>