Hello On 08/24/2013 12:14 AM, Toshi Kani wrote: > Hello, > > On Fri, 2013-08-23 at 09:04 -0400, Tejun Heo wrote: >> On Thu, Aug 22, 2013 at 04:17:41PM -0600, Toshi Kani wrote: >>> I am relatively new to Linux, so I am not a good person to elaborate >>> this. From my experience on other OS, huge pages helped for the kernel, >>> but did not necessarily help user applications. It depended on >>> applications, which were not niche cases. But Linux may be different, >>> so I asked since you seemed confident. I'd appreciate if you can point >>> us some data that endorses your statement. >> >> We are talking about the kernel linear mapping which is created during >> early boot, so if it's available and useable there's no reason not to >> use it. Exceptions would be earlier processors which didn't do 1G >> mappings or e820 maps with a lot of holes. For CPUs used in NUMA >> configurations, the former has been history for a bit now. Can't be >> sure about the latter but it'd be surprising for that to affect large >> amount of memory in the systems that are of interest here. Ooh, that >> reminds me that we probably wanna go back to 1G + MTRR mapping under >> 4G. We're currently creating a lot of mapping holes. > > Thanks for the explanation. > >>> My worry is that the code is unlikely tested with the special logic when >>> someone makes code changes to the page tables. Such code can easily be >>> broken in future. >> >> Well, I wouldn't consider flipping the direction of allocation to be >> particularly difficult to get right especially when compared to >> bringing in ACPI tables into the mix. >> >>> To answer your other question/email, I believe Tang's next step is to >>> support local page tables. This is why we think pursing SRAT earlier is >>> the right direction. >> >> Given 1G mappings, is that even a worthwhile effort? I'm getting even >> more more skeptical. > > With 1G mappings, I agree that it won't make much difference. > > I still think acpi table info should be available earlier, but I do not > think I can convince you on this. This can be religious debate. > > Tang, what do you think? Are you OK to try Tejun's suggestion as well? > By saying TJ's suggestion, you mean, we will let memblock to control the behaviour, that said, we will do early allocations near the kernel image range before we get the SRAT info? If so, yeah, we have been working on this direction. By doing this, we may have two main changes: 1. change some of memblock's APIs to make it have the ability to allocate memory from low address. 2. setup kernel page table down-top. Concretely, we first map the memory just after the kernel image to the top, then, we map 0 - kernel image end. Do you guys think this is reasonable and acceptable? -- Thanks. Zhang Yanfei -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html