On Mon, 27 Apr 2015 15:48:42 +0200 Alexander Graf <agraf@xxxxxxx> wrote: > On 04/23/2015 02:13 PM, Martin Schwidefsky wrote: > > On Thu, 23 Apr 2015 14:01:23 +0200 > > Alexander Graf <agraf@xxxxxxx> wrote: > > > >> As far as alternative approaches go, I don't have a great idea otoh. > >> We could have an elf flag indicating that this process needs 4k page > >> tables to limit the impact to a single process. In fact, could we > >> maybe still limit the scope to non-global? A personality may work > >> as well. Or ulimit? > > I tried the ELF flag approach, does not work. The trouble is that > > allocate_mm() has to create the page tables with 4K tables if you > > want to change the page table layout later on. We have learned the > > hard way that the direction 2K to 4K does not work due to races > > in the mm. > > > > Now there are two major cases: 1) fork + execve and 2) fork only. > > The ELF flag can be used to reduce from 4K to 2K for 1) but not 2). > > 2) is required for apps that use lots of forking, e.g. database or > > web servers. Same goes for the approach with a personality flag or > > ulimit. > > > > We would have to distinguish the two cases for allocate_mm(), > > if the new mm is allocated for a fork the current mm decides > > 2K vs. 4K. If the new mm is allocated by binfmt_elf, then start > > with 4K and do the downgrade after the ELF flag has been evaluated. > > Well, you could also make it a personality flag for example, no? Then > every new process below a certain one always gets 4k page tables until > they drop the personality, at which point each child would only get 2k > page tables again. > > I'm mostly concerned that people will end up mixing VMs and other > workloads on the same LPAR, so I don't think there's a one-shoe-fits-all > solution. If I add an argument to mm_init() to indicate if this context is for fork() or execve() then the ELF header flag approach works. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html