Hi Joerg, On Mon, Jul 19, 2021 at 02:34:31PM +0200, Joerg Roedel wrote: > On Fri, Jul 16, 2021 at 02:09:58AM -0400, Peilin Ye wrote: > > This information is out-of-date, and it took me quite some time of > > ftrace'ing before I figured it out... I think it would be beneficial to > > update, or at least remove it. > > > > As a proof that I understand what I am talking about, on my x86_64 box: > > > > 1. I allocated a vmalloc() area containing linear address `addr`; > > 2. I manually pagewalked `addr` in different page tables, including > > `init_mm.pgd`; > > 3. The corresponding PGD entries for `addr` in different page tables, > > they all immediately pointed at the same PUD table (my box uses > > 4-level paging), at the same physical address; > > 4. No "lazy synchronization" via page fault handling happened at all, > > since it is the same PUD table pre-allocated by > > preallocate_vmalloc_pages() during boot time. > > Yes, this is the story for x86-64, because all PUD/P4D pages for the vmalloc > area are pre-allocated at boot. So no faulting or synchronization needs > to happen. > > On x86-32 this is a bit different. Pre-allocation of PMD/PTE pages is > not an option there (even less when 4MB large-pages with 2-level paging > come into the picture). > > So what happens there is that vmalloc related changes to the init_mm.pgd > are synchronized to all page-tables in the system. But this > synchronization is subject to race conditions in a way that another CPU > might vmalloc an area below a PMD which is not fully synchronized yet. > > When this happens there is a fault, which is handled as a vmalloc() > fault on x86-32 just as before. So vmalloc faults still exist on 32-bit, > they are just less likely as they used to be. Thanks a lot for the information! I will improve my commit message and send a v2 soon. I think for this patch, removing that out-of-date statement is sufficient, since mm.rst is x86-64-specific, but maybe we should document this behavior for x86-32 somewhere as well... Thank you, Peilin Ye