> On Aug 31, 2023, at 06:47, Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: > > On 08/30/23 15:26, Muchun Song wrote: >> >> >> On 2023/8/26 03:04, Mike Kravetz wrote: >>> At the beginning of hugetlb_vmemmap_optimize, optimistically set >>> the HPageVmemmapOptimized flag in the head page. Clear the flag >>> if the operation fails. >>> >>> No change in behavior. However, this will become important in >>> subsequent patches where we batch delay TLB flushing. We need to >>> make sure the content in the old and new vmemmap pages are the same. >> >> Sorry, I didn't get the point here. Could you elaborate it? >> > > Sorry, this really could use a better explanation. > >>> >>> Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> >>> --- >>> mm/hugetlb_vmemmap.c | 8 +++++--- >>> 1 file changed, 5 insertions(+), 3 deletions(-) >>> >>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c >>> index e390170c0887..500a118915ff 100644 >>> --- a/mm/hugetlb_vmemmap.c >>> +++ b/mm/hugetlb_vmemmap.c >>> @@ -566,7 +566,9 @@ static void __hugetlb_vmemmap_optimize(const struct hstate *h, >>> if (!vmemmap_should_optimize(h, head)) >>> return; >>> + /* Optimistically assume success */ >>> static_branch_inc(&hugetlb_optimize_vmemmap_key); >>> + SetHPageVmemmapOptimized(head); >>> vmemmap_end = vmemmap_start + hugetlb_vmemmap_size(h); >>> vmemmap_reuse = vmemmap_start; >>> @@ -577,10 +579,10 @@ static void __hugetlb_vmemmap_optimize(const struct hstate *h, >>> * to the page which @vmemmap_reuse is mapped to, then free the pages >>> * which the range [@vmemmap_start, @vmemmap_end] is mapped to. >>> */ >>> - if (vmemmap_remap_free(vmemmap_start, vmemmap_end, vmemmap_reuse, bulk_pages)) >>> + if (vmemmap_remap_free(vmemmap_start, vmemmap_end, vmemmap_reuse, bulk_pages)) { >>> static_branch_dec(&hugetlb_optimize_vmemmap_key); >>> - else >>> - SetHPageVmemmapOptimized(head); >>> + ClearHPageVmemmapOptimized(head); >>> + } > > Consider the case where we have successfully remapped vmemmap AND > - we have replaced the page table page (pte page) containing the struct > page of the hugetlb head page. Joao's commit 11aad2631bf7 > 'mm/hugetlb_vmemmap: remap head page to newly allocated page'. > - we have NOT flushed the TLB after remapping due to batching the > operations before flush. > > In this case, it is possible that the old head page is still in the TLB > and caches and SetHPageVmemmapOptimized(head) will actually set the flag > in the old pte page. We then have an optimized hugetlb page without the > HPageVmemmapOptimized flag set. When developing this series, we > experienced various BUGs as a result of this situation. Now, I got it. Thanks for your elaboration. > > In the case of an error during optimization, we do a TLB flush so if > we need to clear the flag we will write to the correct pte page. Right. > > Hope that makes sense. > > I add an explanation like this to the commit message and perhaps put > this closer to/or squash with the patch that batches operations before > flushing TLB. Yes. But I'd also like to add a big comment to explain what's going on here instead of a simple "Optimistically assume success". This one really makes me think it is an optimization not a mandatory premise. Thanks. > -- > Mike Kravetz > >>> } >>> /**