On 7/30/20 4:26 PM, Peter Xu wrote: > Hi, Mike, > > On Thu, Jul 30, 2020 at 02:49:18PM -0700, Mike Kravetz wrote: >> On 7/30/20 1:16 PM, Peter Xu wrote: >>> This is found by code observation only. >>> >>> Firstly, the worst case scenario should assume the whole range was covered by >>> pmd sharing. The old algorithm might not work as expected for ranges >>> like (1g-2m, 1g+2m), where the adjusted range should be (0, 1g+2m) but the >>> expected range should be (0, 2g). >>> >>> Since at it, remove the loop since it should not be required. With that, the >>> new code should be faster too when the invalidating range is huge. >> >> Thanks Peter! >> >> That is certainly much simpler than the loop in current code. You say there >> are instances where old code 'might not work' for ranges like (1g-2m, 1g+2m). >> Not sure I understand what you mean by adjusted and expected ranges in the >> message. Both are possible 'adjusted' ranges depending on vma size. >> >> Just trying to figure out if there is an actual problem in the existing code >> that needs to be fixed in stable. I think the existing code is correct, just >> inefficient. > > Thanks for the quick review! > > I'm not sure whether that will cause a real problem, but iiuc in my previous > example of (1g-2m, 1g+2m) in the commit message, the old code will extend the > range to (0, 1g+2m). In this case, if unluckily the (1g, 2g) range is a pud > with shared pmd, then imho we face the risk of partial tlb flushing with the > old code, because it will only flush tlb for range (0, 1g+2m) but not (0, 2g). > If that's the case, maybe it worths cc stable. > > Anyway, I'd like to double confirm with you in case I missed something. You are correct. With range (1g-2m, 1g+2m) within a vma (0, 2g) the existing code will only adjust to (0, 1g+2m) which is incorrect. We should cc stable. The original reason for adjusting the range was to prevent data corruption (getting wrong page). Since the range is not always adjusted correctly, the potential for corruption still exists. However, I am fairly confident that adjust_range_if_pmd_sharing_possible is only gong to be called in two cases: 1) for a single page 2) for range == entire vma In those cases, the current code should produce the correct results. To be safe, let's just cc stable. Also, Fixes: 017b1660df89 ("mm: migration: fix migration of huge PMD shared pages") Reviewed-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> -- Mike Kravetz