On 2020/3/23 10:54, Mike Kravetz wrote: > On 3/22/20 7:03 PM, Longpeng (Mike, Cloud Infrastructure Service Product Dept.) wrote: >> >> >> On 2020/3/22 7:38, Mike Kravetz wrote: >>> On 2/21/20 7:33 PM, Longpeng(Mike) wrote: >>>> From: Longpeng <longpeng2@xxxxxxxxxx> >>>> >>>> Our machine encountered a panic(addressing exception) after run >>>> for a long time and the calltrace is: [snip] >>>> >>>> We can avoid this race by read the pud only once. What's more, we also use >>>> READ_ONCE to access the entries for safe(e.g. avoid the compilier mischief) >>>> >>>> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> >>>> Cc: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> >>>> Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx> >>>> Cc: stable@xxxxxxxxxxxxxxx >>>> Signed-off-by: Longpeng <longpeng2@xxxxxxxxxx> >>> >>> Andrew dropped this patch from his tree which caused me to go back and >>> look at the status of this patch/issue. >>> >>> It is pretty obvious that code in the current huge_pte_offset routine >>> is racy. I checked out the assembly code produced by my compiler and >>> verified that the line, >>> >>> if (pud_huge(*pud) || !pud_present(*pud)) >>> >>> does actually dereference *pud twice. So, the value could change between >>> those two dereferences. Longpeng (Mike) could easlily recreate the issue >>> if he put a delay between the two dereferences. I believe the only >>> reservations/concerns about the patch below was the use of READ_ONCE(). >>> Is that correct? >>> >> Hi Mike, >> >> It seems I've missed your another mail in my client, I found it here >> (https://lkml.org/lkml/2020/2/27/1927) just now. >> >> I think we have reached an agreement that the pud/pmd need READ_ONCE in >> huge_pte_offset() and disagreement is whether the pgd/p4d also need READ_ONCE, >> right ? > > Correct. > > Sorry, I did not reply to the mail thread with more context. > >>> Are there any objections to the patch if the READ_ONCE() calls are removed? >>> >> Because the pgd/p4g are only accessed and dereferenced once here, so some guys >> want to remove it. >> >> But we must make sure they are *really* accessed once, in other words, this >> makes we need to care about both the implementation of pgd_present/p4d_present >> and the behavior of any compiler, for example: >> >> ''' >> static inline int func(int val) >> { >> return subfunc1(val) & subfunc2(val); >> } >> >> func(*p); // int *p >> ''' >> We must make sure there's no strange compiler to generate an assemble code that >> access and dereference 'p' more than once. >> >> I've not found any backwards with READ_ONCE here. However, if you also agree to >> remove READ_ONCE around pgd/p4d, I'll do. >> > > I would like to remove the READ_ONCE calls and move the patch forward. It > does address a real issue you are seeing. > > To be honest, I am more worried about the races in lookup_address_in_pgd() > than using or not using READ_ONCE for pgd/p4d in this patch. > I had the same worry, we've discussed in another thread (https://lkml.org/lkml/2020/2/20/1182) where I asked you `Is it possible the pud changes from pud_huge() to pud_none() while another CPU is walking the pagetable` and you thought it's possible. The reason why I didn't do something in lookup_address_in_pgd together is just because I haven't went into trouble caused by it yet. > I have not looked closely at the generated code for lookup_address_in_pgd. > It appears that it would dereference p4d, pud and pmd multiple times. Sean > seemed to think there was something about the calling context that would > make issues like those seen with huge_pte_offset less likely to happen. I > do not know if this is accurate or not. > > Let's remove the two READ_ONCE calls and move this patch forward. We can > look closer at lookup_address_in_pgd and generate another patch if that needs > to be fixed as well. > OK, I'll remove them in v3. I'll do some fault injection or add some delays in lookup_address_in_pgd to test if it can work well. > Thanks > --- Regards, Longpeng(Mike)