On Friday, 11 June 2021 9:04:19 AM AEST Peter Xu wrote: > External email: Use caution opening links or attachments > > > On Fri, Jun 11, 2021 at 12:21:26AM +1000, Alistair Popple wrote: > > > Hmm, the thing is.. to me FOLL_SPLIT_PMD should have similar effect to explicit > > > call split_huge_pmd_address(), afaict. Since both of them use __split_huge_pmd() > > > internally which will generate that unwanted CLEAR notify. > > > > Agree that gup calls __split_huge_pmd() via split_huge_pmd_address() > > which will always CLEAR. However gup only calls split_huge_pmd_address() if it > > finds a thp pmd. In follow_pmd_mask() we have: > > > > if (likely(!pmd_trans_huge(pmdval))) > > return follow_page_pte(vma, address, pmd, flags, &ctx->pgmap); > > > > So I don't think we have a problem here. > > Sorry I didn't follow here.. We do FOLL_SPLIT_PMD after this check, right? I > mean, if it's a thp for the current mm, afaict pmd_trans_huge() should return > true above, so we'll skip follow_page_pte(); then we'll check FOLL_SPLIT_PMD > and do the split, then the CLEAR notify. Hmm.. Did I miss something? That seems correct - if the thp is not mapped with a pmd we won't split and we won't CLEAR. If there is a thp pmd we will split and CLEAR, but in that case it is fine - we will retry, but the retry will won't CLEAR because the pmd has already been split. The issue arises with doing it unconditionally in make device exclusive is that you *always* CLEAR even if there is no thp pmd to split. Or at least that's my understanding, please let me know if it doesn't make sense. - Alistair > -- > Peter Xu >