On 06.08.22 01:13, Peter Xu wrote: > On Fri, Aug 05, 2022 at 01:48:35PM -0700, Mike Kravetz wrote: >> On 08/05/22 20:57, David Hildenbrand wrote: >>> On 05.08.22 20:33, Mike Kravetz wrote: >>>> On 08/05/22 20:25, David Hildenbrand wrote: >>>>> On 05.08.22 20:23, Mike Kravetz wrote: >>>>>> On 08/05/22 14:14, Peter Xu wrote: >>>>>>> On Fri, Aug 05, 2022 at 01:03:28PM +0200, David Hildenbrand wrote: >>>>>>>> diff --git a/mm/mmap.c b/mm/mmap.c >>>>>>>> index 61e6135c54ef..462a6b0344ac 100644 >>>>>>>> --- a/mm/mmap.c >>>>>>>> +++ b/mm/mmap.c >>>>>>>> @@ -1683,6 +1683,13 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) >>>>>>>> if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED))) >>>>>>>> return 0; >>>>>>>> >>>>>>>> + /* >>>>>>>> + * Hugetlb does not require/support writenotify; especially, it does not >>>>>>>> + * support softdirty tracking. >>>>>>>> + */ >>>>>>>> + if (is_vm_hugetlb_page(vma)) >>>>>>>> + return 0; >>>>>>> >>>>>>> I'm kind of confused here.. you seems to be fixing up soft-dirty for >>>>>>> hugetlb but here it's explicitly forbidden. >>>>>>> >>>>>>> Could you explain a bit more on why this patch is needed if (assume >>>>>>> there'll be a working) patch 2 being provided? >>>>>>> >>>>>> >>>>>> No comments on the patch, but ... >>>>>> >>>>>> Since it required little thought, I ran the test program on next-20220802 and >>>>>> was surprised that the issue did not recreate. Even added a simple printk >>>>>> to make sure we were getting into vma_wants_writenotify with a hugetlb vma. >>>>>> We were. >>>>> >>>>> >>>>> ... does your config have CONFIG_MEM_SOFT_DIRTY enabled? >>>>> >>>> >>>> No, Duh! >>>> >>>> FYI - Some time back, I started looking at adding soft dirty support for >>>> hugetlb mappings. I did not finish that work. But, I seem to recall >>>> places where code was operating on hugetlb mappings when perhaps it should >>>> not. >>>> >>>> Perhaps, it would also be good to just disable soft dirty for hugetlb at >>>> the source? >>> >>> I thought about that as well. But I came to the conclusion that without >>> patch #2, hugetlb VMAs cannot possibly support write-notify, so there is >>> no need to bother in vma_wants_writenotify() at all. >>> >>> The "root" would be places where we clear VM_SOFTDIRTY. That should only >>> be fs/proc/task_mmu.c:clear_refs_write() IIRC. >>> >>> So I don't particularly care, I consider this patch a bit cleaner and >>> more generic, but I can adjust clear_refs_write() instead of there is a >>> preference. >>> >> >> After a closer look, I agree that this may be the simplest/cleanest way to >> proceed. I was going to suggest that you note hugetlb does not support >> softdirty, but see you did in the comment. >> >> Acked-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> > > Filtering out hugetlbfs in vma_wants_writenotify() is still a bit hard to > follow to me, since it's not clear why hugetlbfs never wants writenotify. Well, because the write-fault handler as is cannot deal with write-faults in shared mappings. It cannot possibly work or ever have worked. > > If it's only about soft-dirty, we could have added the hugetlbfs check into > vma_soft_dirty_enabled(), then I think it'll achieve the same thing and > much clearer - with the soft-dirty check constantly returning false for it, > hugetlbfs shared vmas should have vma_wants_writenotify() naturally return > 0 already. I considered that an option, but went with this approach for simplicity and because I don't see a need to check for hugetlb in other code paths (especially, in the !hugetlb mprotect variant). I mean, with patch #2 we can remove it anytime we do support softdirty tracking -- or if there is need for write-notify we can move it into the softdirty check only. > > For the long term - shouldn't we just enable soft-dirty for hugetlbfs? I > remember Mike used to have that in todo. Since we've got patch 2 already, > I feel like that's really much close (is the only missing piece the clear > refs write part? or maybe some more that I didn't notice). My gut feeling is that there isn't too much missing to have it working. Define a PTE bit, implement hugetlb variant of clearing, and set it when setting the PTE dirty. Thanks! -- Thanks, David / dhildenb