Re: [RFC PATCH v4 3/4] mm: add do_set_pte_range()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2/7/2023 1:35 AM, David Hildenbrand wrote:
> On 06.02.23 18:10, Matthew Wilcox wrote:
>> On Mon, Feb 06, 2023 at 05:49:20PM +0100, David Hildenbrand wrote:
>>> We have
>>>
>>> +    if (!cow) {
>>> +        folio_add_file_rmap_range(folio, start, nr, vma, false);
>>> +        add_mm_counter(vma->vm_mm, mm_counter_file(page), nr);
>>> +    } else {
>>> +        /*
>>> +         * rmap code is not ready to handle COW with anonymous
>>> +         * large folio yet. Capture and warn if large folio
>>> +         * is given.
>>> +         */
>>> +        VM_WARN_ON_FOLIO(folio_test_large(folio), folio);
>>> +    }
>>>
>>> now.
>>>
>>> What are we supposed to add instead on the else branch instead that would be
>>> correct in the future? Or not look weird?
>>
>> Right now, I think this patch should look something like this.
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 7a04a1130ec1..2f6173f83d8b 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -4257,15 +4257,18 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
>>   }
>>   #endif
>>   -void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr)
>> +void set_pte_range(struct vm_fault *vmf, struct folio *folio,
>> +        struct page *page, unsigned int nr, unsigned long addr)
>>   {
>>       struct vm_area_struct *vma = vmf->vma;
>>       bool uffd_wp = pte_marker_uffd_wp(vmf->orig_pte);
>>       bool write = vmf->flags & FAULT_FLAG_WRITE;
>>       bool prefault = vmf->address != addr;
>>       pte_t entry;
>> +    unsigned int i;
>>   -    flush_icache_page(vma, page);
>> +    for (i = 0; i < nr; i++)
>> +        flush_icache_page(vma, page + i);
>>       entry = mk_pte(page, vma->vm_page_prot);
>>         if (prefault && arch_wants_old_prefaulted_pte())
>> @@ -4279,14 +4282,15 @@ void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr)
>>           entry = pte_mkuffd_wp(entry);
>>       /* copy-on-write page */
>>       if (write && !(vma->vm_flags & VM_SHARED)) {
>> -        inc_mm_counter(vma->vm_mm, MM_ANONPAGES);
>> -        page_add_new_anon_rmap(page, vma, addr);
>> -        lru_cache_add_inactive_or_unevictable(page, vma);
>> +        add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr);
>> +        VM_BUG_ON_FOLIO(folio, nr != 1);
> 
> ^ what I asked for (WARN would be sufficient for IMHO). I don't precisely care how precisely we tell the educated reader that this function only handles this special case (I could even be convinced that a comment is good enough ;) ).
David, Thanks. I am going to move the cow and !cow case to
set_pte_range() and WARN if the large folio is passed in.

> 
>> +        folio_add_new_anon_rmap(folio, vma, addr);
>> +        folio_add_lru_vma(folio, vma);
>>       } else {
>> -        inc_mm_counter(vma->vm_mm, mm_counter_file(page));
>> -        page_add_file_rmap(page, vma, false);
>> +        add_mm_counter(vma->vm_mm, mm_counter_file(page), nr);
>> +        folio_add_file_rmap_range(folio, page, nr, vma, false);
>>       }
>> -    set_pte_at(vma->vm_mm, addr, vmf->pte, entry);
>> +    set_ptes(vma->vm_mm, addr, vmf->pte, entry, nr);
Matthew, I think we can't do this set_ptes() until pte_next() is ready.
I will make it still a loop to generate correct entry. And we can replace
the loop with set_ptes() once the pte_next() is ready.

Let me know if I get something wrong. Thanks.

Regards
Yin, Fengwei

>>   }
>>     static bool vmf_pte_changed(struct vm_fault *vmf)
>> @@ -4359,7 +4363,9 @@ vm_fault_t finish_fault(struct vm_fault *vmf)
>>         /* Re-check under ptl */
>>       if (likely(!vmf_pte_changed(vmf))) {
>> -        do_set_pte(vmf, page, vmf->address);
>> +        struct folio *folio = page_folio(page);
>> +
>> +        set_pte_range(vmf, folio, page, 1, vmf->address);
>>             /* no need to invalidate: a not-present page won't be cached */
>>           update_mmu_cache(vma, vmf->address, vmf->pte);
>>
>>> Go on, scream louder at me, I don't care.
>>
>> I'm not even shouting.  I just think you're wrong ;-)
>>
> 
> Good ;)
> 




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux