Re: [RFC PATCH 4/6] mm: Add COW PTE fallback function

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Le 19/05/2022 à 20:31, Chih-En Lin a écrit :
> The lifetime of COW PTE will handle by ownership and a reference count.
> When the process wants to write the COW PTE, which reference count is 1,
> it will reuse the COW PTE instead of copying then free.
> 
> Only the owner will update its RSS state and the record of page table
> bytes allocation. So we need to handle when the non-owner process gets
> the fallback COW PTE.
> 
> This commit prepares for the following implementation of the reference
> count for COW PTE.
> 
> Signed-off-by: Chih-En Lin <shiyn.lin@xxxxxxxxx>
> ---
>   mm/memory.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 66 insertions(+)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 76e3af9639d9..dcb678cbb051 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1000,6 +1000,34 @@ page_copy_prealloc(struct mm_struct *src_mm, struct vm_area_struct *vma,
>          return new_page;
>   }
> 
> +static inline void cow_pte_rss(struct mm_struct *mm, struct vm_area_struct *vma,
> +       pmd_t *pmdp, unsigned long addr, unsigned long end, bool inc_dec)

Parenthesis alignment is not correct.

You should run 'scripts/checkpatch.pl --strict' on you patch.

> +{
> +       int rss[NR_MM_COUNTERS];
> +       pte_t *orig_ptep, *ptep;
> +       struct page *page;
> +
> +       init_rss_vec(rss);
> +
> +       ptep = pte_offset_map(pmdp, addr);
> +       orig_ptep = ptep;
> +       arch_enter_lazy_mmu_mode();
> +       do {
> +               if (pte_none(*ptep) || pte_special(*ptep))
> +                       continue;
> +
> +               page = vm_normal_page(vma, addr, *ptep);
> +               if (page) {
> +                       if (inc_dec)
> +                               rss[mm_counter(page)]++;
> +                       else
> +                               rss[mm_counter(page)]--;
> +               }
> +       } while (ptep++, addr += PAGE_SIZE, addr != end);
> +       arch_leave_lazy_mmu_mode();
> +       add_mm_rss_vec(mm, rss);
> +}
> +
>   static int
>   copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma,
>                 pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long addr,
> @@ -4554,6 +4582,44 @@ static vm_fault_t wp_huge_pud(struct vm_fault *vmf, pud_t orig_pud)
>          return VM_FAULT_FALLBACK;
>   }
> 
> +/* COW PTE fallback to normal PTE:
> + * - two state here
> + *   - After break child :   [parent, rss=1, ref=1, write=NO , owner=parent]
> + *                        to [parent, rss=1, ref=1, write=YES, owner=NULL  ]
> + *   - After break parent:   [child , rss=0, ref=1, write=NO , owner=NULL  ]
> + *                        to [child , rss=1, ref=1, write=YES, owner=NULL  ]
> + */
> +void cow_pte_fallback(struct vm_area_struct *vma, pmd_t *pmd,
> +               unsigned long addr)

There should be a prototype in a header somewhere for a non static function.

You are encouraged to run 'make mm/memory.o C=2' to check sparse reports.

> +{
> +       struct mm_struct *mm = vma->vm_mm;
> +       unsigned long start, end;
> +       pmd_t new;
> +
> +       BUG_ON(pmd_write(*pmd));

You seem to add a lot of BUG_ONs(). Are they really necessary ? See 
https://docs.kernel.org/process/deprecated.html?highlight=bug_on#bug-and-bug-on

You may also use VM_BUG_ON().

> +
> +       start = addr & PMD_MASK;
> +       end = (addr + PMD_SIZE) & PMD_MASK;
> +
> +       /* If pmd is not owner, it needs to increase the rss.
> +        * Since only the owner has the RSS state for the COW PTE.
> +        */
> +       if (!cow_pte_owner_is_same(pmd, pmd)) {
> +               cow_pte_rss(mm, vma, pmd, start, end, true /* inc */);
> +               mm_inc_nr_ptes(mm);
> +               smp_wmb();
> +               pmd_populate(mm, pmd, pmd_page(*pmd));
> +       }
> +
> +       /* Reuse the pte page */
> +       set_cow_pte_owner(pmd, NULL);
> +       new = pmd_mkwrite(*pmd);
> +       set_pmd_at(mm, addr, pmd, new);
> +
> +       BUG_ON(!pmd_write(*pmd));
> +       BUG_ON(pmd_page(*pmd)->cow_pte_owner);
> +}
> +
>   /*
>    * These routines also need to handle stuff like marking pages dirty
>    * and/or accessed for architectures that don't do it in hardware (most
> --
> 2.36.1
> 




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux