Re: [PATCH] mm/swapfile: fix swapoff vs. software dirty bits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 17, 2015 at 10:58:59AM +0200, Martin Schwidefsky wrote:
> Fixes a regression introduced with commit 179ef71cbc085252
> "mm: save soft-dirty bits on swapped pages"
> 
> The maybe_same_pte() function is used to match a swap pte independent
> of the swap software dirty bit set with pte_swp_mksoft_dirty().
> 
> For CONFIG_HAVE_ARCH_SOFT_DIRTY=y but CONFIG_MEM_SOFT_DIRTY=n the
> software dirty bit may be set but maybe_same_pte() will not recognize
> a software dirty swap pte. Due to this a 'swapoff -a' will hang.
> 
> The straightforward solution is to replace CONFIG_MEM_SOFT_DIRTY
> with HAVE_ARCH_SOFT_DIRTY in maybe_same_pte().
> 
> Cc: linux-mm@xxxxxxxxx
> Cc: Cyrill Gorcunov <gorcunov@xxxxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Cc: Michal Hocko <mhocko@xxxxxxx>
> Signed-off-by: Martin Schwidefsky <schwidefsky@xxxxxxxxxx>
> ---
>  mm/swapfile.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 5887731..bf7da58 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1113,7 +1113,7 @@ unsigned int count_swap_pages(int type, int free)
>  
>  static inline int maybe_same_pte(pte_t pte, pte_t swp_pte)
>  {
> -#ifdef CONFIG_MEM_SOFT_DIRTY
> +#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
>  	/*
>  	 * When pte keeps soft dirty bit the pte generated
>  	 * from swap entry does not has it, still it's same

You know, I seem to miss how this might help. If CONFIG_MEM_SOFT_DIRTY=n
then all related helpers are nop'ed.

In particular in the commit you mentioned

+static inline int maybe_same_pte(pte_t pte, pte_t swp_pte)
+{
+#ifdef CONFIG_MEM_SOFT_DIRTY
+       /*
+        * When pte keeps soft dirty bit the pte generated
+        * from swap entry does not has it, still it's same
+        * pte from logical point of view.
+        */
+       pte_t swp_pte_dirty = pte_swp_mksoft_dirty(swp_pte);
+       return pte_same(pte, swp_pte) || pte_same(pte, swp_pte_dirty);
+#else
+       return pte_same(pte, swp_pte);
+#endif
+}
+
...
@@ -892,7 +907,7 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
        }
 
        pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
-       if (unlikely(!pte_same(*pte, swp_entry_to_pte(entry)))) {
+       if (unlikely(!maybe_same_pte(*pte, swp_entry_to_pte(entry)))) {
                mem_cgroup_cancel_charge_swapin(memcg);
                ret = 0;
                goto out;
@@ -947,7 +962,7 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
                 * swapoff spends a _lot_ of time in this loop!
                 * Test inline before going to call unuse_pte.
                 */
-               if (unlikely(pte_same(*pte, swp_pte))) {
+               if (unlikely(maybe_same_pte(*pte, swp_pte))) {
                        pte_unmap(pte);
                        ret = unuse_pte(vma, pmd, addr, entry, page);
                        if (ret)

Thus when CONFIG_MEM_SOFT_DIRTY = n, the unuse_pte will be the same
as it were without the patch, calling pte_same.

Now to the bit itself

#ifdef CONFIG_MEM_SOFT_DIRTY
#define _PAGE_SWP_SOFT_DIRTY_PAGE_PSE
#else
#define _PAGE_SWP_SOFT_DIRTY_PAGE_PSE(_AT(pteval_t, 0))
#endif

it's 0 if CONFIG_MEM_SOFT_DIRTY=n, so any setup of this
bit will simply become nop

#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
static inline pte_t pte_swp_mksoft_dirty(pte_t pte)
{
	return pte_set_flags(pte, _PAGE_SWP_SOFT_DIRTY);
}

static inline int pte_swp_soft_dirty(pte_t pte)
{
	return pte_flags(pte) & _PAGE_SWP_SOFT_DIRTY;
}

static inline pte_t pte_swp_clear_soft_dirty(pte_t pte)
{
	return pte_clear_flags(pte, _PAGE_SWP_SOFT_DIRTY);
}
#endif

So I fear I'm lost where this "set" of the bit comes from
when CONFIG_MEM_SOFT_DIRTY=n.

Martin, could you please elaborate? Seems I'm missing
something obvious.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]