Hugh Dickins <hughd@xxxxxxxxxx> writes: > On Mon, 11 Jan 2016, Aneesh Kumar K.V wrote: >> Hugh Dickins <hughd@xxxxxxxxxx> writes: >> >> > Swapoff after swapping hangs on the G5, when CONFIG_CHECKPOINT_RESTORE=y >> > but CONFIG_MEM_SOFT_DIRTY is not set. That's because the non-zero >> > _PAGE_SWP_SOFT_DIRTY bit, added by CONFIG_HAVE_ARCH_SOFT_DIRTY=y, is not >> > discounted when CONFIG_MEM_SOFT_DIRTY is not set: so swap ptes cannot be >> > recognized. >> > >> > (I suspect that the peculiar dependence of HAVE_ARCH_SOFT_DIRTY on >> > CHECKPOINT_RESTORE in arch/powerpc/Kconfig comes from an incomplete >> > attempt to solve this problem.) >> > >> > It's true that the relationship between CONFIG_HAVE_ARCH_SOFT_DIRTY and >> > and CONFIG_MEM_SOFT_DIRTY is too confusing, and it's true that swapoff >> > should be made more robust; but nevertheless, fix up the powerpc ifdefs >> > as x86_64 and s390 (which met the same problem) have them, defining the >> > bits as 0 if CONFIG_MEM_SOFT_DIRTY is not set. >> >> Do we need this patch, if we make the maybe_same_pte() more robust. The >> #ifdef with pte bits is always a confusing one and IMHO, we should avoid >> that if we can ? > > If maybe_same_pte() were more robust (as in the pte_same_as_swp() patch), > this patch here becomes an optimization rather than a correctness patch: > without this patch here, pte_same_as_swp() will perform an unnecessary > transformation (masking out _PAGE_SWP_SOFT_DIRTY) from every one of the > millions of ptes it has to examine, on configs where it couldn't be set. > Or perhaps the processor gets that all nicely lined up without any actual > delay, I don't know. But we have #ifndef CONFIG_HAVE_ARCH_SOFT_DIRTY static inline pte_t pte_swp_clear_soft_dirty(pte_t pte) { return pte; } #endif If we fix the CONFIG_HAVE_ARCH_SOFT_DIRTY correctly, we can do the same optmization without the #ifdef of pte bits right ? > > I've already agreed that the way SOFT_DIRTY is currently config'ed is > too confusing; but until that's improved, I strongly recommend that you > follow the same way of handling this as x86_64 and s390 are doing - going > off and doing it differently is liable to lead to error, as we have seen. > > So I recommend using the patch below too, whether or not you care for > the optimization. > > Hugh -aneesh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>