Re: [PATCH] thp, mm: remove comments on serializion of THP split vs. gup_fast

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 25, 2016 at 10:50:14PM -0800, Hugh Dickins wrote:
> It's a useful suggestion from Gerald, and your THP rework may have
> brought us closer to being able to rely on RCU locking rather than
> IRQ disablement there; but you were right just to delete the comment,
> there are other reasons why fast GUP still depends on IRQs disabled.
> 
> For example, see the fallback tlb_remove_table_one() in mm/memory.c:
> that one uses smp_call_function() sending IPI to all CPUs concerned,
> without waiting an RCU grace period at all.

I full agree, the refcounting change just drops the THP splitting from
the equation, but everything else remains. It's not like x86 is using
RCU for gup_fast when CONFIG_TRANSPARENT_HUGEPAGE=n.

The main issue Peter also pointed out is how it can be faster to wait
a RCU grace period than sending an IPI to only the CPU that have an
active_mm matching the one the page belongs to and I'm not exactly
sure the cost of disabling irqs in gup_fast is going to pay off. It's
not just swap, large munmap should be able to free up pagetables or
pagetables would get a footprint out of proportion with the Rss of the
process, and in turn it'll have to either block synchronously for long
before returning to userland, or return to userland when the pagetable
memory is still not free, and userland may mmap again and munmap again
in a loop and being legit doing so too, with unclear side effects with
regard to false positive OOM.

Then there's another issue with synchronize_sched(),
__get_user_pages_fast has to safe to run from irq (note the
local_irq_save instead of local_irq_disable) and KVM leverages it. KVM
just requires it to be atomic so it can run from inside a preempt
disabled section (i.e. inside a spinlock), I'm fairly certain the
irq-safe guarantee could be dropped without pain and
rcu_read_lock_sched() would be enough, but the documentation of the
IRQ-safe guarantees provided by __get_user_pages_fast should be also
altered if we were to use synchronize_sched() and that's a symbol
exported to GPL modules too.

Overall my main concern in switching x86 to RCU gup-fast is the
performance of synchronize_sched in large munmap pagetable teardown.

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]