On 6/26/18 12:43 AM, Peter Zijlstra wrote:
On Mon, Jun 25, 2018 at 05:06:23PM -0700, Yang Shi wrote:
By looking this deeper, we may not be able to cover all the unmapping range
for VM_DEAD, for example, if the start addr is in the middle of a vma. We
can't set VM_DEAD to that vma since that would trigger SIGSEGV for still
mapped area.
splitting can't be done with read mmap_sem held, so maybe just set VM_DEAD
to non-overlapped vmas. Access to overlapped vmas (first and last) will
still have undefined behavior.
Acquire mmap_sem for writing, split, mark VM_DEAD, drop mmap_sem. Acquire
mmap_sem for reading, madv_free drop mmap_sem. Acquire mmap_sem for
writing, free everything left, drop mmap_sem.
?
Sure, you acquire the lock 3 times, but both write instances should be
'short', and I suppose you can do a demote between 1 and 2 if you care.
Thanks, Peter. Yes, by looking the code and trying two different
approaches, it looks this approach is the most straight-forward one.
Splitting vma up-front can save a lot pain later. Holding write mmap_sem
for this job before zapping mappings sounds worth the cost (very short
write critical section).
And, VM_DEAD can be set exclusively with write mmap_sem without racing
with page faults, this will give us consistent behavior for the race
between PF and munmap. And, we don't need care about overlapped vma
since it has been split before.
Yang