On 8/8/18 2:22 AM, Vlastimil Babka wrote:
On 08/08/2018 03:51 AM, Yang Shi wrote:
On 8/6/18 10:45 PM, Michal Hocko wrote:
On Mon 06-08-18 15:19:06, Yang Shi wrote:
On 8/6/18 1:52 PM, Michal Hocko wrote:
On Mon 06-08-18 13:48:35, Yang Shi wrote:
On 8/6/18 1:41 PM, Michal Hocko wrote:
On Mon 06-08-18 09:46:30, Yang Shi wrote:
On 8/6/18 2:40 AM, Michal Hocko wrote:
On Fri 03-08-18 14:01:58, Yang Shi wrote:
On 8/3/18 2:07 AM, Michal Hocko wrote:
On Fri 27-07-18 02:10:14, Yang Shi wrote:
[...]
If the vma has VM_LOCKED | VM_HUGETLB | VM_PFNMAP or uprobe, they are
considered as special mappings. They will be dealt with before zapping
pages with write mmap_sem held. Basically, just update vm_flags.
Well, I think it would be safer to simply fallback to the current
implementation with these mappings and deal with them on top. This would
make potential issues easier to bisect and partial reverts as well.
Do you mean just call do_munmap()? It sounds ok. Although we may waste some
cycles to repeat what has done, it sounds not too bad since those special
mappings should be not very common.
VM_HUGETLB is quite spread. Especially for DB workloads.
Wait a minute. In this way, it sounds we go back to my old implementation
with special handling for those mappings with write mmap_sem held, right?
Yes, I would really start simple and add further enhacements on top.
If updating vm_flags with read lock is safe in this case, we don't have to
do this. The only reason for this special handling is about vm_flags update.
Yes, maybe you are right that this is safe. I would still argue to have
it in a separate patch for easier review, bisectability etc...
Sorry, I'm a little bit confused. Do you mean I should have the patch
*without* handling the special case (just like to assume it is safe to
update vm_flags with read lock), then have the other patch on top of it,
which simply calls do_munmap() to deal with the special cases?
Just skip those special cases in the initial implementation and handle
each special case in its own patch on top.
Thanks. VM_LOCKED area will not be handled specially since it is easy to
handle it, just follow what do_munmap does. The special cases will just
handle VM_HUGETLB, VM_PFNMAP and uprobe mappings.
So I think you could maybe structure code like this: instead of
introducing do_munmap_zap_rlock() and all those "bool skip_vm_flags"
additions, add a boolean parameter in do_munmap() to use the new
behavior, with only the first user SYSCALL_DEFINE2(munmap) setting it to
true. If true, do_munmap() will do the
- down_write_killable() itself instead of assuming it's already locked
- munmap_lookup_vma()
- check if any of the vma's in the range is "special", if yes, change
the boolean param to "false", and continue like previously, e.g. no mmap
sem downgrade etc.
Thanks for the suggestion. Actually, I did the similar thing in v1
patches, which added a bool parameter in vm_munmap() to tell if
releasing mmap_sem is acceptable for some code paths. But, it got pushed
back by tglx since vm_munmap() is called by x86 specific code too (and
some other architectures). He suggested to define a new function to do
the optimization. So, I followed this approach in the later versions.
Yang
That would be a basis for further optimizing the special vma cases in
subsequent patches (maybe it's really ok to touch the vma flags with
mmap sem for read as vma's are detached), and to eventually convert more
do_munmap() callers to the new mode.
HTH,
Vlastimil