Anthony Liguori wrote:
Chris Wright wrote:
* Anthony Liguori (anthony@xxxxxxxxxxxxx) wrote:
The ioctl() interface is quite bad for what you're doing. You're
telling the kernel extra information about a VA range in
userspace. That's what madvise is for. You're tweaking simple
read/write values of kernel infrastructure. That's what sysfs is for.
I agree re: sysfs (brought it up myself before). As far as madvise vs.
ioctl, the one thing that comes from the ioctl is fops->release to
automagically unregister memory on exit.
This is precisely why ioctl() is a bad interface. fops->release isn't
tied to the process but rather tied to the open file. The file can
stay open long after the process exits either by a fork()'d child
inheriting the file descriptor or through something more sinister like
SCM_RIGHTS.
In fact, a common mistake is to leak file descriptors by not closing
them when exec()'ing a process. Instead of just delaying a close, if
you rely on this behavior to unregister memory regions, you could
potentially have badness happen in the kernel if ksm attempted to
access an invalid memory region.
How could such badness ever happen in the kernel?
Ksm work by virtual addresses!, it fetch the pages by using
get_user_pages(), and the mm struct is protected by get_task_mm(), in
addion we take the down_read(mmap_sem)
So how could ksm ever acces to invalid memory region unless the host
page table or get_task_mm() would stop working!
When someone register memory for scan, we do get_task_mm() when the file
is closed or when he say that he dont want this to be registered anymore
he call the unregister ioctl
You can aurgoment about API, but this is mathamathical thing to say Ksm
is insecure, please show me senario!
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html