Re: [PATCH V6 01/10] Use copy_process in vhost layer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/17/21 1:26 PM, Eric W. Biederman wrote:
> Mike Christie <michael.christie@xxxxxxxxxx> writes:
> 
>> The following patches made over Linus's tree, allow the vhost layer to do
>> a copy_process on the thread that does the VHOST_SET_OWNER ioctl like how
>> io_uring does a copy_process against its userspace app. This allows the
>> vhost layer's worker threads to inherit cgroups, namespaces, address
>> space, etc and this worker thread will also be accounted for against that
>> owner/parent process's RLIMIT_NPROC limit.
>>
>> If you are not familiar with qemu and vhost here is more detailed
>> problem description:
>>
>> Qemu will create vhost devices in the kernel which perform network, SCSI,
>> etc IO and management operations from worker threads created by the
>> kthread API. Because the kthread API does a copy_process on the kthreadd
>> thread, the vhost layer has to use kthread_use_mm to access the Qemu
>> thread's memory and cgroup_attach_task_all to add itself to the Qemu
>> thread's cgroups.
>>
>> The problem with this approach is that we then have to add new functions/
>> args/functionality for every thing we want to inherit. I started doing
>> that here:
>>
>> https://urldefense.com/v3/__https://lkml.org/lkml/2021/6/23/1233__;!!ACWV5N9M2RV99hQ!eIaEe9V8mCgGU6vyvaWTKGi3Zlnz0rgk5Y-0nsBXRbsuVZsM8lYfHr8I8GRuoLYPYrOB$ 
>>
>> for the RLIMIT_NPROC check, but it seems it might be easier to just
>> inherit everything from the beginning, becuase I'd need to do something
>> like that patch several times.
> 
> I read through the code and I don't see why you want to make these
> almost threads of a process not actually threads of that process
> (like the io_uring threads are).
> 
> As a separate process there are many things that will continue to be
> disjoint.  If the thread changes cgroups for example your new process
> won't follow.
> 
> If you want them to be actually processes with an lifetime independent
> of the creating process I expect you want to reparent them to the local
> init process.  Just so they don't confuse the process tree.  Plus init
> processes know how to handle unexpected children.
> 
> What are the semantics you are aiming for?
> 

Hi Eric,

Right now, for vhost we need the thread being created:

1. added to the caller's cgroup.
2. to share the mm struct with the caller.
3. to be accounted for under the caller's nproc rlimit value.

For 1 and 2, we have cgroup_attach_task_all and get_task_mm
already.

This patchset started with me just trying to handle #3 by modifying kthreads
like here:

https://lkml.org/lkml/2021/6/23/1234

So we can use kthreads and the existing helpers and add:

A. a ucounts version of the above patches in the link

or

B. a helper that does something like copy_process's use of
is_ucounts_overlimit and vhost can call that.

instead of this patchset.


Before we even get to the next section below, do you consider items 1 - 3
something we need an API based on copy_process for?

Do you think I should just do A or B above, or do you have another idea? If
so can we get agreement on that from everyone?

I thought my patches in that link were a little hacky in how they passed
around the user/creds info. I thought maybe it shouldn't be passed around like
that, so switched to the copy_process based approach which did everything for
me. And I thought io_uring needed something similar as us so I made it generic.

I don't have a preference. You and Christian are the experts, so I'll leave it
to you guys.


> I can see sense in generalizing some of the pieces of create_io_thread
> but I think generalizing create_io_thread itself is premature.  The code
> lives in kernel/fork.c because it is a very special thing that we want
> to keep our eyes on.
> 
> Some of your generalization makes it much more difficult to tell what
> your code is going to use because you remove hard codes that are there
> to simplify the analysis of the situation.
> 
> My gut says adding a new create_vhost_worker and putting that in
> kernel/fork.c is a lot safer and will allow much better code analysis.
> 
> If there a really are commonalities between creating a userspace process
> that runs completely in the kernel and creating an additional userspace
> thread we refactor the code and simplify things.
> 
> I am especially nervous about generalizing the io_uring code as it's
> signal handling just barely works, and any generalization will cause it
> to break.  So you are in the process of generalizing code that simply
> can not handle the general case.  That scares me
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization



[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux