答复: [PATCH] Export mm_update_next_owner function for vhost-net(Internet mail)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>>   Under normal circumstances,When do_exit exits, mm->owner will
>>   be updated on exit_mm(). but when the kernel process calls
>>   unuse_mm() and then exits,mm->owner cannot be updated. And it
>>   will point to a task that has been released.
>>
>>   Below is my issue on vhost_net:
>>      A, B are two kernel processes(such as vhost_worker),
>>      C is a user space process(such as qemu), and all
>>      three use the mm of the user process C.
>>      Now, because user process C exits abnormally, the owner of this
>>      mm becomes A. When A calls unuse_mm and exits, this mm->ower
>>      still points to the A that has been released.
>>      When B accesses this mm->owner again, A has been released.


Thank your for taking a look and apologize for my distrub.

>Could you describe how you reproduce this issue?
Sorry, this issue is hard for my to reproduce, But there is such a critical situation.

>I believe vhost process should exit before process C?
Yes, the A, B will exit before C, because usually C will close the open fd and then exit.
However, if C is abnormally exited, such as killed by some fatal signal, A may exit before C

The current issue flow is as follows:
Process C              Process A         Process B
qemu-system-x86_64:     kernel:vhost_net  kernel: vhost_net
open /dev/vhost-net
  VHOST_SET_OWNER   create kthread vhost-%d  create kthread vhost-%d
  network init           use_mm()          use_mm()
   ...                   ...
   Abnormal exited
   ...
  do_exit
  exit_mm()
  update mm->owner to A
  exit_files()
   close_files()
   kthread_should_stop() unuse_mm()
    Stop Process A       tsk->mm=NULL
                         do_exit()
                         can't update owner
                         A exit completed   vhost-%d  rcv first package
                                            vhost-%d build rcv buffer for vq
                                            page fault
                                            access mm & mm->owner
                                            NOW,mm->owner still pointer A
                                            kernel NULL pointer at mem_cgroup_from_task()
    stop Process B

>>
>> Cc: "Michael S. Tsirkin" <mst@xxxxxxxxxx>
>> Cc: Jason Wang <jasowang@xxxxxxxxxx>
>> Cc: kvm@xxxxxxxxxxxxxxx
>> Cc: virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
>> Cc: netdev@xxxxxxxxxxxxxxx
>> Cc: linux-kernel@xxxxxxxxxxxxxxx
>> Cc: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>> Cc: Sudip Mukherjee <sudipm.mukherjee@xxxxxxxxx>
>> Cc: "Luis R. Rodriguez" <mcgrof@xxxxxxxxxx>
>> Cc: Dominik Brodowski <linux@xxxxxxxxxxxxxxxxxxxx>
>> Signed-off-by: guomin chen <gchen.guomin@xxxxxxxxx>
>> ---
>>   drivers/vhost/vhost.c | 1 +
>>   kernel/exit.c         | 1 +
>>   2 files changed, 2 insertions(+)
>>
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index 6b98d8e..7c09087 100644
>> --- a/drivers/vhost/vhost.c
>> +++ b/drivers/vhost/vhost.c
>> @@ -368,6 +368,7 @@ static int vhost_worker(void *data)
>>   		}
>>   	}
>>   	unuse_mm(dev->mm);
>> +	mm_update_next_owner(dev->mm);


>If you analysis is correct, this is still racy isn't it? (E.g page fault 
>happen between unuse_mm() and mm_update_next_owner()).

No, I think this is not racy. 
When page fault happend Between unuse_mm() and mm_update_next_owner(), Although tsk->mm =NULL, 
But tsk has not exited, So mm->onwer = tsk can still be accessed.  

Thanks and regards





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux