On 6/6/23 2:22 PM, Michael S. Tsirkin wrote: > On Tue, Jun 06, 2023 at 12:19:10PM -0500, Mike Christie wrote: >> On 6/6/23 4:49 AM, Stefano Garzarella wrote: >>> On Mon, Jun 05, 2023 at 01:57:30PM -0500, Mike Christie wrote: >>>> If userspace does VHOST_VSOCK_SET_GUEST_CID before VHOST_SET_OWNER we >>>> can race where: >>>> 1. thread0 calls vhost_transport_send_pkt -> vhost_work_queue >>>> 2. thread1 does VHOST_SET_OWNER which calls vhost_worker_create. >>>> 3. vhost_worker_create will set the dev->worker pointer before setting >>>> the worker->vtsk pointer. >>>> 4. thread0's vhost_work_queue will see the dev->worker pointer is >>>> set and try to call vhost_task_wake using not yet set worker->vtsk >>>> pointer. >>>> 5. We then crash since vtsk is NULL. >>>> >>>> Before commit 6e890c5d5021 ("vhost: use vhost_tasks for worker >>>> threads"), we only had the worker pointer so we could just check it to >>>> see if VHOST_SET_OWNER has been done. After that commit we have the >>>> vhost_worker and vhost_task pointers, so we can now hit the bug above. >>>> >>>> This patch embeds the vhost_worker in the vhost_dev, so we can just >>>> check the worker.vtsk pointer to check if VHOST_SET_OWNER has been done >>>> like before. >>>> >>>> Fixes: 6e890c5d5021 ("vhost: use vhost_tasks for worker threads") >>> >>> We should add: >>> >>> Reported-by: syzbot+d0d442c22fa8db45ff0e@xxxxxxxxxxxxxxxxxxxxxxxxx >> >> >> Ok. Will do. >> >> >>>> - } >>>> + vtsk = vhost_task_create(vhost_worker, &dev->worker, name); >>>> + if (!vtsk) >>>> + return -ENOMEM; >>>> >>>> - worker->vtsk = vtsk; >>>> + dev->worker.kcov_handle = kcov_common_handle(); >>>> + dev->worker.vtsk = vtsk; >>> >>> vhost_work_queue() is called by vhost_transport_send_pkt() without >>> holding vhost_dev.mutex (like vhost_poll_queue() in several places). >>> >>> If vhost_work_queue() finds dev->worker.vtsk not NULL, how can we >>> be sure that for example `work_list` has been initialized? >>> >>> Maybe I'm overthinking since we didn't have this problem before or the >>> race is really short that it never happened. >> >> Yeah, I dropped the READ/WRITE_ONCE use because I didn't think we needed >> it for the vhost_vsock_start case, and for the case you mentioned above, I >> wondered the same thing as you but was not sure so I added the same >> behavior as before. When I read memory-barriers.txt, it sounds like we've >> been getting lucky. > > Yea READ/WRITE_ONCE is one of these things. Once you start adding > them it's hard to stop, you begin to think it's needed everywhere. > To actually know you need a language lawyer (READ/WRITE_ONCE > is a compiler thing not a CPU thing). I am worried about the compiler reordering when init_llist_head sets the llist->first pointer to NULL and when we set the vtsk pointer. If they are in reverse order, then vhost_work_queue could detect there is a vtsk, but then after we add work to the work_list we could set llist->first to NULL. However, to avoid this type of thing we need to have init_llist_head use WRITE_ONCE right as well as vhost_worker_create when it sets vtsk in the patch. One other thing is that we could just move the init_llist_head to vhost_dev_init like it was before. We should be safe like before. _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization