On Mon, Jul 10, 2023 at 11:02:33PM +0800, Wen Yang wrote: > > On 2023/7/10 22:12, Christian Brauner wrote: > > On Sun, Jul 09, 2023 at 02:54:51PM +0800, wenyang.linux@xxxxxxxxxxx wrote: > > > From: Wen Yang <wenyang.linux@xxxxxxxxxxx> > > > > > > For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling > > > eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX. > > > > > > Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove") > > > Signed-off-by: Wen Yang <wenyang.linux@xxxxxxxxxxx> > > > Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx> > > > Cc: Jens Axboe <axboe@xxxxxxxxx> > > > Cc: Christian Brauner <brauner@xxxxxxxxxx> > > > Cc: Christoph Hellwig <hch@xxxxxx> > > > Cc: Dylan Yudaken <dylany@xxxxxx> > > > Cc: David Woodhouse <dwmw@xxxxxxxxxxxx> > > > Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> > > > Cc: linux-fsdevel@xxxxxxxxxxxxxxx > > > Cc: linux-kernel@xxxxxxxxxxxxxxx > > > --- > > So this looks ok but I would like to see an analysis how the overflow > > can happen. I'm looking at the callers and it seems that once ctx->count > > hits 0 eventfd_read() won't call eventfd_ctx_do_read() anymore. So is > > there a caller that can call directly or indirectly > > eventfd_ctx_do_read() on a ctx->count == 0? > eventfd_read() ensures that ctx->count is not 0 before calling > eventfd_ctx_do_read() and it is correct. > > But it is not appropriate for eventfd_ctx_remove_wait_queue() to call > eventfd_ctx_do_read() unconditionally, > > as it may not only causes ctx->count to overflow, but also unnecessarily > calls wake_up_locked_poll(). Hm, so I think you're right and an underflow can be triggered for at least three subsystems: (1) virt/kvm/eventfd.c (2) drivers/vfio/virqfd.c (3) drivers/virt/acrn/irqfd.c where (2) and (3) are just modeled after (1). The eventfd must've been set to EFD_SEMAPHORE and ctx->count must been or decremented zero. The only way I can see the _underflow_ happening is if the irqfd is shutdown through an ioctl() like KVM_IRQFD with KVM_IRQFD_FLAG_DEASSIGN raised while ctx->count is zero: kvm_vm_ioctl() -> kvm_irqfd() -> kvm_irqfd_deassign() -> irqfd_deactivate() -> irqfd_shutdown() -> eventfd_ctx_remove_wait_queue(&cnt) which would underflow @cnt and cause a spurious wakeup. Userspace would still read one because of EFD_SEMAPHORE semantics and wouldn't notice the underflow. I think it's probably not that bad because afaict, this really can only happen when (1)-(3) are shutdown. But we should still fix it ofc. > > > I am sorry for just adding the following string in the patch: > Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove") > > > Looking forward to your suggestions. What I usually look for is some callchain/analysis that explain under what circumstance what this is fixing can happen. That makes life for reviewers a lot easier because they don't have to dig out that work themselves which takes time.