Michael S. Tsirkin wrote: > On Mon, Jul 06, 2009 at 10:56:02AM -0400, Gregory Haskins wrote: > >> Avi Kivity wrote: >> >>> On 07/02/2009 06:50 PM, Avi Kivity wrote: >>> >>>> On 07/02/2009 06:37 PM, Gregory Haskins wrote: >>>> >>>>> (Applies to kvm.git/master:1f9050fd) >>>>> >>>>> The following is the latest attempt to fix the races in >>>>> irqfd/eventfd, as >>>>> well as restore DEASSIGN support. For more details, please read the >>>>> patch >>>>> headers. >>>>> >>>>> As always, this series has been tested against the kvm-eventfd unit >>>>> test >>>>> and everything appears to be functioning properly. You can download >>>>> this >>>>> test here: >>>>> >>>> Applied, thanks. >>>> >>>> >>> ... and unapplied. There's a refcounting mismatch in irqfd_cleanup: a >>> reference is taken for each irqfd, but dropped for each guest. This >>> causes an oops if a guest with no irqfds is created and destroyed: >>> >> I was able to reproduce this issue. The problem turned out to be that I >> inadvertently always did a flush_workqueue(), even if the work-queue was >> never initialized. >> >> The following interdiff applied to the reverted patch has been confirmed >> to fix the issue: >> > > Could you document the init boolean and its locking rules? > The best place to put it would be where the field is declared btw. > Will do > Is it true that init === list_empty(&kvm->irqfds.items)? > If yes maybe we don't need this field at all. > > No, because its more difficult to maintain the work-queue when referenced against active irqfds (*). So instead, its maintained against guests that use irqfd, whether they have an active irqfd or not. Otherwise you have to contend with the eventfd-side release, which is a little tricky. (*) I'm sure its not rocket science to get this working, but it was getting more complex than I thought it was worth, so I simplified the model to be per-vm. Note that this design decision/limitation is declared in the patch header. > >> ------------------- >> >> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c >> index fcc3469..52b0e04 100644 >> --- a/virt/kvm/eventfd.c >> +++ b/virt/kvm/eventfd.c >> @@ -318,6 +318,9 @@ kvm_irqfd_deassign(struct kvm *kvm, int fd, int gsi) >> struct _irqfd *irqfd, *tmp; >> struct eventfd_ctx *eventfd; >> >> + if (!kvm->irqfds.init) >> + return -ENOENT; >> + >> eventfd = eventfd_ctx_fdget(fd); >> if (IS_ERR(eventfd)) >> return PTR_ERR(eventfd); >> > > wouldn't it be cleaner to error out in the for each loop if we don't > find an entry to deactivate? Might be helpful for apps to get an error > if they didn't deassign anything. > Again, irqfds.init is somewhat orthogonal to whether the list is populated or not. This check is for sanity (how can you deassign if you didnt assign, etc). Normally this would be a simple BUG_ON() sanity check, but I don't want a malicious/broken userspace to gain an easy attack vector ;) > >> @@ -360,6 +363,9 @@ kvm_irqfd_release(struct kvm *kvm) >> { >> struct _irqfd *irqfd, *tmp; >> >> + if (!kvm->irqfds.init) >> + return; >> + >> > > So here, I recall some old comment that flush below was > needed even if list is empty. Is this no longer true? > If you are using irqfd, its true. If irqfds.init == false, you are not using irqfd and thus the flush cannot be needed. > If not it might be cleaner to only flush if list is not empty. > > You have to flush if irqfds.init == true even if the list is empty because you need to be sure that eventfd-side releases complete. They may have already removed themselves from the list, but the work-item is still in flight. Regards, -Greg
Attachment:
signature.asc
Description: OpenPGP digital signature