On 2012-08-14 17:34, Alex Williamson wrote: > On Tue, 2012-08-14 at 17:15 +0200, Jan Kiszka wrote: >> On 2012-08-14 16:31, Alex Williamson wrote: >>> On Tue, 2012-08-14 at 16:10 +0200, Jan Kiszka wrote: >>>> On 2012-08-14 16:05, Alex Williamson wrote: >>>>> On Tue, 2012-08-14 at 15:48 +0200, Jan Kiszka wrote: >>>>>> Hi Alex, >>>>>> >>>>>> you once wrote this comment in device-assignment.c, msix_mmio_write: >>>>>> >>>>>> if (!msix_masked(&orig) && msix_masked(entry)) { >>>>>> /* >>>>>> * Vector masked, disable it >>>>>> * >>>>>> * XXX It's not clear if we can or should actually attempt >>>>>> * to mask or disable the interrupt. KVM doesn't have >>>>>> * support for pending bits and kvm_assign_set_msix_entry >>>>>> * doesn't modify the device hardware mask. Interrupts >>>>>> * while masked are simply not injected to the guest, so >>>>>> * are lost. Can we get away with always injecting an >>>>>> * interrupt on unmask? >>>>>> */ >>>>>> >>>>>> I'm wondering what made you think that we won't inject if the vector is >>>>>> masked like this (ie. in the shadow MSI-X table). Can you recall the >>>>>> details? >>>>>> >>>>>> I'm trying to refactor this code to make the KVM interface a bit more >>>>>> encapsulating the kernel interface details, not fixing anything. Still, >>>>>> I would also like to avoid introducing regressions. >>>>> >>>>> Yeah, I didn't leave a very good comment there. I'm sure it made more >>>>> sense to me at the time. I think I was trying to say that not only do >>>>> we not have a way to mask the physical hardware, but if we did, we don't >>>>> have a way to retrieve the pending bits, so any pending interrupts while >>>>> masked would be lost. We might be able to deal with that by posting a >>>>> spurious interrupt on unmask, but for now we do nothing as masking is >>>>> usually done just to update the vector. Thanks, >>>> >>>> Ok, thanks for the clarification. >>>> >>>> As we are at it, do you also recall if this >>>> >>>> --- a/hw/device-assignment.c >>>> +++ b/hw/device-assignment.c >>>> @@ -1573,28 +1573,7 @@ static void msix_mmio_write(void *opaque, target_phys_addr_t addr, >>>> */ >>>> } else if (msix_masked(&orig) && !msix_masked(entry)) { >>>> /* Vector unmasked */ >>>> - if (i >= adev->irq_entries_nr || !adev->entry[i].type) { >>>> - /* Previously unassigned vector, start from scratch */ >>>> - assigned_dev_update_msix(pdev); >>>> - return; >>>> - } else { >>>> - /* Update an existing, previously masked vector */ >>>> - struct kvm_irq_routing_entry orig = adev->entry[i]; >>>> - int ret; >>>> - >>>> - adev->entry[i].u.msi.address_lo = entry->addr_lo; >>>> - adev->entry[i].u.msi.address_hi = entry->addr_hi; >>>> - adev->entry[i].u.msi.data = entry->data; >>>> - >>>> - ret = kvm_update_routing_entry(&orig, &adev->entry[i]); >>>> - if (ret) { >>>> - fprintf(stderr, >>>> - "Error updating irq routing entry (%d)\n", ret); >>>> - return; >>>> - } >>>> - >>>> - kvm_irqchip_commit_routes(kvm_state); >>>> - } >>>> + assigned_dev_update_msix(pdev); >>>> } >>>> } >>>> } >>>> >>>> would make a relevant difference for known workloads? I'm trying to get >>>> rid of direct routing table manipulations, but I would also like to >>>> avoid introducing things like kvm_irqchip_update_msi_route unless really >>>> necessary. Or could VFIO make use of that as well? >>> >>> It makes me a little nervous, but I don't know that it won't work. >>> There's a lot more latency in turning off MSI-X and completely >>> rebuilding it than there is in updating the routing of a single vector. >>> You can imagine that irqbalance could be triggering this path pretty >>> regularly. Increasing vectors beyond what was previously setup is more >>> of an init-time event, so the latency doesn't bother me as much. We'd >>> probably have to send some spurious interrupts for anything we might >>> have missed if we take the high latency path. >> >> Yeah, good points. >> >>> >>> VFIO is already a little more abstracted, making use of the msix vector >>> use and release interface, but we do still make use of the kvm_irqchip >>> irqfd/virq interfaces. >> >> Hmm, but due to the nature of the callbacks, we always disable/reanable >> on mask/unmask. So VFIO will be slower than current device assignment in >> this regard. > > It's a bit awkward, I'm not thrilled with those msix callbacks but they > seem to work. I have a similar comment in static void > vfio_msix_vector_release that maybe we should just disable direct > injection on mask so that qemu-msix can do the masking and fill in the > PBA. That will require an enhancement of the callback mechanism. So far it does not allow to tell apart per-vector masking from general disabling. When the latter happens, we still want to release resources, I suppose. With such enhancement in place, we could even consider keeping the VIRQ and MSI route active (and provide a route update service) to avoid the tear-down/recreate overhead on fast mask/unmask cycles, e.g. for IRQ migration. > >> BTW, how do you handle the device's PBA? Pass it through to the guest? > > We could but I'm trying to use qemu-msix infrastructure which handles > the PBA. We've been working happily w/o good PBA support for so long, I > haven't bothered to work on a channel to get to the physical PBA yet. I think bouncing should be OK performance-wise - until some strange guest pops up that actually polls the PBA in high-load scenarios. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html