On Thu, Jul 25, 2024 at 04:18:43PM +0100, David Woodhouse wrote: > On Thu, 2024-07-25 at 10:11 -0400, Michael S. Tsirkin wrote: > > On Thu, Jul 25, 2024 at 02:50:50PM +0100, David Woodhouse wrote: > > > Even if the virtio-rtc specification were official today, and I was > > > able to expose it via PCI, I probably wouldn't do it that way. There's > > > just far more in virtio-rtc than we need; the simple shared memory > > > region is perfectly sufficient for most needs, and especially ours. > > > > I can't stop amazon from shipping whatever in its hypervisor, > > I'd just like to understand this better, if there is a use-case > > not addressed here then we can change virtio to address it. > > > > The rtc driver patch posted is 900 lines, yours is 700 lines, does not > > look like a big difference. As for using a memory region, this is > > valid, but maybe rtc should be changed to do exactly that? > > I'm certainly aiming for virtio-rtc to include that as an *option*, > because I think I don't think it makes sense for an RTC specification > aimed at virtual machines *not* to deal with the live migration > problem. > > AFAICT the only ways to deal with the LM problem are either to make a > hypercall/virtio transaction for *every* clock read which needs to be > accurate, or expose a memory region for the guest to do it "vDSO- > style". virtio can support the second option, we already have VIRTIO_PCI_CAP_SHARED_MEMORY_CFG, I'd just use it. > And similarly, unless we want guest userspace to have to make a > *system* call every time, that memory region needs to be mappable all > the way to userspace. This part is classic for pci, mapping pci bar has been well studied. > The use case isn't necessarily for all users of gettimeofday(), of > course; this is for those applications which *need* precision time. > Like distributed databases which rely on timestamps for coherency, and > users who get fined millions of dollars when LM messes up their clocks > and they put wrong timestamps on financial transactions. I would however worry that with all this pass through, applications have to be coded to each hypervisor or even version of the hypervisor. I don't really know the use-case well enough - is sending an interrupt to linux and having linux create a device independent structure not workable? > > E.g. we can easily add a capability describing such a region. > > or put it in device config space. > > I think it has to be memory, not config space. But yes. virtio config space, which is just a region in a BAR. But yes, maybe VIRTIO_PCI_CAP_SHARED_MEMORY_CFG is cleaner. > The intent is that my driver would be usable with the shared memory > region from a virtio-rtc device too. It'd need a tiny amount of > refactoring of the discovery code in vmclock_probe(), which I haven't > done yet as it would be premature optimisation. > > > I mean yes, we can build a new transport for each specific need but in > > the end we'll get a ton of interfaces with unclear compatibility > > requirements. If effort is instead spent improving common interfaces, > > we get consistency and everyone benefits. That's why I'm trying to > > understand the need here. > > It's simplicity. Because this isn't even a "transport". It's just a > simple breadcrumb given to the guest to tell it where the information > is. > In the fullness of time assuming this becomes part of virtio-rtc too, > the fact that it can *also* be discovered by ACPI is just a tiny > detail. And it allows hypervisors to implement it a *whole* lot more > simply. > > The addition of an ACPI method to enable the timekeeping does make it a > tiny bit more than a 'breadcrump', I concede — but that's still > basically trivial to implement. A whole lot simpler than a full virtio > device. virtio has been developed with the painful experience that we keep making mistakes, or coming up with new needed features, and that maintaining forward and backward compatibility becomes a whole lot harder than it seems in the beginning. -- MST