On Thu, 2024-07-11 at 09:25 +0200, Peter Hilber wrote: > > IMHO this phrasing is better, since it directly refers to the state of the > structure. Thanks. I'll update it. > AFAIU if there would be abnormal delays in store buffers, causing some > driver to still see the old clock for some time, the monotonicity could be > violated: > > 1. device writes new, much slower clock to store buffer > 2. some time passes > 3. driver reads old, much faster clock > 4. device writes store buffer to cache > 5. driver reads new, much slower clock > > But I hope such delays do not occur. For the case of the hypervisor←→guest interface this should be handled by the use of memory barriers and the seqcount lock. The guest driver reads the seqcount, performs a read memory barrier, then reads the contents of the structure. Then performs *another* read memory barrier, and checks the seqcount hasn't changed: https://git.infradead.org/?p=users/dwmw2/linux.git;a=blob;f=drivers/ptp/ptp_vmclock.c;hb=vmclock#l351 The converse happens with write barriers on the hypervisor side: https://git.infradead.org/?p=users/dwmw2/qemu.git;a=blob;f=hw/acpi/vmclock.c;hb=vmclock#l68 Do we need to think harder about the ordering across a real PCI bus? It isn't entirely unreasonable for this to be implemented in hardware if we eventually add a counter_id value for a bus-visible counter like the Intel Always Running Timer (ART). I'm also OK with saying that device implementations may only provide the shared memory structure if they can ensure memory ordering.
Attachment:
smime.p7s
Description: S/MIME cryptographic signature