On Thu, Sep 5, 2019 at 3:07 AM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > > The host may write multiple 0 or 1's and the endpoint can only > > detect one of them in an notif_poll usleep interval. > > Right. Notifications weren't designed to be implemented on top of RW > memory like this: the assumption was all notifications are buffered. I can implement notification as a counter instead of a pending bit to simulate a buffer. There will be many troublesome cases illustrated by the following example. The host sends a notification about available buffers 0-3. The endpoint will probably consume buffers 0-5 as the notification is polled and there is a delay. Then for some following notifications, the endpoint may realize there are no corresponding available buffers to consume. Those useless function calls waste cycles. > So if you implement modern instead, different queues can use > different addresses. Will start working on this after switching the endpoint to using vringh.c. > > The host may write > > some non-2 value as the endpoint code just finishes detecting the last > > non-2 value and reverting that value back to 2, effectively nullifying > > the new non-2 value. > > > > The host may decide to write a non-2 value > > immediately after the endpoint revert that value back to 2 but before > > the endpoint code finishes the current loop of execution, effectively > > making the value not reverted back to 2. > > > > All these and other problems are made worse by the fact that the PCI > > host Linux usually runs on much faster cores than the one on PCI > > endpoint. This is why relying completely on pending bits is not always > > safe. Hence the "fallback" check using usleep hackery exists. > > Nevertheless I welcome any suggestion, because I do not like this > > treatment myself either. > > As long as you have a small number of queues, you can poll both > of them. And to resolve racing with host, re-check > rings after you write 2 into the selector I assume your suggestion is based on modern virtio. vrings in legacy virtio share a common notification read-write area. > (btw you also need a bunch of memory barriers, atomics don't > imply them automatically). Thank you for the reminder. In this doc, https://www.kernel.org/doc/html/latest/core-api/atomic_ops.html, it says "atomic_cmpxchg must provide explicit memory barriers around the operation, although if the comparison fails then no memory ordering guarantees are required". My understanding of this sentence is that the arch-specific implementer of atomic_cmpxchg already surrounds the operation with barriers in a more efficient way. The second part of the sentence implies the doc's target audience is the implementer of atomic_cmpxchg. Please correct me if I misunderstand this doc. Thank you for your feedback. Best, Haotian