On 23/07/21 11:48, Hillf Danton wrote:
On Fri, 23 Jul 2021 09:59:55 +0200 Paolo Bonzini wrote:
First and foremost, I'm not sure what you are trying to fix.
The change proposed builds the return value without assuming that the
event count is stable across poll_wait(). If it is unstable then we know
there are concurrent reader and/or writer who both are ingnored currently.
Concurrent reads or writes have their own wake_up_locked_poll calls.
Why are they not enough?
Second, the patch is wrong even without taking into account the lockless
accesses, because the condition for returning EPOLLOUT is certainly wrong.
Given it is detected that event count was consumed, there is room, though
as racy as it is, in the event count for writer to make some progress.
For one, you do not return EPOLLIN|EPOLLOUT correctly.
Third, barriers very rarely speak for themselves. In particular what
do they pair with?
There is no need to consider pair frankly. Barriers are just readded for
removing the seep in the comment. Then the comment goes with the seep.
Then you would need an smp_mb() to order a spin_unlock() against a
READ_ONCE(). But adding memory barriers randomly is the worst way to
write a lockless algorithm that you can explain to others, and "there is
no need to consider pair frankly" all but convinces me that you've put
enough thought in the change you're proposing.
(Shameless plug: https://lwn.net/Articles/844224/).
What the comment does not cover is the cases of more-than-two-party race.
But, you still haven't explained what's the bug there.
Paolo