On 02/11, Jeremy Fitzhardinge wrote: > > On 02/11/2015 09:24 AM, Oleg Nesterov wrote: > > I agree, and I have to admit I am not sure I fully understand why > > unlock uses the locked add. Except we need a barrier to avoid the race > > with the enter_slowpath() users, of course. Perhaps this is the only > > reason? > > Right now it needs to be a locked operation to prevent read-reordering. > x86 memory ordering rules state that all writes are seen in a globally > consistent order, and are globally ordered wrt reads *on the same > addresses*, but reads to different addresses can be reordered wrt to writes. > > So, if the unlocking add were not a locked operation: > > __add(&lock->tickets.head, TICKET_LOCK_INC); /* not locked */ > > if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG)) > __ticket_unlock_slowpath(lock, prev); > > Then the read of lock->tickets.tail can be reordered before the unlock, > which introduces a race: Yes, yes, thanks, but this is what I meant. We need a barrier. Even if "Every store is a release" as Linus mentioned. > This *might* be OK, but I think it's on dubious ground: > > __add(&lock->tickets.head, TICKET_LOCK_INC); /* not locked */ > > /* read overlaps write, and so is ordered */ > if (unlikely(lock->head_tail & (TICKET_SLOWPATH_FLAG << TICKET_SHIFT)) > __ticket_unlock_slowpath(lock, prev); > > because I think Intel and AMD differed in interpretation about how > overlapping but different-sized reads & writes are ordered (or it simply > isn't architecturally defined). can't comment, I simply so not know how the hardware works. > If the slowpath flag is moved to head, then it would always have to be > locked anyway, because it needs to be atomic against other CPU's RMW > operations setting the flag. Yes, this is true. But again, if we want to avoid the read-after-unlock, we need to update this lock and read SLOWPATH atomically, it seems that we can't avoid the locked insn. Oleg. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html