On Wed, Jul 18, 2012 at 02:08:43PM +0300, Michael S. Tsirkin wrote: > On Wed, Jul 18, 2012 at 01:53:15PM +0300, Gleb Natapov wrote: > > On Wed, Jul 18, 2012 at 01:51:05PM +0300, Michael S. Tsirkin wrote: > > > On Wed, Jul 18, 2012 at 01:36:08PM +0300, Gleb Natapov wrote: > > > > On Wed, Jul 18, 2012 at 01:33:35PM +0300, Michael S. Tsirkin wrote: > > > > > On Wed, Jul 18, 2012 at 01:27:39PM +0300, Gleb Natapov wrote: > > > > > > On Wed, Jul 18, 2012 at 01:20:29PM +0300, Michael S. Tsirkin wrote: > > > > > > > On Wed, Jul 18, 2012 at 09:27:42AM +0300, Gleb Natapov wrote: > > > > > > > > On Tue, Jul 17, 2012 at 07:14:52PM +0300, Michael S. Tsirkin wrote: > > > > > > > > > > _Seems_ racy, or _is_ racy? Please identify the race. > > > > > > > > > > > > > > > > > > Look at this: > > > > > > > > > > > > > > > > > > static inline int kvm_irq_line_state(unsigned long *irq_state, > > > > > > > > > int irq_source_id, int level) > > > > > > > > > { > > > > > > > > > /* Logical OR for level trig interrupt */ > > > > > > > > > if (level) > > > > > > > > > set_bit(irq_source_id, irq_state); > > > > > > > > > else > > > > > > > > > clear_bit(irq_source_id, irq_state); > > > > > > > > > > > > > > > > > > return !!(*irq_state); > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > > > > Now: > > > > > > > > > If other CPU changes some other bit after the atomic change, > > > > > > > > > it looks like !!(*irq_state) might return a stale value. > > > > > > > > > > > > > > > > > > CPU 0 clears bit 0. CPU 1 sets bit 1. CPU 1 sets level to 1. > > > > > > > > > If CPU 0 sees a stale value now it will return 0 here > > > > > > > > > and interrupt will get cleared. > > > > > > > > > > > > > > > > > This will hardly happen on x86 especially since bit is set with > > > > > > > > serialized instruction. > > > > > > > > > > > > > > Probably. But it does make me a bit uneasy. Why don't we pass > > > > > > > irq_source_id to kvm_pic_set_irq/kvm_ioapic_set_irq, and move > > > > > > > kvm_irq_line_state to under pic_lock/ioapic_lock? We can then use > > > > > > > __set_bit/__clear_bit in kvm_irq_line_state, making the ordering simpler > > > > > > > and saving an atomic op in the process. > > > > > > > > > > > > > With my patch I do not see why we can't change them to unlocked variant > > > > > > without moving them anywhere. The only requirement is to not use RMW > > > > > > sequence to set/clear bits. The ordering of setting does not matter. The > > > > > > ordering of reading is. > > > > > > > > > > You want to use __set_bit/__clear_bit on the same word > > > > > from multiple CPUs, without locking? > > > > > Why won't this lose information? > > > > Because it is not RMW. If it is then yes, you can't do that. > > > > > > You are saying __set_bit does not do RMW on x86? Interesting. > > I think it doesn't. > > Anywhere I can read about this? > Well actually SDM says LOCK prefix is needed, so yes we cannot use __set_bit/__clear_bit without moving it under lock. > > > It's probably not a good idea to rely on this I think. > > > > > The code is no in arch/x86 so probably no. Although it is used only on > > x86 (and ia64 which has broken kvm anyway). > > Yes but exactly the reverse is documented. > > /** > * __set_bit - Set a bit in memory > * @nr: the bit to set > * @addr: the address to start counting from > * > * Unlike set_bit(), this function is non-atomic and may be reordered. > > > >>>> pls note the below > > * If it's called on the same region of memory simultaneously, the effect > * may be that only one operation succeeds. > >>>> until here > > */ > static inline void __set_bit(int nr, volatile unsigned long *addr) > { > asm volatile("bts %1,%0" : ADDR : "Ir" (nr) : "memory"); > } > > > > > > > > > > > > > > In any case, it seems simpler and safer to do accesses under lock > > > > > than rely on specific use. > > > > > > > > > > > -- > > > > > > Gleb. > > > > > > > > -- > > > > Gleb. > > > > -- > > Gleb. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html