On Mon, May 24, 2010 at 09:37:05AM +0300, Avi Kivity wrote: > On 05/23/2010 07:30 PM, Michael S. Tsirkin wrote: >> >> >>>> Maybe we should use atomics on index then? >>>> >>>> >>> This should only be helpful if you access the cacheline several times in >>> a row. That's not the case in virtio (or here). >>> >> So why does it help? >> > > We actually do access the cacheline several times in a row here (but not > in virtio?): > >> case SHARE: >> while (count< MAX_BOUNCES) { >> /* Spin waiting for other side to change it. */ >> while (counter->cacheline1 != count); >> > > Broadcast a read request. > >> count++; >> counter->cacheline1 = count; >> > > Broadcast an invalidate request. > >> count++; >> } >> break; >> >> case LOCKSHARE: >> while (count< MAX_BOUNCES) { >> /* Spin waiting for other side to change it. */ >> while (__sync_val_compare_and_swap(&counter->cacheline1, count, count+1) >> != count); >> > > Broadcast a 'read for ownership' request. > >> count += 2; >> } >> break; >> > > So RMW should certainly by faster using single-instruction RMW > operations (or using prefetchw). Okay, but why is lockunshare faster than unshare? > -- > Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html