On Mon, Mar 30, 2020 at 02:45:54PM +0200, Vitaly Kuznetsov wrote: > Andrea Parri <parri.andrea@xxxxxxxxx> writes: > > >> Correct me if I'm wrong, but currently vmbus_chan_sched() accesses > >> per-cpu list of channels on the same CPU so we don't need a spinlock to > >> guarantee that during an interrupt we'll be able to see the update if it > >> happened before the interrupt (in chronological order). With a global > >> list of relids, who guarantees that an interrupt handler on another CPU > >> will actually see the modified list? > > > > Thanks for pointing this out! > > > > The offer/resume path presents implicit full memory barriers, program > > -order after the array store which should guarantee the visibility of > > the store to *all* CPUs before the offer/resume can complete (c.f., > > > > tools/memory-model/Documentation/explanation.txt, Sect. #13 > > > > and assuming that the offer/resume for a channel must complete before > > the corresponding handler, which seems to be the case considered that > > some essential channel fields are initialized only later...) > > > > IIUC, the spin lock approach you suggested will work and be "simpler"; > > an obvious side effect would be, well, a global synchronization point > > in vmbus_chan_sched()... > > > > Thoughts? > > This is, of course, very theoretical as if we're seeing an interrupt for > a channel at the same time we're writing its relid we're already in > trouble. I can, however, try to suggest one tiny improvement: Indeed. I think the idea (still quite informal) is that: 1) the mapping of the channel relid is propagated to (visible from) all CPUs before add_channel_work is queued (full barrier in queue_work()), 2) add_channel_work is queued before the channel is opened (aka, before the channel ring buffer is allocate/initalized and the OPENCHANNEL msg is sent and acked from Hyper-V, cf. OPEN_STATE), 3) the channel is opened before Hyper-V can start sending interrupts for the channel, and hence before vmbus_chan_sched() can find the channel relid in recv_int_page set, 4) vmbus_chan_sched() finds the channel's relid in recv_int_page set before it search/load from the channel array (full barrier in sync_test_and_clear_bit()). This is for the "normal"/not resuming from hibernation case; for the latter, notice that: a) vmbus_isr() (and vmbus_chan_sched()) can not run until when vmbus_bus_resume() has finished (@resume_noirq callback), b) vmbus_bus_resume() can not complete before nr_chan_fixup_on_resume equals 0 in check_ready_for_resume_event(). (and check_ready_for_resume_event() does also provides a full barrier). If makes sense to you, I'll try to add some of the above in comments. Thanks, Andrea > > vmbus_chan_sched() now clean the bit in the event page and then searches > for a channel with this relid; in case we allow the search to > (temporary) fail we can reverse the logic: search for the channel and > clean the bit only if we succeed. In case we fail, next time (next IRQ) > we'll try again and likely succeed. The only purpose is to make sure no > interrupts are ever lost. This may be an overkill, we may want to try > to count how many times (if ever) this happens. > > Just a thought though. > > -- > Vitaly >