On Tue, 4 Apr 2023 18:00:38 +0200 Peter Zijlstra <peterz@xxxxxxxxxxxxx> > On Tue, Apr 04, 2023 at 05:12:17PM +0200, Peter Zijlstra wrote: > > > case 2: > > > CPU-A CPU-B > > > > > > modify pagetables > > > tlb_flush (memory barrier) > > > state == CONTEXT_USER > > > int state = atomic_read(&ct->state); > > > Kernel-enter: > > > state == CONTEXT_KERNEL > > > READ(pagetable values) > > > if (state & CT_STATE_MASK == CONTEXT_USER) > > > > > Hmm, hold up; what about memory ordering, we need a store-load ordering > between the page-table write and the context trackng load, and a > store-load order on the context tracking update and software page-table > walker loads. > > Now, iirc page-table modification is done under pte_lock (or > page_table_lock) and that only provides a RELEASE barrier on this end, > which is insufficient to order against a later load. > > Is there anything else? Hmm Yair could you specify what you meant by the memory barrier after tlb_flush in the diagram above? > > On the state tracking side, we have ct_state_inc() which is > atomic_add_return() which should provide full barrier and is sufficient.