On Thu, 20 Sep 2018, John David Anglin wrote: > On 2018-09-20 2:39 PM, Mikulas Patocka wrote: > > > > On Thu, 20 Sep 2018, John David Anglin wrote: > > > > > On 2018-09-20 12:11 PM, Mikulas Patocka wrote: > > > > On Thu, 20 Sep 2018, Mikulas Patocka wrote: > > > > > > > > > But there's code like this that needs to be changed. It needs to first > > > > > set > > > > > the PTE and then purge the tlb - otherwise the old pte value could be > > > > > reloaded by a different CPU after purge_tlb_entries() because the TLB > > > > > reload code no longer takes the lock. > > > > > > > > > > static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned > > > > > long > > > > > addr, pte_t *ptep) > > > > > { > > > > > unsigned long flags; > > > > > spin_lock_irqsave(&pa_tlb_lock, flags); > > > > > purge_tlb_entries(mm, addr); > > > > > set_pte(ptep, pte_wrprotect(*ptep)); > > > > > spin_unlock_irqrestore(&pa_tlb_lock, flags); > > > > > } > > > > > > > > > > > > > > > > > > > > The accessed bit is in the same byte as the present bit, so what will > > > > > happen if the TLB fault races with pte clearing? > > > > > > > > > > Linux also encodes the swap location in non-present PTEs - could that > > > > > race > > > > > with the update of the byte containing the accessed bit? > > > > > > > > > > Perhaps the best solution would be to store the accessed and dirty > > > > > flags > > > > > in the two topmost bytes of the PTE and make sure that these bytes are > > > > > not > > > > > used for anything else. > > > > > > > > > > Mikulas > > > > And I realized that even if you don't modify the PTE, dropping the lock > > > > from the TLB handler is wrong. > > > > > > > > If you drop the lock, the CPU can read valid PTE, then wait unlimited > > > > amount of time (due to some microarchitectural event) and then insert > > > > the > > > > PTE into TLB - long after the PTE was cleared and flushed by some other > > > > CPU. > > > > > > > > If you drop the lock, you must stop relying on the instructions that > > > > broadcast TLB shootdown and you must send IPIs to flush TLBs just like > > > > on > > > > x86. > > > I don't believe the lock helps with the insert. As far as I can tell > > > > If you insert into TLB without holding the lock, you may be inserting > > arbitrarily old value. > > There is only a problem if this is done after purge. Yes - and it may happen - I've posted an example: https://www.spinics.net/lists/linux-parisc/msg09131.html > On PA 2.0, there are two reorder buffers - one for accesses and the > other for integer operations. Some instructions go in both. The arch > does not say TLB inserts are accesses. TLB purges are accesses. So, > integer operations can leak from a spin lock except when there is a > dependency on an access. This problem happens even if the CPU didn't reorder instructions at all. The TLB handler first reads a PTE from memory and then inserts it into the TLB. These two operations (load and insert) are not atomic. Between the load and insert, the PTE value in memory may change arbitrarily. Mikulas