Re: [PATCH] parisc: Remove locking from TLB handler and set page accessed

Mikulas Patocka <mpatocka@xxxxxxxxxx> · Thu, 20 Sep 2018 13:40:21 -0400 (EDT)

On Thu, 20 Sep 2018, John David Anglin wrote:

> You ask good questions.  If the races that you describe are possible, this
> argues for continuing
> to use the lock.  It has never been clear to me whether the kernel changes
> page table entries
> while code using the entries or not.

The kernel doesn't stop processes while changing the pagetables - so the 
change may happen simultaneously with another CPU accessing the PTE that 
is being changed.

In x86 the kernel changes the pagetable and then broadcasts an IPI to 
other processors to flush the TLB.

On parisc, the request to flush the TLB may be broadcasted by the hardware 
- but we need the lock in the TLB handler to make sure that it doesn't 
load an entry after it was invalidated.

> On 2018-09-20 10:42 AM, Mikulas Patocka wrote:
> > But there's code like this that needs to be changed. It needs to first set
> > the PTE and then purge the tlb - otherwise the old pte value could be
> > reloaded by a different CPU after purge_tlb_entries() because the TLB
> > reload code no longer takes the lock.
> > 
> > static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long
> > addr, pte_t *ptep)
> > {
> >          unsigned long flags;
> >          spin_lock_irqsave(&pa_tlb_lock, flags);
> >          purge_tlb_entries(mm, addr);
> >          set_pte(ptep, pte_wrprotect(*ptep));
> >          spin_unlock_irqrestore(&pa_tlb_lock, flags);
> > }
> Should be tried.

But I think this isn't sufficient - see my other email. There's this race 
possible if we drop the lock from the fault handler:

CPU1 (executing a page fault handler):
load *pte from pagetable

CPU2 (executing ptep_set_wrprotect)
spin_lock_irqsave(&pa_tlb_lock, flags);
set_pte(ptep, pte_wrprotect(*ptep));
purge_tlb_entries(mm, addr);
spin_unlock_irqrestore(&pa_tlb_lock, flags);

CPU1 (continuing the page fault handler)
insert pte into TLB - now CPU1 is running with stale pte entry in TLB

Does anyone have an idea how does HP-UX handle the TLB faults?

Mikulas