Re: [PATCH v13 06/10] iommu: Add a page fault handler

Jean-Philippe Brucker <jean-philippe@xxxxxxxxxx> · Tue, 23 Mar 2021 11:53:10 +0100

On Tue, Mar 02, 2021 at 09:57:27PM -0800, Raj, Ashok wrote:
> > +	ret = handle_mm_fault(vma, prm->addr, fault_flags, NULL);
> 
> Should we add a trace similar to trace_page_fault_user() or kernel in
> arch/x86/kernel/mm/fault.c 

Yes that would definitely be useful for debugging hardware and developping
applications. I can send a separate patch to add tracepoints here and in
the lower-level device fault path.

> or maybe add a perf_sw_event() for device faults? 

It does seem like that would fit well alongside the existing
PERF_COUNT_SW_PAGE_FAULTS, but I don't think it would be useful in
practice, because we can't provide a context for the event. Since we're
handling these faults remotely, the only way for a user to get IOPF events
is to enable them on all CPUs and all tasks. Tracepoints can have 'pasid'
and 'device' fields to identify an event, but the perf_sw_event wouldn't
have any useful metadata apart from the faulting address.

We could also add tracepoints on bind(), so users can get the PASID
obtained with the PID they care about and filter fault events based on
that.

I've been wondering about associating a PASID with a PID internally,
because we don't currently have anywhere to send SEGV signals for
unhandled page faults. But I think it would be best to notify the device
driver on unhandled fault and let them deal with it. They'll probably want
that information anyway.

Thanks,
Jean