On Wed, 2015-10-07 at 09:28 -0700, Jesse Barnes wrote: > On 10/07/2015 09:14 AM, Daniel Vetter wrote: > > On Wed, Oct 07, 2015 at 08:16:42AM -0700, Jesse Barnes wrote: > > > On 10/07/2015 06:00 AM, David Woodhouse wrote: > > > > On Fri, 2015-09-04 at 09:59 -0700, Jesse Barnes wrote: > > > > > + > > > > > + ret = handle_mm_fault(mm, vma, address, > > > > > + desc.wr_req ? FAULT_FLAG_WRITE : 0); > > > > > + if (ret & VM_FAULT_ERROR) { > > > > > + gpu_mm_segv(tsk, address, SEGV_ACCERR); /* ? */ > > > > > + goto out_unlock; > > > > > + } > > > > > + > > > > > > > > Hm, do you need to force the SEGV there, in what ought to be generic > > > > IOMMU code? > > > > > > > > Can you instead just let the fault handler return an appropriate > > > > failure code to the IOMMU request queue and then deal with the > > > > resulting error on the i915 device side? > > > > > > I'm not sure if we get enough info on the i915 side to handle it > > > reasonably, we'll have to test that out. > > > > We do know precisely which context blew up, but without the TDR work we > > can't yet just kill the offender selective without affecting the other > > active gpu contexts. > > How? The notification from the IOMMU queue is asynchronous... The page request, and the response, include 'private data' which an endpoint can use to carry that kind of information. In $7.5.1.1 of the VT-d specification it tells us: "Private Data: The Private Data field can be used by Root-Complex integrated endpoints to uniquely identify device-specific private information associated with an individual page request. "For Intel ® Processor Graphics device, the Private Data field specifies the identity of the GPU advanced-context (see Section 3.10) sending the page request." > > But besides that I really don't see a reason why we need to kill the > > process if the gpu faults. After all if a thread sigfaults then signal > > goes to that thread and not some random one (or the one thread that forked > > the thread that blew up). And we do have interfaces to tell userspace that > > something bad happened with the gpu work it submitted. I certainly don't want the core IOMMU code killing things. I really want to just complete the page request with an appropriate failure code, and let the endpoint device deal with it from there. -- David Woodhouse Open Source Technology Centre David.Woodhouse@xxxxxxxxx Intel Corporation
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx