Re: [PATCH 7/7] drm/i915/gem: Acquire all vma/objects under reservation_ww_class

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Chris Wilson (2020-06-25 18:42:41)
> Quoting Christian König (2020-06-25 16:47:09)
> > Am 25.06.20 um 17:10 schrieb Chris Wilson:
> > > We have the DAG of fences, we can use that information to avoid adding
> > > an implicit coupling between execution contexts.
> > 
> > No, we can't. And it sounds like you still have not understood the 
> > underlying problem.
> > 
> > See this has nothing to do with the fences itself or their DAG.
> > 
> > When you depend on userspace to do another submission so your fence can 
> > start processing you end up depending on whatever userspace does.
> 
> HW dependency on userspace is explicit in the ABI and client APIs, and
> the direct control userspace has over the HW.
> 
> > This in turn means when userspace calls a system call (or does page 
> > fault) it is possible that this ends up in the reclaim code path.
> 
> We have both said the very same thing.
>  
> > And while we want to avoid it both Daniel and I already discussed this 
> > multiple times and we agree it is still a must have to be able to do 
> > fence waits in the reclaim code path.
> 
> But came to the opposite conclusion. For doing that wait harms the
> unrelated caller and the reclaim is opportunistic. There is no need for
> that caller to reclaim that page, when it can have any other. Why did you
> even choose that page to reclaim? Inducing latency in the caller is a bug,
> has been reported previously as a bug, and still considered a bug. [But at
> the end of the day, if the system is out of memory, then you have to pick
> a victim.]

An example

Thread A				Thread B

	submit(VkCmdWaitEvents)
	recvfrom(ThreadB)	...	sendto(ThreadB)
					\- alloc_page
					 \- direct reclaim
					  \- dma_fence_wait(A)
	VkSetEvent()

Regardless of that actual deadlock, waiting on an arbitrary fence incurs
an unbounded latency which is unacceptable for direct reclaim.

Online debugging can indefinitely suspend fence signaling, and the only
guarantee we make of forward progress, in some cases, is process
termination.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux