Re: [PATCH v5 00/32] Introduce GPU SVM and Xe SVM implementation

"Ghimiray, Himal Prasad" <himal.prasad.ghimiray@xxxxxxxxx> · Fri, 14 Feb 2025 09:10:29 +0000

From: Intel-xe <intel-xe-bounces@xxxxxxxxxxxxxxxxxxxxx> on behalf of Ghimiray, Himal Prasad <himal.prasad.ghimiray@xxxxxxxxx>

Sent: Friday, February 14, 2025 2:38:10 pm

To: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx>; Demi Marie Obenour <demi@xxxxxxxxxxxxxxxxxxxxxx>; Brost, Matthew <matthew.brost@xxxxxxxxx>; intel-xe@xxxxxxxxxxxxxxxxxxxxx <intel-xe@xxxxxxxxxxxxxxxxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx
 <dri-devel@xxxxxxxxxxxxxxxxxxxxx>

Cc: apopple@xxxxxxxxxx <apopple@xxxxxxxxxx>; airlied@xxxxxxxxx <airlied@xxxxxxxxx>; simona.vetter@xxxxxxxx <simona.vetter@xxxxxxxx>; felix.kuehling@xxxxxxx <felix.kuehling@xxxxxxx>; dakr@xxxxxxxxxx <dakr@xxxxxxxxxx>

Subject: Re: [PATCH v5 00/32] Introduce GPU SVM and Xe SVM implementation

k,
asx.ddk

please ignore

From: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx>

Sent: Friday, February 14, 2025 2:17:13 PM

To: Demi Marie Obenour <demi@xxxxxxxxxxxxxxxxxxxxxx>; Brost, Matthew <matthew.brost@xxxxxxxxx>; intel-xe@xxxxxxxxxxxxxxxxxxxxx <intel-xe@xxxxxxxxxxxxxxxxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx <dri-devel@xxxxxxxxxxxxxxxxxxxxx>

Cc: Ghimiray, Himal Prasad <himal.prasad.ghimiray@xxxxxxxxx>; apopple@xxxxxxxxxx <apopple@xxxxxxxxxx>; airlied@xxxxxxxxx <airlied@xxxxxxxxx>; simona.vetter@xxxxxxxx <simona.vetter@xxxxxxxx>; felix.kuehling@xxxxxxx <felix.kuehling@xxxxxxx>; dakr@xxxxxxxxxx
 <dakr@xxxxxxxxxx>

Subject: Re: [PATCH v5 00/32] Introduce GPU SVM and Xe SVM implementation

Hi

On Thu, 2025-02-13 at 16:23 -0500, Demi Marie Obenour wrote:

> On Wed, Feb 12, 2025 at 06:10:40PM -0800, Matthew Brost wrote:

> > Version 5 of GPU SVM. Thanks to everyone (especially Sima, Thomas,

> > Alistair, Himal) for their numerous reviews on revision 1, 2, 3 

> > and for

> > helping to address many design issues.

> > 

> > This version has been tested with IGT [1] on PVC, BMG, and LNL.

> > Also

> > tested with level0 (UMD) PR [2].

> 

> What is the plan to deal with not being able to preempt while a page

> fault is pending?  This seems like an easy DoS vector.  My

> understanding

> is that SVM is mostly used by compute workloads on headless systems.

> Recent AMD client GPUs don't support SVM, so programs that want to

> run

> on client systems should not require SVM if they wish to be portable.

> 

> Given the potential for abuse, I think it would be best to require

> explicit administrator opt-in to enable SVM, along with possibly

> having

> a timeout to resolve a page fault (after which the context is

> killed).

> Since I expect most uses of SVM to be in the datacenter space (for

> the

> reasons mentioned above), I don't believe this will be a major

> limitation in practice.  Programs that wish to run on client systems

> already need to use explicit memory transfer or pinned userptr, and

> administrators of compute clusters should be willing to enable this

> feature because only one workload will be using a GPU at a time.

While not directly having addressed the potential DoS issue you

mention, there is an associated deadlock possibility that may happen

due to not being able to preempt a pending pagefault. That is if a dma-

fence job is requiring the same resources held up by the pending page-

fault, and then the pagefault servicing is dependent on that dma-fence

to be signaled in one way or another.

That deadlock is handled by only allowing either page-faulting jobs or

dma-fence jobs on a resource (hw engine or hw engine group) that can be

used by both at a time, blocking synchronously in the exec IOCTL until

the resource is available for the job type. That means LR jobs waits

for all dma-fence jobs to complete, and dma-fence jobs wait for all LR

jobs to preempt. So a dma-fence job wait could easily mean "wait for

all outstanding pagefaults to be serviced".

Whether, on the other hand, that is a real DoS we need to care about,

is probably a topic for debate. The directions we've had so far are

that it's not. Nothing is held up indefinitely, what's held up can be

Ctrl-C'd by the user and core mm memory management is not blocked since

mmu_notifiers can execute to completion and shrinkers / eviction can

execute while a page-fault is pending.

Thanks,

Thomas