From: Intel-xe <intel-xe-bounces@xxxxxxxxxxxxxxxxxxxxx> on behalf of Ghimiray, Himal Prasad <himal.prasad.ghimiray@xxxxxxxxx>
Sent: Friday, February 14, 2025 2:38:10 pm
To: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx>; Demi Marie Obenour <demi@xxxxxxxxxxxxxxxxxxxxxx>; Brost, Matthew <matthew.brost@xxxxxxxxx>; intel-xe@xxxxxxxxxxxxxxxxxxxxx <intel-xe@xxxxxxxxxxxxxxxxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx <dri-devel@xxxxxxxxxxxxxxxxxxxxx>
Cc: apopple@xxxxxxxxxx <apopple@xxxxxxxxxx>; airlied@xxxxxxxxx <airlied@xxxxxxxxx>; simona.vetter@xxxxxxxx <simona.vetter@xxxxxxxx>; felix.kuehling@xxxxxxx <felix.kuehling@xxxxxxx>; dakr@xxxxxxxxxx <dakr@xxxxxxxxxx>
Subject: Re: [PATCH v5 00/32] Introduce GPU SVM and Xe SVM implementation
Sent: Friday, February 14, 2025 2:38:10 pm
To: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx>; Demi Marie Obenour <demi@xxxxxxxxxxxxxxxxxxxxxx>; Brost, Matthew <matthew.brost@xxxxxxxxx>; intel-xe@xxxxxxxxxxxxxxxxxxxxx <intel-xe@xxxxxxxxxxxxxxxxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx <dri-devel@xxxxxxxxxxxxxxxxxxxxx>
Cc: apopple@xxxxxxxxxx <apopple@xxxxxxxxxx>; airlied@xxxxxxxxx <airlied@xxxxxxxxx>; simona.vetter@xxxxxxxx <simona.vetter@xxxxxxxx>; felix.kuehling@xxxxxxx <felix.kuehling@xxxxxxx>; dakr@xxxxxxxxxx <dakr@xxxxxxxxxx>
Subject: Re: [PATCH v5 00/32] Introduce GPU SVM and Xe SVM implementation
k,
asx.ddk
please ignore
From: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx>
Sent: Friday, February 14, 2025 2:17:13 PM
To: Demi Marie Obenour <demi@xxxxxxxxxxxxxxxxxxxxxx>; Brost, Matthew <matthew.brost@xxxxxxxxx>; intel-xe@xxxxxxxxxxxxxxxxxxxxx <intel-xe@xxxxxxxxxxxxxxxxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx <dri-devel@xxxxxxxxxxxxxxxxxxxxx>
Cc: Ghimiray, Himal Prasad <himal.prasad.ghimiray@xxxxxxxxx>; apopple@xxxxxxxxxx <apopple@xxxxxxxxxx>; airlied@xxxxxxxxx <airlied@xxxxxxxxx>; simona.vetter@xxxxxxxx <simona.vetter@xxxxxxxx>; felix.kuehling@xxxxxxx <felix.kuehling@xxxxxxx>; dakr@xxxxxxxxxx <dakr@xxxxxxxxxx>
Subject: Re: [PATCH v5 00/32] Introduce GPU SVM and Xe SVM implementation
Sent: Friday, February 14, 2025 2:17:13 PM
To: Demi Marie Obenour <demi@xxxxxxxxxxxxxxxxxxxxxx>; Brost, Matthew <matthew.brost@xxxxxxxxx>; intel-xe@xxxxxxxxxxxxxxxxxxxxx <intel-xe@xxxxxxxxxxxxxxxxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx <dri-devel@xxxxxxxxxxxxxxxxxxxxx>
Cc: Ghimiray, Himal Prasad <himal.prasad.ghimiray@xxxxxxxxx>; apopple@xxxxxxxxxx <apopple@xxxxxxxxxx>; airlied@xxxxxxxxx <airlied@xxxxxxxxx>; simona.vetter@xxxxxxxx <simona.vetter@xxxxxxxx>; felix.kuehling@xxxxxxx <felix.kuehling@xxxxxxx>; dakr@xxxxxxxxxx <dakr@xxxxxxxxxx>
Subject: Re: [PATCH v5 00/32] Introduce GPU SVM and Xe SVM implementation
Hi
On Thu, 2025-02-13 at 16:23 -0500, Demi Marie Obenour wrote:
> On Wed, Feb 12, 2025 at 06:10:40PM -0800, Matthew Brost wrote:
> > Version 5 of GPU SVM. Thanks to everyone (especially Sima, Thomas,
> > Alistair, Himal) for their numerous reviews on revision 1, 2, 3
> > and for
> > helping to address many design issues.
> >
> > This version has been tested with IGT [1] on PVC, BMG, and LNL.
> > Also
> > tested with level0 (UMD) PR [2].
>
> What is the plan to deal with not being able to preempt while a page
> fault is pending? This seems like an easy DoS vector. My
> understanding
> is that SVM is mostly used by compute workloads on headless systems.
> Recent AMD client GPUs don't support SVM, so programs that want to
> run
> on client systems should not require SVM if they wish to be portable.
>
> Given the potential for abuse, I think it would be best to require
> explicit administrator opt-in to enable SVM, along with possibly
> having
> a timeout to resolve a page fault (after which the context is
> killed).
> Since I expect most uses of SVM to be in the datacenter space (for
> the
> reasons mentioned above), I don't believe this will be a major
> limitation in practice. Programs that wish to run on client systems
> already need to use explicit memory transfer or pinned userptr, and
> administrators of compute clusters should be willing to enable this
> feature because only one workload will be using a GPU at a time.
While not directly having addressed the potential DoS issue you
mention, there is an associated deadlock possibility that may happen
due to not being able to preempt a pending pagefault. That is if a dma-
fence job is requiring the same resources held up by the pending page-
fault, and then the pagefault servicing is dependent on that dma-fence
to be signaled in one way or another.
That deadlock is handled by only allowing either page-faulting jobs or
dma-fence jobs on a resource (hw engine or hw engine group) that can be
used by both at a time, blocking synchronously in the exec IOCTL until
the resource is available for the job type. That means LR jobs waits
for all dma-fence jobs to complete, and dma-fence jobs wait for all LR
jobs to preempt. So a dma-fence job wait could easily mean "wait for
all outstanding pagefaults to be serviced".
Whether, on the other hand, that is a real DoS we need to care about,
is probably a topic for debate. The directions we've had so far are
that it's not. Nothing is held up indefinitely, what's held up can be
Ctrl-C'd by the user and core mm memory management is not blocked since
mmu_notifiers can execute to completion and shrinkers / eviction can
execute while a page-fault is pending.
Thanks,
Thomas
On Thu, 2025-02-13 at 16:23 -0500, Demi Marie Obenour wrote:
> On Wed, Feb 12, 2025 at 06:10:40PM -0800, Matthew Brost wrote:
> > Version 5 of GPU SVM. Thanks to everyone (especially Sima, Thomas,
> > Alistair, Himal) for their numerous reviews on revision 1, 2, 3
> > and for
> > helping to address many design issues.
> >
> > This version has been tested with IGT [1] on PVC, BMG, and LNL.
> > Also
> > tested with level0 (UMD) PR [2].
>
> What is the plan to deal with not being able to preempt while a page
> fault is pending? This seems like an easy DoS vector. My
> understanding
> is that SVM is mostly used by compute workloads on headless systems.
> Recent AMD client GPUs don't support SVM, so programs that want to
> run
> on client systems should not require SVM if they wish to be portable.
>
> Given the potential for abuse, I think it would be best to require
> explicit administrator opt-in to enable SVM, along with possibly
> having
> a timeout to resolve a page fault (after which the context is
> killed).
> Since I expect most uses of SVM to be in the datacenter space (for
> the
> reasons mentioned above), I don't believe this will be a major
> limitation in practice. Programs that wish to run on client systems
> already need to use explicit memory transfer or pinned userptr, and
> administrators of compute clusters should be willing to enable this
> feature because only one workload will be using a GPU at a time.
While not directly having addressed the potential DoS issue you
mention, there is an associated deadlock possibility that may happen
due to not being able to preempt a pending pagefault. That is if a dma-
fence job is requiring the same resources held up by the pending page-
fault, and then the pagefault servicing is dependent on that dma-fence
to be signaled in one way or another.
That deadlock is handled by only allowing either page-faulting jobs or
dma-fence jobs on a resource (hw engine or hw engine group) that can be
used by both at a time, blocking synchronously in the exec IOCTL until
the resource is available for the job type. That means LR jobs waits
for all dma-fence jobs to complete, and dma-fence jobs wait for all LR
jobs to preempt. So a dma-fence job wait could easily mean "wait for
all outstanding pagefaults to be serviced".
Whether, on the other hand, that is a real DoS we need to care about,
is probably a topic for debate. The directions we've had so far are
that it's not. Nothing is held up indefinitely, what's held up can be
Ctrl-C'd by the user and core mm memory management is not blocked since
mmu_notifiers can execute to completion and shrinkers / eviction can
execute while a page-fault is pending.
Thanks,
Thomas