RE: [PATCH] drm/amdkfd: Fixed kfd_process cleanup on module exit.

"Belanger, David" <David.Belanger@xxxxxxx> · Tue, 7 Mar 2023 17:11:05 +0000

[AMD Official Use Only - General]

Thanks Felix for your feedback.
I will work on applying your suggested approach and uploaded a second version when it is ready.

David B.

> -----Original Message-----
> From: Kuehling, Felix <Felix.Kuehling@xxxxxxx>
> Sent: Monday, March 6, 2023 6:46 PM
> To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Belanger, David
> <David.Belanger@xxxxxxx>
> Subject: Re: [PATCH] drm/amdkfd: Fixed kfd_process cleanup on module
> exit.
> 
> 
> Am 2023-03-06 um 16:58 schrieb David Belanger:
> > Handle case when module is unloaded (kfd_exit) before a process space
> > (mm_struct) is released.
> >
> > Signed-off-by: David Belanger <david.belanger@xxxxxxx>
> > ---
> >   drivers/gpu/drm/amd/amdkfd/kfd_module.c  |  4 ++
> >   drivers/gpu/drm/amd/amdkfd/kfd_process.c | 57
> ++++++++++++++++++++++++
> >   2 files changed, 61 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> > b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> > index 09b966dc3768..8ef4bd9e4f7d 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> > @@ -26,6 +26,9 @@
> >   #include "kfd_priv.h"
> >   #include "amdgpu_amdkfd.h"
> >
> > +void kfd_cleanup_processes(void);
> > +
> > +
> >   static int kfd_init(void)
> >   {
> >   	int err;
> > @@ -77,6 +80,7 @@ static int kfd_init(void)
> >
> >   static void kfd_exit(void)
> >   {
> > +	kfd_cleanup_processes();
> >   	kfd_debugfs_fini();
> >   	kfd_process_destroy_wq();
> >   	kfd_procfs_shutdown();
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > index ebabe92f7edb..b5b28a32639d 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > @@ -1181,6 +1181,17 @@ static void kfd_process_notifier_release(struct
> mmu_notifier *mn,
> >   		return;
> >
> >   	mutex_lock(&kfd_processes_mutex);
> > +	/*
> > +	 * Do early return if p is not in the table.
> > +	 *
> > +	 * This could potentially happen if this function is called concurrently
> > +	 * by mmu_notifier and by kfd_cleanup_pocesses.
> > +	 *
> > +	 */
> > +	if (!hash_hashed(&p->kfd_processes)) {
> > +		mutex_unlock(&kfd_processes_mutex);
> > +		return;
> > +	}
> >   	hash_del_rcu(&p->kfd_processes);
> >   	mutex_unlock(&kfd_processes_mutex);
> >   	synchronize_srcu(&kfd_processes_srcu);
> > @@ -1200,6 +1211,52 @@ static const struct mmu_notifier_ops
> kfd_process_mmu_notifier_ops = {
> >   	.free_notifier = kfd_process_free_notifier,
> >   };
> >
> > +
> > +void kfd_cleanup_processes(void)
> > +{
> > +	struct kfd_process *p;
> > +	unsigned int temp;
> > +
> > +	/*
> > +	 * Iterate over remaining processes in table, calling notifier release
> > +	 * to free up notifier and process resources.
> > +	 *
> > +	 * This code handles the case when driver is unloaded before all
> mm_struct
> > +	 * are released.
> > +	 */
> > +	int idx = srcu_read_lock(&kfd_processes_srcu);
> > +
> > +	hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
> > +		if (p) {
> > +			/*
> > +			 * Obtain a reference on p to avoid a late
> mmu_notifier release
> > +			 * call triggering freeing the process.
> > +			 */
> > +
> > +			kref_get(&p->ref);
> > +
> > +			srcu_read_unlock(&kfd_processes_srcu, idx);
> 
> I don't think it's valid to drop the lock in the middle of the loop. You need to
> hold the lock throughout the loop to protect the consistency of the data
> structure. I guess you're doing this because you got a deadlock from
> synchronize_srcu in kfd_process_notifier_release.
> 
> 
> > +
> > +			kfd_process_notifier_release(&p->mmu_notifier, p-
> >mm);
> 
> This calls hash_del_rcu to remove the process from the hash table. To make
> this safe, you need to hold the kfd_processes_mutex.
> 
> Since this is outside the RCU read lock, the entry in the hlist can be freed,
> which can cause problems when the hash_for_each_rcu loop tries to find the
> next entry in the hlist.
> 
> 
> > +
> > +			kfd_unref_process(p);
> 
> This schedules a worker that can free the process at any time, which also
> frees the hlist_node p->kfd_processes that is still needed by
> hash_for_each_rcu to find the next entry. If you're unlucky, the worker will
> be scheduled before the next loop iteration, and you can get a kernel oops.
> 
> I suggest a safer strategy: Make a loop using hash_for_each_safe to move all
> the processes into a new hlist. You can do that while holding the
> kfd_processes_mutex, so you can safely remove all entries from the hash
> table and move them into your own private hlist.
> 
> Then you can safely release all the processes from your private hlist without
> having to hold either the srcu read lock or the mutex.
> 
> Regards,
>    Felix
> 
> 
> > +
> > +			idx = srcu_read_lock(&kfd_processes_srcu);
> > +		}
> > +	}
> > +	srcu_read_unlock(&kfd_processes_srcu, idx);
> > +
> > +	/*
> > +	 * Must be called after all mmu_notifier_put are done and before
> > +	 * kfd_process_wq is released.
> > +	 *
> > +	 * Ensures that all outstanding free_notifier gets called, triggering the
> release
> > +	 * of the process.
> > +	 */
> > +	mmu_notifier_synchronize();
> > +}
> > +
> > +
> >   static int kfd_process_init_cwsr_apu(struct kfd_process *p, struct file
> *filep)
> >   {
> >   	unsigned long  offset;