RE: [PATCH 2/4] drm/xe/client: add missing bo locking in show_meminfo()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Auld, Matthew <matthew.auld@xxxxxxxxx>
> Sent: Wednesday, September 11, 2024 2:05 PM
> To: Upadhyay, Tejas <tejas.upadhyay@xxxxxxxxx>; intel-
> xe@xxxxxxxxxxxxxxxxxxxxx
> Cc: Ghimiray, Himal Prasad <himal.prasad.ghimiray@xxxxxxxxx>; Thomas
> Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx>; stable@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH 2/4] drm/xe/client: add missing bo locking in
> show_meminfo()
> 
> Hi,
> 
> On 11/09/2024 06:39, Upadhyay, Tejas wrote:
> >
> >
> >> -----Original Message-----
> >> From: Auld, Matthew <matthew.auld@xxxxxxxxx>
> >> Sent: Tuesday, September 10, 2024 6:42 PM
> >> To: intel-xe@xxxxxxxxxxxxxxxxxxxxx
> >> Cc: Ghimiray, Himal Prasad <himal.prasad.ghimiray@xxxxxxxxx>;
> >> Upadhyay, Tejas <tejas.upadhyay@xxxxxxxxx>; Thomas Hellström
> >> <thomas.hellstrom@xxxxxxxxxxxxxxx>; stable@xxxxxxxxxxxxxxx
> >> Subject: [PATCH 2/4] drm/xe/client: add missing bo locking in
> >> show_meminfo()
> >>
> >> bo_meminfo() wants to inspect bo state like tt and the ttm resource,
> >> however this state can change at any point leading to stuff like NPD
> >> and UAF, if the bo lock is not held. Grab the bo lock when calling
> >> bo_meminfo(), ensuring we drop any spinlocks first. In the case of
> >> object_idr we now also need to hold a ref.
> >>
> >> Fixes: 0845233388f8 ("drm/xe: Implement fdinfo memory stats
> >> printing")
> >> Signed-off-by: Matthew Auld <matthew.auld@xxxxxxxxx>
> >> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@xxxxxxxxx>
> >> Cc: Tejas Upadhyay <tejas.upadhyay@xxxxxxxxx>
> >> Cc: "Thomas Hellström" <thomas.hellstrom@xxxxxxxxxxxxxxx>
> >> Cc: <stable@xxxxxxxxxxxxxxx> # v6.8+
> >> ---
> >>   drivers/gpu/drm/xe/xe_drm_client.c | 37
> +++++++++++++++++++++++++++---
> >>   1 file changed, 34 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/xe/xe_drm_client.c
> >> b/drivers/gpu/drm/xe/xe_drm_client.c
> >> index badfa045ead8..3cca741c500c 100644
> >> --- a/drivers/gpu/drm/xe/xe_drm_client.c
> >> +++ b/drivers/gpu/drm/xe/xe_drm_client.c
> >> @@ -10,6 +10,7 @@
> >>   #include <linux/slab.h>
> >>   #include <linux/types.h>
> >>
> >> +#include "xe_assert.h"
> >>   #include "xe_bo.h"
> >>   #include "xe_bo_types.h"
> >>   #include "xe_device_types.h"
> >> @@ -151,10 +152,13 @@ void xe_drm_client_add_bo(struct
> xe_drm_client
> >> *client,
> >>    */
> >>   void xe_drm_client_remove_bo(struct xe_bo *bo)  {
> >> +	struct xe_device *xe = ttm_to_xe_device(bo->ttm.bdev);
> >>   	struct xe_drm_client *client = bo->client;
> >>
> >> +	xe_assert(xe, !kref_read(&bo->ttm.base.refcount));
> >> +
> >>   	spin_lock(&client->bos_lock);
> >> -	list_del(&bo->client_link);
> >> +	list_del_init(&bo->client_link);
> >>   	spin_unlock(&client->bos_lock);
> >>
> >>   	xe_drm_client_put(client);
> >> @@ -207,7 +211,20 @@ static void show_meminfo(struct drm_printer *p,
> >> struct drm_file *file)
> >>   	idr_for_each_entry(&file->object_idr, obj, id) {
> >>   		struct xe_bo *bo = gem_to_xe_bo(obj);
> >>
> >> -		bo_meminfo(bo, stats);
> >> +		if (dma_resv_trylock(bo->ttm.base.resv)) {
> >> +			bo_meminfo(bo, stats);
> >> +			xe_bo_unlock(bo);
> >> +		} else {
> >> +			xe_bo_get(bo);
> >> +			spin_unlock(&file->table_lock);
> >> +
> >> +			xe_bo_lock(bo, false);
> >> +			bo_meminfo(bo, stats);
> >> +			xe_bo_unlock(bo);
> >> +
> >> +			xe_bo_put(bo);
> >> +			spin_lock(&file->table_lock);
> >> +		}
> >>   	}
> >>   	spin_unlock(&file->table_lock);
> >>
> >> @@ -217,7 +234,21 @@ static void show_meminfo(struct drm_printer *p,
> >> struct drm_file *file)
> >>   		if (!kref_get_unless_zero(&bo->ttm.base.refcount))
> >>   			continue;
> >>
> >
> > While we have ref to BO, why would it need lock here, can you please
> explain if I am missing something. I though BO cant be deleted till will hold
> ref?
> 
> The ref is just about protecting the lifetime of the bo, however the internal bo
> state in particular the ttm stuff, is all protected by holding the dma-resv bo
> lock.
> 
> For example the bo can be moved/evicted around at will and the object state
> changes with it, but that should be done only when also holding the bo lock.
> If we are holding the bo lock here then the object state should be stable,
> making it safe to inspect stuff like bo->ttm.ttm and
> bo->ttm.resource. As an example, if you look at ttm_bo_move_null() and
> imagine xe_bo_has_pages() racing with that, then NPD or UAF is possible.
> 

Ok, meaning ref will protect deletion of bo while in used, but internal state of bo's fields can still change with holding ref. Got it,
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@xxxxxxxxx>

Thanks,
Tejas
> >
> > Thanks,
> > Tejas
> >> -		bo_meminfo(bo, stats);
> >> +		if (dma_resv_trylock(bo->ttm.base.resv)) {
> >> +			bo_meminfo(bo, stats);
> >> +			xe_bo_unlock(bo);
> >> +		} else {
> >> +			spin_unlock(&client->bos_lock);
> >> +
> >> +			xe_bo_lock(bo, false);
> >> +			bo_meminfo(bo, stats);
> >> +			xe_bo_unlock(bo);
> >> +
> >> +			spin_lock(&client->bos_lock);
> >> +			/* The bo ref will prevent this bo from being removed
> >> from the list */
> >> +			xe_assert(xef->xe, !list_empty(&bo->client_link));
> >> +		}
> >> +
> >>   		xe_bo_put_deferred(bo, &deferred);
> >>   	}
> >>   	spin_unlock(&client->bos_lock);
> >> --
> >> 2.46.0
> >




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux