RE: [PATCH v3 1/5] drm/i915/gem: Implement object migration

"Ruhl, Michael J" <michael.j.ruhl@xxxxxxxxx> · Mon, 28 Jun 2021 19:50:18 +0000

>-----Original Message-----
>From: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx>
>Sent: Monday, June 28, 2021 3:03 PM
>To: Ruhl, Michael J <michael.j.ruhl@xxxxxxxxx>; intel-
>gfx@xxxxxxxxxxxxxxxxxxxxx; dri-devel@xxxxxxxxxxxxxxxxxxxxx
>Cc: Auld, Matthew <matthew.auld@xxxxxxxxx>; lkp <lkp@xxxxxxxxx>
>Subject: Re: [PATCH v3 1/5] drm/i915/gem: Implement object migration
>
>
>On 6/28/21 8:11 PM, Ruhl, Michael J wrote:
>>> -----Original Message-----
>>> From: dri-devel <dri-devel-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of
>>> Thomas Hellström
>>> Sent: Monday, June 28, 2021 10:46 AM
>>> To: intel-gfx@xxxxxxxxxxxxxxxxxxxxx; dri-devel@xxxxxxxxxxxxxxxxxxxxx
>>> Cc: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx>; Auld,
>Matthew
>>> <matthew.auld@xxxxxxxxx>; lkp <lkp@xxxxxxxxx>
>>> Subject: [PATCH v3 1/5] drm/i915/gem: Implement object migration
>>>
>>> Introduce an interface to migrate objects between regions.
>>> This is primarily intended to migrate objects to LMEM for display and
>>> to SYSTEM for dma-buf, but might be reused in one form or another for
>>> performance-based migration.
>>>
>>> v2:
>>> - Verify that the memory region given as an id really exists.
>>>   (Reported by Matthew Auld)
>>> - Call i915_gem_object_{init,release}_memory_region() when switching
>>> region
>>>   to handle also switching region lists. (Reported by Matthew Auld)
>>> v3:
>>> - Fix i915_gem_object_can_migrate() to return true if object is already in
>>>   the correct region, even if the object ops doesn't have a migrate()
>>>   callback.
>>> - Update typo in commit message.
>>> - Fix kerneldoc of i915_gem_object_wait_migration().
>>>
>>> Reported-by: kernel test robot <lkp@xxxxxxxxx>
>>> Signed-off-by: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx>
>>> ---
>>> drivers/gpu/drm/i915/gem/i915_gem_object.c    | 96
>>> +++++++++++++++++++
>>> drivers/gpu/drm/i915/gem/i915_gem_object.h    | 12 +++
>>> .../gpu/drm/i915/gem/i915_gem_object_types.h  |  9 ++
>>> drivers/gpu/drm/i915/gem/i915_gem_ttm.c       | 69 +++++++++----
>>> drivers/gpu/drm/i915/gem/i915_gem_wait.c      | 19 ++++
>>> 5 files changed, 188 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c
>>> b/drivers/gpu/drm/i915/gem/i915_gem_object.c
>>> index 07e8ff9a8aae..1c18be067b58 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
>>> @@ -513,6 +513,102 @@ bool i915_gem_object_has_iomem(const struct
>>> drm_i915_gem_object *obj)
>>> 	return obj->mem_flags & I915_BO_FLAG_IOMEM;
>>> }
>>>
>>> +/**
>>> + * i915_gem_object_can_migrate - Whether an object likely can be
>migrated
>>> + *
>>> + * @obj: The object to migrate
>>> + * @id: The region intended to migrate to
>>> + *
>>> + * Check whether the object backend supports migration to the
>>> + * given region. Note that pinning may affect the ability to migrate.
>>> + *
>>> + * Return: true if migration is possible, false otherwise.
>>> + */
>>> +bool i915_gem_object_can_migrate(struct drm_i915_gem_object *obj,
>>> +				 enum intel_region_id id)
>>> +{
>>> +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
>>> +	unsigned int num_allowed = obj->mm.n_placements;
>>> +	struct intel_memory_region *mr;
>>> +	unsigned int i;
>>> +
>>> +	GEM_BUG_ON(id >= INTEL_REGION_UNKNOWN);
>>> +	GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED);
>>> +
>>> +	mr = i915->mm.regions[id];
>>> +	if (!mr)
>>> +		return false;
>>> +
>>> +	if (obj->mm.region == mr)
>>> +		return true;
>>> +
>>> +	if (!i915_gem_object_evictable(obj))
>>> +		return false;
>>> +
>>> +	if (!obj->ops->migrate)
>>> +		return false;
>>> +
>>> +	if (!(obj->flags & I915_BO_ALLOC_USER))
>>> +		return true;
>>> +
>>> +	if (num_allowed == 0)
>>> +		return false;
>>> +
>>> +	for (i = 0; i < num_allowed; ++i) {
>>> +		if (mr == obj->mm.placements[i])
>>> +			return true;
>>> +	}
>> Hi Thomas,
>>
>> I am a little confused over the difference in checks between this function
>> and i915_gem_object_migrate().
>>
>> Why is the lack of an mr a BUG_ON in _object_migrate(), but here it just
>> false?
>>
>> So that means that under certain circumstances, you could not have a mr?
>>
>> If that is the case, when?
>>
>> Would that be when the I915_BO_ALLOC_USER is set?
>>
>> If so, should there be a check for "non" user vs user?
>>
>> Or maybe document this function pointing out why there are differences
>> and why?
>
>Yes, I'll give it some more documentation. The basic idea is that the
>above function also could be
>used to validate user supplied data, whereas there might be cases where
>we want to use the gem_object_migrate() function and override the above.
>
>
>>
>>> +	return false;
>>> +}
>>> +
>>> +/**
>>> + * i915_gem_object_migrate - Migrate an object to the desired region id
>>> + * @obj: The object to migrate.
>>> + * @ww: An optional struct i915_gem_ww_ctx. If NULL, the backend may
>>> + * not be successful in evicting other objects to make room for this object.
>> Is the ww for future consideration?  (I don't see any use of it in the patch).
>
>Yes, but it will remain optional.
>
>
>>
>>> + * @id: The region id to migrate to.
>>> + *
>>> + * Attempt to migrate the object to the desired memory region. The
>>> + * object backend must support migration and the object may not be
>>> + * pinned, (explicitly pinned pages or pinned vmas). The object must
>>> + * be locked.
>>> + * On successful completion, the object will have pages pointing to
>>> + * memory in the new region, but an async migration task may not have
>>> + * completed yet, and to accomplish that,
>>> i915_gem_object_wait_migration()
>>> + * must be called.
>>> + *
>>> + * Return: 0 on success. Negative error code on failure. In particular may
>>> + * return -ENXIO on lack of region space, -EDEADLK for deadlock
>avoidance
>>> + * if @ww is set, -EINTR or -ERESTARTSYS if signal pending, and
>>> + * -EBUSY if the object is pinned.
>>> + */
>>> +int i915_gem_object_migrate(struct drm_i915_gem_object *obj,
>>> +			    struct i915_gem_ww_ctx *ww,
>>> +			    enum intel_region_id id)
>>> +{
>>> +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
>>> +	struct intel_memory_region *mr;
>>> +
>>> +	GEM_BUG_ON(id >= INTEL_REGION_UNKNOWN);
>>> +	GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED);
>>> +	assert_object_held(obj);
>>> +
>>> +	mr = i915->mm.regions[id];
>>> +	GEM_BUG_ON(!mr);
>>> +
>>> +	if (obj->mm.region == mr)
>>> +		return 0;
>>> +
>>> +	if (!i915_gem_object_evictable(obj))
>>> +		return -EBUSY;
>>> +
>>> +	if (!obj->ops->migrate)
>>> +		return -EOPNOTSUPP;
>> Why aren't you using _can_migrate here?
>
>It's just in case we want to override. I'll add some more comments about
>this.
>
>>
>>> +	return obj->ops->migrate(obj, mr);
>>> +}
>>> +
>>> void i915_gem_init__objects(struct drm_i915_private *i915)
>>> {
>>> 	INIT_WORK(&i915->mm.free_work, __i915_gem_free_work);
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h
>>> b/drivers/gpu/drm/i915/gem/i915_gem_object.h
>>> index ea3224a480c4..8cbd7a5334e2 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
>>> @@ -17,6 +17,8 @@
>>> #include "i915_gem_ww.h"
>>> #include "i915_vma_types.h"
>>>
>>> +enum intel_region_id;
>>> +
>>> /*
>>>   * XXX: There is a prevalence of the assumption that we fit the
>>>   * object's page count inside a 32bit _signed_ variable. Let's document
>>> @@ -597,6 +599,16 @@ bool i915_gem_object_migratable(struct
>>> drm_i915_gem_object *obj);
>>>
>>> bool i915_gem_object_validates_to_lmem(struct drm_i915_gem_object
>>> *obj);
>>>
>>> +int i915_gem_object_migrate(struct drm_i915_gem_object *obj,
>>> +			    struct i915_gem_ww_ctx *ww,
>>> +			    enum intel_region_id id);
>>> +
>>> +bool i915_gem_object_can_migrate(struct drm_i915_gem_object *obj,
>>> +				 enum intel_region_id id);
>>> +
>>> +int i915_gem_object_wait_migration(struct drm_i915_gem_object *obj,
>>> +				   unsigned int flags);
>>> +
>>> #ifdef CONFIG_MMU_NOTIFIER
>>> static inline bool
>>> i915_gem_object_is_userptr(struct drm_i915_gem_object *obj)
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
>>> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
>>> index 441f913c87e6..ef3de2ae9723 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
>>> @@ -18,6 +18,7 @@
>>>
>>> struct drm_i915_gem_object;
>>> struct intel_fronbuffer;
>>> +struct intel_memory_region;
>>>
>>> /*
>>>   * struct i915_lut_handle tracks the fast lookups from handle to vma used
>>> @@ -77,6 +78,14 @@ struct drm_i915_gem_object_ops {
>>> 	 * delayed_free - Override the default delayed free implementation
>>> 	 */
>>> 	void (*delayed_free)(struct drm_i915_gem_object *obj);
>>> +
>>> +	/**
>>> +	 * migrate - Migrate object to a different region either for
>>> +	 * pinning or for as long as the object lock is held.
>>> +	 */
>>> +	int (*migrate)(struct drm_i915_gem_object *obj,
>>> +		       struct intel_memory_region *mr);
>>> +
>>> 	void (*release)(struct drm_i915_gem_object *obj);
>>>
>>> 	const struct vm_operations_struct *mmap_ops;
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> index c39d982c4fa6..8f89185b6507 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>>> @@ -617,7 +617,8 @@ struct ttm_device_funcs *i915_ttm_driver(void)
>>> 	return &i915_ttm_bo_driver;
>>> }
>>>
>>> -static int i915_ttm_get_pages(struct drm_i915_gem_object *obj)
>>> +static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,
>>> +				struct ttm_placement *placement)
>>> {
>>> 	struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
>>> 	struct ttm_operation_ctx ctx = {
>>> @@ -625,19 +626,12 @@ static int i915_ttm_get_pages(struct
>>> drm_i915_gem_object *obj)
>>> 		.no_wait_gpu = false,
>>> 	};
>>> 	struct sg_table *st;
>>> -	struct ttm_place requested, busy[I915_TTM_MAX_PLACEMENTS];
>>> -	struct ttm_placement placement;
>>> 	int real_num_busy;
>>> 	int ret;
>>>
>>> -	GEM_BUG_ON(obj->mm.n_placements >
>>> I915_TTM_MAX_PLACEMENTS);
>>> -
>>> -	/* Move to the requested placement. */
>>> -	i915_ttm_placement_from_obj(obj, &requested, busy, &placement);
>>> -
>>> 	/* First try only the requested placement. No eviction. */
>>> -	real_num_busy =
>>> fetch_and_zero(&placement.num_busy_placement);
>>> -	ret = ttm_bo_validate(bo, &placement, &ctx);
>>> +	real_num_busy = fetch_and_zero(&placement-
>>>> num_busy_placement);
>>> +	ret = ttm_bo_validate(bo, placement, &ctx);
>>> 	if (ret) {
>>> 		ret = i915_ttm_err_to_gem(ret);
>>> 		/*
>>> @@ -652,8 +646,8 @@ static int i915_ttm_get_pages(struct
>>> drm_i915_gem_object *obj)
>>> 		 * If the initial attempt fails, allow all accepted placements,
>>> 		 * evicting if necessary.
>>> 		 */
>>> -		placement.num_busy_placement = real_num_busy;
>>> -		ret = ttm_bo_validate(bo, &placement, &ctx);
>>> +		placement->num_busy_placement = real_num_busy;
>>> +		ret = ttm_bo_validate(bo, placement, &ctx);
>>> 		if (ret)
>>> 			return i915_ttm_err_to_gem(ret);
>>> 	}
>>> @@ -668,16 +662,56 @@ static int i915_ttm_get_pages(struct
>>> drm_i915_gem_object *obj)
>>> 		i915_ttm_adjust_gem_after_move(obj);
>>> 	}
>>>
>>> -	/* Object either has a page vector or is an iomem object */
>>> -	st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj->ttm.cached_io_st;
>>> -	if (IS_ERR(st))
>>> -		return PTR_ERR(st);
>>> +	if (!obj->mm.pages) {
>>> +		/* Object either has a page vector or is an iomem object */
>>> +		st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj-
>>>> ttm.cached_io_st;
>>> +		if (IS_ERR(st))
>>> +			return PTR_ERR(st);
>>>
>>> -	__i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl));
>>> +		__i915_gem_object_set_pages(obj, st,
>>> i915_sg_dma_sizes(st->sgl));
>>> +	}
>>>
>>> 	return ret;
>>> }
>>>
>>> +static int i915_ttm_get_pages(struct drm_i915_gem_object *obj)
>>> +{
>>> +	struct ttm_place requested, busy[I915_TTM_MAX_PLACEMENTS];
>>> +	struct ttm_placement placement;
>>> +
>>> +	GEM_BUG_ON(obj->mm.n_placements >
>>> I915_TTM_MAX_PLACEMENTS);
>>> +
>>> +	/* Move to the requested placement. */
>>> +	i915_ttm_placement_from_obj(obj, &requested, busy, &placement);
>>> +
>>> +	return __i915_ttm_get_pages(obj, &placement);
>>> +}
>>> +
>>> +static int i915_ttm_migrate(struct drm_i915_gem_object *obj,
>>> +			    struct intel_memory_region *mr)
>>> +{
>>> +	struct ttm_place requested;
>>> +	struct ttm_placement placement;
>>> +	int ret;
>>> +
>>> +	i915_ttm_place_from_region(mr, &requested, obj->flags);
>>> +	placement.num_placement = 1;
>>> +	placement.num_busy_placement = 1;
>>> +	placement.placement = &requested;
>>> +	placement.busy_placement = &requested;
>>> +
>>> +	ret = __i915_ttm_get_pages(obj, &placement);
>>> +	if (ret)
>>> +		return ret;
>>> +
>>> +	if (obj->mm.region != mr) {
>>> +		i915_gem_object_release_memory_region(obj);
>>> +		i915_gem_object_init_memory_region(obj, mr);
>>> +	}
>> Perhaps a minor nit:
>>
>> Doing this after we have done the _get_pages() just doesn't seem right.
>>
>> I.e. we do work on the object, and then we init some portion of it.
>>
>> Do we need to do this incase the migration/placement fails?  If so,
>> maybe a comment to that effect?
>
>This is simply switching memory region under the lock, and to also move
>to another memory region list. Is it the naming _release_ and _init_
>that is confusing?

Hmm, re-reading my question, I am wondering if I was unclear.

My "real" question was, can the release/init occur before the _get_pages()?

But looking at this some more, I can see answer is probably no.

I was going to suggest calling _init_ _set_, but when I looked at it, it was
doing init things as well as setting things

Maybe just a comment like:

/* Complete the migration by updating the memory region info. */
if (object->mm...)

?

M

>/Thomas
>
>
>>
>> Thanks,
>>
>> Mike
>>
>>> +	return 0;
>>> +}
>>> +
>>> static void i915_ttm_put_pages(struct drm_i915_gem_object *obj,
>>> 			       struct sg_table *st)
>>> {
>>> @@ -814,6 +848,7 @@ static const struct drm_i915_gem_object_ops
>>> i915_gem_ttm_obj_ops = {
>>> 	.truncate = i915_ttm_purge,
>>> 	.adjust_lru = i915_ttm_adjust_lru,
>>> 	.delayed_free = i915_ttm_delayed_free,
>>> +	.migrate = i915_ttm_migrate,
>>> 	.mmap_offset = i915_ttm_mmap_offset,
>>> 	.mmap_ops = &vm_ops_ttm,
>>> };
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
>>> b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
>>> index 1070d3afdce7..f909aaa09d9c 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
>>> @@ -290,3 +290,22 @@ i915_gem_wait_ioctl(struct drm_device *dev, void
>>> *data, struct drm_file *file)
>>> 	i915_gem_object_put(obj);
>>> 	return ret;
>>> }
>>> +
>>> +/**
>>> + * i915_gem_object_wait_migration - Sync an accelerated migration
>>> operation
>>> + * @obj: The migrating object.
>>> + * @flags: waiting flags. Currently supports only
>I915_WAIT_INTERRUPTIBLE.
>>> + *
>>> + * Wait for any pending async migration operation on the object,
>>> + * whether it's explicitly (i915_gem_object_migrate()) or implicitly
>>> + * (swapin, initial clearing) initiated.
>>> + *
>>> + * Return: 0 if successful, -ERESTARTSYS if a signal was hit during waiting.
>>> + */
>>> +int i915_gem_object_wait_migration(struct drm_i915_gem_object *obj,
>>> +				   unsigned int flags)
>>> +{
>>> +	might_sleep();
>>> +	/* NOP for now. */
>>> +	return 0;
>>> +}
>>> --
>>> 2.31.1