Re: [PATCH 6/6] drm/i915: Migrate stolen objects before hibernation

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Fri, 11 Dec 2015 12:33:16 +0000

On 11/12/15 05:16, Ankitprasad Sharma wrote:
On Thu, 2015-12-10 at 14:15 +0000, Tvrtko Ursulin wrote:
On 10/12/15 13:17, Ankitprasad Sharma wrote:
On Thu, 2015-12-10 at 09:43 +0000, Tvrtko Ursulin wrote:
Hi,

Two more comments below:

On 09/12/15 12:46, ankitprasad.r.sharma@xxxxxxxxx wrote:
From: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>

Ville reminded us that stolen memory is not preserved across
hibernation, and a result of this was that context objects now being
allocated from stolen were being corrupted on S4 and promptly hanging
the GPU on resume.

We want to utilise stolen for as much as possible (nothing else will use
that wasted memory otherwise), so we need a strategy for handling
general objects allocated from stolen and hibernation. A simple solution
is to do a CPU copy through the GTT of the stolen object into a fresh
shmemfs backing store and thenceforth treat it as a normal objects. This
can be refined in future to either use a GPU copy to avoid the slow
uncached reads (though it's hibernation!) and recreate stolen objects
upon resume/first-use. For now, a simple approach should suffice for
testing the object migration.

Mention of "testing" in the commit message and absence of a path to
migrate the objects back to stolen memory on resume makes me think this
is kind of half finished and note really ready for review / merge ?

Because I don't see how it is useful to migrate it one way and never
move back?
I think that this is not much of a problem, as the purpose here is to
keep the object intact, to avoid breaking anything.
So as far as objects are concerned they will be in shmem and can be used
without any issue, and the stolen memory will be free again for other
usage from the user.

I am not sure that is a good state of things.

One of the things it means is that when user wanted to create an object
in stolen memory, after resume it will not be any more. So what is the
point in failing stolen object creation when area is full in the first
place? We could just return a normal object instead.
I agree with you, but the absence of a reverse path will not affect the
user in any way, though the user may be under the wrong impression that
the buffer is residing inside the stolen area.

Yes, and since suspend-resume is rather a frequent use case it brings 
the whole point of stolen into question for me unless implemented fully.



Then the question of objects which are allocated in stolen by the
driver. Are they being re-allocated on resume or will also be stuck in
shmemfs from then onward?
Objects allocated by the driver need not be preserved (we use a
internal_volatile flag for those). These are not migrated to the shmemfs
and are later re-populated by the driver, when used again after resume.

Good then, it wasn't clear to me from that comment about HEAD and TAIL etc.


And finally, one corner case might be that shmemfs plus stolen is a
larger sum which will be attempted to restored in shmemfs only on
resume. Will that always work if everything is fully populated and what
will happen if we run out of space?
As per my understanding,
shmemfs size will get increased, due to migration, before the
hibernation itself. And if not everything from shmemfs can be stored in
RAM, swap-out will take care of it.
Whatever was stored in the RAM will be restored on resume, rest all will
remain in the swap.

Yes but still there is a change it won't fit is someone is running at 
the limit and maybe with no swap, no?


At minimum all this should be discussed and explicitly documented in the
commit message.

Would it be difficult to implement the reverse path?
I will try to explore the reverse path as well. But that can be
submitted separately as a follow-up patch.

I don't think so because of what I wrote above. Especially since Ville 
said corruption can even happen in S3, but even if we only think of S4, 
for me that is such a common on frequent use-case that having the 
ability to request object be in stolen, only for that to be silently 
changed on resume is not worth it.

SO until the migration both ways is there I don't see what is the point 
of allowing stolen objects and even failing the creation if there is not 
enough space there.


v2:
Swap PTE for pinned bindings over to the shmemfs. This adds a
complicated dance, but is required as many stolen objects are likely to
be pinned for use by the hardware. Swapping the PTEs should not result
in externally visible behaviour, as each PTE update should be atomic and
the two pages identical. (danvet)

safe-by-default, or the principle of least surprise. We need a new flag
to mark objects that we can wilfully discard and recreate across
hibernation. (danvet)

Just use the global_list rather than invent a new stolen_list. This is
the slowpath hibernate and so adding a new list and the associated
complexity isn't worth it.

v3: Rebased on drm-intel-nightly (Ankit)

v4: Use insert_page to map stolen memory backed pages for migration to
shmem (Chris)

v5: Acquire mutex lock while copying stolen buffer objects to shmem (Chris)

Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@xxxxxxxxx>
---
    drivers/gpu/drm/i915/i915_drv.c         |  17 ++-
    drivers/gpu/drm/i915/i915_drv.h         |   7 +
    drivers/gpu/drm/i915/i915_gem.c         | 232 ++++++++++++++++++++++++++++++--
    drivers/gpu/drm/i915/intel_display.c    |   3 +
    drivers/gpu/drm/i915/intel_fbdev.c      |   6 +
    drivers/gpu/drm/i915/intel_pm.c         |   2 +
    drivers/gpu/drm/i915/intel_ringbuffer.c |   6 +
    7 files changed, 261 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 9f55209..2bb9e9e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1036,6 +1036,21 @@ static int i915_pm_suspend(struct device *dev)
    	return i915_drm_suspend(drm_dev);
    }

+static int i915_pm_freeze(struct device *dev)
+{
+	int ret;
+
+	ret = i915_gem_freeze(pci_get_drvdata(to_pci_dev(dev)));
+	if (ret)
+		return ret;

One of the first steps in idling GEM seems to be idling the GPU and
retiring requests.

Would it also make sense to do those steps before attempting to migrate
the stolen objects?
Here, we do that implicitly when trying to do a vma_unbind for the
object.

Code paths are not the same so it makes me uncomfortable.  It looks more
logical to do the migration after the existing i915_gem_suspend. It
would mean some code duplication, true (maybe split i915_drm_suspend in
two and call i915_gem_freeze in between), but to me it looks more like a
proper place to do it.

All inactive stolen objects will be migrated immediately and the active
ones will be implicitly synchronized in vma_unbind.
Would it be more appropriate to rename i915_gem_freeze to
i915_gem_migrate_stolen?

But anyway, we will wait for Chris or Ville's inputs.

Ping! :)

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx