On Sat, Mar 17, 2012 at 3:09 PM, Hugh Dickins <hughd@xxxxxxxxxx> wrote: > > I've got to go out for an hour: I'll digest more and think more about > this when I get back. If someone could explain the original problem > with _MOVABLE, that would help me: I do not believe we actually ever uncovered the original problem with _MOVABLE: the problem was bisected down to the stable-backported version of commit 4bdadb978569 ("drm/i915: Selectively enable self-reclaim"), and I looked at the changes and decided that one of the main ones was the removal of the mapping_set_gfp_mask() - which resulted in __GFP_MOVABLE being on for the mapping. So we just tried re-introducing that, and it fixed the problem. Exactly *why* movable pages are a problem was never all that clear. The assumption was (and I guess still is) that there are physical pointers to the pages in the DRM pages themselves. Quoting from the original thread: "I could easily see that something would get very unhappy and corrupt memory if the suspend-to-disk phase ends up compacting memory and moving the pages around from under the i915 driver." but I didn't actually see why the i915 page pinning would be defeated by __GFP_MOVABLE. The code does get a reference to them afaik. So no, we don't have a real first cause. We just have a band-aid for the symptom. It might be a pure bug in hibernation that just assumes that any pages in MOVABLE zones can always be moved without even checking any refcounts, so not using __GFP_MOVABLE might just be working around problems elsewhere. Or it's entirely possible that the DRM layer does *not* increment the refcounts of the pages properly, and that hibernate with movable pages is just a very efficient way of showing the problems that results in. > A factor which might turn out to be relevant: swapin_readahead() (or > swapoff) brings in pages from swap before it knows who they belong to, > so the swapped-in swapcache pages don't necessarily have the limitations > mapping_set_gfp_mask() has asked for. It's been discussed with Alan, > we know it will need fixing for gma500 when eventually that comes to > be used on machines with more than 4GB, but for now I wasn't aware of > a problem. So i915 shouldn't much care about highmem vs non-highmem, it's just that it does hold on to pages and take their physical address. But as far as I can tell, whenever the physical address is in use, it does have a refcount to the page. So for example, i915_gem_object_get_pages_gtt() will use shmem_read_mapping_page_gfp() which will increment the page count for the page it gets, so all the obj->pages[] entries should have properly incremented page counts. And they get released by i915_gem_object_put_pages_gtt(), but maybe that is called too early while the pages are still in use by the GFX unit... If there is a refcounting issue, hibernate + __GFP_MOVABLE might just be a great way to see it. I don't really know the code, though. Linus _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel