On Tue, May 24, 2022 at 7:57 AM Rob Clark <robdclark@xxxxxxxxx> wrote: > > On Tue, May 24, 2022 at 6:45 AM Tvrtko Ursulin > <tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote: > > > > On 23/05/2022 23:53, Rob Clark wrote: > > > > > > btw, one fun (but unrelated) issue I'm hitting with scheduler... I'm > > > trying to add an igt test to stress shrinker/eviction, similar to the > > > existing tests/i915/gem_shrink.c. But we hit an unfortunate > > > combination of circumstances: > > > 1. Pinning memory happens in the synchronous part of the submit ioctl, > > > before enqueuing the job for the kthread to handle. > > > 2. The first run_job() callback incurs a slight delay (~1.5ms) while > > > resuming the GPU > > > 3. Because of that delay, userspace has a chance to queue up enough > > > more jobs to require locking/pinning more than the available system > > > RAM.. > > > > Is that one or multiple threads submitting jobs? > > In this case multiple.. but I think it could also happen with a single > thread (provided it didn't stall on a fence, directly or indirectly, > from an earlier submit), because of how resume and actual job > submission happens from scheduler kthread. > > > > I'm not sure if we want a way to prevent userspace from getting *too* > > > far ahead of the kthread. Or maybe at some point the shrinker should > > > sleep on non-idle buffers? > > > > On the direct reclaim path when invoked from the submit ioctl? In i915 > > we only shrink idle objects on direct reclaim and leave active ones for > > the swapper. It depends on how your locking looks like whether you could > > do them, whether there would be coupling of locks and fs-reclaim context. > > I think the locking is more or less ok, although lockdep is unhappy > about one thing[1] which is I think a false warning (ie. not > recognizing that we'd already successfully acquired the obj lock via > trylock). We can already reclaim idle bo's in this path. But the > problem with a bunch of submits queued up in the scheduler, is that > they are already considered pinned and active. So at some point we > need to sleep (hopefully interruptabley) until they are no longer > active, ie. to throttle userspace trying to shove in more submits > until some of the enqueued ones have a chance to run and complete. > > BR, > -R > > [1] https://gitlab.freedesktop.org/drm/msm/-/issues/14 > btw, one thing I'm thinking about is __GFP_RETRY_MAYFAIL for gem bo's.. I'd need to think about the various code paths that could trigger us to need to allocate pages, but short-circuiting the out_of_memory() path deep in drm_gem_get_pages() -> shmem_read_mapping_page() -> ... -> __alloc_pages_may_oom() and letting the driver decide itself if there is queued work worth waiting on (and if not, calling out_of_memory() directly itself) seems like a possible solution.. that also untangles the interrupted-syscall case so we don't end up having to block in a non-interruptible way. Seems like it might work? BR, -R