On Wed, 13 Aug 2014 17:13:56 +0200 Thomas Hellstrom <thellstrom@xxxxxxxxxx> wrote: > On 08/13/2014 03:01 PM, Daniel Vetter wrote: > > On Wed, Aug 13, 2014 at 02:35:52PM +0200, Thomas Hellstrom wrote: > >> On 08/13/2014 12:42 PM, Daniel Vetter wrote: > >>> On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote: > >>>> On 08/13/2014 05:52 AM, Jérôme Glisse wrote: > >>>>> From: Jérôme Glisse <jglisse@xxxxxxxxxx> > >>>>> > >>>>> When experiencing memory pressure we want to minimize pool size so that > >>>>> memory we just shrinked is not added back again just as the next thing. > >>>>> > >>>>> This will divide by 2 the maximum pool size for each device each time > >>>>> the pool have to shrink. The limit is bumped again is next allocation > >>>>> happen after one second since the last shrink. The one second delay is > >>>>> obviously an arbitrary choice. > >>>> Jérôme, > >>>> > >>>> I don't like this patch. It adds extra complexity and its usefulness is > >>>> highly questionable. > >>>> There are a number of caches in the system, and if all of them added > >>>> some sort of voluntary shrink heuristics like this, we'd end up with > >>>> impossible-to-debug unpredictable performance issues. > >>>> > >>>> We should let the memory subsystem decide when to reclaim pages from > >>>> caches and what caches to reclaim them from. > >>> Yeah, artificially limiting your cache from growing when your shrinker > >>> gets called will just break the equal-memory pressure the core mm uses to > >>> rebalance between all caches when workload changes. In i915 we let > >>> everything grow without artificial bounds and only rely upon the shrinker > >>> callbacks to ensure we don't consume more than our fair share of available > >>> memory overall. > >>> -Daniel > >> Now when you bring i915 memory usage up, Daniel, > >> I can't refrain from bringing up the old user-space unreclaimable kernel > >> memory issue, for which gem open is a good example ;) Each time > >> user-space opens a gem handle, some un-reclaimable kernel memory is > >> allocated, for which there is no accounting, so theoretically I think a > >> user can bring a system to unusability this way. > >> > >> Typically there are various limits on unreclaimable objects like this, > >> like open file descriptors, and IIRC the kernel even has an internal > >> limit on the number of struct files you initialize, based on the > >> available system memory, so dma-buf / prime should already have some > >> sort of protection. > > Oh yeah, we have zero cgroups limits or similar stuff for gem allocations, > > so there's not really a way to isolate gpu memory usage in a sane way for > > specific processes. But there's also zero limits on actual gpu usage > > itself (timeslices or whatever) so I guess no one asked for this yet. > > In its simplest form (like in TTM if correctly implemented by drivers) > this type of accounting stops non-privileged malicious GPU-users from > exhausting all system physical memory causing grief for other kernel > systems but not from causing grief for other GPU users. I think that's > the minimum level that's intended also for example also for the struct > file accounting. > > > My comment really was about balancing mm users under the assumption that > > they're all unlimited. > > Yeah, sorry for stealing the thread. I usually bring this up now and > again but nowadays with an exponential backoff. Yeah I agree we're missing some good limits stuff in i915 and DRM in general probably. Chris started looking at this awhile back, but I haven't seen anything recently. Tying into the ulimits/rlimits might make sense, and at the very least we need to account for things properly so we can add new limits where needed. -- Jesse Barnes, Intel Open Source Technology Center _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel