Re: [RFC] Per file OOM badness

Eric Anholt <eric@xxxxxxxxxx> · Sun, 21 Jan 2018 17:50:39 +1100

Michel Dänzer <michel@xxxxxxxxxxx> writes:

> On 2018-01-19 11:02 AM, Michel Dänzer wrote:
>> On 2018-01-19 10:58 AM, Christian König wrote:
>>> Am 19.01.2018 um 10:32 schrieb Michel Dänzer:
>>>> On 2018-01-19 09:39 AM, Christian König wrote:
>>>>> Am 19.01.2018 um 09:20 schrieb Michal Hocko:
>>>>>> OK, in that case I would propose a different approach. We already
>>>>>> have rss_stat. So why do not we simply add a new counter there
>>>>>> MM_KERNELPAGES and consider those in oom_badness? The rule would be
>>>>>> that such a memory is bound to the process life time. I guess we will
>>>>>> find more users for this later.
>>>>> I already tried that and the problem with that approach is that some
>>>>> buffers are not created by the application which actually uses them.
>>>>>
>>>>> For example X/Wayland is creating and handing out render buffers to
>>>>> application which want to use OpenGL.
>>>>>
>>>>> So the result is when you always account the application who created the
>>>>> buffer the OOM killer will certainly reap X/Wayland first. And that is
>>>>> exactly what we want to avoid here.
>>>> FWIW, what you describe is true with DRI2, but not with DRI3 or Wayland
>>>> anymore. With DRI3 and Wayland, buffers are allocated by the clients and
>>>> then shared with the X / Wayland server.
>>>
>>> Good point, when I initially looked at that problem DRI3 wasn't widely
>>> used yet.
>>>
>>>> Also, in all cases, the amount of memory allocated for buffers shared
>>>> between DRI/Wayland clients and the server should be relatively small
>>>> compared to the amount of memory allocated for buffers used only locally
>>>> in the client, particularly for clients which create significant memory
>>>> pressure.
>>>
>>> That is unfortunately only partially true. When you have a single
>>> runaway application which tries to allocate everything it would indeed
>>> work as you described.
>>>
>>> But when I tested this a few years ago with X based desktop the
>>> applications which actually used most of the memory where Firefox and
>>> Thunderbird. Unfortunately they never got accounted for that.
>>>
>>> Now, on my current Wayland based desktop it actually doesn't look much
>>> better. Taking a look at radeon_gem_info/amdgpu_gem_info the majority of
>>> all memory was allocated either by gnome-shell or Xwayland.
>> 
>> My guess would be this is due to pixmaps, which allow X clients to cause
>> the X server to allocate essentially unlimited amounts of memory. It's a
>> separate issue, which would require a different solution than what we're
>> discussing in this thread. Maybe something that would allow the X server
>> to tell the kernel that some of the memory it allocates is for the
>> client process.
>
> Of course, such a mechanism could probably be abused to incorrectly
> blame other processes for one's own memory consumption...
>
>
> I'm not sure if the pixmap issue can be solved for the OOM killer. It's
> an X design issue which is fixed with Wayland. So it's probably better
> to ignore it for this discussion.
>
> Also, I really think the issue with DRM buffers being shared between
> processes isn't significant for the OOM killer compared to DRM buffers
> only used in the same process that allocates them. So I suggest focusing
> on the latter.

Agreed.  The 95% case is non-shared buffers, so just don't account for
them and we'll have a solution good enough that we probably never need
to handle the shared case.  On the DRM side, removing buffers from the
accounting once they get shared would be easy.
Attachment:
signature.asc

Description: PGP signature