On Mon 15-03-21 08:48:26, Shakeel Butt wrote: > On Mon, Mar 15, 2021 at 6:27 AM Borislav Petkov <bp@xxxxxxxxx> wrote: > > > > On Mon, Mar 15, 2021 at 03:24:01PM +0300, Vasily Averin wrote: > > > Unprivileged user inside memcg-limited container can create > > > non-accounted multi-page per-thread kernel objects for LDT > > > > I have hard time parsing this commit message. > > > > And I'm CCed only on patch 8 of what looks like a patchset. > > > > And that patchset is not on lkml so I can't find the rest to read about > > it, perhaps linux-mm. > > > > /me goes and finds it on lore > > > > I can see some bits and pieces, this, for example: > > > > https://lore.kernel.org/linux-mm/05c448c7-d992-8d80-b423-b80bf5446d7c@xxxxxxxxxxxxx/ > > > > ( Btw, that version has your SOB and this patch doesn't even have a > > Signed-off-by. Next time, run your whole set through checkpatch please > > before sending. ) > > > > Now, this URL above talks about OOM, ok, that gets me close to the "why" > > this patch. > > > > From a quick look at the ldt.c code, we allow a single LDT struct per > > mm. Manpage says so too: > > > > DESCRIPTION > > modify_ldt() reads or writes the local descriptor table (LDT) for a process. > > The LDT is an array of segment descriptors that can be referenced by user code. > > Linux allows processes to configure a per-process (actually per-mm) LDT. > > > > We allow > > > > /* Maximum number of LDT entries supported. */ > > #define LDT_ENTRIES 8192 > > > > so there's an upper limit per mm. > > > > Now, please explain what is this accounting for? > > > > Let me try to provide the reasoning at least from my perspective. > There are legitimate workloads with hundreds of processes and there > can be hundreds of workloads running on large machines. The > unaccounted memory can cause isolation issues between the workloads > particularly on highly utilized machines. It would be better to be explicit 8192 * 8 = 64kB * number_of_tasks so realistically this is in range of lower megabytes. Is this worth the memcg accounting overhead? Maybe yes but what kind of workloads really care? -- Michal Hocko SUSE Labs