On Wed, Jul 18, 2018 at 03:36:13PM +0200, Vlastimil Babka wrote: > v3 changes: > - fix missing hunk in patch 5/7 > - more verbose cover letter and patch 6/7 commit log > v2 changes: > - shorten cache names to kmalloc-rcl-<SIZE> > - last patch shortens <SIZE> for all kmalloc caches to e.g. "1k", "4M" > - include dma caches to the 2D kmalloc_caches[] array to avoid a branch > - vmstat counter nr_indirectly_reclaimable_bytes renamed to > nr_kernel_misc_reclaimable, doesn't include kmalloc-rcl-* > - /proc/meminfo counter renamed to KReclaimable, includes kmalloc-rcl* > and nr_kernel_misc_reclaimable > > Hi, > > as discussed at LSF/MM [1] here's a patchset that introduces > kmalloc-reclaimable caches (more details in the second patch) and uses them for > SLAB freelists and dcache external names. The latter allows us to repurpose the > NR_INDIRECTLY_RECLAIMABLE_BYTES counter later in the series. > > With patch 4/7, dcache external names are allocated from kmalloc-rcl-* > caches, eliminating the need for manual accounting. More importantly, it > also ensures the reclaimable kmalloc allocations are grouped in pages > separate from the regular kmalloc allocations. The need for proper > accounting of dcache external names has shown it's easy for misbehaving > process to allocate lots of them, causing premature OOMs. Without the > added grouping, it's likely that a similar workload can interleave the > dcache external names allocations with regular kmalloc allocations > (note: I haven't searched myself for an example of such regular kmalloc > allocation, but I would be very surprised if there wasn't some). A > pathological case would be e.g. one 64byte regular allocations with 63 > external dcache names in a page (64x64=4096), which means the page is > not freed even after reclaiming after all dcache names, and the process > can thus "steal" the whole page with single 64byte allocation. > > If there other kmalloc users similar to dcache external names become > identified, they can also benefit from the new functionality simply by > adding __GFP_RECLAIMABLE to the kmalloc calls. > > Side benefits of the patchset (that could be also merged separately) > include removed branch for detecting __GFP_DMA kmalloc(), and shortening > kmalloc cache names in /proc/slabinfo output. The latter is potentially > an ABI break in case there are tools parsing the names and expecting the > values to be in bytes. > > This is how /proc/slabinfo looks like after booting in virtme: > > ... > kmalloc-rcl-4M 0 0 4194304 1 1024 : tunables 1 1 0 : slabdata 0 0 0 > ... > kmalloc-rcl-96 7 32 128 32 1 : tunables 120 60 8 : slabdata 1 1 0 > kmalloc-rcl-64 25 128 64 64 1 : tunables 120 60 8 : slabdata 2 2 0 > kmalloc-rcl-32 0 0 32 124 1 : tunables 120 60 8 : slabdata 0 0 0 > kmalloc-4M 0 0 4194304 1 1024 : tunables 1 1 0 : slabdata 0 0 0 > kmalloc-2M 0 0 2097152 1 512 : tunables 1 1 0 : slabdata 0 0 0 > kmalloc-1M 0 0 1048576 1 256 : tunables 1 1 0 : slabdata 0 0 0 > ... > > /proc/vmstat with renamed nr_indirectly_reclaimable_bytes counter: > > ... > nr_slab_reclaimable 2817 > nr_slab_unreclaimable 1781 > ... > nr_kernel_misc_reclaimable 0 > ... > > /proc/meminfo with new KReclaimable counter: > > ... > Shmem: 564 kB > KReclaimable: 11260 kB > Slab: 18368 kB > SReclaimable: 11260 kB > SUnreclaim: 7108 kB > KernelStack: 1248 kB > ... > > Thanks, > Vlastimil Hi, Vlastimil! Overall the patchset looks solid to me. Please, feel free to add Acked-by: Roman Gushchin <guro@xxxxxx> Two small nits: 1) The last patch is unrelated to the main idea, and can potentially cause ABI breakage. I'd separate it from the rest of the patchset. 2) It's actually re-opening the security issue for SLOB users. Is the memory overhead really big enough to justify that? Thanks!