On 08/03/2018 04:13 PM, Marinko Catovic wrote: > Thanks for the analysis. > > So since I am no mem management dev, what exactly does this mean? > Is there any way of workaround or quickfix or something that can/will > be fixed at some point in time? Workaround would be the manual / periodic cache flushing, unfortunately. Maybe a memcg with kmemcg limit? Michal could know more. A long-term generic solution will be much harder to find :( > I can not imagine that I am the only one who is affected by this, nor do I > know why my use case would be so much different from any other. > Most 'cloud' services should be affected as well. Hmm, either your workload is specific in being hungry for fs metadata and not much data (page cache). And/Or there's some source of the high-order allocations that others don't have, possibly related to some piece of hardware? > Tell me if you need any other snapshots or whatever info. > > 2018-08-02 18:15 GMT+02:00 Vlastimil Babka <vbabka@xxxxxxx > <mailto:vbabka@xxxxxxx>>: > > On 07/31/2018 12:08 AM, Marinko Catovic wrote: > > > >> Can you provide (a single snapshot) /proc/pagetypeinfo and > >> /proc/slabinfo from a system that's currently experiencing the issue, > >> also with /proc/vmstat and /proc/zoneinfo to verify? Thanks. > > > > your request came in just one day after I 2>drop_caches again when the > > ram usage > > was really really low again. Up until now it did not reoccur on any of > > the 2 hosts, > > where one shows 550MB/11G with 37G of totally free ram for now - so not > > that low > > like last time when I dropped it, I think it was like 300M/8G or so, but > > I hope it helps: > > Thanks. > > > /proc/pagetypeinfo https://pastebin.com/6QWEZagL > > Yep, looks like fragmented by reclaimable slabs: > > Node 0, zone Normal, type Unmovable 29101 32754 8372 > 2790 1334 354 23 3 4 0 0 > Node 0, zone Normal, type Movable 142449 83386 99426 > 69177 36761 12931 1378 24 0 0 0 > Node 0, zone Normal, type Reclaimable 467195 530638 355045 > 192638 80358 15627 2029 231 18 0 0 > > Number of blocks type Unmovable Movable Reclaimable > HighAtomic Isolate > Node 0, zone DMA 1 7 0 > 0 0 > Node 0, zone DMA32 34 703 375 > 0 0 > Node 0, zone Normal 1672 14276 15659 > 1 0 > > Half of the memory is marked as reclaimable (2 megabyte) pageblocks. > zoneinfo has nr_slab_reclaimable 1679817 so the reclaimable slabs occupy > only 3280 (6G) pageblocks, yet they are spread over 5 times as much. > It's also possible they pollute the Movable pageblocks as well, but the > stats can't tell us. Either the page grouping mobility heuristics are > broken here, or the worst case scenario happened - memory was at > some point > really wholly filled with reclaimable slabs, and the rather random > reclaim > did not result in whole pageblocks being freed. > > > /proc/slabinfo https://pastebin.com/81QAFgke > > Largest caches seem to be: > # name <active_objs> <num_objs> <objsize> <objperslab> > <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : > slabdata <active_slabs> <num_slabs> <sharedavail> > ext4_inode_cache 3107754 3759573 1080 3 1 : tunables 24 > 12 8 : slabdata 1253191 1253191 0 > dentry 2840237 7328181 192 21 1 : tunables 120 > 60 8 : slabdata 348961 348961 120 > > The internal framentation of dentry cache is significant as well. > Dunno if some of those objects pin movable pages as well... > > So looks like there's insufficient slab reclaim (shrinker activity), and > possibly problems with page grouping by mobility heuristics as well... > > > /proc/vmstat https://pastebin.com/S7mrQx1s > > /proc/zoneinfo https://pastebin.com/csGeqNyX > > > > also please note - whether this makes any difference: there is no swap > > file/partition > > I am using this without swap space. imho this should not be > necessary since > > applications running on the hosts would not consume more than > 20GB, the rest > > should be used by buffers/cache. > > > >