Am 09.09.19 um 13:49 schrieb Vlastimil Babka: > On 9/9/19 10:54 AM, Stefan Priebe - Profihost AG wrote: >>> Do you have more snapshots of /proc/vmstat as suggested by Vlastimil and >>> me earlier in this thread? Seeing the overall progress would tell us >>> much more than before and after. Or have I missed this data? >> >> I needed to wait until today to grab again such a situation but from >> what i know it is very clear that MemFree is low and than the kernel >> starts to drop the chaches. >> >> Attached you'll find two log files. > > Thanks, what about my other requests/suggestions from earlier? Sorry i missed your email. > 1. How does /proc/pagetypeinfo look like? # cat /proc/pagetypeinfo Page block order: 9 Pages per block: 512 Free pages count per migrate type at order 0 1 2 3 4 5 6 7 8 9 10 Node 0, zone DMA, type Unmovable 1 0 0 1 2 1 1 0 1 0 0 Node 0, zone DMA, type Movable 0 0 0 0 0 0 0 0 0 1 3 Node 0, zone DMA, type Reclaimable 0 0 0 0 0 0 0 0 0 0 0 Node 0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Node 0, zone DMA, type Isolate 0 0 0 0 0 0 0 0 0 0 0 Node 0, zone DMA32, type Unmovable 1141 970 903 628 302 106 27 4 0 0 0 Node 0, zone DMA32, type Movable 274 269 368 396 342 265 214 178 113 12 13 Node 0, zone DMA32, type Reclaimable 81 57 134 114 60 50 25 4 2 0 0 Node 0, zone DMA32, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Node 0, zone DMA32, type Isolate 0 0 0 0 0 0 0 0 0 0 0 Node 0, zone Normal, type Unmovable 39 36 13257 3474 1333 317 42 0 0 0 0 Node 0, zone Normal, type Movable 1087 9678 1104 4250 2391 1946 1768 691 141 0 0 Node 0, zone Normal, type Reclaimable 1 1782 1153 2455 1927 986 330 7 2 0 0 Node 0, zone Normal, type HighAtomic 1 1 2 2 2 0 1 1 1 0 0 Node 0, zone Normal, type Isolate 0 0 0 0 0 0 0 0 0 0 0 Number of blocks type Unmovable Movable Reclaimable HighAtomic Isolate Node 0, zone DMA 1 7 0 0 0 Node 0, zone DMA32 52 1461 15 0 0 Node 0, zone Normal 824 5448 383 1 0 > 2. Could you also try if the bad trend stops after you execute: > echo never > /sys/kernel/mm/transparent_hugepage/defrag > and report the result? it's pretty difficult to catch those moments. Is it OK so set the value now and monitor if it happens again? Just to let you know: I've now also some more servers where memfree show 10-20Gb but cache drops suddently and memory PSI raises. Greets, Stefan