On Wed, Dec 05 2018, Richard Alloway wrote: > > ================================================ > # egrep '^#|raid' /proc/slabinfo | sed 's/^#//' | column -t > name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail> > raid6-md0 272 272 1864 17 8 : tunables 0 0 0 : slabdata 16 16 0 > ================================================ > > The array is empty – no filesystems, partitions, or anything, so the disks are idle. > > If I trigger a raid-check manually, and then re-examine the slabinfo: > > ================================================ > # /usr/sbin/raid-check ; egrep '^#|raid' /proc/slabinfo | sed 's/^#//' | column -t > name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail> > raid6-md0 3060 3060 1864 17 8 : tunables 0 0 0 : slabdata 180 180 0 > ================================================ > > Executing the raid-check a second time, the memory usage increases again: > > ================================================ > # /usr/sbin/raid-check ; egrep '^#|raid' /proc/slabinfo | sed 's/^#//' | column -t name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail> > raid6-md0 4420 4420 1864 17 8 : tunables 0 0 0 : slabdata 260 260 0 > ================================================ > > So, this accounts for the loss of available memory. This is useful information, thanks. Can you repeat the experiment and also check the value in /sys/block/md0/md/stripe_cache_active This number can grow large, but should shrink again when there is memory pressure, but maybe that isn't happening. If stripe_cache_active has a similar value to slabinfo, then memory isn't getting lost, but the shrinker isn't working. If it has a much smaller value then memory is getting lost. If it appears to be the former, try to stop the check, then echo 3 > /proc/sys/vm/drop_caches that should aggressively flush lots of caches, including the stripe cache. NeilBrown
Attachment:
signature.asc
Description: PGP signature