Re: MDS: obscene buffer_anon memory use when scanning lots of files

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Wed, 22 Jan 2020 08:43:02 +0100

On Wed, Jan 22, 2020 at 12:24 AM Patrick Donnelly <pdonnell@xxxxxxxxxx> wrote:
On Tue, Jan 21, 2020 at 8:32 AM John Madden <jmadden.com@xxxxxxxxx> wrote:

>

> On 14.2.5 but also present in Luminous, buffer_anon memory use spirals

> out of control when scanning many thousands of files. The use case is

> more or less "look up this file and if it exists append this chunk to

> it, otherwise create it with this chunk." The memory is recovered as

> soon as the workload stops, and at most only 20-100 files are ever

> open at one time.

>

> Cache gets oversized but that's more or less expected, it's pretty

> much always/immediately in some warn state, which makes me wonder if a

> much larger cache might help buffer_anon use, looking for advice

> there. This is on a deeply-hashed directory, but overall very little

> data (<20GB), lots of tiny files.

>

> As I typed this post the pool went from ~60GB to ~110GB. I've resorted

> to a cronjob that restarts the active MDS when it reaches swap just to

> keep the cluster alive.

This looks like it will be fixed by

https://tracker.ceph.com/issues/42943

That will be available in v14.2.7.

Couldn't John confirm that this is the issue by checking the heap stats and triggering the release via

  ceph tell mds.mds1 heap stats
  ceph tell mds.mds1 heap release

(this would be much less disruptive than restarting the MDS)

-- Dan

-- 

Patrick Donnelly, Ph.D.

He / Him / His

Senior Software Engineer

Red Hat Sunnyvale, CA

GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com