[MDS] Pacific memory leak

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

For the last 2 months, our MDS is frequently switching to another because of a sudden memory leak. The host has 128G RAM and most of the time the MDS occupies ~20% of memory. And in less than 3 minutes it increases to 100% and crashs with tcmalloc: allocation failed.

We tried to run heap stats / perf dump on the host but we couldn't find any reasons why the memory used by the MDS exploses so quickly. MDS log available here : https://filesender.renater.fr/?s=download&token=c1e60c3c-7f02-4f1e-b23e-f5b25c0cd2a8

Any idea what could lead to this memory leak? Anything we can try to understand what happens or prevent this?
We use Pacific 16.2.14.

Cheers,
Adrien


|cluster:
    id:     86cd8a68-7649-11ed-b2be-5cba2c7fdb30
    health: HEALTH_OK

  services:
    mon: 5 daemons, quorum cccephadm40,cccephadm42,cccephadm41,cccephadm43,cccephadm44 (age 3d)     mgr: cccephadm40.ucqhkr(active, since 105m), standbys: cccephadm41.osnxsd
    mds: 4/4 daemons up, 1 standby
    osd: 720 osds: 720 up (since 9d), 720 in (since 7w)
    rgw: 1 daemon active (1 hosts, 1 zones)

  data:
    volumes: 1/1 healthy
    pools:   12 pools, 8801 pgs
    objects: 1.25G objects, 2.7 PiB
    usage:   3.5 PiB used, 2.6 PiB / 6.1 PiB avail
    pgs:     8731 active+clean
             59   active+clean+scrubbing+deep
             11   active+clean+scrubbing

  io:
    client:   2.1 GiB/s rd, 1.2 GiB/s wr, 3.63k op/s rd, 17.47k op/s wr


ceph fs status
cephfs_astro - 1628 clients
============
RANK  STATE                 MDS                   ACTIVITY DNS    INOS   DIRS   CAPS  0    active  cephfs_astro.cccephadm41.nbgvxk  Reqs: 7205 /s 11.2M  11.1M   708k  2782k  1    active  cephfs_astro.cccephadm42.ieexgj  Reqs:  812 /s 11.6M  11.5M   554k  1366k  2    active  cephfs_astro.cccephadm40.aafhps  Reqs:    1 /s 11.4M  11.3M  1346k   462k  3    active  cephfs_astro.cccephadm44.jodfcx  Reqs:    3 /s 11.4M  11.4M  1094k  29.3k
      POOL         TYPE     USED  AVAIL
cephfs_metadata  metadata   484G  13.5T
 cephfs_default    data       0   13.5T
 cephfs_EC_data    data    3463T  1830T
   cephfs_ssd      data    5314G  13.5T
          STANDBY MDS
cephfs_astro.cccephadm43.wetlrp
MDS version: ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable)


|
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux