Annoying MDS_CLIENT_RECALL Warning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

We are consistently seeing the MDS_CLIENT_RECALL warning in our cluster, it seems harmless, but we cannot get HEALTH_OK, which is annoying.

The clients that are reported failing to respond to cache pressure are constantly changing, and most of the time we got 1-5 such clients out of ~20. All of the clients are kernel clients, running HWE kernel 5.11 of Ubuntu 20.04. The load is pretty low.

We are reading datasets that consist of millions of small files from cephfs, so we have tuned some config for performance. Some configs from "ceph config dump" that might be relevant:

WHO       LEVEL     OPTION                   VALUE
  mds     basic     mds_cache_memory_limit   51539607552
  mds     advanced  mds_max_caps_per_client  8388608
  client  basic     client_cache_size        32768

We also manually pinned almost every directory to either rank 0 or rank 1.

Any thoughts about what causes the warning, or how can we get rid of it?

Thanks,
Weiwen Hu


# ceph -s
  cluster:
    id:     e88d509a-f6fc-11ea-b25d-a0423f3ac864
    health: HEALTH_WARN
            4 clients failing to respond to cache pressure

  services:
    mon: 5 daemons, quorum gpu024,gpu006,gpu023,gpu013,gpu018 (age 7d)
    mgr: gpu014.kwbqcf(active, since 2w), standbys: gpu024.bapbcz
    mds: 2/2 daemons up, 2 hot standby
    osd: 45 osds: 45 up (since 2h), 45 in (since 5d)
    rgw: 2 daemons active (2 hosts, 1 zones)

  data:
    volumes: 1/1 healthy
    pools:   16 pools, 1713 pgs
    objects: 265.84M objects, 55 TiB
    usage:   115 TiB used, 93 TiB / 208 TiB avail
    pgs:     1711 active+clean
             2    active+clean+scrubbing+deep

  io:
    client:   55 MiB/s rd, 5.2 MiB/s wr, 513 op/s rd, 14 op/s wr


# ceph fs status
cephfs - 23 clients
======
RANK      STATE               MDS              ACTIVITY     DNS    INOS   DIRS   CAPS
 0        active      cephfs.gpu018.ovxvoz  Reqs:  241 /s  17.3M  17.3M  41.3k  5054k
 1        active      cephfs.gpu023.aetiph  Reqs:    1 /s  13.1M  12.1M   864k   586k
1-s   standby-replay  cephfs.gpu006.ddpekw  Evts:    2 /s  2517k  2393k   216k     0
0-s   standby-replay  cephfs.gpu024.rpfbnh  Evts:   17 /s  9587k  9587k   214k     0
          POOL              TYPE     USED  AVAIL
   cephfs.cephfs.meta     metadata   126G   350G
   cephfs.cephfs.data       data     102T  25.9T
 cephfs.cephfs.data_ssd     data       0    525G
cephfs.cephfs.data_mixed    data    9.81T   350G
 cephfs.cephfs.data_ec      data     166G  41.4T
MDS version: ceph version 16.2.6 (ee28fb57e47e9f88813e24bbf4c14496ca299d31) pacific (stable)
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux