Cephfs mds not trimming after cluster outage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

we are having issues with cephfs cluster.
Any help would be appreciated.

We are running still on 18.2.0.
During holidays we had outage caused by filling up rootfs. OSDs started randomly dying and we had time when not all PGs were active. This issue is already solved and all OSDs work fine but we're stuck with some MDS issues.

warnings we are concerned about:

[WRN] MDS_SLOW_METADATA_IO: 1 MDSs report slow metadata IOs
mds.arm-vol.k02r04nvm01.zaqebs(mds.0): 29 slow metadata IOs are blocked > 30 secs, oldest blocked for 1899 secs
[WRN] MDS_TRIM: 1 MDSs behind on trimming
mds.arm-vol.k02r04nvm01.zaqebs(mds.0): Behind on trimming (4851/128) max_segments: 128, num_segments: 4851

1. Out MDSs are not trimming.
2. our active MDS has metadata slow ops which we cannot understand

Cephfs status look ok, main MDS is active.
All metadata pool PGs are active and working, there are not laggy PGs.

Trying to dump ops from mds also doesn't help

ceph daemon ./ceph-mds.arm-vol.k02r04nvm01.zaqebs.asok dump_ops_in_flight
{
    "ops": [],
    "num_ops": 0
}

MDS failover or MDS restart also doesn't help.
Metadata slow ops always return after MDS restart. (all MDSs have this issue)
After failover main MDS is stuck in rejoin state for a long time.
We've used mds_wipe_sessions config option to bring it quickly into active state.

I'm guessing slow metadata ops are stopping MDS from trimming but we cannot figure out what is causing these slow ops.

Best regards
Adam Prycki

Attachment: smime.p7s
Description: Kryptograficzna sygnatura S/MIME

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux