mds can't trim journal

locallocal <locallocal@xxxxxxx> · Fri, 15 Nov 2019 16:10:40 +0800 (GMT+08:00)

    hi,cool guys,

    Recently,we had encountered a problem,the journal of MDS daemon could't be trimmed,resulting in  a large amount of space occupied by the metadata pool.so what we could think out was using the admin socket command to flush journal,you know,It got worse,the admin thread of MDS was also stuck and after that we couldn't configure the log level.After analyzing the code, we found some segments couldn't get out out from expiring queue.but we don't know why and where is stuck in the function void(LogSegment::try_to_expire(MDSRank *mds, MDSGatherBuilder &gather_bld, int op_prio). any ideas or advice?Thanks a lot.Here are some cluster information:

     Version:
            luminous(v12.2.12)
     MDS debug log:
            5 mds.0.log trim already expiring segment 3658103659/11516554553473, 980 events
            5 mds.0.log trim already expiring segment 3658104639/11516556356904, 1024 events
            5 mds.0.log trim already expiring segment 3658105663/11516558241475, 1024 events
    cephfs-journal-tool:
            {
        "magic": "ceph fs volume v011",
        "write_pos": 11836049063598,
        "expire_pos": 11516554553473,
        "trimmed_pos": 11516552151040,
        "stream_format": 1,
        "layout": {
                "stripe_unit": 4194304,
                "stripe_count": 1,
                "object_size": 4194304,
                "pool_id": 2,
                "pool_ns": ""
        }
    }

                                locallocal

                                    locallocal@xxxxxxx

        签名由
        网易邮箱大师
        定制

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx