mds slow request, getattr currently failed to rdlock. Kraken with Bluestore

Daniel K <sathackr@xxxxxxxxx> · Tue, 23 May 2017 18:41:42 -0400

Have a 20 OSD cluster -"my first ceph cluster" that has another 400 OSDs enroute.
I was "beating up" on the cluster, and had been writing to a 6TB file in CephFS for several hours, during which I changed the crushmap to better match my environment, generating a bunch of recovery IO. After about 5.8TB written, one of the OSD(which is also a MON..soon to be rectivied) hosts crashed that hat 5 OSDs on it, and after rebooting, I have this in ceph -s:  (The degraded/misplaced warnings are likely because the cluster hasn't completed rebalancing after I changed the crushmap I assume)

2017-05-23 18:33:13.775924 7ff9d3230700 -1 WARNING: the following dangerous and experimental features are enabled: bluestore
2017-05-23 18:33:13.781732 7ff9d3230700 -1 WARNING: the following dangerous and experimental features are enabled: bluestore
    cluster e92e20ca-0fe6-4012-86cc-aa51e0466661
     health HEALTH_WARN
            440 pgs backfill_wait
            7 pgs backfilling
            85 pgs degraded
            5 pgs recovery_wait
            85 pgs stuck degraded
            452 pgs stuck unclean
            77 pgs stuck undersized
            77 pgs undersized
            recovery 196526/3554278 objects degraded (5.529%)
            recovery 1690392/3554278 objects misplaced (47.559%)
            mds0: 1 slow requests are blocked > 30 sec
     monmap e4: 3 mons at {stor-vm1=10.0.15.51:6789/0,stor-vm2=10.0.15.52:6789/0,stor-vm3=10.0.15.53:6789/0}
            election epoch 136, quorum 0,1,2 stor-vm1,stor-vm2,stor-vm3
      fsmap e21: 1/1/1 up {0=stor-vm4=up:active}
        mgr active: stor-vm1 standbys: stor-vm2
     osdmap e4655: 20 osds: 20 up, 20 in; 450 remapped pgs
            flags sortbitwise,require_jewel_osds,require_kraken_osds
      pgmap v192589: 1428 pgs, 5 pools, 5379 GB data, 1345 kobjects
            11041 GB used, 16901 GB / 27943 GB avail
            196526/3554278 objects degraded (5.529%)
            1690392/3554278 objects misplaced (47.559%)
                 975 active+clean
                 364 active+remapped+backfill_wait
                  76 active+undersized+degraded+remapped+backfill_wait
                   3 active+recovery_wait+degraded+remapped
                   3 active+remapped+backfilling
                   3 active+degraded+remapped+backfilling
                   2 active+recovery_wait+degraded
                   1 active+clean+scrubbing+deep
                   1 active+undersized+degraded+remapped+backfilling
recovery io 112 MB/s, 28 objects/s

Seems related to the "corrupted rbd filesystems since jewel" thread.

log entries on the MDS server:

2017-05-23 18:27:12.966218 7f95ed6c0700  0 log_channel(cluster) log [WRN] : slow request 243.113407 seconds old, received at 2017-05-23 18:23:09.852729: client_request(client.204100:5 getattr pAsLsXsFs #100000003ec 2017-05-23 17:48:23.770852 RETRY=2 caller_uid=0, caller_gid=0{}) currently failed to rdlock, waiting

output of ceph daemon mds.stor-vm4 objecter_requests(changes each time I run it)
:
root@stor-vm4:/var/log/ceph# ceph daemon mds.stor-vm4 objecter_requests
{
    "ops": [
        {
            "tid": 66700,
            "pg": "1.60e95c32",
            "osd": 4,
            "object_id": "100000003ec.003efb9f",
            "object_locator": "@1",
            "target_object_id": "100000003ec.003efb9f",
            "target_object_locator": "@1",
            "paused": 0,
            "used_replica": 0,
            "precalc_pgid": 0,
            "last_sent": "1.47461e+06s",
            "attempts": 1,
            "snapid": "head",
            "snap_context": "0=[]",
            "mtime": "1969-12-31 19:00:00.000000s",
            "osd_ops": [
                "stat"
            ]
        }
    ],
    "linger_ops": [],
    "pool_ops": [],
    "pool_stat_ops": [],
    "statfs_ops": [],
    "command_ops": []
}

I've tried restarting the mds daemon ( systemctl stop ceph-mds\*.service ceph-mds.target &&  systemctl start ceph-mds\*.service ceph-mds.target )

IO to the file that was being access when the host crashed is blocked.

Suggestions?

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com