In luminous osd_recovery_threads = osd_disk_threads ? osd_recovery_sleep = osd_recovery_sleep_hdd ? Or is this speeding up recovery, a lot different in luminous? [@~]# ceph daemon osd.0 config show | grep osd | grep thread "osd_command_thread_suicide_timeout": "900", "osd_command_thread_timeout": "600", "osd_disk_thread_ioprio_class": "", "osd_disk_thread_ioprio_priority": "-1", "osd_disk_threads": "1", "osd_op_num_threads_per_shard": "0", "osd_op_num_threads_per_shard_hdd": "1", "osd_op_num_threads_per_shard_ssd": "2", "osd_op_thread_suicide_timeout": "150", "osd_op_thread_timeout": "15", "osd_peering_wq_threads": "2", "osd_recovery_thread_suicide_timeout": "300", "osd_recovery_thread_timeout": "30", "osd_remove_thread_suicide_timeout": "36000", "osd_remove_thread_timeout": "3600", -----Original Message----- From: Webert de Souza Lima [mailto:webert.boss@xxxxxxxxx] Sent: vrijdag 11 mei 2018 20:34 To: ceph-users Subject: Re: Node crash, filesytem not usable This message seems to be very concerning: > mds0: Metadata damage detected but for the rest, the cluster seems still to be recovering. you could try to seep thing up with ceph tell, like: ceph tell osd.* injectargs --osd_max_backfills=10 ceph tell osd.* injectargs --osd_recovery_sleep=0.0 ceph tell osd.* injectargs --osd_recovery_threads=2 Regards, Webert Lima DevOps Engineer at MAV Tecnologia Belo Horizonte - Brasil IRC NICK - WebertRLZ On Fri, May 11, 2018 at 3:06 PM Daniel Davidson <danield@xxxxxxxxxxxxxxxx> wrote: Below id the information you were asking for. I think they are size=2, min size=1. Dan # ceph status cluster 7bffce86-9d7b-4bdf-a9c9-67670e68ca77 health HEALTH_ERR 140 pgs are stuck inactive for more than 300 seconds 64 pgs backfill_wait 76 pgs backfilling 140 pgs degraded 140 pgs stuck degraded 140 pgs stuck inactive 140 pgs stuck unclean 140 pgs stuck undersized 140 pgs undersized 210 requests are blocked > 32 sec recovery 38725029/695508092 objects degraded (5.568%) recovery 10844554/695508092 objects misplaced (1.559%) mds0: Metadata damage detected mds0: Behind on trimming (71/30) noscrub,nodeep-scrub flag(s) set monmap e3: 4 mons at {ceph-0=172.16.31.1:6789/0,ceph-1=172.16.31.2:6789/0,ceph-2=172.16.31.3: 6789/0,ceph-3=172.16.31.4:6789/0} election epoch 824, quorum 0,1,2,3 ceph-0,ceph-1,ceph-2,ceph-3 fsmap e144928: 1/1/1 up {0=ceph-0=up:active}, 1 up:standby osdmap e35814: 32 osds: 30 up, 30 in; 140 remapped pgs flags noscrub,nodeep-scrub,sortbitwise,require_jewel_osds pgmap v43142427: 1536 pgs, 2 pools, 762 TB data, 331 Mobjects 1444 TB used, 1011 TB / 2455 TB avail 38725029/695508092 objects degraded (5.568%) 10844554/695508092 objects misplaced (1.559%) 1396 active+clean 76 undersized+degraded+remapped+backfilling+peered 64 undersized+degraded+remapped+wait_backfill+peered recovery io 1244 MB/s, 1612 keys/s, 705 objects/s ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 2619.54541 root default -2 163.72159 host ceph-0 0 81.86079 osd.0 up 1.00000 1.00000 1 81.86079 osd.1 up 1.00000 1.00000 -3 163.72159 host ceph-1 2 81.86079 osd.2 up 1.00000 1.00000 3 81.86079 osd.3 up 1.00000 1.00000 -4 163.72159 host ceph-2 8 81.86079 osd.8 up 1.00000 1.00000 9 81.86079 osd.9 up 1.00000 1.00000 -5 163.72159 host ceph-3 10 81.86079 osd.10 up 1.00000 1.00000 11 81.86079 osd.11 up 1.00000 1.00000 -6 163.72159 host ceph-4 4 81.86079 osd.4 up 1.00000 1.00000 5 81.86079 osd.5 up 1.00000 1.00000 -7 163.72159 host ceph-5 6 81.86079 osd.6 up 1.00000 1.00000 7 81.86079 osd.7 up 1.00000 1.00000 -8 163.72159 host ceph-6 12 81.86079 osd.12 up 0.79999 1.00000 13 81.86079 osd.13 up 1.00000 1.00000 -9 163.72159 host ceph-7 14 81.86079 osd.14 up 1.00000 1.00000 15 81.86079 osd.15 up 1.00000 1.00000 -10 163.72159 host ceph-8 16 81.86079 osd.16 up 1.00000 1.00000 17 81.86079 osd.17 up 1.00000 1.00000 -11 163.72159 host ceph-9 18 81.86079 osd.18 up 1.00000 1.00000 19 81.86079 osd.19 up 1.00000 1.00000 -12 163.72159 host ceph-10 20 81.86079 osd.20 up 1.00000 1.00000 21 81.86079 osd.21 up 1.00000 1.00000 -13 163.72159 host ceph-11 22 81.86079 osd.22 up 1.00000 1.00000 23 81.86079 osd.23 up 1.00000 1.00000 -14 163.72159 host ceph-12 24 81.86079 osd.24 up 1.00000 1.00000 25 81.86079 osd.25 up 1.00000 1.00000 -15 163.72159 host ceph-13 26 81.86079 osd.26 down 0 1.00000 27 81.86079 osd.27 down 0 1.00000 -16 163.72159 host ceph-14 28 81.86079 osd.28 up 1.00000 1.00000 29 81.86079 osd.29 up 1.00000 1.00000 -17 163.72159 host ceph-15 30 81.86079 osd.30 up 1.00000 1.00000 31 81.86079 osd.31 up 1.00000 1.00000 On 05/11/2018 11:56 AM, David Turner wrote: What are some outputs of commands to show us the state of your cluster. Most notable is `ceph status` but `ceph osd tree` would be helpful. What are the size of the pools in your cluster? Are they all size=3 min_size=2? On Fri, May 11, 2018 at 12:05 PM Daniel Davidson <danield@xxxxxxxxxxxxxxxx> wrote: Hello, Today we had a node crash, and looking at it, it seems there is a problem with the RAID controller, so it is not coming back up, maybe ever. It corrupted the local filesytem for the ceph storage there. The remainder of our storage (10.2.10) cluster is running, and it looks to be repairing and our min_size is set to 2. Normally, I would expect that the system would keep running normally from and end user perspective when this happens, but the system is down. All mounts that were up when this started look to be stale, and new mounts give the following error: # mount -t ceph ceph-0:/ /test/ -o name=admin,secretfile=/etc/ceph/admin.secret,noatime,_netdev,rbytes mount error 5 = Input/output error Any suggestions? Dan _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com