Hi, issue solved! I have stopped active MGR service and waited until standby MGR became active. Then I started the (previously stopped) MGR service in order to have 2 standby. Thanks Eugen Am 21.11.2019 um 15:23 schrieb Eugen Block: > Hi, > > check if the active MGR is hanging. > I had this when testing pg_autoscaler, after some time every command > would hang. Restarting the MGR helped for a short period of time, then > I disabled pg_autoscaler. This is an upgraded cluster, currently on > Nautilus. > > Regards, > Eugen > > > Zitat von Thomas Schneider <74cmonty@xxxxxxxxx>: > >> Hi, >> command ceph osd df does not return any output. >> Based on the strace output there's a timeout. >> [...] >> mmap(NULL, 262144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, >> 0) = 0x7f53006b9000 >> brk(0x55c2579b6000) = 0x55c2579b6000 >> brk(0x55c2579d7000) = 0x55c2579d7000 >> brk(0x55c2579f9000) = 0x55c2579f9000 >> brk(0x55c257a1a000) = 0x55c257a1a000 >> mmap(NULL, 262144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, >> 0) = 0x7f5300679000 >> brk(0x55c257a3b000) = 0x55c257a3b000 >> brk(0x55c257a5c000) = 0x55c257a5c000 >> brk(0x55c257a7d000) = 0x55c257a7d000 >> clone(child_stack=0x7f53095c1fb0, >> flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, >> >> parent_tidptr=0x7f53095c29d0, tls=0x7f53095c2700, >> child_tidptr=0x7f53095c29d0) = 3261669 >> futex(0x55c257489940, FUTEX_WAKE_PRIVATE, 1) = 1 >> futex(0x55c2576246e0, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, >> NULL, FUTEX_BITSET_MATCH_ANY) = -1 EAGAIN (Resource temporarily >> unavailable) >> select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=1000}) = 0 (Timeout) >> select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=2000}) = 0 (Timeout) >> select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=4000}) = 0 (Timeout) >> select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=8000}) = 0 (Timeout) >> select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=16000}) = 0 (Timeout) >> select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=32000}) = 0 (Timeout) >> select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=50000}) = 0 (Timeout) >> select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=50000}) = 0 (Timeout) >> select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=50000}) = 0 (Timeout) >> select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=50000}) = 0 (Timeout) >> select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=50000}) = 0 (Timeout) >> select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=50000}^Cstrace: Process >> 3261645 detached >> <detached ...> >> Interrupted >> Traceback (most recent call last): >> File "/usr/bin/ceph", line 1263, in <module> >> retval = main() >> File "/usr/bin/ceph", line 1194, in main >> >> verbose) >> File "/usr/bin/ceph", line 619, in new_style_command >> ret, outbuf, outs = do_command(parsed_args, target, cmdargs, >> sigdict, inbuf, verbose) >> File "/usr/bin/ceph", line 593, in do_command >> return ret, '', '' >> UnboundLocalError: local variable 'ret' referenced before assignment >> >> >> How can I fix this? >> Do you need the full strace output to analyse this issue? >> >> This Ceph health status is reported since hours and I cannot identify >> any progress. Not sure if this is related to the issue with ceph osd df, >> though. >> >> 2019-11-21 15:00:00.000262 mon.ld5505 [ERR] overall HEALTH_ERR 1 >> filesystem is degraded; 1 filesystem has a failed mds daemon; 1 >> filesystem is offline; insufficient standby MDS daemons available; >> nodown,noout,noscrub,nodeep-scrub flag(s) set; 81 osds down; Reduced >> data availability: 1366 pgs inactive, 241 pgs peering; Degraded data >> redundancy: 6437/190964568 objects degraded (0.003%), 7 pgs degraded, 7 >> pgs undersized; 1 subtrees have overcommitted pool target_size_bytes; 1 >> subtrees have overcommitted pool target_size_ratio >> >> THX >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com