Re: Command ceph osd df hangs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

check if the active MGR is hanging.
I had this when testing pg_autoscaler, after some time every command would hang. Restarting the MGR helped for a short period of time, then I disabled pg_autoscaler. This is an upgraded cluster, currently on Nautilus.

Regards,
Eugen


Zitat von Thomas Schneider <74cmonty@xxxxxxxxx>:

Hi,
command ceph osd df does not return any output.
Based on the strace output there's a timeout.
[...]
mmap(NULL, 262144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x7f53006b9000
brk(0x55c2579b6000)                     = 0x55c2579b6000
brk(0x55c2579d7000)                     = 0x55c2579d7000
brk(0x55c2579f9000)                     = 0x55c2579f9000
brk(0x55c257a1a000)                     = 0x55c257a1a000
mmap(NULL, 262144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x7f5300679000
brk(0x55c257a3b000)                     = 0x55c257a3b000
brk(0x55c257a5c000)                     = 0x55c257a5c000
brk(0x55c257a7d000)                     = 0x55c257a7d000
clone(child_stack=0x7f53095c1fb0,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7f53095c29d0, tls=0x7f53095c2700,
child_tidptr=0x7f53095c29d0) = 3261669
futex(0x55c257489940, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x55c2576246e0, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0,
NULL, FUTEX_BITSET_MATCH_ANY) = -1 EAGAIN (Resource temporarily unavailable)
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=2000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=4000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=8000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=16000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=32000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=50000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=50000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=50000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=50000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=50000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=50000}^Cstrace: Process
3261645 detached
 <detached ...>
Interrupted
Traceback (most recent call last):
  File "/usr/bin/ceph", line 1263, in <module>
    retval = main()
  File "/usr/bin/ceph", line 1194, in main

    verbose)
  File "/usr/bin/ceph", line 619, in new_style_command
    ret, outbuf, outs = do_command(parsed_args, target, cmdargs,
sigdict, inbuf, verbose)
  File "/usr/bin/ceph", line 593, in do_command
    return ret, '', ''
UnboundLocalError: local variable 'ret' referenced before assignment


How can I fix this?
Do you need the full strace output to analyse this issue?

This Ceph health status is reported since hours and I cannot identify
any progress. Not sure if this is related to the issue with ceph osd df,
though.

2019-11-21 15:00:00.000262 mon.ld5505 [ERR] overall HEALTH_ERR 1
filesystem is degraded; 1 filesystem has a failed mds daemon; 1
filesystem is offline; insufficient standby MDS daemons available;
nodown,noout,noscrub,nodeep-scrub flag(s) set; 81 osds down; Reduced
data availability: 1366 pgs inactive, 241 pgs peering; Degraded data
redundancy: 6437/190964568 objects degraded (0.003%), 7 pgs degraded, 7
pgs undersized; 1 subtrees have overcommitted pool target_size_bytes; 1
subtrees have overcommitted pool target_size_ratio

THX
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux