No active PG; No disk activity

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Good morning everyone!

Today there was an atypical situation in our Cluster where the three
machines came to shut down.

On powering up the cluster went up and formed quorum with no problems, but
the PGs are all in Working, I don't see any disk activity on the machines.
No PG is active.




[ceph: root@dcs1 /]# ceph osd tree
ID  CLASS  WEIGHT    TYPE NAME       STATUS  REWEIGHT  PRI-AFF
-1         98.24359  root default
-3         32.74786      host dcs1
 0    hdd   2.72899          osd.0       up   1.00000  1.00000
 1    hdd   2.72899          osd.1       up   1.00000  1.00000
 2    hdd   2.72899          osd.2       up   1.00000  1.00000
 3    hdd   2.72899          osd.3       up   1.00000  1.00000
 4    hdd   2.72899          osd.4       up   1.00000  1.00000
 5    hdd   2.72899          osd.5       up   1.00000  1.00000
 6    hdd   2.72899          osd.6       up   1.00000  1.00000
 7    hdd   2.72899          osd.7       up   1.00000  1.00000
 8    hdd   2.72899          osd.8       up   1.00000  1.00000
 9    hdd   2.72899          osd.9       up   1.00000  1.00000
10    hdd   2.72899          osd.10      up   1.00000  1.00000
11    hdd   2.72899          osd.11      up   1.00000  1.00000
-5         32.74786      host dcs2
12    hdd   2.72899          osd.12      up   1.00000  1.00000
13    hdd   2.72899          osd.13      up   1.00000  1.00000
14    hdd   2.72899          osd.14      up   1.00000  1.00000
15    hdd   2.72899          osd.15      up   1.00000  1.00000
16    hdd   2.72899          osd.16      up   1.00000  1.00000
17    hdd   2.72899          osd.17      up   1.00000  1.00000
18    hdd   2.72899          osd.18      up   1.00000  1.00000
19    hdd   2.72899          osd.19      up   1.00000  1.00000
20    hdd   2.72899          osd.20      up   1.00000  1.00000
21    hdd   2.72899          osd.21      up   1.00000  1.00000
22    hdd   2.72899          osd.22      up   1.00000  1.00000
23    hdd   2.72899          osd.23      up   1.00000  1.00000
-7         32.74786      host dcs3
24    hdd   2.72899          osd.24      up   1.00000  1.00000
25    hdd   2.72899          osd.25      up   1.00000  1.00000
26    hdd   2.72899          osd.26      up   1.00000  1.00000
27    hdd   2.72899          osd.27      up   1.00000  1.00000
28    hdd   2.72899          osd.28      up   1.00000  1.00000
29    hdd   2.72899          osd.29      up   1.00000  1.00000
30    hdd   2.72899          osd.30      up   1.00000  1.00000
31    hdd   2.72899          osd.31      up   1.00000  1.00000
32    hdd   2.72899          osd.32      up   1.00000  1.00000
33    hdd   2.72899          osd.33      up   1.00000  1.00000
34    hdd   2.72899          osd.34      up   1.00000  1.00000
35    hdd   2.72899          osd.35      up   1.00000  1.00000




[ceph: root@dcs1 /]# ceph -s
  cluster:
    id:     58bbb950-538b-11ed-b237-2c59e53b80cc
    health: HEALTH_WARN
            4 filesystems are degraded
            4 MDSs report slow metadata IOs
            Reduced data availability: 1153 pgs inactive, 1101 pgs peering
            26 slow ops, oldest one blocked for 563 sec, daemons
[osd.10,osd.13,osd.14,osd.15,osd.16,osd.18,osd.20,osd.21,osd.24,osd.25]...
have slow ops.

  services:
    mon: 3 daemons, quorum dcs1.evocorp,dcs2,dcs3 (age 7m)
    mgr: dcs1.evocorp.kyqfcd(active, since 15m), standbys: dcs2.rirtyl
    mds: 4/4 daemons up, 4 standby
    osd: 36 osds: 36 up (since 6m), 36 in (since 47m); 65 remapped pgs

  data:
    volumes: 0/4 healthy, 4 recovering
    pools:   10 pools, 1153 pgs
    objects: 254.72k objects, 994 GiB
    usage:   2.8 TiB used, 95 TiB / 98 TiB avail
    pgs:     100.000% pgs not active
             1036 peering
             65   remapped+peering
             52   activating




[ceph: root@dcs1 /]# ceph health detail
HEALTH_WARN 4 filesystems are degraded; 4 MDSs report slow metadata IOs;
Reduced data availability: 1153 pgs inactive, 1101 pgs peering; 26 slow
ops, oldest one blocked for 673 sec, daemons
[osd.10,osd.13,osd.14,osd.15,osd.16,osd.18,osd.20,osd.21,osd.24,osd.25]...
have slow ops.
[WRN] FS_DEGRADED: 4 filesystems are degraded
    fs dc_ovirt is degraded
    fs dc_iso is degraded
    fs dc_sas is degraded
    fs pool_tester is degraded
[WRN] MDS_SLOW_METADATA_IO: 4 MDSs report slow metadata IOs
    mds.dc_sas.dcs1.wbyuik(mds.0): 4 slow metadata IOs are blocked > 30
secs, oldest blocked for 1063 secs
    mds.dc_ovirt.dcs1.lpcazs(mds.0): 4 slow metadata IOs are blocked > 30
secs, oldest blocked for 1058 secs
    mds.pool_tester.dcs1.ixkkfs(mds.0): 4 slow metadata IOs are blocked >
30 secs, oldest blocked for 1058 secs
    mds.dc_iso.dcs1.jxqqjd(mds.0): 4 slow metadata IOs are blocked > 30
secs, oldest blocked for 1058 secs
[WRN] PG_AVAILABILITY: Reduced data availability: 1153 pgs inactive, 1101
pgs peering
    pg 6.c3 is stuck inactive for 50m, current state peering, last acting
[30,15,11]
    pg 6.c4 is stuck peering for 10h, current state peering, last acting
[12,0,26]
    pg 6.c5 is stuck peering for 10h, current state peering, last acting
[12,32,6]
    pg 6.c6 is stuck peering for 11h, current state peering, last acting
[30,4,22]
    pg 6.c7 is stuck peering for 10h, current state peering, last acting
[4,14,26]
    pg 6.c8 is stuck peering for 10h, current state peering, last acting
[0,22,32]
    pg 6.c9 is stuck peering for 11h, current state peering, last acting
[32,20,0]
    pg 6.ca is stuck peering for 11h, current state peering, last acting
[31,0,23]
    pg 6.cb is stuck peering for 10h, current state peering, last acting
[8,35,16]
    pg 6.cc is stuck peering for 10h, current state peering, last acting
[8,24,13]
    pg 6.cd is stuck peering for 10h, current state peering, last acting
[15,25,1]
    pg 6.ce is stuck peering for 11h, current state peering, last acting
[27,23,4]
    pg 6.cf is stuck peering for 11h, current state peering, last acting
[25,4,20]
    pg 7.c4 is stuck peering for 11m, current state remapped+peering, last
acting [19,8]
    pg 7.c5 is stuck peering for 10h, current state peering, last acting
[6,14,32]
    pg 7.c6 is stuck peering for 10h, current state peering, last acting
[14,35,5]
    pg 7.c7 is stuck peering for 10h, current state remapped+peering, last
acting [11,14]
    pg 7.c8 is stuck peering for 10h, current state peering, last acting
[21,9,28]
    pg 7.c9 is stuck peering for 10h, current state peering, last acting
[0,30,15]
    pg 7.ca is stuck peering for 10h, current state peering, last acting
[23,2,26]
    pg 7.cb is stuck peering for 10h, current state peering, last acting
[23,9,24]
    pg 7.cc is stuck peering for 10h, current state peering, last acting
[23,27,0]
    pg 7.cd is stuck peering for 11m, current state remapped+peering, last
acting [13,6]
    pg 7.ce is stuck peering for 10h, current state peering, last acting
[16,1,25]
    pg 7.cf is stuck peering for 11h, current state peering, last acting
[24,16,8]
    pg 9.c0 is stuck peering for 10h, current state peering, last acting
[21,28]
    pg 9.c1 is stuck peering for 10h, current state peering, last acting
[12,31]
    pg 9.c2 is stuck peering for 10h, current state peering, last acting
[6,27]
    pg 9.c3 is stuck peering for 10h, current state peering, last acting
[9,27]
    pg 9.c4 is stuck peering for 50m, current state peering, last acting
[17,34]
    pg 9.c5 is stuck peering for 11h, current state peering, last acting
[31,8]
    pg 9.c6 is stuck peering for 10h, current state peering, last acting
[1,29]
    pg 9.c7 is stuck peering for 10h, current state peering, last acting
[12,30]
    pg 9.c8 is stuck peering for 11h, current state peering, last acting
[26,3]
    pg 9.c9 is stuck peering for 11h, current state peering, last acting
[29,13]
    pg 9.ca is stuck peering for 11h, current state peering, last acting
[25,6]
    pg 9.cb is stuck peering for 10h, current state peering, last acting
[16,9]
    pg 9.cc is stuck peering for 4h, current state peering, last acting
[4,29]
    pg 10.c0 is stuck peering for 11h, current state peering, last acting
[32,19]
    pg 10.c1 is stuck peering for 10h, current state peering, last acting
[23,6]
    pg 10.c2 is stuck peering for 11h, current state peering, last acting
[24,7]
    pg 10.c3 is stuck peering for 38m, current state peering, last acting
[5,20]
    pg 10.c4 is stuck peering for 10h, current state peering, last acting
[21,4]
    pg 10.c5 is stuck peering for 10h, current state peering, last acting
[12,8]
    pg 10.c6 is stuck peering for 11h, current state peering, last acting
[34,7]
    pg 10.c7 is stuck peering for 10h, current state peering, last acting
[17,30]
    pg 10.c8 is stuck peering for 11h, current state peering, last acting
[24,19]
    pg 10.c9 is stuck inactive for 54m, current state activating, last
acting [13,3]
    pg 10.ca is stuck peering for 10h, current state peering, last acting
[16,6]
    pg 10.cb is stuck peering for 11h, current state peering, last acting
[26,13]
    pg 10.cf is stuck peering for 50m, current state peering, last acting
[21,24]
[WRN] SLOW_OPS: 26 slow ops, oldest one blocked for 673 sec, daemons
[osd.10,osd.13,osd.14,osd.15,osd.16,osd.18,osd.20,osd.21,osd.24,osd.25]...
have slow ops.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux