Re: Another cluster completely hang

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 512 pgp_num 512 last_change 9313 flags hashpspool stripe_width 0
       removed_snaps [1~3]
pool 1 'rbd2' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 512 pgp_num 512 last_change 9314 flags hashpspool stripe_width 0
       removed_snaps [1~3]
pool 2 'rbd3' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 512 pgp_num 512 last_change 10537 flags hashpspool stripe_width 0
       removed_snaps [1~3]


ID WEIGHT  REWEIGHT SIZE   USE   AVAIL %USE  VAR   
5 1.81000  1.00000  1857G  984G  872G 53.00 0.86  
6 1.81000  1.00000  1857G 1202G  655G 64.73 1.05  
2 1.81000  1.00000  1857G 1158G  698G 62.38 1.01  
3 1.35999  1.00000  1391G  906G  485G 65.12 1.06  
4 0.89999  1.00000   926G  702G  223G 75.88 1.23  
7 1.81000  1.00000  1857G 1063G  793G 57.27 0.93  
8 1.81000  1.00000  1857G 1011G  846G 54.44 0.88  
9 0.89999  1.00000   926G  573G  352G 61.91 1.01  
0 1.81000  1.00000  1857G 1227G  629G 66.10 1.07  
13 0.45000  1.00000   460G  307G  153G 66.74 1.08  
             TOTAL 14846G 9136G 5710G 61.54       
MIN/MAX VAR: 0.86/1.23  STDDEV: 6.47



ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)

http://pastebin.com/SvGfcSHb
http://pastebin.com/gYFatsNS
http://pastebin.com/VZD7j2vN

I do not understand why I/O on ENTIRE cluster is blocked when only few pgs are incomplete.

Many thanks,
Mario


Il giorno mar 28 giu 2016 alle ore 19:34 Stefan Priebe - Profihost AG <s.priebe@xxxxxxxxxxxx> ha scritto:
And ceph health detail

Stefan

Excuse my typo sent from my mobile phone.

Am 28.06.2016 um 19:28 schrieb Oliver Dzombic <info@xxxxxxxxxxxxxxxxx>:

Hi Mario,

please give some more details:

Please the output of:

ceph osd pool ls detail
ceph osd df
ceph --version

ceph -w for 10 seconds ( use http://pastebin.com/ please )

ceph osd crush dump ( also pastebin pls )

--
Mit freundlichen Gruessen / Best regards

Oliver Dzombic
IP-Interactive

mailto:info@xxxxxxxxxxxxxxxxx

Anschrift:

IP Interactive UG ( haftungsbeschraenkt )
Zum Sonnenberg 1-3
63571 Gelnhausen

HRB 93402 beim Amtsgericht Hanau
Geschäftsführung: Oliver Dzombic

Steuer Nr.: 35 236 3622 1
UST ID: DE274086107


Am 28.06.2016 um 18:59 schrieb Mario Giammarco:
Hello,
this is the second time that happens to me, I hope that someone can
explain what I can do.
Proxmox ceph cluster with 8 servers, 11 hdd. Min_size=1, size=2.

One hdd goes down due to bad sectors.
Ceph recovers but it ends with:

cluster f2a8dd7d-949a-4a29-acab-11d4900249f4
    health HEALTH_WARN
           3 pgs down
           19 pgs incomplete
           19 pgs stuck inactive
           19 pgs stuck unclean
           7 requests are blocked > 32 sec
    monmap e11: 7 mons at
{0=192.168.0.204:6789/0,1=192.168.0.201:6789/0,
2=192.168.0.203:6789/0,3=192.168.0.205:6789/0,4=192.168.0.202:
6789/0,5=192.168.0.206:6789/0,6=192.168.0.207:6789/0}
           election epoch 722, quorum
0,1,2,3,4,5,6 1,4,2,0,3,5,6
    osdmap e10182: 10 osds: 10 up, 10 in
     pgmap v3295880: 1024 pgs, 2 pools, 4563 GB data, 1143 kobjects
           9136 GB used, 5710 GB / 14846 GB avail
               1005 active+clean
                 16 incomplete
                  3 down+incomplete

Unfortunately "7 requests blocked" means no virtual machine can boot
because ceph has stopped i/o.

I can accept to lose some data, but not ALL data!
Can you help me please?
Thanks,
Mario

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux