Re: Another cluster completely hang

Oliver Dzombic <info@xxxxxxxxxxxxxxxxx> · Wed, 29 Jun 2016 10:03:19 +0200

Hi Mario,

in my opinion you should

1. fix

 too many PGs per OSD (307 > max 300)

2. stop scrubbing / deeb scrubbing

--------------

How looks your current

ceph osd tree

?

-- 
Mit freundlichen Gruessen / Best regards

Oliver Dzombic
IP-Interactive

mailto:info@xxxxxxxxxxxxxxxxx

Anschrift:

IP Interactive UG ( haftungsbeschraenkt )
Zum Sonnenberg 1-3
63571 Gelnhausen

HRB 93402 beim Amtsgericht Hanau
Geschäftsführung: Oliver Dzombic

Steuer Nr.: 35 236 3622 1
UST ID: DE274086107

Am 29.06.2016 um 09:50 schrieb Mario Giammarco:
> I have searched google and I see that there is no official procedure.
> 
> Il giorno mer 29 giu 2016 alle ore 09:43 Mario Giammarco
> <mgiammarco@xxxxxxxxx <mailto:mgiammarco@xxxxxxxxx>> ha scritto:
> 
>     I have read many times the post "incomplete pgs, oh my"
>     I think my case is different. 
>     The broken disk is completely broken.
>     So how can I simply mark incomplete pgs as complete? 
>     Should I stop ceph before?
> 
> 
>     Il giorno mer 29 giu 2016 alle ore 09:36 Tomasz Kuzemko
>     <tomasz.kuzemko@xxxxxxxxxxxx <mailto:tomasz.kuzemko@xxxxxxxxxxxx>>
>     ha scritto:
> 
>         Hi,
>         if you need fast access to your remaining data you can use
>         ceph-objectstore-tool to mark those PGs as complete, however
>         this will
>         irreversibly lose the missing data.
> 
>         If you understand the risks, this procedure is pretty good
>         explained here:
>         http://ceph.com/community/incomplete-pgs-oh-my/
> 
>         Since this article was written, ceph-objectstore-tool gained a
>         feature
>         that was not available at that time, that is "--op mark-complete". I
>         think it will be necessary in your case to call --op
>         mark-complete after
>         you import the PG to temporary OSD (between steps 12 and 13).
> 
>         On 29.06.2016 09:09, Mario Giammarco wrote:
>         > Now I have also discovered that, by mistake, someone has put
>         production
>         > data on a virtual machine of the cluster. I need that ceph
>         starts I/O so
>         > I can boot that virtual machine.
>         > Can I mark the incomplete pgs as valid?
>         > If needed, where can I buy some paid support?
>         > Thanks again,
>         > Mario
>         >
>         > Il giorno mer 29 giu 2016 alle ore 08:02 Mario Giammarco
>         > <mgiammarco@xxxxxxxxx <mailto:mgiammarco@xxxxxxxxx>
>         <mailto:mgiammarco@xxxxxxxxx <mailto:mgiammarco@xxxxxxxxx>>> ha
>         scritto:
>         >
>         >     pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0
>         >     object_hash rjenkins pg_num 512 pgp_num 512 last_change
>         9313 flags
>         >     hashpspool stripe_width 0
>         >            removed_snaps [1~3]
>         >     pool 1 'rbd2' replicated size 2 min_size 1 crush_ruleset 0
>         >     object_hash rjenkins pg_num 512 pgp_num 512 last_change
>         9314 flags
>         >     hashpspool stripe_width 0
>         >            removed_snaps [1~3]
>         >     pool 2 'rbd3' replicated size 2 min_size 1 crush_ruleset 0
>         >     object_hash rjenkins pg_num 512 pgp_num 512 last_change
>         10537 flags
>         >     hashpspool stripe_width 0
>         >            removed_snaps [1~3]
>         >
>         >
>         >     ID WEIGHT  REWEIGHT SIZE   USE   AVAIL %USE  VAR
>         >     5 1.81000  1.00000  1857G  984G  872G 53.00 0.86
>         >     6 1.81000  1.00000  1857G 1202G  655G 64.73 1.05
>         >     2 1.81000  1.00000  1857G 1158G  698G 62.38 1.01
>         >     3 1.35999  1.00000  1391G  906G  485G 65.12 1.06
>         >     4 0.89999  1.00000   926G  702G  223G 75.88 1.23
>         >     7 1.81000  1.00000  1857G 1063G  793G 57.27 0.93
>         >     8 1.81000  1.00000  1857G 1011G  846G 54.44 0.88
>         >     9 0.89999  1.00000   926G  573G  352G 61.91 1.01
>         >     0 1.81000  1.00000  1857G 1227G  629G 66.10 1.07
>         >     13 0.45000  1.00000   460G  307G  153G 66.74 1.08
>         >                  TOTAL 14846G 9136G 5710G 61.54
>         >     MIN/MAX VAR: 0.86/1.23  STDDEV: 6.47
>         >
>         >
>         >
>         >     ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)
>         >
>         >     http://pastebin.com/SvGfcSHb
>         >     http://pastebin.com/gYFatsNS
>         >     http://pastebin.com/VZD7j2vN
>         >
>         >     I do not understand why I/O on ENTIRE cluster is blocked
>         when only
>         >     few pgs are incomplete.
>         >
>         >     Many thanks,
>         >     Mario
>         >
>         >
>         >     Il giorno mar 28 giu 2016 alle ore 19:34 Stefan Priebe -
>         Profihost
>         >     AG <s.priebe@xxxxxxxxxxxx <mailto:s.priebe@xxxxxxxxxxxx>
>         <mailto:s.priebe@xxxxxxxxxxxx <mailto:s.priebe@xxxxxxxxxxxx>>>
>         ha scritto:
>         >
>         >         And ceph health detail
>         >
>         >         Stefan
>         >
>         >         Excuse my typo sent from my mobile phone.
>         >
>         >         Am 28.06.2016 um 19:28 schrieb Oliver Dzombic
>         >         <info@xxxxxxxxxxxxxxxxx
>         <mailto:info@xxxxxxxxxxxxxxxxx> <mailto:info@xxxxxxxxxxxxxxxxx
>         <mailto:info@xxxxxxxxxxxxxxxxx>>>:
>         >
>         >>         Hi Mario,
>         >>
>         >>         please give some more details:
>         >>
>         >>         Please the output of:
>         >>
>         >>         ceph osd pool ls detail
>         >>         ceph osd df
>         >>         ceph --version
>         >>
>         >>         ceph -w for 10 seconds ( use http://pastebin.com/
>         please )
>         >>
>         >>         ceph osd crush dump ( also pastebin pls )
>         >>
>         >>         --
>         >>         Mit freundlichen Gruessen / Best regards
>         >>
>         >>         Oliver Dzombic
>         >>         IP-Interactive
>         >>
>         >>         mailto:info@xxxxxxxxxxxxxxxxx
>         <mailto:info@xxxxxxxxxxxxxxxxx>
>         >>
>         >>         Anschrift:
>         >>
>         >>         IP Interactive UG ( haftungsbeschraenkt )
>         >>         Zum Sonnenberg 1-3
>         >>         63571 Gelnhausen
>         >>
>         >>         HRB 93402 beim Amtsgericht Hanau
>         >>         Geschäftsführung: Oliver Dzombic
>         >>
>         >>         Steuer Nr.: 35 236 3622 1
>         >>         UST ID: DE274086107
>         >>
>         >>
>         >>         Am 28.06.2016 um 18:59 schrieb Mario Giammarco:
>         >>>         Hello,
>         >>>         this is the second time that happens to me, I hope that
>         >>>         someone can
>         >>>         explain what I can do.
>         >>>         Proxmox ceph cluster with 8 servers, 11 hdd.
>         Min_size=1, size=2.
>         >>>
>         >>>         One hdd goes down due to bad sectors.
>         >>>         Ceph recovers but it ends with:
>         >>>
>         >>>         cluster f2a8dd7d-949a-4a29-acab-11d4900249f4
>         >>>             health HEALTH_WARN
>         >>>                    3 pgs down
>         >>>                    19 pgs incomplete
>         >>>                    19 pgs stuck inactive
>         >>>                    19 pgs stuck unclean
>         >>>                    7 requests are blocked > 32 sec
>         >>>             monmap e11: 7 mons at
>         >>>         {0=192.168.0.204:6789/0,1=192.168.0.201:6789/0
>         <http://192.168.0.204:6789/0,1=192.168.0.201:6789/0>
>         >>>         <http://192.168.0.204:6789/0,1=192.168.0.201:6789/0>,
>         >>>       
>          2=192.168.0.203:6789/0,3=192.168.0.205:6789/0,4=192.168.0.202
>         <http://192.168.0.203:6789/0,3=192.168.0.205:6789/0,4=192.168.0.202>
>         >>>       
>          <http://192.168.0.203:6789/0,3=192.168.0.205:6789/0,4=192.168.0.202>:
>         >>>         6789/0,5=192.168.0.206:6789/0,6=192.168.0.207:6789/0
>         <http://192.168.0.206:6789/0,6=192.168.0.207:6789/0>
>         >>>         <http://192.168.0.206:6789/0,6=192.168.0.207:6789/0>}
>         >>>                    election epoch 722, quorum
>         >>>         0,1,2,3,4,5,6 1,4,2,0,3,5,6
>         >>>             osdmap e10182: 10 osds: 10 up, 10 in
>         >>>              pgmap v3295880: 1024 pgs, 2 pools, 4563 GB
>         data, 1143
>         >>>         kobjects
>         >>>                    9136 GB used, 5710 GB / 14846 GB avail
>         >>>                        1005 active+clean
>         >>>                          16 incomplete
>         >>>                           3 down+incomplete
>         >>>
>         >>>         Unfortunately "7 requests blocked" means no virtual
>         machine
>         >>>         can boot
>         >>>         because ceph has stopped i/o.
>         >>>
>         >>>         I can accept to lose some data, but not ALL data!
>         >>>         Can you help me please?
>         >>>         Thanks,
>         >>>         Mario
>         >>>
>         >>>         _______________________________________________
>         >>>         ceph-users mailing list
>         >>>         ceph-users@xxxxxxxxxxxxxx
>         <mailto:ceph-users@xxxxxxxxxxxxxx>
>         <mailto:ceph-users@xxxxxxxxxxxxxx
>         <mailto:ceph-users@xxxxxxxxxxxxxx>>
>         >>>         http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>         >>>
>         >>         _______________________________________________
>         >>         ceph-users mailing list
>         >>         ceph-users@xxxxxxxxxxxxxx
>         <mailto:ceph-users@xxxxxxxxxxxxxx>
>         <mailto:ceph-users@xxxxxxxxxxxxxx
>         <mailto:ceph-users@xxxxxxxxxxxxxx>>
>         >>         http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>         >         _______________________________________________
>         >         ceph-users mailing list
>         >         ceph-users@xxxxxxxxxxxxxx
>         <mailto:ceph-users@xxxxxxxxxxxxxx>
>         <mailto:ceph-users@xxxxxxxxxxxxxx
>         <mailto:ceph-users@xxxxxxxxxxxxxx>>
>         >         http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>         >
>         >
>         >
>         > _______________________________________________
>         > ceph-users mailing list
>         > ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>         > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>         >
> 
>         --
>         Tomasz Kuzemko
>         tomasz.kuzemko@xxxxxxxxxxxx <mailto:tomasz.kuzemko@xxxxxxxxxxxx>
> 
>         _______________________________________________
>         ceph-users mailing list
>         ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>         http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com