Re: Another cluster completely hang

Tomasz Kuzemko <tomasz.kuzemko@xxxxxxxxxxxx> · Wed, 29 Jun 2016 09:59:08 +0200

As far as I know there isn't, which is a shame. We have covered a
situation like this in our dev environment to be ready for it in
production and it worked, however be aware that the data that Ceph
believes is missing will be lost after you mark a PG complete.

In your situation I would find OSD which has a most complete copy of the
incomplete PG by looking at files in /var/lib/ceph/osd/*/current (based
on size or maybe mtime of files) and export it using
ceph-objectstore-tool. After that you can follow procedure described in
"incomplete pgs, oh my" with the addition of "--op mark-complete"
between steps 12 and 13.

On 29.06.2016 09:50, Mario Giammarco wrote:
> I have searched google and I see that there is no official procedure.
> 
> Il giorno mer 29 giu 2016 alle ore 09:43 Mario Giammarco
> <mgiammarco@xxxxxxxxx <mailto:mgiammarco@xxxxxxxxx>> ha scritto:
> 
>     I have read many times the post "incomplete pgs, oh my"
>     I think my case is different. 
>     The broken disk is completely broken.
>     So how can I simply mark incomplete pgs as complete? 
>     Should I stop ceph before?
> 
> 
>     Il giorno mer 29 giu 2016 alle ore 09:36 Tomasz Kuzemko
>     <tomasz.kuzemko@xxxxxxxxxxxx <mailto:tomasz.kuzemko@xxxxxxxxxxxx>>
>     ha scritto:
> 
>         Hi,
>         if you need fast access to your remaining data you can use
>         ceph-objectstore-tool to mark those PGs as complete, however
>         this will
>         irreversibly lose the missing data.
> 
>         If you understand the risks, this procedure is pretty good
>         explained here:
>         http://ceph.com/community/incomplete-pgs-oh-my/
> 
>         Since this article was written, ceph-objectstore-tool gained a
>         feature
>         that was not available at that time, that is "--op mark-complete". I
>         think it will be necessary in your case to call --op
>         mark-complete after
>         you import the PG to temporary OSD (between steps 12 and 13).
> 
>         On 29.06.2016 09:09, Mario Giammarco wrote:
>         > Now I have also discovered that, by mistake, someone has put
>         production
>         > data on a virtual machine of the cluster. I need that ceph
>         starts I/O so
>         > I can boot that virtual machine.
>         > Can I mark the incomplete pgs as valid?
>         > If needed, where can I buy some paid support?
>         > Thanks again,
>         > Mario
>         >
>         > Il giorno mer 29 giu 2016 alle ore 08:02 Mario Giammarco
>         > <mgiammarco@xxxxxxxxx <mailto:mgiammarco@xxxxxxxxx>
>         <mailto:mgiammarco@xxxxxxxxx <mailto:mgiammarco@xxxxxxxxx>>> ha
>         scritto:
>         >
>         >     pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0
>         >     object_hash rjenkins pg_num 512 pgp_num 512 last_change
>         9313 flags
>         >     hashpspool stripe_width 0
>         >            removed_snaps [1~3]
>         >     pool 1 'rbd2' replicated size 2 min_size 1 crush_ruleset 0
>         >     object_hash rjenkins pg_num 512 pgp_num 512 last_change
>         9314 flags
>         >     hashpspool stripe_width 0
>         >            removed_snaps [1~3]
>         >     pool 2 'rbd3' replicated size 2 min_size 1 crush_ruleset 0
>         >     object_hash rjenkins pg_num 512 pgp_num 512 last_change
>         10537 flags
>         >     hashpspool stripe_width 0
>         >            removed_snaps [1~3]
>         >
>         >
>         >     ID WEIGHT  REWEIGHT SIZE   USE   AVAIL %USE  VAR
>         >     5 1.81000  1.00000  1857G  984G  872G 53.00 0.86
>         >     6 1.81000  1.00000  1857G 1202G  655G 64.73 1.05
>         >     2 1.81000  1.00000  1857G 1158G  698G 62.38 1.01
>         >     3 1.35999  1.00000  1391G  906G  485G 65.12 1.06
>         >     4 0.89999  1.00000   926G  702G  223G 75.88 1.23
>         >     7 1.81000  1.00000  1857G 1063G  793G 57.27 0.93
>         >     8 1.81000  1.00000  1857G 1011G  846G 54.44 0.88
>         >     9 0.89999  1.00000   926G  573G  352G 61.91 1.01
>         >     0 1.81000  1.00000  1857G 1227G  629G 66.10 1.07
>         >     13 0.45000  1.00000   460G  307G  153G 66.74 1.08
>         >                  TOTAL 14846G 9136G 5710G 61.54
>         >     MIN/MAX VAR: 0.86/1.23  STDDEV: 6.47
>         >
>         >
>         >
>         >     ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)
>         >
>         >     http://pastebin.com/SvGfcSHb
>         >     http://pastebin.com/gYFatsNS
>         >     http://pastebin.com/VZD7j2vN
>         >
>         >     I do not understand why I/O on ENTIRE cluster is blocked
>         when only
>         >     few pgs are incomplete.
>         >
>         >     Many thanks,
>         >     Mario
>         >
>         >
>         >     Il giorno mar 28 giu 2016 alle ore 19:34 Stefan Priebe -
>         Profihost
>         >     AG <s.priebe@xxxxxxxxxxxx <mailto:s.priebe@xxxxxxxxxxxx>
>         <mailto:s.priebe@xxxxxxxxxxxx <mailto:s.priebe@xxxxxxxxxxxx>>>
>         ha scritto:
>         >
>         >         And ceph health detail
>         >
>         >         Stefan
>         >
>         >         Excuse my typo sent from my mobile phone.
>         >
>         >         Am 28.06.2016 um 19:28 schrieb Oliver Dzombic
>         >         <info@xxxxxxxxxxxxxxxxx
>         <mailto:info@xxxxxxxxxxxxxxxxx> <mailto:info@xxxxxxxxxxxxxxxxx
>         <mailto:info@xxxxxxxxxxxxxxxxx>>>:
>         >
>         >>         Hi Mario,
>         >>
>         >>         please give some more details:
>         >>
>         >>         Please the output of:
>         >>
>         >>         ceph osd pool ls detail
>         >>         ceph osd df
>         >>         ceph --version
>         >>
>         >>         ceph -w for 10 seconds ( use http://pastebin.com/
>         please )
>         >>
>         >>         ceph osd crush dump ( also pastebin pls )
>         >>
>         >>         --
>         >>         Mit freundlichen Gruessen / Best regards
>         >>
>         >>         Oliver Dzombic
>         >>         IP-Interactive
>         >>
>         >>         mailto:info@xxxxxxxxxxxxxxxxx
>         <mailto:info@xxxxxxxxxxxxxxxxx>
>         >>
>         >>         Anschrift:
>         >>
>         >>         IP Interactive UG ( haftungsbeschraenkt )
>         >>         Zum Sonnenberg 1-3
>         >>         63571 Gelnhausen
>         >>
>         >>         HRB 93402 beim Amtsgericht Hanau
>         >>         Geschäftsführung: Oliver Dzombic
>         >>
>         >>         Steuer Nr.: 35 236 3622 1
>         >>         UST ID: DE274086107
>         >>
>         >>
>         >>         Am 28.06.2016 um 18:59 schrieb Mario Giammarco:
>         >>>         Hello,
>         >>>         this is the second time that happens to me, I hope that
>         >>>         someone can
>         >>>         explain what I can do.
>         >>>         Proxmox ceph cluster with 8 servers, 11 hdd.
>         Min_size=1, size=2.
>         >>>
>         >>>         One hdd goes down due to bad sectors.
>         >>>         Ceph recovers but it ends with:
>         >>>
>         >>>         cluster f2a8dd7d-949a-4a29-acab-11d4900249f4
>         >>>             health HEALTH_WARN
>         >>>                    3 pgs down
>         >>>                    19 pgs incomplete
>         >>>                    19 pgs stuck inactive
>         >>>                    19 pgs stuck unclean
>         >>>                    7 requests are blocked > 32 sec
>         >>>             monmap e11: 7 mons at
>         >>>         {0=192.168.0.204:6789/0,1=192.168.0.201:6789/0
>         <http://192.168.0.204:6789/0,1=192.168.0.201:6789/0>
>         >>>         <http://192.168.0.204:6789/0,1=192.168.0.201:6789/0>,
>         >>>       
>          2=192.168.0.203:6789/0,3=192.168.0.205:6789/0,4=192.168.0.202
>         <http://192.168.0.203:6789/0,3=192.168.0.205:6789/0,4=192.168.0.202>
>         >>>       
>          <http://192.168.0.203:6789/0,3=192.168.0.205:6789/0,4=192.168.0.202>:
>         >>>         6789/0,5=192.168.0.206:6789/0,6=192.168.0.207:6789/0
>         <http://192.168.0.206:6789/0,6=192.168.0.207:6789/0>
>         >>>         <http://192.168.0.206:6789/0,6=192.168.0.207:6789/0>}
>         >>>                    election epoch 722, quorum
>         >>>         0,1,2,3,4,5,6 1,4,2,0,3,5,6
>         >>>             osdmap e10182: 10 osds: 10 up, 10 in
>         >>>              pgmap v3295880: 1024 pgs, 2 pools, 4563 GB
>         data, 1143
>         >>>         kobjects
>         >>>                    9136 GB used, 5710 GB / 14846 GB avail
>         >>>                        1005 active+clean
>         >>>                          16 incomplete
>         >>>                           3 down+incomplete
>         >>>
>         >>>         Unfortunately "7 requests blocked" means no virtual
>         machine
>         >>>         can boot
>         >>>         because ceph has stopped i/o.
>         >>>
>         >>>         I can accept to lose some data, but not ALL data!
>         >>>         Can you help me please?
>         >>>         Thanks,
>         >>>         Mario
>         >>>
>         >>>         _______________________________________________
>         >>>         ceph-users mailing list
>         >>>         ceph-users@xxxxxxxxxxxxxx
>         <mailto:ceph-users@xxxxxxxxxxxxxx>
>         <mailto:ceph-users@xxxxxxxxxxxxxx
>         <mailto:ceph-users@xxxxxxxxxxxxxx>>
>         >>>         http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>         >>>
>         >>         _______________________________________________
>         >>         ceph-users mailing list
>         >>         ceph-users@xxxxxxxxxxxxxx
>         <mailto:ceph-users@xxxxxxxxxxxxxx>
>         <mailto:ceph-users@xxxxxxxxxxxxxx
>         <mailto:ceph-users@xxxxxxxxxxxxxx>>
>         >>         http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>         >         _______________________________________________
>         >         ceph-users mailing list
>         >         ceph-users@xxxxxxxxxxxxxx
>         <mailto:ceph-users@xxxxxxxxxxxxxx>
>         <mailto:ceph-users@xxxxxxxxxxxxxx
>         <mailto:ceph-users@xxxxxxxxxxxxxx>>
>         >         http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>         >
>         >
>         >
>         > _______________________________________________
>         > ceph-users mailing list
>         > ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>         > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>         >
> 
>         --
>         Tomasz Kuzemko
>         tomasz.kuzemko@xxxxxxxxxxxx <mailto:tomasz.kuzemko@xxxxxxxxxxxx>
> 
>         _______________________________________________
>         ceph-users mailing list
>         ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>         http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Tomasz Kuzemko
tomasz.kuzemko@xxxxxxxxxxxx

Attachment:
signature.asc

Description: OpenPGP digital signature
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com