Re: Disk Down Emergency

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you all for your time and support.

I don't see any backfilling in the logs and the number of "active+degraded" as well as "active+remapped" and "active+clean" objects is the same for some time now. The only thing I see is "scrubbing".

Wido, I cannot do anything with the data in osd.0 since although the failed disk seems mounted I cannot see anything and I am getting an "Input/output" error.

So I guess the right action for now is to remove the OSD by issuing "ceph osd crush remove osd.0" as Sean suggested, correct?

G.


Op 16 november 2017 om 14:46 schreef Caspar Smit <casparsmit@xxxxxxxxxxx>:


2017-11-16 14:43 GMT+01:00 Wido den Hollander <wido@xxxxxxxx>:

>
> > Op 16 november 2017 om 14:40 schreef Georgios Dimitrakakis <
> giorgis@xxxxxxxxxxxx>:
> >
> >
> > @Sean Redmond: No I don't have any unfound objects. I only have "stuck
> >  unclean" with "active+degraded" status
> >  @Caspar Smit: The cluster is scrubbing ...
> >
> > @All: My concern is because of one copy left for the data on the failed
> >  disk.
> >
>
> Let the Ceph recovery do it's work. Don't do anything manually now.
>
>
@Wido, i think his cluster might have stopped recovering because of
non-optimal tunables in firefly.


Ah, darn. Yes, that's been a long time ago. Could very well be the case.

He could try to remove osd.0 from the CRUSHMap and let recovery progress.

I would however advise him not to fiddle with the data on osd.0. Do
not try to copy the data somewhere else and try to fix the OSD.

Wido


> > If I just remove the OSD.0 from crush map does that copy all its data > > from the only one available copy to the rest unaffected disks which will > > consequently end in having again two copies on two different hosts?
> >
>
> Do NOT copy the data from osd.0 to another OSD. Let the Ceph recovery
> handle this.
>
> It is already marked as out and within 24 hours or so recovery will have
> finished.
>
> But a few things:
>
> - Firefly 0.80.9 is old
> - Never, never, never run with size=2
>
> Not trying to scare you, but it's a reality.
>
> Now let Ceph handle the rebalance and wait.
>
> Wido
>
> >  Best,
> >
> >  G.
> >
> >
> > > 2017-11-16 14:05 GMT+01:00 Georgios Dimitrakakis :
> > >
> > >> Dear cephers,
> > >>
> > >> I have an emergency on a rather small ceph cluster.
> > >>
> > >> My cluster consists of 2 OSD nodes with 10 disks x4TB each and 3
> > >> monitor nodes.
> > >>
> > >> The version of ceph running is Firefly v.0.80.9
> > >> (b5a67f0e1d15385bc0d60a6da6e7fc810bde6047)
> > >>
> > >> The cluster originally was build with "Replicated size=2" and "Min
> > >> size=1" with the attached crush map,
> > >> which in my understanding this replicates data across hosts.
> > >>
> > >> The emergency comes from the violation of the golden rule: "Never
> > >> use 2 replicas on a production cluster"
> > >>
> > >> Unfortunately the customers never really understood well the risk > > >> and now that one disk is down I am in the middle and I must do > > >> everything in my power not to loose any data, thus I am requesting
> > >> your assistance.
> > >>
> > >> Here is the output of
> > >>
> > >> $ ceph osd tree
> > >> # id    weight  type name       up/down reweight
> > >> -1      72.6    root default
> > >> -2      36.3            host store1
> > >> 0       3.63                    osd.0   down
> > >> 0       ---> DISK DOWN
> > >> 1       3.63                    osd.1   up
> > >> 1
> > >> 2       3.63                    osd.2   up
> > >> 1
> > >> 3       3.63                    osd.3   up
> > >> 1
> > >> 4       3.63                    osd.4   up
> > >> 1
> > >> 5       3.63                    osd.5   up
> > >> 1
> > >> 6       3.63                    osd.6   up
> > >> 1
> > >> 7       3.63                    osd.7   up
> > >> 1
> > >> 8       3.63                    osd.8   up
> > >> 1
> > >> 9       3.63                    osd.9   up
> > >> 1
> > >> -3      36.3            host store2
> > >> 10      3.63                    osd.10  up      1
> > >> 11      3.63                    osd.11  up      1
> > >> 12      3.63                    osd.12  up      1
> > >> 13      3.63                    osd.13  up      1
> > >> 14      3.63                    osd.14  up      1
> > >> 15      3.63                    osd.15  up      1
> > >> 16      3.63                    osd.16  up      1
> > >> 17      3.63                    osd.17  up      1
> > >> 18      3.63                    osd.18  up      1
> > >> 19      3.63                    osd.19  up      1
> > >>
> > >> and here is the status of the cluster
> > >>
> > >> # ceph health
> > >> HEALTH_WARN 497 pgs degraded; 549 pgs stuck unclean; recovery
> > >> 51916/2552684 objects degraded (2.034%)
> > >>
> > >> Althoug OSD.0 is shown as mounted it cannot be started (probably
> > >> failed disk controller problem)
> > >>
> > >> # df -h
> > >> Filesystem      Size  Used Avail Use% Mounted on
> > >> /dev/sda3       251G  4.1G  235G   2% /
> > >> tmpfs            24G     0   24G   0% /dev/shm
> > >> /dev/sda1       239M  100M  127M  44% /boot
> > >> /dev/sdj1       3.7T  223G  3.5T   6%
> > >> /var/lib/ceph/osd/ceph-8
> > >> /dev/sdh1       3.7T  205G  3.5T   6%
> > >> /var/lib/ceph/osd/ceph-6
> > >> /dev/sdg1       3.7T  199G  3.5T   6%
> > >> /var/lib/ceph/osd/ceph-5
> > >> /dev/sde1       3.7T  180G  3.5T   5%
> > >> /var/lib/ceph/osd/ceph-3
> > >> /dev/sdi1       3.7T  187G  3.5T   6%
> > >> /var/lib/ceph/osd/ceph-7
> > >> /dev/sdf1       3.7T  193G  3.5T   6%
> > >> /var/lib/ceph/osd/ceph-4
> > >> /dev/sdd1       3.7T  212G  3.5T   6%
> > >> /var/lib/ceph/osd/ceph-2
> > >> /dev/sdk1       3.7T  210G  3.5T   6%
> > >> /var/lib/ceph/osd/ceph-9
> > >> /dev/sdb1       3.7T  164G  3.5T   5%
> > >> /var/lib/ceph/osd/ceph-0    ---> This is the problematic OSD
> > >> /dev/sdc1       3.7T  183G  3.5T   5%
> > >> /var/lib/ceph/osd/ceph-1
> > >>
> > >> # service ceph start osd.0
> > >> find: `/var/lib/ceph/osd/ceph-0: Input/output error
> > >> /etc/init.d/ceph: osd.0 not found (/etc/ceph/ceph.conf defines > > >> mon.store1 osd.6 osd.9 osd.1 osd.4 osd.3 osd.2 osd.8 osd.5 osd.7 > > >> mds.store1 mon.store3, /var/lib/ceph defines mon.store1 osd.6 osd.9
> > >> osd.1 osd.4 osd.3 osd.2 osd.8 osd.5 osd.7 mds.store1)
> > >>
> > >> I have found this:
> > >>
> > >
> > > http://ceph.com/geen-categorie/admin-guide-
> replacing-a-failed-disk-in-a-ceph-cluster/
> > >> [1]
> > >>
> > >> and I am looking for your guidance in order to properly perform all > > >> actions in order not to loose any data and keep the ones of the
> > >> second copy.
> > >
> > > What guidance are you looking for besides the steps to replace a
> > > failed disk (which you already found) ?
> > > If i look at your situation, there is nothing down in terms of
> > > availability of pgs, just a failed drive which needs to be replaced.
> > >
> > > Is the cluster still recovering? It should reach HEALTH_OK again
> > > after
> > > rebalancing the cluster when an OSD goes down.
> > >
> > > If it stopped recovering it might have to do with the ceph tunables > > > which are not set to optimal by default on firefly and that prevents
> > > further rebalancing.
> > > WARNING: Dont just set tunables to optimal because it will trigger a
> > > massive rebalance!
> > >
> > > Perhaps the second golden rule is to never run a CEPH production > > > cluster without knowing (and testing) how to replace a failed drive.
> > > (Im not trying to be harsh here).
> > >
> > > Kind regards,
> > > Caspar
> > >
> > >
> > >> Best regards,
> > >>
> > >> G.
> > >> _______________________________________________
> > >> ceph-users mailing list
> > >> ceph-users@xxxxxxxxxxxxxx [2]
> > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [3]
> > >
> > >
> > >
> > > Links:
> > > ------
> > > [1]
> > >
> > > http://ceph.com/geen-categorie/admin-guide-
> replacing-a-failed-disk-in-a-ceph-cluster/
> > > [2] mailto:ceph-users@xxxxxxxxxxxxxx
> > > [3] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > [4] mailto:giorgis@xxxxxxxxxxxx
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
Dr. Dimitrakakis Georgios

Networks and Systems Administrator

Archimedes Center for Modeling, Analysis & Computation (ACMAC)
School of Sciences and Engineering
University of Crete
P.O. Box 2208
710 - 03 Heraklion
Crete, Greece

Tel: +30 2810 393717
Fax: +30 2810 393660

E-mail: giorgis@xxxxxxxxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux