> Op 16 november 2017 om 14:46 schreef Caspar Smit <casparsmit@xxxxxxxxxxx>: > > > 2017-11-16 14:43 GMT+01:00 Wido den Hollander <wido@xxxxxxxx>: > > > > > > Op 16 november 2017 om 14:40 schreef Georgios Dimitrakakis < > > giorgis@xxxxxxxxxxxx>: > > > > > > > > > @Sean Redmond: No I don't have any unfound objects. I only have "stuck > > > unclean" with "active+degraded" status > > > @Caspar Smit: The cluster is scrubbing ... > > > > > > @All: My concern is because of one copy left for the data on the failed > > > disk. > > > > > > > Let the Ceph recovery do it's work. Don't do anything manually now. > > > > > @Wido, i think his cluster might have stopped recovering because of > non-optimal tunables in firefly. > Ah, darn. Yes, that's been a long time ago. Could very well be the case. He could try to remove osd.0 from the CRUSHMap and let recovery progress. I would however advise him not to fiddle with the data on osd.0. Do not try to copy the data somewhere else and try to fix the OSD. Wido > > > > If I just remove the OSD.0 from crush map does that copy all its data > > > from the only one available copy to the rest unaffected disks which will > > > consequently end in having again two copies on two different hosts? > > > > > > > Do NOT copy the data from osd.0 to another OSD. Let the Ceph recovery > > handle this. > > > > It is already marked as out and within 24 hours or so recovery will have > > finished. > > > > But a few things: > > > > - Firefly 0.80.9 is old > > - Never, never, never run with size=2 > > > > Not trying to scare you, but it's a reality. > > > > Now let Ceph handle the rebalance and wait. > > > > Wido > > > > > Best, > > > > > > G. > > > > > > > > > > 2017-11-16 14:05 GMT+01:00 Georgios Dimitrakakis : > > > > > > > >> Dear cephers, > > > >> > > > >> I have an emergency on a rather small ceph cluster. > > > >> > > > >> My cluster consists of 2 OSD nodes with 10 disks x4TB each and 3 > > > >> monitor nodes. > > > >> > > > >> The version of ceph running is Firefly v.0.80.9 > > > >> (b5a67f0e1d15385bc0d60a6da6e7fc810bde6047) > > > >> > > > >> The cluster originally was build with "Replicated size=2" and "Min > > > >> size=1" with the attached crush map, > > > >> which in my understanding this replicates data across hosts. > > > >> > > > >> The emergency comes from the violation of the golden rule: "Never > > > >> use 2 replicas on a production cluster" > > > >> > > > >> Unfortunately the customers never really understood well the risk > > > >> and now that one disk is down I am in the middle and I must do > > > >> everything in my power not to loose any data, thus I am requesting > > > >> your assistance. > > > >> > > > >> Here is the output of > > > >> > > > >> $ ceph osd tree > > > >> # id weight type name up/down reweight > > > >> -1 72.6 root default > > > >> -2 36.3 host store1 > > > >> 0 3.63 osd.0 down > > > >> 0 ---> DISK DOWN > > > >> 1 3.63 osd.1 up > > > >> 1 > > > >> 2 3.63 osd.2 up > > > >> 1 > > > >> 3 3.63 osd.3 up > > > >> 1 > > > >> 4 3.63 osd.4 up > > > >> 1 > > > >> 5 3.63 osd.5 up > > > >> 1 > > > >> 6 3.63 osd.6 up > > > >> 1 > > > >> 7 3.63 osd.7 up > > > >> 1 > > > >> 8 3.63 osd.8 up > > > >> 1 > > > >> 9 3.63 osd.9 up > > > >> 1 > > > >> -3 36.3 host store2 > > > >> 10 3.63 osd.10 up 1 > > > >> 11 3.63 osd.11 up 1 > > > >> 12 3.63 osd.12 up 1 > > > >> 13 3.63 osd.13 up 1 > > > >> 14 3.63 osd.14 up 1 > > > >> 15 3.63 osd.15 up 1 > > > >> 16 3.63 osd.16 up 1 > > > >> 17 3.63 osd.17 up 1 > > > >> 18 3.63 osd.18 up 1 > > > >> 19 3.63 osd.19 up 1 > > > >> > > > >> and here is the status of the cluster > > > >> > > > >> # ceph health > > > >> HEALTH_WARN 497 pgs degraded; 549 pgs stuck unclean; recovery > > > >> 51916/2552684 objects degraded (2.034%) > > > >> > > > >> Althoug OSD.0 is shown as mounted it cannot be started (probably > > > >> failed disk controller problem) > > > >> > > > >> # df -h > > > >> Filesystem Size Used Avail Use% Mounted on > > > >> /dev/sda3 251G 4.1G 235G 2% / > > > >> tmpfs 24G 0 24G 0% /dev/shm > > > >> /dev/sda1 239M 100M 127M 44% /boot > > > >> /dev/sdj1 3.7T 223G 3.5T 6% > > > >> /var/lib/ceph/osd/ceph-8 > > > >> /dev/sdh1 3.7T 205G 3.5T 6% > > > >> /var/lib/ceph/osd/ceph-6 > > > >> /dev/sdg1 3.7T 199G 3.5T 6% > > > >> /var/lib/ceph/osd/ceph-5 > > > >> /dev/sde1 3.7T 180G 3.5T 5% > > > >> /var/lib/ceph/osd/ceph-3 > > > >> /dev/sdi1 3.7T 187G 3.5T 6% > > > >> /var/lib/ceph/osd/ceph-7 > > > >> /dev/sdf1 3.7T 193G 3.5T 6% > > > >> /var/lib/ceph/osd/ceph-4 > > > >> /dev/sdd1 3.7T 212G 3.5T 6% > > > >> /var/lib/ceph/osd/ceph-2 > > > >> /dev/sdk1 3.7T 210G 3.5T 6% > > > >> /var/lib/ceph/osd/ceph-9 > > > >> /dev/sdb1 3.7T 164G 3.5T 5% > > > >> /var/lib/ceph/osd/ceph-0 ---> This is the problematic OSD > > > >> /dev/sdc1 3.7T 183G 3.5T 5% > > > >> /var/lib/ceph/osd/ceph-1 > > > >> > > > >> # service ceph start osd.0 > > > >> find: `/var/lib/ceph/osd/ceph-0: Input/output error > > > >> /etc/init.d/ceph: osd.0 not found (/etc/ceph/ceph.conf defines > > > >> mon.store1 osd.6 osd.9 osd.1 osd.4 osd.3 osd.2 osd.8 osd.5 osd.7 > > > >> mds.store1 mon.store3, /var/lib/ceph defines mon.store1 osd.6 osd.9 > > > >> osd.1 osd.4 osd.3 osd.2 osd.8 osd.5 osd.7 mds.store1) > > > >> > > > >> I have found this: > > > >> > > > > > > > > http://ceph.com/geen-categorie/admin-guide- > > replacing-a-failed-disk-in-a-ceph-cluster/ > > > >> [1] > > > >> > > > >> and I am looking for your guidance in order to properly perform all > > > >> actions in order not to loose any data and keep the ones of the > > > >> second copy. > > > > > > > > What guidance are you looking for besides the steps to replace a > > > > failed disk (which you already found) ? > > > > If i look at your situation, there is nothing down in terms of > > > > availability of pgs, just a failed drive which needs to be replaced. > > > > > > > > Is the cluster still recovering? It should reach HEALTH_OK again > > > > after > > > > rebalancing the cluster when an OSD goes down. > > > > > > > > If it stopped recovering it might have to do with the ceph tunables > > > > which are not set to optimal by default on firefly and that prevents > > > > further rebalancing. > > > > WARNING: Dont just set tunables to optimal because it will trigger a > > > > massive rebalance! > > > > > > > > Perhaps the second golden rule is to never run a CEPH production > > > > cluster without knowing (and testing) how to replace a failed drive. > > > > (Im not trying to be harsh here). > > > > > > > > Kind regards, > > > > Caspar > > > > > > > > > > > >> Best regards, > > > >> > > > >> G. > > > >> _______________________________________________ > > > >> ceph-users mailing list > > > >> ceph-users@xxxxxxxxxxxxxx [2] > > > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [3] > > > > > > > > > > > > > > > > Links: > > > > ------ > > > > [1] > > > > > > > > http://ceph.com/geen-categorie/admin-guide- > > replacing-a-failed-disk-in-a-ceph-cluster/ > > > > [2] mailto:ceph-users@xxxxxxxxxxxxxx > > > > [3] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > [4] mailto:giorgis@xxxxxxxxxxxx > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@xxxxxxxxxxxxxx > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com