Re: node and its OSDs down...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Confused ...
a few OSDs down and cluster done the recovery and reblanced to HEALTH OK state.
Now I can could that down OSDs are down state from crushmap and are not part of OSD up or in state.
After 5 days or says, still the same state.
How or when Ceph will make the down state OSDs to out state? I guess ceph don't do it.
Now I ran the OSD out - (after 5days of down state), still recovery and reblanced stating...worried about it...

Thanks
Swami


On Thu, Dec 8, 2016 at 6:40 AM, Brad Hubbard <bhubbard@xxxxxxxxxx> wrote:


On Wed, Dec 7, 2016 at 9:11 PM, M Ranga Swami Reddy <swamireddy@xxxxxxxxx> wrote:
That's right..
But, my question was: when an OSD down, all data will be moved to other OSDs from downed OSD. - Is this correct?

No, only after it is marked out.

"If an OSD is down and the degraded condition persists, Ceph may mark the down OSD as out of the cluster and remap the data from the down OSD to another OSD. The time between being marked down and being marked out is controlled by mon osd down out interval, which is set to 300 seconds by default."
 
Now, I change the crushmap as out an OSD, then again data will be moved across the cluster?
 

Thanks
Swami

On Wed, Dec 7, 2016 at 2:14 PM, 한승진 <yongiman@xxxxxxxxx> wrote:
Hi 

Because "down" and "out" are different to ceph cluster

Crush map of ceph is depends on how many osds are in ths cluster.

Crush map doesn't change when osds are down. However crush map would chage when the osds are absolutelly out.
Data location also will change, there fore rebalancing starts.

Thanks
John Haan



2016. 12. 3. 오후 5:27에 "M Ranga Swami Reddy" <swamireddy@xxxxxxxxx>님이 작성:
Sure, will try with "ceph osd crush reweight 0.0" and update the status.

Thanks
Swami

On Fri, Dec 2, 2016 at 8:15 PM, David Turner <david.turner@xxxxxxxxxxxxxxxx> wrote:

If you want to reweight only once when you have a failed disk that is being balanced off of, set the crush weight for that osd to 0.0.  Then when you fully remove the disk from the cluster it will not do any additional backfilling.  Any change to the crush map will likely move data around, even if you're removing an already "removed" osd.


David Turner | Cloud Operations Engineer | StorageCraft Technology Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2760 | Mobile: 385.224.2943


If you are not the intended recipient of this message or received it erroneously, please notify the sender and delete it, together with any attachments, and be advised that any dissemination or copying of this message is prohibited.



From: M Ranga Swami Reddy [swamireddy@xxxxxxxxx]
Sent: Thursday, December 01, 2016 11:45 PM
To: David Turner
Cc: ceph-users
Subject: Re: node and its OSDs down...

Hi David - Yep, I did the "ceph osd crush remove osd.<id>", which started the recovery.
My worries is - why Ceph is doing the recovery, if an OSD is already down and no more in the cluster. That means, ceph already maintained down OSDs objects copied to another OSDs.. here is the ceph osd tree o/p:
===

227     0.91                            osd.227 down    0

....

250     0.91                            osd.250 down    0

===


So to avoid the recovery/rebalance , can I set the weight of OSD (which was in down state). But is this weight setting also lead to rebalance activity.


Thanks

Swami



On Thu, Dec 1, 2016 at 8:07 PM, David Turner <david.turner@xxxxxxxxxxxxxxxx> wrote:

I assume you also did ceph osd crush remove osd.<id>.  When you removed the osd that was down/out and balanced off of, you changed the weight of the host that it was on which triggers additional backfilling to balance the crush map.


David Turner | Cloud Operations Engineer | StorageCraft Technology Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2760 | Mobile: 385.224.2943


If you are not the intended recipient of this message or received it erroneously, please notify the sender and delete it, together with any attachments, and be advised that any dissemination or copying of this message is prohibited.



From: ceph-users [ceph-users-bounces@xxxxxxxxxx.com] on behalf of M Ranga Swami Reddy [swamireddy@xxxxxxxxx]
Sent: Thursday, December 01, 2016 3:03 AM
To: ceph-users
Subject: node and its OSDs down...

Hello,
One of my ceph node with 20 OSDs down...After a couple of hours, ceph health is in OK state.

Now, I tried to remove those OSDs, which were down state from ceph cluster...
using the "ceh osd remove osd.<id>"
then ceph clsuter started rebalancing...which is strange ..because thsoe OSDs are down for a long time and health also OK..
my question - why recovery or reblance started when I remove the OSD (which was down).

Thanks
Swami




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Cheers,
Brad

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux