Re: Incomplete PGs. Ceph Consultant Wanted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ohhh, so multiple OSD failure domains on a single SAN node?  I suspected as much.

I've experienced a Ceph cluster built on SanDisk InfiniFlash, was was somewhere between SAN and DAS arguably.  Each of 4 IF chassis drive 4x OSD nodes via SAS, but it was zoned such that the chassis was the failure domain in the CRUSH tree.

> On Jun 17, 2024, at 16:52, David C. <david.casier@xxxxxxxx> wrote:
> 
> In Pablo's unfortunate incident, it was because of a SAN incident, so it's possible that Replica 3 didn't save him.
> In this scenario, the architecture is more the origin of the incident than the number of replicas.


> It seems to me that replica 3 exists, by default, since firefly => make replica 2, this is intentional.


The default EC profile though is 2,1 and that makes it too easy for someone to understandably assume that the default is suitable for production.  I have an action item to update docs and code to default to, say, 2,2 so that it still works on smaller configurations like sandboxes but is less dangerous.


> However, I'd rather see a full flash Replica 2 platform with solid backups than Replica 3 without backups (well obviously, Replica 3, or E/C + backup are much better).
> 

Tangent, but yeah RAID or replication != backups.  SolidFire was RF2 flash, their assertion was that resilvering was fast enough that it was safe.  With Ceph we know there's more to it than that, but I'm not sure if they had special provisions to address the sequences of events that can cause problems with Ceph RF2.  They did have limited scalability, though.

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux