>> I wonder that when a osd came back from power-lost, all the data >> scrubbing and there are 2 other copies. >> PLP is important on mostly Block Storage, Ceph should easily recover >> from that situation. >> That's why I don't understand why I should pay more for PLP and other >> protections. > > I'm no expert (or power user) al all, but my reasoning is: if something power-related can take down one of my servers it can just as easily take down *all* my ceph servers at once. > > And that could just as easily render all three copies inacessible. Or even two. I’ve been through a protracted outage (not power related) that involved widespread OSD flapping. Despite having not lost OSDs in the end, somehow a single RADOS object ended up lost, in an RBD head. Very much a corner case, but if we’d been using 2R it would have been gruesome. On another occasion I saw a power inductor / PSU failure take down power in an entire DC row. Fortunately we were using redundant PSUs on different circuits. One node went down nonetheless — the PSU on the surviving power feed had a previous issue that wasn’t caught because PSUs weren’t monitored. As with active/passive network bonds, this showed the importance of monitoring and addressing latent faults so you don’t find them at exactly the wrong time. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx