On 01/29/2018 06:15 PM, David Turner wrote:
+1 for Gregory's response. With filestore, if you lost a journal SSD
and followed the steps you outlined, you are leaving yourself open to
corrupt data. Any write that was ack'd by the journal, but not flushed
to the disk would be lost and assumed to be there by the cluster. With
a failed journal SSD on filestore, you should have removed all affected
OSDs before re-adding them with a new journal device. The same is true
of Bluestore.
Long story short: If you value your data, never attempt a manual fix of
OSDs which were giving troubles.
Never attempt a XFS repair with FileStore nor try anything which
involves fiddling with the bit on the disk.
Wipe the OSD, re-add it and let the backfilling handle it for you.
I've just seen too many cases where people corrupted their cluster by
trying to be smart.
Wido
Where Bluestore differs from Filestore is if your SSD stops receiving
writes and can still be read (or any time you can still read from the
SSD and are swapping it out). You would be able to flush the journal
and create new journals on a new SSD for the OSDs. This is not possible
with Bluestore as you cannot modify the WAL or RocksDB portions of a
Bluestore OSD after creation. If you started with your RocksDB and WAL
on an SSD, you could not decide to add an NVME later to move the WAL to
without removing and re-creating the OSDs with the new configuration.
On Mon, Jan 29, 2018 at 10:58 AM Gregory Farnum <gfarnum@xxxxxxxxxx
<mailto:gfarnum@xxxxxxxxxx>> wrote:
On Mon, Jan 29, 2018 at 9:37 AM Vladimir Prokofev <v@xxxxxxxxxxx
<mailto:v@xxxxxxxxxxx>> wrote:
Hello.
In short: what are the consequence of loosing external WAL/DB
device(assuming it’s SSD) in bluestore?
In comparison with filestore - we used to have an external SSD
for journaling multiple HDD OSDs. Hardware failure of such a
device would not be that big of a deal, as we can quickly use
xfs_repair to initialize a new journal. You don't have to
redeploy OSDs, just provide them with a new journal device,
remount XFS, and restart osd process so it can quickly update
its state. Healthy state can be restored in a matter of minutes.
That was with filestore.
Now what's the situation with bluestore?
What will happen in different scenarios. like having only WAL on
external device, or DB, or both WAL+DB?
I kind of assume that loosing DB means losing OSD, and it has to
be redeployed?
I'll let the BlueStore guys speak to this more directly, but I
believe you lose the OSD.
However, let's be clear: this is not really a different situation
than with FileStore. You *can* with FileStore fix the xfs filesystem
and persuade the OSD to start up again by giving it a new journal.
But this is a *lie* to the OSD about the state of its data and is
very likely to introduce data loss or inconsistencies. You shouldn't
do it unless the OSD hosts the only copy of a PG in your cluster.
-Greg
What about WAL? Any specific commands to restore it, similar to
xfs_repair?
I didn't find any docs regarding this matter, but maybe I'm
doing it badly, so a link to such doc would be great.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com