Re: Ceph Luminous RocksDB vs WalDB?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm going to hope that Igor is correct since I have a PR for DeepSea to change 
this exact behavior.

With respect to ceph-deploy, if you specify --block-wal, your OSD will have a 
block.wal symlink.  Likewise, --block-db will give you a block.db symlink.

If you have both on the command line, you will get both.  That does work, but 
this also means twice the partitions to manage on a shared device.  We have 
been doing Bluestore this way in DeepSea since we started supporting 
Bluestore.

Recently, I learned that this is not necessary when both are on the same 
device.  The wal for the Bluestore OSD will use the db device when set to 0.  
I will soon verify that also means when the block.wal is absent.  

Eric

On Thursday, June 28, 2018 5:07:25 PM EDT Kai Wagner wrote:
> I'm also not 100% sure but I think that the first one is the right way
> to go. The second command only specifies the db partition but no
> dedicated WAL partition. The first one should do the trick.
> 
> On 28.06.2018 22:58, Igor Fedotov wrote:
> > I think the second variant is what you need. But I'm not the guru in
> > ceph-deploy so there might be some nuances there...
> > 
> > Anyway the general idea is to have just a single NVME partition (for
> > both WAL and DB) per OSD.
> > 
> > Thanks,
> > 
> > Igor
> > 
> > On 6/27/2018 11:28 PM, Pardhiv Karri wrote:
> >> Thank you Igor for the response.
> >> 
> >> So do I need to use this,
> >> 
> >> ceph-deploy osd create --debug --bluestore --data /dev/sdb
> >> --block-wal /dev/nvme0n1p1 --block-db /dev/nvme0n1p2 cephdatahost1
> >> 
> >> or 
> >> 
> >> ceph-deploy osd create --debug --bluestore --data /dev/sdb --block-db
> >> /dev/nvme0n1p2 cephdatahost1
> >> 
> >> where /dev/sdb is ssd disk for osd
> >> /dev/nvmen0n1p1 is 10G partition
> >> /dev/nvme0n1p2 is 25G partition
> >> 
> >> 
> >> Thanks,
> >> Pardhiv K
> >> 
> >> On Wed, Jun 27, 2018 at 9:08 AM Igor Fedotov <ifedotov@xxxxxxx
> >> 
> >> <mailto:ifedotov@xxxxxxx>> wrote:
> >>     Hi Pardhiv,
> >>     
> >>     there is no WalDB in Ceph.
> >>     
> >>     It's WAL (Write Ahead Log) that is a way to ensure write safety
> >>     in RocksDB. In other words - that's just a RocksDB subsystem
> >>     which can use separate volume though.
> >>     
> >>     In general For BlueStore/BlueFS one can either allocate separate
> >>     volumes for WAL and DB or have them on the same volume. The
> >>     latter is the common option.
> >>     
> >>     The separated layout makes sense when you have tiny but
> >>     super-fast device (for WAL) and less effective (but still fast)
> >>     larger drive for DB. Not to mention the third one for user data....
> >>     
> >>     E.g. HDD (user data) + SDD (DB) + NVME  (WAL) is such a layout.
> >>     
> >>     
> >>     So for you case IMO it's optimal to have merged WAL+DB at NVME
> >>     and data at SSD. Hence no need for separate WAL volume.
> >>     
> >>     
> >>     Regards,
> >>     
> >>     Igor
> >>     
> >>     On 6/26/2018 10:22 PM, Pardhiv Karri wrote:
> >>>     Hi,
> >>>     
> >>>     I am playing with Ceph Luminous and getting confused information
> >>>     around usage of WalDB vs RocksDB.
> >>>     
> >>>     I have 2TB NVMe drive which I want to use for Wal/Rocks DB and
> >>>     have 5 2TB SSD's for OSD. 
> >>>     I am planning to create 5 30GB partitions for RocksDB on NVMe
> >>>     drive, do I need to create partitions of WalDB also on NVMe
> >>>     drive or does RocksDB does same work as WalDB plus having
> >>>     metadata on it? 
> >>>     
> >>>     So my question is do I really need to use WalDB along with
> >>>     RocksDB or having RocksDB only is fine?
> >>>     
> >>>     Thanks,
> >>>     Pardhiv K
> >>>     
> >>>     
> >>>     
> >>>     
> >>>     
> >>>     _______________________________________________
> >>>     ceph-users mailing list
> >>>     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
> >>>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>     
> >>     _______________________________________________
> >>     ceph-users mailing list
> >>     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
> >>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > 
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux