Re: Bcache, partitions and BlueStore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Op 26 september 2016 om 19:51 schreef Sam Yaple <samuel@xxxxxxxxx>:
> 
> 
> On Mon, Sep 26, 2016 at 5:44 PM, Wido den Hollander <wido@xxxxxxxx> wrote:
> 
> >
> > > Op 26 september 2016 om 17:48 schreef Sam Yaple <samuel@xxxxxxxxx>:
> > >
> > >
> > > On Mon, Sep 26, 2016 at 9:31 AM, Wido den Hollander <wido@xxxxxxxx>
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > This has been discussed on the ML before [0], but I would like to bring
> > > > this up again with the outlook towards BlueStore.
> > > >
> > > > Bcache [1] allows for block device level caching in Linux. This can be
> > > > read/write(back) and vastly improves read and write performance to a
> > block
> > > > device.
> > > >
> > > > With the current layout of Ceph with FileStore you can already use
> > bcache,
> > > > but not with ceph-disk.
> > > >
> > > > The reason is that bcache currently does not support creating
> > partitions
> > > > on those devices. There are patches [2] out there, but they are not
> > > > upstream.
> > > >
> > > > I haven't tested it yet, but it looks like BlueStore can still benefit
> > > > quite good from Bcache and it would be a lot easier if the patches [2]
> > were
> > > > merged upstream.
> > > >
> > > > This way you would have:
> > > >
> > > > - bcache0p1: XFS/EXT4 OSD metadata
> > > > - bcache0p2: RocksDB
> > > > - bcache0p3: RocksDB WAL
> > > > - bcache0p4: BlueStore DATA
> > > >
> > > > With bcache you could create multiple bcache devices by creating
> > > > partitions on the backing disk and creating bcache devices for all of
> > them,
> > > > but that's a lot of work and not easy to automate with ceph-disk.
> > > >
> > > > So what I'm trying to find is the best route to get this upstream in
> > the
> > > > Linux kernel. That way next year when BlueStore becomes the default in
> > L
> > > > (luminous) users can use bcache underneath BlueStore easily.
> > > >
> > > > Does anybody know the proper route we need to take to get this fixed
> > > > upstream? Has any contacts with the bcache developers?
> > > >
> > >
> > > Kent is pretty heavy into developing bcachefs at the moment. But you can
> > > hit him up on IRC at OFTC #bcache . I've talked ot him about this before
> > > and he is 100% willing to accept any patch to solves this issue in the
> > > standard way the kernel typically allocs major/minors for disks. The blog
> > > post you listed from me does _not_ solve this in an upstream way, though
> > > the final result is pretty accurate from my understanding.
> > >
> >
> > No, I understood that the blog indeed doesn't solve that.
> >
> > > I will look into a more better way to patch this upstream since there is
> > > renew interested in this.
> > >
> >
> > That would be great! My kernel knowledge is to limited to look into this,
> > but if you could help with this it would be nice.
> >
> > If this hits the kernel somewhere in Nov/Dec we should be good for a
> > kernel release somewhere together with L for Ceph.
> >
> > > Also, checkout bcachefs if you like bcache. It's up and coming, but it is
> > > pretty sweet. My goal is to use bcachefs with bluestore in the future.
> > >
> >
> > bcachefs with bluestore? The OSD doesn't require a filesystem with
> > BlueStore, just a raw block device :)
> >
> > Well there are parts of the OSD that still use a file system that can
> benefit from the caching (rockdb and wal). This is what I meant. There is a
> tiering system with bcachefs which currently only supports 2 tiers, but
> will eventually allow for 15 tiers, so you could have small and fast pci
> caching tier, followed by ssd, followed by spinning disk. Controlling what
> data can exist on what tier (and with writeback/writethrough potentially).
> Lots of room for configurations to improve performance.
> 

Interesting! Although RocksDB and it's WAL can also be a partition which would be bcache again.

However, I send a message to the linux-bcache mailinglist [0], hope we can get a proper patch into the kernel soon.

Any input, help or suggestions there would be nice!

Wido

[0]: https://marc.info/?l=linux-bcache&m=147507062812270

> SamYaple
> 
> 
> > Wido
> >
> > >
> > > >
> > > > Thanks!
> > > >
> > > > Wido
> > > >
> > > > [0]: http://www.spinics.net/lists/ceph-devel/msg29550.html
> > > > [1]: https://bcache.evilpiepirate.org/
> > > > [2]: https://yaple.net/2016/03/31/bcache-partitions-and-dkms/
> > > >
> > >
> > >
> > > SamYaple
> >
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux