Re: General question CephFS or RBD

Stefan Kooman <stefan@xxxxxx> · Thu, 30 Jan 2020 11:52:41 +0100

Hi,

Quoting Willi Schiegel (willi.schiegel@xxxxxxxxxxxxxx):
> Hello All,
> 
> I have a HW RAID based 240 TB data pool with about 200 million files for
> users in a scientific institution. Data sizes range from tiny parameter
> files for scientific calculations and experiments to huge images of brain
> scans. There are group directories, home directories, Windows roaming
> profile directories organized in ZFS pools on Solaris operating systems,
> exported via NFS and Samba to Linux, macOS, and Windows clients.
> 
> I would like to switch to CephFS because of the flexibility and
> expandability but I cannot find any recommendations for which storage
> backend would be suitable for all the functionality we have.
> 
> Since I like the features of ZFS like immediate snapshots of very large data
> pools, quotas for each file system within hierarchical data trees and
> dynamic expandability by simply adding new disks or disk images without
> manual resizing would it be a good idea to create RBD images, map them onto
> the file servers and create zpools on the mapped images? I know that ZFS
> best works with raw disks but maybe a RBD image is close enough to a raw
> disk?

Some of the features also exist within Ceph. Ceph would rebalance your
data accross the cluster immdiately as opposed to zfs (only new data
written to new disks). You can make snapshots of filesystems, you can
set quotas but with different caveats than with zfs [1]. You would have
to setup Ceph with samba (ceph vfs) for your windows / mac clients [2]
which is not (yet) HA (or you would build something yourself with Samba
CTDB, Ceph object locking feature of CTDB and vfs_ceph). If you need to
support nfs you would be able to do so in a HA fashion with nfs-ganesha
(nautilus has fixed most caveats). You would need one ore more MDS
servers. And it would really depend on the workload (and the amount of
clients) if it would work out for you. It might require tuning of the
MDSs. It's one of the more difficult interfaces of Ceph to comprehend.

ZFS with rbd would work (IIRC I have seen a presentation at Cephalacon
of a user using both zfs and rbd). I have certainly done it myself. As
far as zfs is concerned it's just another disk (depends if you would map
it with rbd-nbd, krbd or as virtual disk, but all three should work).

> Or would CephFS be the way to go? Can there be multiple CephFS pools for the
> group data folders and for the user's home directory folders for example or
> do I have to have everything in one single file space?

You can have multiple data pools and use xattrs to set the appropriate
attributes to use the correct pool. You can use name spaces (ceph, not
linux) to "prefix" all objects put in cephfs. That's there primarily there to
be able to set permissions per "namespace" instead of on a whole pool.
In this case you can restrict what objects a user is allowed access to.
You can also restrict the permission of creating cephfs snapshots
(from nautilus onwards) on a per user basis.

It would probably require a whole bunch of servers to get the right sizing for
ceph as opposed to your zfs setup for which you proably have one or more
nodes with a bunch of disks. But it would definately scale better and
provide higher availability when done right. Even if you would only use
Ceph for "block" devices. It would also require quite a investment in
time to get used to the quirks of using Ceph in production for large
CephFS workloads. 

Gr. Stefan

[1]: https://docs.ceph.com/docs/master/cephfs/quota/
> 
> Thank you very much.
> 
> Best regards
> Willi
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> 

-- 
| BIT BV  https://www.bit.nl/        Kamer van Koophandel 09090351
| GPG: 0xD14839C6                   +31 318 648 688 / info@xxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx