Hi Manuel,
I was the one that did Red Hat's IO500 CephFS submission. Feel free to
ask any questions you like. Generally speaking I could achieve 3GB/s
pretty easily per kernel client and up to about 8GB/s per client with
libcephfs directly (up to the aggregate cluster limits assuming enough
concurrency). Metadata is trickier. The fastest option is if you have
files spread across directories that you can manually pin round-robin to
MDSes, though you can do somewhat well with ephemeral pinning too as a
more automatic option. If you have lots of clients dealing with lots of
files in a single directory, that's where you revert to dynamic subtree
partitioning which tends to be quite a bit slower (though at least some
of this is due to journaling overhead on the auth MDS). That's
especially true if you have a significant number of active/active MDS
servers (say 10-20+). We tended to consistently do very well with the
"easy" IO500 tests and struggled more with the "hard" tests. Otherwise
most of the standard Ceph caveats apply. Replication eats into write
performance, scrub/deep scrub can impact performance, choosing the right
NVMe drive with power less protection and low overhead is important, etc.
Probably the most important questions you should be asking yourself is
how you intend to use the storage, what do you need out of it, and what
you need to do to get there. Ceph has a lot of advantages regarding
replication, self-healing, and consistency and it's quite fast for some
workloads given those advantages. There are some workloads though (say
unaligned small writes from hundreds of clients to random files in a
single directory) that potentially could be pretty slow.
Mark
On 7/21/21 8:54 AM, Manuel Holtgrewe wrote:
Dear all,
we are looking towards setting up an all-NVME CephFS instance in our
high-performance compute system. Does anyone have any experience to share
in a HPC setup or an NVME setup mounted by dozens of nodes or more?
I've followed the impressive work done at CERN on Youtube but otherwise
there appear to be only few places using CephFS this way. There are a few
of CephFS-as-enterprise-storage vendors that sporadically advertise CephFS
for HPC but it does not appear to be a strategic main target for them.
I'd be happy to read about your experience/opinion on CephFS for HPC.
Best wishes,
Manuel
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx