Re: how to speed up hundreds of millions small files read base on cephfs?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I don't know if those optimizations are applicable on HDD clusters, but few optimizations we did on our full NVME ceph cluster (for ML/AI workload). Ceph has been deployed on Kubernetes worker nodes.

54 Nodes, 1 NVME per node, 2 partitions per NVME (4 would be better for NVME I guess)

Increase the number of threads per OSDs (by default, it is set to 1 afaik) and the memory allocated for MDSs daemons increased IOPS performance by ~ x3.

mds_cache_memory_limit = 137438953472
osd_disk_threads = 4
osd_memory_cache_min = 4294967296
osd_op_num_threads_per_shard = 5
osd_op_queue_cut_off = high
osd_op_threads = 4

And add more MDSs : from 2 actives + 1 standby to 7 actives + 1 standby increase IOPS performance by ~ x8.

For the new MDSs, we have chosen CPU with less cores but higher frequency.

we have now between 200K and 300K IOPs available on the cluster

One more optimization we could do is running more than one MDS daemon per node with less memory each.

Anyway, even with that, we still have "slow ops", "slow request", "Behind on trimming" warning sometimes but less than before.

Best regards,

Yoann

Le 01.09.22 à 10:58, zxcs a écrit :
Hi, experts,

We are using cephfs(15.2.*) with kernel mount on our production environment. And these days when we do massive read from cluster(multi processes),  ceph health always report slow ops for some osds(build with hdd(8TB) which using ssd as db cache).

our cluster have more read than write request.

health log like below:
100 slow ops, oldest one blocked for 114 sec, [osd.* ...] has slow ops (SLOW _OPS)

my question is does there any best practices to process hundreds of millions small files(means 100kb-300kb each file and 10000+ files in each directory, also more than 5000 directory)? A Any config we can tune or any patch we can apply try to speed up the read(more important than write) and any other file system we could try (we also not sure cephfs is the best choice to store such huge small files )?

Please experts shed some light here! We really need your are help here!

Any suggestions are welcome! Thanks in advance!~

Thanks,
zx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux