Thanks a lot for the insightful comments. Reply see below From: Christian Wuerdig Date: 2021-10-22 02:13 To: huxiaoyu@xxxxxxxxxxxx CC: ceph-users Subject: Re: Open discussing: Designing 50GB/s CephFS or S3 ceph cluster What is the expected file/object size distribution and count? Let us suppose file/object size be 1MB or even higher. Is it write-once or modify-often data? Write once and read many, very few modification What's your overall required storage capacity? Exa level storage capacity 18 OSDs per WAL/DB drive seems a lot - recommended is ~6-8 With 12TB OSD the recommended WAL/DB size is 120-480GB (1-4%) per OSD to avoid spillover - if you go RGW then you may want to aim more towards 4% since RGW can use quite a bit of OMAP data (especially when you store many small objects). Not sure about CephFS So you may want to look at 4x NVME and probably 3.2TB instead of 1.6 Does Nautilus 14.2.22 support flexible WAL/DB size? I rememebered previously the supported sizes are 3, 30, 300GB Rule-of-thumb is 1 Thread per HDD OSD - so if you want to give yourself some extra wiggle room a 7402 might be better - especially since EC is a bit heavier on CPU Agree, 7402 is a better choice Running EC 8+3 with failure domain host means you should have at least 12 nodes which means you'd need to push 4GB/sec/node which seems theoretically possible but is quite close to the network interface capacity. And whether you could actually push 4GB/sec into a node in this config I don't know. But overall 12 nodes seems like the minimum With 12 nodes you have a raw storage capacity of around 5PB - assuming you don't run you cluster more than 80% full and EC 8+3 means max of 3PB usable data capacity (again assuming your objects are large enough to not cause significant space amplification wrt. bluestore min block size) You will probably run more nodes than that so if you don't need the actual capacity then consider going replicated instead which generally performs better than EC Agree, i will need more nodes On Fri, 22 Oct 2021 at 05:24, huxiaoyu@xxxxxxxxxxxx <huxiaoyu@xxxxxxxxxxxx> wrote: Dear Cephers, I am thinking of designing a cephfs or S3 cluster, with a target to achieve a minimum of 50GB/s (write) bandwidth. For each node, I prefer 4U 36x 3.5" Supermicro server with 36x 12TB 7200K RPM HDDs, 2x Intel P4610 1.6TB NVMe SSD as DB/WAL, a single CPU socket with AMD 7302, and 256GB DDR4 memory. Each node comes with 2x 25Gb networking in mode 4 bonded. 8+3 EC will be used. My questions are the following: 1 How many nodes should be deployed in order to achieve a minimum of 50GB/s, if possible, with the above hardware setting? 2 How many Cephfs MDS are required? (suppose 1MB request size), and how many clients are needed for reach a total of 50GB/s? 3 From the perspective of getting the maximum bandwidth, which one should i choose, CephFS or Ceph S3? Any comments, suggestions, or improvement tips are warmly welcome best regards, Samuel huxiaoyu@xxxxxxxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx