Oh wow, a lot to read piled up in one night :) First things first: I want to thank you all for your insights and for the really valuable knowledge I pulled from this mailthread. Regarding flash only: We use flash only clusters for our RBD clusters. This is very nice and most of the maintanance is "install updates and reboot". In the last two years I work for in this company I saw one disk die (rotating) out of around 700 disks in total over eleven cluster (very different in sizes, two of them hold the vast majority of disks, one is the main s3 cluster). I would love to use flash only for our s3, but it is expensive. I think our most "maintanance cost" is the actuall RGW and figuring out how to use it correctly instead of disk management. why we use 8tb disks: I don't really know. We've some older 4TB disks in there (some got re-added, when we ran very fast into disk limitations, we just needed space right NOW and those were laying around), some 16TB disks and as we found out that these disk make some of the hosts really heavy (we've bought and added them in the "need much space ASAP" situation and didn't have time to place them optimal in the cluster) and they are expensive, so we are now going with 8TB. We are also redistributing disks in the cluster, as there are 2RU chassis with 7 disks (currently 4TB and 16TB) and 4RU chassis with >20 disks (with 4,8 and 16TB disks). Also CRUSH seems to have a bad time to distribute PGs around in the cluster when then disks are too scattered. There are a lot of OSDs lignering around 79.9% - 80% used disk space and a lot at the other end 65% - 70% used disk space. There are very few that are actually used in the 73% range (which is the overall cluster utilization). DWPD anyone? I don't really care about this, because our s3 cluster is more a data dump than an actual object storage. 90% of the data are exported rbd snapshots (we've implemented our managed backup center and needed some place to store the data, and s3 was really nice because it makes it very easy on the clients end and can be used from everywhere). It looks like that the cluster got a lot of writing to it (at least 3/4 of the traffic is writing to the cluster) but it is only 200-500MB/s. I couldn't imagine that I am hitting ant DWPD threashold. why I came up with this question: We just have a ton of 2TB SSDs laying around (I don't know the number but I think it is around 30) because we replace small disks with larger disk instead of just buying a new chassis with disks. And I wanted to bring these disks to good use in our s3 cluster. Reading all the problems that might arise from DB/WAL SSDs let's me think to not just use them. Just add more 8TB disks to the 4RU chassis, and when they are out of slots add another 8RU chassis with 8TB disks. Basically I am just cleaning up old technical debts and emergency decissions and want to do it now in the most optimal way with the resources we have. :) Cheers Boris _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx