Hi, On Mon, 21 Jun 2010, Christopher McLean wrote: > Hi, > > Going to cut to the chase on this. I'm pretty sure that this is not the > correct place to ask, but the IRC channel is quiet and we can find no > other sources of information - sorry in advance! > > we've been hunting for some stats on ceph, notably disk utilisation e.g. > given 3 servers providing 100Gb each to the cluster, how much useable > space would be available after management/redundancy overheads? Can't > find anything of relevance online. needing the info for a comparative > study into costs for deploying ceph > > any help/pointers/references would be of great help! Generally speaking, you need to account for 2x replication, btrfs overhead, cosd overhead (contents of $osd_data/current/meta directory), and the pseudorandom distribution. The last one is the trickiest. Because data is placed based on a hash, there will be some natural variance is osd utilization, which will depend on the total number of objects and osds. There is a facility in crush to adjust the distribution to correct for that natural variance, but ceph isn't using it yet. I would probably allow for 10% utilization variance for a smallish cluster and maybe another 10% for the rest to be safe. Something along the lines of total_disk * .8 / replication_level? You generally shouldn't fill any file system beyond 80% or 90% anyway and expect it to perform well. sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html