I'm in the process of adding more resources to an existing cluster.
I'll have 38 hosts, with 2 HDD each, for an EC pool. I plan on adding a cache pool in front of it (is it worth it? S3 data, mostly writes and objects are usually 200kB upwards to several MB/GB...); all of the hosts are on the same rack. All the other pools will go into a separate SSD based pool and would be replicated.
This cluster is currently Hammer, so I was looking to using LRC. Is it worth using LRC over standard jerasure? What would be a good k and m? I was thinking k=12, m=4, l=4 as I have more than enough hosts for these values, but what if I lose more than one host? Will LRC still be able to recover using the "adjacent" group?
And what about performance? From Somnath's email it seemed the bigger the k and m the worse it would perform...
What are the usual values you all use?
PS: I still haven't seen Mark Nelson performance presentation...
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com