On 6 November 2013 14:08, Andrey Korolyov <andrey@xxxxxxx> wrote:
> We are looking at building high density nodes for small scale 'starter'For such large number of disks you should consider that the cache
> deployments for our customers (maybe 4 or 5 nodes). High density in this
> case could mean a 2u chassis with 2x external 45 disk JBOD containers
> attached. That's 90 3TB disks/OSDs to be managed by a single node. That's
> about 243TB of potential usable space, and so (assuming up to 75% fillage)
> maybe 182TB of potential data 'loss' in the event of a node failure. On an
> uncongested, unused, 10Gbps network, my back-of-a-beer-mat calculations say
> that would take about 45 hours to get the cluster back into an undegraded
> state - that is the requisite number of copies of all objects.
>
amortization will not take any place even if you are using 1GB
controller(s) - only tiered cache can be an option. Also recovery will
take much more time even if you have a room for client I/O in the
calculations because raw disks have very limited IOPS capacity and
recovery will either take a much longer than such expectations at a
glance or affect regular operations. For S3/Swift it may be acceptable
but for VM images it does not.
Sure, but my argument was that you are never likely to actually let that entire recovery operation complete - you're going to replace the hardware and plug the disks back in and let them catch up by log replay/backfill. Assuming you don't ever actually expect to really lose all data on 90 disks in one go...
By tiered caching, do you mean using something like flashcache or bcache?
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com