On 5/21/15, 5:04 AM, "Blair Bethwaite" <blair.bethwaite@xxxxxxxxx> wrote: >Hi Warren, > >On 20 May 2015 at 23:23, Wang, Warren <Warren_Wang@xxxxxxxxxxxxxxxxx> >wrote: >> We¹ve contemplated doing something like that, but we also realized that >> it would result in manual work in Ceph everytime we lose a drive or >> server, >> and a pretty bad experience for the customer when we have to do >> maintenance. > >Yeah I guess you have to delete and recreate the pool, but is that >really so bad? Or trash the associated volumes. Plus the perceived failure rate from client perspective would be high, especially when we have to do things like reboots. > >> We also kicked around the idea of leveraging the notion of a Hadoop rack >> to define a set of instances which are Cinder volume backed, and the >>rest >> be ephemeral drives (not Ceph backed ephemeral). Using 100% ephemeral >> isn¹t out of the question either, but we have seen a few instances where >> all the instances in a region were quickly terminated. > >What's the implication here - the HDFS instances were terminated and >that would have caused Hadoop data-loss had they been ephemeral? Yeah. Of course it would be able to tolerate up to 2/3 but 100% would result in permanent data loss. I see the Intel folks are tackling this from the object backed approach: https://wiki.ceph.com/Planning/Blueprints/Infernalis/rgw%3A_Hadoop_FileSyst em_Interface_for_a_RADOS_Gateway_Caching_Tier Probably should have chatted with them about that. I totally forgot. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com