Thanks Christian, I'm using a pool with size 3, min_size 1. I can see the cluster serving I/O in a degraded after the OSD is marked down, but the problem we have is in the interval between the OSD failure event and the moment when that OSD is marked down. In that interval (which can take up to 10 minutes) all the I/O operations directed to that OSD are blocked, thus all the virtual machines using the RBDs provided by the cluster hang, until the failed OSD is finally marked down. Is this the expected operation of the cluster during failure? Is it possible to make that time shorter so the I/O operations don't get blocked for so long? Thanks, On 11/04/2016 07:25 PM, Christian
Wuerdig wrote:
-- Fernando Cid O. Ingeniero de Operaciones AltaVoz S.A. http://www.altavoz.net Viña del Mar, Valparaiso: 2 Poniente 355 of 53 +56 32 276 8060 Santiago: San Pío X 2460, oficina 304, Providencia +56 2 2585 4264 |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com