I am not sure if there is a hard and fast rule you are after, but pretty much anything that would cause ceph transactions to be blocked (flapping OSD, network
loss, hung host) has the potential to block RBD IO which would cause your iSCSI LUNs to become unresponsive for that period. For the most part though, once that condition clears things keep working, so its not like a hang where you need to reboot to clear it. Some situations we have
hit with our setup: -
Failed OSDs (dead disks) – no issues -
Cluster rebalancing – ok if throttled back to keep service times down -
Network packet loss (bad fibre) – painful, broken communication everywhere, caused a krbd hang needing a reboot -
RBD Snapshot deletion – disk latency through roof, cluster unresponsive for minutes at a time, won’t do again. From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx]
On Behalf Of Brady Deetz I apologize if this is a duplicate of something recent, but I'm not finding much. Does the issue still exist where dropping an OSD results in a LUN's I/O hanging? I'm attempting to determine if I have to move off of VMWare in order to safely use Ceph as my VM storage. |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com