Did you try an umount -l (lasy umount) - should just disconnect the fs - as I experienced with other network FS - like NFS or Gluster - you may always have difficulties with any of them - so "-l" helps me. Not sure for CEPH though. On Friday 23 July 2010, Sébastien Paolacci wrote: > Hello Sage, > > I would like to emphasize that this issue is somewhat > annoying, even for experiment purpose: I definitely > expect my test server to not behave safely, crash, burn > or whatever, but having a client side impact as deep as > needed a (hard) reboot to solved a hanged ceph really > prevent me from testing with real life payloads. > > I understand that it's not an easy point but a lot of my > colleagues are not really whiling to sacrifice even > their dev workstation to play during spare time... sad > world ;) > > Sebastien > > On Wed, 16 Jun 2010, Peter Niemayer wrote: > > Hi, > > > > trying to "umount" a formerly mounted ceph filesystem > > that has become unavailable (osd crashed, then msd/mon > > were shut down using /etc/init.d/ceph stop) results in > > "umount" hanging forever in > > "D" state. > > > > Strangely, "umount -f" started from another terminal > > reports the ceph filesystem as not being mounted > > anymore, which is consistent with what the mount-table > > says. > > > > The kernel keeps emitting the following messages from time to time: > > > Jun 16 17:25:29 gitega kernel: ceph: tid 211912 > > > timed out on osd0, will reset osd > > > Jun 16 17:25:35 gitega kernel: ceph: mon0 > > > 10.166.166.1:6789 connection failed > > > Jun 16 17:26:15 gitega last message repeated 4 times > > > > I would have expected the "umount" to terminate at > > least after some generous timeout. > > > > Ceph should probably support something like the > > "soft,intr" options of NFS, because if the only > > supported way of mounting is one where a client is > > more or less stuck-until-reboot when the service > > fails, many potential test-configurations involving > > Ceph are way too dangerous to try... > > Yeah, being able to force it to shut down when servers > are unresponsive is definitely the intent. 'umount -f' > should work. It sounds like the problem is related to > the initial 'umount' (which doesn't time out) followed > by 'umount -f'. > > I'm hesitant to add a blanket umount timeout, as that > could prevent proper writeout of cached data/metadata in > some cases. So I think the goal should be that if a > normal umount hangs for some reason, you should be able > to intervene to add the 'force' if things don't go well. > > sage > -- > -- > To unsubscribe from this list: send the line "unsubscribe > ceph-devel" in the body of a message to > majordomo@xxxxxxxxxxxxxxx More majordomo info at > http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html