Re: System reboot hangs when umounting filesystem on rbd

Sage Weil <sage@xxxxxxxxxxx> · Sun, 8 Sep 2013 21:01:56 -0700 (PDT)

On Mon, 9 Sep 2013, Da Chun Ng wrote:
> Centos 6.4Kernel 3.10.6
> Ceph 0.61.8
> 
> My ceph cluster is deployed on three nodes.
> One rbd image was created, mapped to one of the three nodes, formatted with
> ext4, and mounted.
> When rebooting this node, it hung umouting the file system on the rbd.
> 
> My guess about the root cause:
> When the system shutting down, the services are stopped firstly, then the
> mounted file systems are umounted. For the file system on rbd, if there are
> dirty pages, a flush will happen, but the ceph services have been shut down,
> so it will hang.
> 
> Am I right? How to work around this?

Sounds right.  You need to get the rbd volumes unmounted before shutting 
down the ceph services.  The rbdmap file may do the trick if you set the 
sysvinit priority right.

This is all going to be very fragile, however:

 - running kernel client and ceph servers on the same node is vulnerable 
to writeback deadlock (as it is with cifs, nfs, and other network file 
systems)

 - even if you shut down the services on the current node in the right 
order, that doesn't mean the other nodes in the cluster will too; you 
really need all clients to shut down, and then all servers.

sage
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com