Re: Issue with server reboot / shutdown

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 12, 2014 at 10:03:36AM +0100, Florian Knorn wrote:
> Dear Niels,
> 
> Many thanks for your quick reply.
> 
> Indeed, after running “service glusterfs-server stop” (which is
> essentially one of the first executed on shutdown), if I use “lsof” on
> the brick directory I get many hits for “glusterfs”, and the
> umountiscsi script (which essentially runs “umount -a -O _netdev”) in
> fact does not manage to unmount that directory, and subsequently the
> overall filesystem unmount fails.
> 
> So how could I get these remaining glusterfs processes stopped? Why
> are they still there?

The glusterfsd processes should get stopped by the glusterfs-server
init-script. I do not know if Debian has its own script, or if it takes
the one from the upstream Gluster repository.

I've documented issues with stopping/restarting the brick processes a
while back in a blog post:
- http://blog.nixpanic.net/2013/12/gluster-and-not-restarting-brick.html

Maybe this helps in understanding some of the common problems, and gives
you an idea how this can be made to work on Debian.

I'll add the gluster-users@ list back on CC. Other Debian users would be
interested in this too, I expect.

Niels

> 
> On Wed, Nov 12, 2014 at 9:56 AM, Niels de Vos <ndevos@xxxxxxxxxx> wrote:
> > On Wed, Nov 12, 2014 at 09:07:46AM +0100, Florian Knorn wrote:
> >> Hi,
> >>
> >> I have an issue where I can’t reboot or shutdown my server with
> >> gluster running. The setup:
> >>
> >> Debian 7.7, Gluster - 3.5.2, 1 volume with 1 brick mounted via iSCSI,
> >> using multipath-tools.
> >>
> >> On shutdown / reboot, the system hangs at “Unmounting local
> >> filesystems”, see this screenshot:
> >>
> >> https://www.dropbox.com/s/7g22330382mvkm6/shutdown_issue.jpg?dl=0
> >>
> >> Testing around, I noticed that if I issue “service glusterfs-server
> >> stop; umount /path/to/brick” it says the path is still in use.
> >> However, if I use “gluster volume stop THEVOLUME; service
> >> glusterfs-server stop; umount /path/to/brick” then it works.
> >>
> >> Similarly, IF prior to shutdown / reboot I manually stop the volume
> >> first, then it all goes through.
> >>
> >> So it seems to me that even though it is stopped, the gluster server
> >> still has some files on the brick open, which prevents the unmount,
> >> and locks up the system on reboot.
> >
> > I think that "service glusterfs-server stop" does not stop the brick
> > processes (glusterfsd). These are stopped when doing a "gluster volume
> > stop ...", but that should not be needed for a reboot/shutdown. Stopping
> > the volume will also take down the brick processes on the other storage
> > servers.
> >
> > I do not know which service scripts Debian uses, but there should be an
> > option that gracefully kills the brick processes on shutdown.
> >
> >> Any pointers? Or has this to do with multipath, because I believe the
> >> issue started after using that?
> >
> > It could also be related to the fact that you store the bricks on a
> > iscsi disk. It is common for distributions to wait until unmounting has
> > finished. There may be some processes that are flushing their data, and
> > you do not want them to abort writing out their data. However, if the
> > iscsi service or the network has been stopped already, it might not be
> > possible to have the processes write out their data. It is tricky to get
> > the shutdown procedure right, something in this order should work:
> >
> >    4. stop glusterd, self-heal, quota, .. and glusterfsd processes
> >    3. unmount bricks
> >    2. stop iscsi
> >    1. stop network
> >
> > Maybe with these details you can identify where things go wrong? Please
> > keep us informed about the results you get.
> >
> > Thanks,
> > Niels

Attachment: pgpluAY8znUDZ.pgp
Description: PGP signature

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux