Re: GF_PARENT_DOWN on SIGKILL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



http://review.gluster.org/14980, this is where we have all the context about why I sent out this mail. Basically the tests were failing because umount is racing with version-updation xattrop. While I fixed the test to handle that race, xavi was wondering why GF_PARENT_DOWN event didn't come. I found that in cleanup_and_exit() we don't send this event. We are only calling 'fini()'. So wondering if any one knows why this is so.

On Fri, Jul 22, 2016 at 6:37 PM, Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> wrote:
It is only calling fini() apart from that not much.

On Fri, Jul 22, 2016 at 6:36 PM, Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> wrote:
Gah! sorry sorry, I meant to send the mail as SIGTERM. Not SIGKILL. So xavi and I were wondering why cleanup_and_exit() is not sending GF_PARENT_DOWN event.

On Fri, Jul 22, 2016 at 6:24 PM, Jeff Darcy <jdarcy@xxxxxxxxxx> wrote:
> Does anyone know why GF_PARENT_DOWN is not triggered on SIGKILL? It will give
> a chance for xlators to do any cleanup they need to do. For example ec can
> complete the delayed xattrops.

Nothing is triggered on SIGKILL.  SIGKILL is explicitly defined to terminate a
process *immediately*.  Among other things, this means it can not be ignored or
caught, to preclude handlers doing something that might delay termination.

http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04

Since at least 4.2BSD and SVr2 (the first version of UNIX that I worked on)
there have even been distinct kernel code paths to ensure special handling of
SIGKILL.  There's nothing we can do about SIGKILL except be prepared to deal
with it the same way we'd deal with the entire machine crashing.

If you mean why is there nothing we can do on a *server* in response to
SIGKILL on a *client*, that's a slightly more interesting question.  It's
possible that the unique nature of SIGKILL puts connections into a
different state than either system failure (on the more abrupt side) or
clean shutdown (less abrupt).  If so, we probably need to take a look at
the socket/RPC code or perhaps even protocol/server to see why these
connections are not being cleaned up and shut down in a timely fashion.



--
Pranith



--
Pranith



--
Pranith
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux