Re: Fork and RDMA operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 08/12/2016 12:27 PM, Sage Weil wrote:
It seems like it would be simpler to push the fork before any important
operations.  (And BTW with systemd and upstart we don't fork anyway; it's
just there for sysvinit.)  The preforker thing is there to make it easy to
fork early, but keep the parent waiting around so that you can do more
intialization, print errors, and terminate with an error code if something
(post-fork) goes wrong.  In theory, there's no reason why we couldn't make
this almost the very first thing the daemon does so that *all* work is
done in the child...

sage


I've recently run into issues related to fork as well (see my "memory leaks related to CephContext and global_init_daemonize()" email). Trying to manage resources across a fork is difficult and error-prone, so changing how we daemonize could eliminate a whole class of these bugs. And as Haomai points out, we're going to great lengths to make things work in the current model.

I'm a big fan of Sage's theory that all work could be done in the child process, and I'm willing to take on the project if we can reach a consensus on the design.

Whether or not we're interested in long-term support for SysV, the ability to daemonize is useful our for development workflow (vstart.sh in particular). To fill this role, some basic requirements for the parent process are:
* don't exit until initialization is finished
* return an error code if initialization failed

As Sage pointed out, this is exactly what Preforker is doing for ceph-mon. So we can start by changing the other daemons to use that instead of global_init_daemonize().

The next step is to prevent global_init()/common_init()/CephContext from doing any work in the parent process (esp. spawning threads for Log, CephContextServiceThread, and AdminSocket). Decoupling the config parsing from CephContext initialization seems like a natural way to accomplish that. So the parent would create and initialize the md_config_t object, then after fork, the child would pass that as an argument when creating the CephContext.

How does that sound for a start?

Casey
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux