On Thu, May 2, 2019 at 9:26 PM Sage Weil <sage@xxxxxxxxxxxx> wrote: > > On Tue, 30 Apr 2019, Roman Penyaev wrote: > > On 2019-04-19 16:31, Liu, Changcheng wrote: > > > Hi Roman, > > > I found that why ceph/msg/async/rdma/iwarp(x722) doesn't work on > > > ceph master branch. > > > The problem is triggered by below commit: > > > > > > https://github.com/ceph/ceph/pull/20172/commits/fdde016301ae329f76c621337c384ac60aa0d210 > > > > > > Below is the basic program model extracted from > > > ceph/msg/async/rdma/iwarp to show how the problem is triggered: > > > > Hi Changcheng, > > > > Indeed fork() also changes credentials (see copy_creds() in kernel for > > details), > > like setuid() does, so there are two known places in ceph, after which uverbs > > calls return -EACESS: > > > > o setuid() (see global_init()) > > o daemon() (see global_init_daemonize()) > > > > My question is why you daemonize your ceph services and do not rely on > > systemd, > > which does fork() on its own and runs each service with '-f' flag, which means > > do not daemonize? So I would not daemonize services and this can be a simple > > solution. > > The daemonize behavior predates systemd (and upstart). At this point it > is only there for legacy reasons and to avoid breaking things for the > Devuan sysvinit hold-outs. (And vstart still daemonizes.) We could > probably get away with ripping it out... in crimson-osd, we are also struggling with the daemonize feature. probably we can have a command like http://www.libslack.org/daemon/, to read the settings and do the daemonize on behalf of the ceph-{osd,mgr,mon,mds} and radosgw daemons when it's necessary without breaking sysvinit systems and vstart.sh. > > sage > > > > > With setuid() is not that easy. The most straightforward way is to move > > mc_bootstrap.get_monmap_and_config() after setuid() call. At the bottom of > > the email there is a small patch which can fix the problem (I hope does not > > introduce something new). Would be great if you can check it. > > > > -- > > Roman > > > > > > diff --git a/src/global/global_init.cc b/src/global/global_init.cc > > index eb8bbfd1a4db..de647be768bd 100644 > > --- a/src/global/global_init.cc > > +++ b/src/global/global_init.cc > > @@ -147,18 +147,6 @@ void global_pre_init( > > cct->_log->start(); > > } > > > > - if (!conf->no_mon_config) { > > - // make sure our mini-session gets legacy values > > - conf.apply_changes(nullptr); > > - > > - MonClient mc_bootstrap(g_ceph_context); > > - if (mc_bootstrap.get_monmap_and_config() < 0) { > > - cct->_log->flush(); > > - cerr << "failed to fetch mon config (--no-mon-config to skip)" > > - << std::endl; > > - _exit(1); > > - } > > - } > > if (!cct->_log->is_started()) { > > cct->_log->start(); > > } > > @@ -313,6 +301,28 @@ global_init(const std::map<std::string,std::string> > > *defaults, > > } > > #endif > > > > + // > > + // Utterly important to run first network connection after setuid(). > > + // In case of rdma transport uverbs kernel module starts returning > > + // -EACCESS on each operation if credentials has been changed, see > > + // callers of ib_safe_file_access() for details. > > + // > > + // fork() syscall also matters, so daemonization won't work in case > > + // of rdma. > > + // > > + if (!g_conf()->no_mon_config) { > > + // make sure our mini-session gets legacy values > > + g_conf().apply_changes(nullptr); > > + > > + MonClient mc_bootstrap(g_ceph_context); > > + if (mc_bootstrap.get_monmap_and_config() < 0) { > > + g_ceph_context->_log->flush(); > > + cerr << "failed to fetch mon config (--no-mon-config to skip)" > > + << std::endl; > > + _exit(1); > > + } > > + } > > + > > > > > > > > -- Regards Kefu Chai