Re: Async Messenger RDMA IB ib_uverbs_write return EACCES

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 2, 2019 at 9:26 PM Sage Weil <sage@xxxxxxxxxxxx> wrote:
>
> On Tue, 30 Apr 2019, Roman Penyaev wrote:
> > On 2019-04-19 16:31, Liu, Changcheng wrote:
> > > Hi Roman,
> > >   I found that why ceph/msg/async/rdma/iwarp(x722) doesn't work on
> > > ceph master branch.
> > >   The problem is triggered by below commit:
> > >
> > > https://github.com/ceph/ceph/pull/20172/commits/fdde016301ae329f76c621337c384ac60aa0d210
> > >
> > >   Below is the basic program model extracted from
> > > ceph/msg/async/rdma/iwarp to show how the problem is triggered:
> >
> > Hi Changcheng,
> >
> > Indeed fork() also changes credentials (see copy_creds() in kernel for
> > details),
> > like setuid() does, so there are two known places in ceph, after which uverbs
> > calls return -EACESS:
> >
> >   o setuid() (see global_init())
> >   o daemon() (see global_init_daemonize())
> >
> > My question is why you daemonize your ceph services and do not rely on
> > systemd,
> > which does fork() on its own and runs each service with '-f' flag, which means
> > do not daemonize?  So I would not daemonize services and this can be a simple
> > solution.
>
> The daemonize behavior predates systemd (and upstart).  At this point it
> is only there for legacy reasons and to avoid breaking things for the
> Devuan sysvinit hold-outs.  (And vstart still daemonizes.)  We could
> probably get away with ripping it out...

in crimson-osd, we are also struggling with the daemonize feature.
probably we can have a command like http://www.libslack.org/daemon/,
to read the settings and do the daemonize on behalf of the
ceph-{osd,mgr,mon,mds} and radosgw daemons when it's necessary without
breaking sysvinit systems and vstart.sh.

>
> sage
>
> >
> > With setuid() is not that easy.  The most straightforward way is to move
> > mc_bootstrap.get_monmap_and_config() after setuid() call.  At the bottom of
> > the email there is a small patch which can fix the problem (I hope does not
> > introduce something new). Would be great if you can check it.
> >
> > --
> > Roman
> >
> >
> > diff --git a/src/global/global_init.cc b/src/global/global_init.cc
> > index eb8bbfd1a4db..de647be768bd 100644
> > --- a/src/global/global_init.cc
> > +++ b/src/global/global_init.cc
> > @@ -147,18 +147,6 @@ void global_pre_init(
> >      cct->_log->start();
> >    }
> >
> > -  if (!conf->no_mon_config) {
> > -    // make sure our mini-session gets legacy values
> > -    conf.apply_changes(nullptr);
> > -
> > -    MonClient mc_bootstrap(g_ceph_context);
> > -    if (mc_bootstrap.get_monmap_and_config() < 0) {
> > -      cct->_log->flush();
> > -      cerr << "failed to fetch mon config (--no-mon-config to skip)"
> > -          << std::endl;
> > -      _exit(1);
> > -    }
> > -  }
> >    if (!cct->_log->is_started()) {
> >      cct->_log->start();
> >    }
> > @@ -313,6 +301,28 @@ global_init(const std::map<std::string,std::string>
> > *defaults,
> >    }
> >  #endif
> >
> > +  //
> > +  // Utterly important to run first network connection after setuid().
> > +  // In case of rdma transport uverbs kernel module starts returning
> > +  // -EACCESS on each operation if credentials has been changed, see
> > +  // callers of ib_safe_file_access() for details.
> > +  //
> > +  // fork() syscall also matters, so daemonization won't work in case
> > +  // of rdma.
> > +  //
> > +  if (!g_conf()->no_mon_config) {
> > +    // make sure our mini-session gets legacy values
> > +    g_conf().apply_changes(nullptr);
> > +
> > +    MonClient mc_bootstrap(g_ceph_context);
> > +    if (mc_bootstrap.get_monmap_and_config() < 0) {
> > +      g_ceph_context->_log->flush();
> > +      cerr << "failed to fetch mon config (--no-mon-config to skip)"
> > +          << std::endl;
> > +      _exit(1);
> > +    }
> > +  }
> > +
> >
> >
> >
> >



-- 
Regards
Kefu Chai



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux