Re: Unique bug in ceph start script

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 25 Apr 2013, Andreas Friedrich wrote:
> On Mi, Apr 24, 2013 at 11:04:56 -0700, Sage Weil wrote:
> 
> Hello Sage,
> 
> ...
> > > If the start script exits with or without error, the remote config 
> > > file will be removed from <remote-host>:/tmp - but only for the last 
> > > host (the former trap calls will be overwritten by the last trap
> > > call)! This makes no sense to me.
> > 
> > Bah.. I was thinking they would stack.  How about
> > 
> > diff --git a/src/init-ceph.in b/src/init-ceph.in
> > index 61c10e1..195bf76 100644
> > --- a/src/init-ceph.in
> > +++ b/src/init-ceph.in
> > @@ -218,8 +218,11 @@ for name in $what; do
> >      else
> >         unique=`dd if=/dev/urandom bs=16 count=1 2>/dev/null | md5sum | awk '{print $1}'`
> >         scp -q $conf $host:/tmp/ceph.conf.$unique
> > -       trap "ssh $host rm /tmp/ceph.conf.$unique" EXIT
> >         cur_conf="/tmp/ceph.conf.$unique"
> > +
> > +       # clean up on exit
> > +       remote_configs="$host:/tmp/ceph.conf.$unique $remote_configs"
> > +       trap 'for f in '$remote_configs' ; do ssh ${f%:*} rm ${f#*:} ; done' EXIT
> >      fi
> >      cmd="$cmd -c $cur_conf"
> 
> yes, this looks fine but I have a question:
> 
> I don't know the background for copying the config files to
> <remote-host>:/tmp. Why don't use all the daemons their local config
> /etc/ceph/ceph.conf?
> 
> May be you want to ensure that the remote daemons will use the same
> config as exists on the host from which the cluster was started. But
> if the configs on the remote hosts differ from the config of the
> "cluster-start-host", then we still might have a problem when a remote
> host is booting: In this case '/etc/init.d/ceph start' is called and
> the remote host will use its local (different) config.
> 
> I think a clear strategy would be:
> - All daemons always use their local config (no copying to /tmp).
> - If a cluster-start is initiated (/etc/init.d/ceph -a start) the
>   script first checks if the configs on all cluster hosts are equal
>   (maybe we need a content check here - not a literal check).
> - If the configs are not equal, the cluster-start command aborts with
>   an error message.
> 
> What do you think about that?

That makes a lot more sense, although I would suggest a warning instead of 
an error.  For cuttlefish, I'm hesitant to change behavior in the 11th 
hour, but we should make this change going forward.

Care to send a patch?  :)

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux