anti-cephalopod question

sweil@xxxxxxxxxx (Sage Weil) · Mon, 28 Jul 2014 08:46:03 -0700 (PDT)

On Mon, 28 Jul 2014, Joao Eduardo Luis wrote:
> On 07/28/2014 02:07 PM, Robert Fantini wrote:
> > Is the '15 minutes or so '  something that can be configured at run time?
> 
> Someone who knows this better than I do should probably chime in, but from a
> quick look throughout the code it seems to be 'client_mount_interval', which
> by default is 300 seconds (5 minutes) instead of 15 minutes.
> 
> As with all (or most?) options, this can be adjust at run time via injectargs
> (via 'ceph tell') or 'config set' (via the admin socket).
> 
> Please bear in mind that just because you can adjust it doesn't mean that you
> should.  Keeping live connections alive should not be a problem, but given I
> haven't given much thought to it there's a chance that I'm missing something.

I think connected clients will continue to funciton much longer 
than clinet_mount_interval... it should be as long as 
auth_service_ticket_ttl (default is 1h), or somewhere between 1x and 2x 
that interval, when cephx is in use.  The real limitation is that if the 
mons lose quorum you can't have new clients authenticate, and there won't 
be any cluster state changes (e.g., an OSD can't go down or come up).  A 
few other random operations will also fail (snap creation, 'df', etc.).

sage

 > 
>   -Joao
> 
> > 
> > 
> > On Mon, Jul 28, 2014 at 8:44 AM, Joao Eduardo Luis
> > <joao.luis at inktank.com <mailto:joao.luis at inktank.com>> wrote:
> > 
> >     (CC'ing ceph-users)
> > 
> >     On 07/28/2014 12:34 PM, Marc wrote:
> > 
> >         Hi,
> > 
> > 
> >             This said, if out of 3 monitors you have 2 monitors down,
> >             your cluster
> >             will cease functioning (no admin commands, no writes or
> >             reads served).
> > 
> > 
> >         This is not entirely true. (At least) RBDs will continue being fully
> >         functional even if the mon quorum is lost. This only applies to RBDs
> >         that are already mounted (qemu) at the time of quorum loss though.
> > 
> >         Meaning: (K)VMs running off of Ceph will remain fully functional
> >         even if
> >         the mon quorum is lost (assuming you havent lost too many OSDs
> >         at the
> >         same time).
> > 
> > 
> >     True.  Clients will maintain the connections they have to OSDs for
> >     about 15 minutes or so, at which point timeouts will go off and all
> >     work will be halted.  New clients won't be able to do this though,
> >     as they have to grab maps from the monitors prior to connecting to
> >     OSDs, and the monitor will not serve those requests if quorum is not
> >     in place.
> > 
> >        -Joao
> > 
> > 
> > 
> >         On 28/07/2014 12:22, Joao Eduardo Luis wrote:
> > 
> >             On 07/28/2014 08:49 AM, Christian Balzer wrote:
> > 
> > 
> >                 Hello,
> > 
> >                 On Sun, 27 Jul 2014 18:20:43 -0400 Robert Fantini wrote:
> > 
> >                     Hello Christian,
> > 
> >                     Let me supply more info and answer some questions.
> > 
> >                     * Our main concern is high availability, not speed.
> >                     Our storage requirements are not huge.
> >                     However we want good keyboard response 99.99% of the
> >                     time.   We
> >                     mostly do
> >                     data entry and reporting.   20-25  users doing
> >                     mostly order , invoice
> >                     processing and email.
> > 
> >                     * DRBD has been very reliable , but I am the SPOF .
> >                        Meaning that when
> >                     split brain occurs [ every 18-24 months ] it is me
> >                     or no one who knows
> >                     what to do. Try to explain how to deal with split
> >                     brain in advance....
> >                     For the future ceph looks like it will be easier to
> >                     maintain.
> > 
> >                 The DRBD people would of course tell you to configure
> >                 things in a way
> >                 that
> >                 a split brain can't happen. ^o^
> > 
> >                 Note that given the right circumstances (too many OSDs
> >                 down, MONs down)
> >                 Ceph can wind up in a similar state.
> > 
> > 
> > 
> >             I am not sure what you mean by ceph winding up in a similar
> >             state.  If
> >             you mean regarding 'split brain' in the usual sense of the
> >             term, it does
> >             not occur in Ceph.  If it does, you have surely found a bug
> >             and you
> >             should let us know with lots of CAPS.
> > 
> >             What you can incur though if you have too many monitors down
> >             is cluster
> >             downtime.  The monitors will ensure you need a strict
> >             majority of
> >             monitors up in order to operate the cluster, and will not
> >             serve requests
> >             if said majority is not in place.  The monitors will only
> >             serve requests
> >             when there's a formed 'quorum', and a quorum is only formed
> >             by (N/2)+1
> >             monitors, N being the total number of monitors in the
> >             cluster (via the
> >             monitor map -- monmap).
> > 
> >             This said, if out of 3 monitors you have 2 monitors down,
> >             your cluster
> >             will cease functioning (no admin commands, no writes or
> >             reads served).
> >             As there is no configuration in which you can have two strict
> >             majorities, thus no two partitions of the cluster are able
> >             to function
> >             at the same time, you do not incur in split brain.
> > 
> >             If you are a creative admin however, you may be able to
> >             enforce split
> >             brain by modifying monmaps.  In the end you'd obviously end
> >             up with two
> >             distinct monitor clusters, but if you so happened to not
> >             inform the
> >             clients about this there's a fair chance that it would cause
> >             havoc with
> >             unforeseen effects.  Then again, this would be the
> >             operator's fault, not
> >             Ceph itself -- especially because rewriting monitor maps is
> >             not trivial
> >             enough for someone to mistakenly do something like this.
> > 
> >                 -Joao
> > 
> > 
> > 
> > 
> > 
> >     --
> >     Joao Eduardo Luis
> >     Software Engineer | http://inktank.com | http://ceph.com
> >     _________________________________________________
> >     ceph-users mailing list
> >     ceph-users at lists.ceph.com <mailto:ceph-users at lists.ceph.com>
> >     http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
> >     <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> > 
> > 
> 
> 
> -- 
> Joao Eduardo Luis
> Software Engineer | http://inktank.com | http://ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
>