Re: Failing heartbeats when no backfill is running

Lorenz Kiefner <root+cephusers@xxxxxxxxxxxx> · Sat, 17 Aug 2019 19:24:02 +0200



    Hello again,

    
    all links are at least 10/50 mbit upstream/downstream, mostly
      40/100 mbit, with some VMs at hosting companies running at 1/1
      gbit. All my 39 OSDs on 17 hosts in 11 locations (5 of them are
      connected at the moment by consumer internet links) are nearly in
      a full mesh network consisting of wireguard VPN links, routed by
      bird with OSPF. Speed is not great as you can imagine but
      sufficient for me.
    Some hosts are x86, some are ARMv7 on ODROID HC-1 (SAMSUNG
      smartphone SoC). Could this mix of architectures be a problem?

    
    My goal is to provide a shared filesystem with my friends and to
      provide backup space on rbd images. This seams possible, but it is
      really annoying when OSDs are randomly marked down.
    If there were some network issues I would expect that all OSDs on
      the affected host would be marked down, but only one OSD on this
      host is marked down. If I log in on that host and restart the OSD
      the same OSD will probably be marked down again in some 10-30
      minutes. And this only happens if there is *no* backfill or
      recovery running. I would expect that network issues and packet
      drops on a saturated line are more likely than on an idling line.

    
    Are there some (more) config keys for OSD ping timeouts in
      luminous? I would be very happy for some more ideas!
    Thank you all
    Lorenz

    
    Am 16.08.19 um 17:01 schrieb Robert
      LeBlanc:

    
      Personally I would not be trying to create a Ceph
        cluster across Consumer Internet links, usually their upload
        speed is so slow and Ceph is so chatty that it would make for a
        horrible experience. If you are looking for a backup solution,
        then I would look at some sort of n-way rsync solution, or
        btrfs/zfs volumes that send/receive each other. I really don't
        think Ceph is a good fit.

        
          ----------------

            Robert LeBlanc

            PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2
            FA62 B9F1
        
        
        On Thu, Aug 15, 2019 at 12:37
          AM Lorenz Kiefner <root+cephusers@xxxxxxxxxxxx>
          wrote:

        
            Oh no, it's not that bad. It's 

            
            $ ping -s 65000 dest.inati.on
            on a VPN connection that has a MTU of 1300 via IPv6. So I
              suspect that I only get an answer, when all 51 fragments
              get fully returned. It's clear that big packets with lots
              of fragments are more affected by packet loss than 64 byte
              pings.

            
            I just (at 9 o'clock in the morning) repeated this ping
              test and got hardly any drops (less than 1%), even with
              the size of 64k. So it's really dependent on the time of
              the day. Seems like some ISPs are dropping some packets,
              especially in the evening...

            
            A few minutes ago I restarted all down-marked OSDs, but
              they are getting marked down again... Seems like Ceph is
              tolerable against packet loss (it surely affects
              performance, but this irrelevant for me).
            

            Could erasure coded pools pose some problems?

            
            Thank you all for every hint!
            Lorenz

            
            Am
              15.08.19 um 08:51 schrieb Janne Johansson:

            
                Den ons 14 aug. 2019 kl 17:46 skrev
                  Lorenz Kiefner <root+cephusers@xxxxxxxxxxxx>:

                
                  Is ceph sensitive
                    to packet loss? On some VPN links I have up to 20%

                    packet loss on 64k packets but less than 3% on 5k
                    packets in the evenings.

                  
                  20% seems crazy high, there must be something
                    really wrong there.
                  

                  At 20%, you would get tons of packet timeouts to
                    wait for on all those lost frames,
                  then resends of (at least!) those 20% extra,
                    which in turn would lead to 20% of those
                  resends getting lost, all while the main streams
                    of data try to move forward when some
                  older packet do get over. This is a really bad
                    situation to design for, 
                  

                  I think you should look for a link solution that
                    doesn't drop that many packets, instead of changing
                  the software you try to run over that link, all
                    others will notice this too and act badly in some
                    way or other.
                  

                  Heck, 20% is like taking a math schoolbook and
                    remove all instances of "3" and "8" and see if kids
                    can learn to count from it. 8-/
                   
                
                -- 

                May
                  the most significant bit of your life be positive.

                
          _______________________________________________

          ceph-users mailing list -- ceph-users@xxxxxxx

          To unsubscribe send an email to ceph-users-leave@xxxxxxx

        
      _______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

    
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx