Re: Stuck pgs (activating+remapped) and slow requests after adding OSD node via ceph-ansible

Peter Linder <peter.linder@xxxxxxxxxxxxxx> · Mon, 22 Jan 2018 22:46:53 +0100



    Did you find out anything about this? We are also getting pgs
      stuck "activating+remapped". I have to manually alter bucket
      weights so that they are basically the same everywhere, even if
      disks aren't the same size to fix the problem, but it is a real
      hassle every time we add a new node or disk.
    See my email subject "Weird issues related to (large/small)
      weights in mixed nvme/hdd pool" from 2018-01-20 and see if there
      are some similarities?

    
    Regards,

    Peter

    
    Den 2018-01-07 kl. 12:17, skrev Tzachi
      Strul:

    
      Hi all,
        We have 5 node ceph cluster (Luminous 12.2.1) installed via
          ceph-ansible.
        All servers have 16X1.5TB SSD disks. 
        3 of these servers are also acting as MON+MGRs.
        We don't have separated network for cluster and public,
          each node has 4 NICs bonded together (40G) and serves
          cluster+public communication (We know it's not ideal and
          planning to change it).
        

        Last week we added another node to cluster (another
          16*1.5TB ssd).
        We used ceph-ansible latest stable release.
        After OSD activation cluster started rebalancing and
          problems began:
        1. Cluster entered HEALTH_ERROR state
        2. 67 pgs stuck at activating+remapped
        3. A lot of blocked slow requests.
        

        This cluster serves OpenStack volumes and almost all
          OpenStack instances got 100% disk utilization and hanged,
          eventually, cinder-volume has crushed.
        

        Eventually, after restarting several OSDs, problem solved
          and cluster got back to HEALTH_OK
        

        Our configuration already has:
        
          osd max backfills = 1
          osd max scrubs = 1
        
        
          osd recovery max active = 1
        
        osd recovery op priority = 1

        
        In addition, we see a lot of bad mappings:
        for example: bad mapping rule 0 x 52 num_rep 8 result
          [32,5,78,25,96,59,80]

        
        What can be the cause and what can I do in order to avoid
          this situation? we need to add another 9 osd servers and can't
          afford downtime.
        

        Any help would be appreciated. Thank you very much
        

        Our ceph configuration:
        

          [mgr]
          mgr_modules = dashboard zabbix
          

          [global]
          cluster network = *removed for security resons*
          fsid =  *removed for security resons*
          mon host =  *removed for security resons*
          mon initial members =  *removed for security resons*
          mon osd down out interval = 900
          osd pool default size = 3
          public network =  *removed for security resons*
          

          [client.libvirt]
          admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok
            # must be writable by QEMU and allowed by SELinux or
            AppArmor
          log file = /var/log/ceph/qemu-guest-$pid.log # must
            be writable by QEMU and allowed by SELinux or AppArmor
          

          [osd]
          osd backfill scan max = 16
          osd backfill scan min = 4
          osd bluestore cache size = 104857600  **Due to 12.2.1
            bluestore memory leak bug** 
          osd max backfills = 1
          osd max scrubs = 1
          osd recovery max active = 1
          osd recovery max single start = 1
          osd recovery op priority = 1
          osd recovery threads = 1
        
        
          --
          
            
                Tzachi
                      Strul
                Storage
                    DevOps // Kenshoo
                
              
      This e-mail, as well as any attached document, may contain
      material which is confidential and privileged and may include
      trademark, copyright and other intellectual property rights that
      are proprietary to Kenshoo Ltd,  its subsidiaries or affiliates
      ("Kenshoo"). This e-mail and its attachments may be read, copied
      and used only by the addressee for the purpose(s) for which it was
      disclosed herein. If you have received it in error, please destroy
      the message and any attachment, and contact us immediately. If you
      are not the intended recipient, be aware that any review,
      reliance, disclosure, copying, distribution or use of the contents
      of this message without Kenshoo's express permission is strictly
      prohibited.
      

      _______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

    
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com