[Jewel] upgrade 10.2.3 => 10.2.5 KO : first OSD server freeze every two days :)

"pascal.pucci@xxxxxxxxxxxxxxx" <pascal.pucci@xxxxxxxxxxxxxxx> · Thu, 2 Mar 2017 15:34:22 +0100



    Hello,
    So, I need maybe some advices : 1 week ago (last 19
        feb), I upgraded my stable Ceph Jewel from 10.2.3 to 10.2.5
        (YES, It was maybe a bad idea).
    I never had problem with Ceph 10.2.3 since last
        upgrade, last 23 September.
    So since my upgrade (10.2.5), every 2 days, the
        first OSD server totaly Freeze. Load go > 500 and come back
        after somes minutes… I lost all OSD from this server (12/36)
        during issue.
    It’s very strange: So, some informations :
    Infrastructure: 
    3 x OSD servers with 12 x OSD disk each and SSD
        Journal + 3 Mon server + 3 clients Ceph - RBD.
    10G dedicated network for client and 10G dedicated
        networks for OSD.
    So 36 x OSD. Each server has 16 CPU core
        (E5-2630v3x2) and 32G Ram. No problem with resources.
    Performance is good for 36 x NL-SAS DISK 4To + 1
        SSD write intensiv per OSD-server.
    Issue:
    This morning (last issue was 2 days ago):
    See screenshot :
http://www.performance-conseil-informatique.net/wp-content/uploads/2017/03/screenshot_LOAD-1.png

        
    As you can see, there are few IO (just 2 clients,
        writing sometime 150Mo/s during few minutes) – It’s a big NAS
        for cold Data.
    So during issue, there was no IO: it's strange.
        Same for other issue.
    See screenshot :
http://www.performance-conseil-informatique.net/wp-content/uploads/2017/03/screenshot_OSD_IO.png

      
    Before issue: no activity. You can see all OSD
        READ, OSD Write, Journal (SSD), IO wait.
    7 :07=>7 :09. 2 minutes with 12/36 OSD totaly
        lost. It come back after, but I need to fix that.
    During time of issue, scrub is stopped as well,
        Trim night was finished… no IO.
    No other cron on server, nothing. all server
      have same configuration.

      
    LOGS : 
    A
          lot :

        
    ceph-osd.3.log:2017-03-02
        07:09:32.061754 7f6d501e4700 -1 osd.3 14557 heartbeat_check: no
        reply from 0x7f6dadb48c10 osd.19 since back 2017-03-02
        07:07:53.286880 front 2017-03-02 07:07:53.286880 (cutoff
        2017-03-02 07:09:12.061690)
     Sometime:
    common/HeartbeatMap.cc: 86: FAILED assert(0 == "hit
        suicide timeout")
     ceph version
        10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
     1:
        (ceph::__ceph_assert_fail(char const*, char const*, int, char
        const*)+0x85) [0x7fc38a5e9425]
     2:
        (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d const*,
        char const*, long)+0x2e1) [0x7fc38a528de1]
     3:
        (ceph::HeartbeatMap::is_healthy()+0xde) [0x7fc38a52963e]
     4:
        (ceph::HeartbeatMap::check_touch_file()+0x2c) [0x7fc38a529e1c]
     5:
        (CephContextServiceThread::entry()+0x15b) [0x7fc38a6011ab]
     6:
        (()+0x7dc5) [0x7fc388304dc5]
     7:
        (clone()+0x6d) [0x7fc38698f73d]
     NOTE: a copy
        of the executable, or `objdump -rdS <executable>` is
        needed to interpret this.
    

     Questions: Why only first
        OSD-server freeze? 3 servers are strictly same. What are
        freezing server, increased load… ?
    Already 4 freezes from the last upgrade. I will
        today modify log level and restart all to have more logs. 

      
    Any idea to troubleshoot ? (I already use sar
        statistics to find something…).
    Maybe some change with heartbeat ?

      
    Should I think to downgrade to 10.2.3 ? upgrade to
        Kraken ?
    Thanks for your help,
    Regards,

      
     Other things :
    rpm -qa|grep ceph
    libcephfs1-10.2.5-0.el7.x86_64
    ceph-common-10.2.5-0.el7.x86_64
    ceph-mon-10.2.5-0.el7.x86_64
    ceph-release-1-1.el7.noarch
    ceph-10.2.5-0.el7.x86_64
    ceph-radosgw-10.2.5-0.el7.x86_64
    ceph-selinux-10.2.5-0.el7.x86_64
    ceph-mds-10.2.5-0.el7.x86_64
    python-cephfs-10.2.5-0.el7.x86_64
    ceph-base-10.2.5-0.el7.x86_64
    ceph-osd-10.2.5-0.el7.x86_64
    uname -a
    Linux ceph-osd-03 3.10.0-514.6.2.el7.x86_64 #1 SMP
        Thu Feb 23 03:04:39 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
    Ceph conf :
    [global]
    fsid = d26f269b-852f-4181-821d-756f213ae155
    mon_initial_members = ceph-mon-01, ceph-mon-02,
        ceph-mon-03
    mon_host =
        192.168.43.147,192.168.43.148,192.168.43.149
    auth_cluster_required = cephx
    auth_service_required = cephx
    auth_client_required = cephx
    max_open_files = 131072
    public_network = 192.168.43.0/24
    cluster_network = 192.168.44.0/24
    osd_journal_size = 13000
    osd_pool_default_size = 2 # Write an object n
        times.
    osd_pool_default_min_size = 2 # Allow writing n
        copy in a degraded state.
    osd_pool_default_pg_num = 512
    osd_pool_default_pgp_num = 512
    osd_crush_chooseleaf_type = 8
    cephx_cluster_require_signatures = true
    cephx_service_require_signatures = false
    mon_pg_warn_max_object_skew = 0
    mon_pg_warn_max_per_osd = 0
    [mon]
    [osd]
    osd_max_backfills = 1
    osd_recovery_priority = 3
    osd_recovery_max_active = 3
    osd_recovery_max_start = 3
    filestore merge threshold = 40
    filestore split multiple = 8
    filestore xattr use omap = true
    osd op threads = 8
    osd disk threads = 4
    osd op num threads per shard = 3
    osd op num shards = 10
    osd map cache size = 1024
    osd_enable_op_tracker = false
    osd_scrub_begin_hour = 20
    osd_scrub_end_hour = 6
    [client]
    rbd_cache = true
    rbd cache size = 67108864 
    rbd cache max dirty = 50331648
    rbd cache target dirty = 33554432 
    rbd cache max dirty age = 2
    rbd cache writethrough until flush = true
    rbd readahead trigger requests = 10 # number of
        sequential requests necessary to trigger readahead.
    rbd readahead max bytes = 524288 # maximum size of
        a readahead request, in bytes.
    rbd readahead disable after bytes = 52428800
    -- 

      
              Performance Conseil Informatique

                Pascal Pucci

                Consultant Infrastructure

                pascal.pucci@xxxxxxxxxxxxxxx

                Mobile : 06 51 47 84 98

                Bureau : 02 85 52 41 81

                http://www.performance-conseil-informatique.net
              
              News :
                  On transforme !
                   Comme promis, en
                    2017 on transforme ! A vos côtés, nous transformons
                    votre infrastructure informatique tout en gardant
                    les fondamentaux PCI : Conti... 
                 
            
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com