[Jewel] upgrade 10.2.3 => 10.2.5 KO : first OSD server freeze every two days :)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

So, I need maybe some advices : 1 week ago (last 19 feb), I upgraded my stable Ceph Jewel from 10.2.3 to 10.2.5 (YES, It was maybe a bad idea).

I never had problem with Ceph 10.2.3 since last upgrade, last 23 September.

So since my upgrade (10.2.5), every 2 days, the first OSD server totaly Freeze. Load go > 500 and come back after somes minutes… I lost all OSD from this server (12/36) during issue.

It’s very strange: So, some informations :

Infrastructure:

3 x OSD servers with 12 x OSD disk each and SSD Journal + 3 Mon server + 3 clients Ceph - RBD.

10G dedicated network for client and 10G dedicated networks for OSD.

So 36 x OSD. Each server has 16 CPU core (E5-2630v3x2) and 32G Ram. No problem with resources.

Performance is good for 36 x NL-SAS DISK 4To + 1 SSD write intensiv per OSD-server.

Issue:

This morning (last issue was 2 days ago):

See screenshot : http://www.performance-conseil-informatique.net/wp-content/uploads/2017/03/screenshot_LOAD-1.png

As you can see, there are few IO (just 2 clients, writing sometime 150Mo/s during few minutes) – It’s a big NAS for cold Data.

So during issue, there was no IO: it's strange. Same for other issue.

See screenshot : http://www.performance-conseil-informatique.net/wp-content/uploads/2017/03/screenshot_OSD_IO.png

Before issue: no activity. You can see all OSD READ, OSD Write, Journal (SSD), IO wait.

7 :07=>7 :09. 2 minutes with 12/36 OSD totaly lost. It come back after, but I need to fix that.

During time of issue, scrub is stopped as well, Trim night was finished… no IO.

No other cron on server, nothing. all server have same configuration.

LOGS :

A lot :

ceph-osd.3.log:2017-03-02 07:09:32.061754 7f6d501e4700 -1 osd.3 14557 heartbeat_check: no reply from 0x7f6dadb48c10 osd.19 since back 2017-03-02 07:07:53.286880 front 2017-03-02 07:07:53.286880 (cutoff 2017-03-02 07:09:12.061690)

 Sometime:

common/HeartbeatMap.cc: 86: FAILED assert(0 == "hit suicide timeout")

 ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)

 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x7fc38a5e9425]

 2: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d const*, char const*, long)+0x2e1) [0x7fc38a528de1]

 3: (ceph::HeartbeatMap::is_healthy()+0xde) [0x7fc38a52963e]

 4: (ceph::HeartbeatMap::check_touch_file()+0x2c) [0x7fc38a529e1c]

 5: (CephContextServiceThread::entry()+0x15b) [0x7fc38a6011ab]

 6: (()+0x7dc5) [0x7fc388304dc5]

 7: (clone()+0x6d) [0x7fc38698f73d]

 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


 Questions: Why only first OSD-server freeze? 3 servers are strictly same. What are freezing server, increased load… ?

Already 4 freezes from the last upgrade. I will today modify log level and restart all to have more logs.

Any idea to troubleshoot ? (I already use sar statistics to find something…).

Maybe some change with heartbeat ?

Should I think to downgrade to 10.2.3 ? upgrade to Kraken ?

Thanks for your help,

Regards,

 Other things :

rpm -qa|grep ceph

libcephfs1-10.2.5-0.el7.x86_64

ceph-common-10.2.5-0.el7.x86_64

ceph-mon-10.2.5-0.el7.x86_64

ceph-release-1-1.el7.noarch

ceph-10.2.5-0.el7.x86_64

ceph-radosgw-10.2.5-0.el7.x86_64

ceph-selinux-10.2.5-0.el7.x86_64

ceph-mds-10.2.5-0.el7.x86_64

python-cephfs-10.2.5-0.el7.x86_64

ceph-base-10.2.5-0.el7.x86_64

ceph-osd-10.2.5-0.el7.x86_64

uname -a

Linux ceph-osd-03 3.10.0-514.6.2.el7.x86_64 #1 SMP Thu Feb 23 03:04:39 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Ceph conf :

[global]

fsid = d26f269b-852f-4181-821d-756f213ae155

mon_initial_members = ceph-mon-01, ceph-mon-02, ceph-mon-03

mon_host = 192.168.43.147,192.168.43.148,192.168.43.149

auth_cluster_required = cephx

auth_service_required = cephx

auth_client_required = cephx

max_open_files = 131072

public_network = 192.168.43.0/24

cluster_network = 192.168.44.0/24

osd_journal_size = 13000

osd_pool_default_size = 2 # Write an object n times.

osd_pool_default_min_size = 2 # Allow writing n copy in a degraded state.

osd_pool_default_pg_num = 512

osd_pool_default_pgp_num = 512

osd_crush_chooseleaf_type = 8

cephx_cluster_require_signatures = true

cephx_service_require_signatures = false

mon_pg_warn_max_object_skew = 0

mon_pg_warn_max_per_osd = 0

[mon]

[osd]

osd_max_backfills = 1

osd_recovery_priority = 3

osd_recovery_max_active = 3

osd_recovery_max_start = 3

filestore merge threshold = 40

filestore split multiple = 8

filestore xattr use omap = true

osd op threads = 8

osd disk threads = 4

osd op num threads per shard = 3

osd op num shards = 10

osd map cache size = 1024

osd_enable_op_tracker = false

osd_scrub_begin_hour = 20

osd_scrub_end_hour = 6

[client]

rbd_cache = true

rbd cache size = 67108864

rbd cache max dirty = 50331648

rbd cache target dirty = 33554432

rbd cache max dirty age = 2

rbd cache writethrough until flush = true

rbd readahead trigger requests = 10 # number of sequential requests necessary to trigger readahead.

rbd readahead max bytes = 524288 # maximum size of a readahead request, in bytes.

rbd readahead disable after bytes = 52428800

--
Performance Conseil Informatique
Pascal Pucci
Consultant Infrastructure
pascal.pucci@xxxxxxxxxxxxxxx
Mobile : 06 51 47 84 98
Bureau : 02 85 52 41 81
http://www.performance-conseil-informatique.net
News :
Comme promis, en 2017 on transforme ! A vos côtés, nous transformons votre infrastructure informatique tout en gardant les fondamentaux PCI : Conti...
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux