Hi all, I have very slow IO operations on my cluster when i stop a node for maintenance. I know my cluster can support the charge with only 2 on my 4 nodes. I set the noout flag with "ceph osd set noout" From my understanding, stop only one CEPH node should not have big issue. BUT when i down the node : * I see some pg move to active+degraded, this is normal i presume. * The load of my VM massively increases, the IO wait too. * Some VM become unusable What's the best solution to do a maintenance on my node without interuption on service? My cluster : * 4 nodes & 3 monitors * 9 OSD by node = 36 OSD * 3 SSD on each node to store journal (1SSD to 3 OSD) * 2 pools : images (202 PG) & volumes (822 PG) for 1024pg (we have to increase pg to 2048) * Average 2000 IOPS My ceph.conf : [global] fsid = a4f460da-ec6e-459a-afb7-cddd565d131d mon_initial_members = gontran,donald,lugia mon_host = 172.28.0.8,172.28.0.9,172.28.0.7 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx filestore_xattr_use_omap = true osd_pool_default_size = 3 # Write an object 3 times. osd_pool_default_min_size = 1 # Allow writing one copy in a degraded state. osd_pool_default_pg_num = 200 osd_pool_default_pgp_num = 200 osd_recovery_op_priority = 2 osd_max_backfills = 1 osd_recovery_max_active = 1 # Disable debug to increase IOPS debug lockdep = 0/0 debug context = 0/0 debug crush = 0/0 debug buffer = 0/0 debug timer = 0/0 debug journaler = 0/0 debug osd = 0/0 debug optracker = 0/0 debug objclass = 0/0 debug filestore = 0/0 debug journal = 0/0 debug ms = 0/0 debug monc = 0/0 debug tp = 0/0 debug auth = 0/0 debug finisher = 0/0 debug heartbeatmap = 0/0 debug perfcounter = 0/0 debug asok = 0/0 debug throttle = 0/0 [mon] mon host = gontran,donald,lugia mon addr = 172.28.0.8:6789,172.28.0.9:6789,172.28.0.7:6789 mon warn on legacy crush tunables = false debug lockdep = 0/0 debug context = 0/0 debug crush = 0/0 debug buffer = 0/0 debug timer = 0/0 debug journaler = 0/0 debug osd = 0/0 debug optracker = 0/0 debug objclass = 0/0 debug filestore = 0/0 debug journal = 0/0 debug ms = 0/0 debug monc = 0/0 debug tp = 0/0 debug auth = 0/0 debug finisher = 0/0 debug heartbeatmap = 0/0 debug perfcounter = 0/0 debug asok = 0/0 debug throttle = 0/0 [osd] osd_mkfs_type = xfs osd_filestore_xattr_use_omap = true osd_journal_size = 10000 debug lockdep = 0/0 debug context = 0/0 debug crush = 0/0 debug buffer = 0/0 debug timer = 0/0 debug journaler = 0/0 debug osd = 0/0 debug optracker = 0/0 debug objclass = 0/0 debug filestore = 0/0 debug journal = 0/0 debug ms = 0/0 debug monc = 0/0 debug tp = 0/0 debug auth = 0/0 debug finisher = 0/0 debug heartbeatmap = 0/0 debug perfcounter = 0/0 debug asok = 0/0 debug throttle = 0/0 [osd.0] host = gontran devs = /dev/sda3 journal = /dev/sdj [osd.1] host = gontran devs = /dev/sdb3 journal = /dev/sdj [osd.2] host = gontran devs = /dev/sdc3 journal = /dev/sdj [osd.3] host = gontran devs = /dev/sdd3 journal = /dev/sdk [osd.4] host = gontran devs = /dev/sde3 journal = /dev/sdk [osd.5] host = gontran devs = /dev/sdf3 journal = /dev/sdk [osd.6] host = gontran devs = /dev/sdg3 journal = /dev/sdl [osd.7] host = gontran devs = /dev/sdh3 journal = /dev/sdl [osd.8] host = gontran devs = /dev/sdi3 journal = /dev/sdl [osd.9] host = donald devs = /dev/sda4 journal = /dev/sdj [osd.10] host = donald devs = /dev/sdb4 journal = /dev/sdj [osd.11] host = donald devs = /dev/sdc4 journal = /dev/sdj [osd.12] host = donald devs = /dev/sdd4 journal = /dev/sdk [osd.13] host = donald devs = /dev/sde4 journal = /dev/sdk [osd.14] host = donald devs = /dev/sdf4 journal = /dev/sdk [osd.15] host = donald devs = /dev/sdg4 journal = /dev/sdl [osd.16] host = donald devs = /dev/sdh4 journal = /dev/sdl [osd.17] host = donald devs = /dev/sdi4 journal = /dev/sdl [osd.18] host = popop devs = /dev/sda4 journal = /dev/sdj [osd.19] host = popop devs = /dev/sdb4 journal = /dev/sdj [osd.20] host = popop devs = /dev/sdc4 journal = /dev/sdj [osd.21] host = popop devs = /dev/sdd4 journal = /dev/sdk [osd.22] host = popop devs = /dev/sde4 journal = /dev/sdk [osd.23] host = popop devs = /dev/sdf4 journal = /dev/sdk [osd.24] host = popop devs = /dev/sdg4 journal = /dev/sdl [osd.25] host = popop devs = /dev/sdh4 journal = /dev/sdl [osd.26] host = popop devs = /dev/sdi4 journal = /dev/sdl [osd.27] host = daisy devs = /dev/sda4 journal = /dev/sdj [osd.28] host = daisy devs = /dev/sdb4 journal = /dev/sdj [osd.29] host = daisy devs = /dev/sdc4 journal = /dev/sdj [osd.30] host = daisy devs = /dev/sdd4 journal = /dev/sdk [osd.31] host = daisy devs = /dev/sde4 journal = /dev/sdk [osd.32] host = daisy devs = /dev/sdf4 journal = /dev/sdk [osd.33] host = daisy devs = /dev/sdg4 journal = /dev/sdl [osd.34] host = daisy devs = /dev/sdh4 journal = /dev/sdl [osd.35] host = daisy devs = /dev/sdi4 journal = /dev/sdl ceph -s [root at gontran.ovh:~] # ceph -s cluster a4f460da-ec6e-459a-afb7-cddd565d131d health HEALTH_WARN noscrub,nodeep-scrub flag(s) set monmap e13: 3 mons at {donald=172.28.0.9:6789/0,gontran=172.28.0.8:6789/0,lugia=172.28.0.7:6789/0}, election epoch 56040, quorum 0,1,2 lugia,gontran,donald osdmap e11058: 36 osds: 36 up, 36 in flags noscrub,nodeep-scrub pgmap v11009904: 1024 pgs, 2 pools, 5467 GB data, 1331 kobjects 16387 GB used, 114 TB / 130 TB avail 1024 active+clean client io 1712 kB/s wr, 231 op/s Thanks you! -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140926/b40eca75/attachment.htm>