Hi Sebastien
Thanks for you reply , yes undersize pgs and recovery in process becuase of we added new osd after getting 2 OSD is near full warning . Yes newly added osd is reblancing the size.
[root@intcfs-osd6 ~]# ceph osd df
ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
0 3.29749 1.00000 3376G 2875G 501G 85.15 1.26 165
1 3.26869 1.00000 3347G 1923G 1423G 57.46 0.85 152
2 3.27339 1.00000 3351G 1980G 1371G 59.08 0.88 161
3 3.24089 1.00000 3318G 2130G 1187G 64.21 0.95 168
4 3.24089 1.00000 3318G 2997G 320G 90.34 1.34 176
5 3.32669 1.00000 3406G 2466G 939G 72.42 1.07 165
6 3.27800 1.00000 3356G 1463G 1893G 43.60 0.65 166
ceph osd crush rule dump
[
{
"rule_id": 0,
"rule_name": "replicated_ruleset",
"ruleset": 0,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
}
]
ceph version 10.2.2 and ceph version 10.2.9
ceph osd pool ls detail
pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
pool 3 'downloads_data' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 250 pgp_num 250 last_change 39 flags hashpspool crash_replay_interval 45 stripe_width 0
pool 4 'downloads_metadata' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 250 pgp_num 250 last_change 36 flags hashpspool stripe_width 0
---- On Sun, 12 Nov 2017 15:04:02 +0530 Sébastien VIGNERON <sebastien.vigneron@xxxxxxxxx> wrote ----
Hi,Can you share:- your placement rules: ceph osd crush rule dump- your CEPH version: ceph versions- your pools definitions: ceph osd pool ls detailWith these we can determine is your pgs are stuck because of a misconfiguration or something else.You seems to have some undersized pgs and a recovery in process. Does your OSDs showed some rebalance of your datas? Does your OSDs use percentage change over time? (changes in "ceph osd df")Cordialement / Best regards,Sébastien VIGNERONCRIANN,Ingénieur / EngineerTechnopôle du Madrillet745, avenue de l'Université76800 Saint-Etienne du Rouvray - Francetél. +33 2 32 91 42 91fax. +33 2 32 91 42 92support: support@xxxxxxxxxLe 12 nov. 2017 à 10:04, gjprabu <gjprabu@xxxxxxxxxxxx> a écrit :Hi Team,We have ceph setup with 6 OSD and we got alert with 2 OSD is near full . We faced issue like slow in accessing ceph from client. So i have added 7th OSD and still 2 OSD is showing near full ( OSD.0 and OSD.4) , I have restarted ceph service in osd.0 and osd.4 . Kindly check the below ceph osd status and please provide us the solutions.# ceph health detailHEALTH_WARN 46 pgs backfill_wait; 1 pgs backfilling; 32 pgs degraded; 50 pgs stuck unclean; 32 pgs undersized; recovery 1098780/40253637 objects degraded (2.730%); recovery 3401433/40253637 objects misplaced (8.450%); 2 near full osd(s); mds0: Client integ-hm3 failing to respond to cache pressure; mds0: Client integ-hm8 failing to respond to cache pressure; mds0: Client integ-hm2 failing to respond to cache pressure; mds0: Client integ-hm9 failing to respond to cache pressure; mds0: Client integ-hm5 failing to respond to cache pressure; mds0: Client integ-hm9-bkp failing to respond to cache pressure; mds0: Client me-build1-bkp failing to respond to cache pressurepg 3.f6 is stuck unclean for 511223.069161, current state active+undersized+degraded+remapped+wait_backfill, last acting [2]pg 4.f6 is stuck unclean for 511232.770419, current state active+undersized+degraded+remapped+wait_backfill, last acting [2]pg 3.ec is stuck unclean for 510902.815668, current state active+undersized+degraded+remapped+wait_backfill, last acting [2]pg 3.eb is stuck unclean for 511285.576487, current state active+remapped+wait_backfill, last acting [3,0]pg 4.17 is stuck unclean for 511235.326709, current state active+undersized+degraded+remapped+wait_backfill, last acting [1]pg 4.2f is stuck unclean for 511232.356371, current state active+undersized+degraded+remapped+wait_backfill, last acting [2]pg 4.3d is stuck unclean for 511300.446982, current state active+remapped, last acting [3,0]pg 4.93 is stuck unclean for 511295.539229, current state active+undersized+degraded+remapped+wait_backfill, last acting [3]pg 3.47 is stuck unclean for 511288.104965, current state active+remapped+wait_backfill, last acting [3,0]pg 4.d5 is stuck unclean for 510916.509825, current state active+undersized+degraded+remapped+wait_backfill, last acting [2]pg 3.31 is stuck unclean for 511221.542878, current state active+remapped+wait_backfill, last acting [0,3]pg 3.62 is stuck unclean for 511221.551662, current state active+undersized+degraded+remapped+wait_backfill, last acting [4]pg 4.4d is stuck unclean for 511232.279602, current state active+undersized+degraded+remapped+wait_backfill, last acting [2]pg 4.48 is stuck unclean for 510911.095367, current state active+remapped+wait_backfill, last acting [5,4]pg 3.4f is stuck unclean for 511226.712285, current state active+undersized+degraded+remapped+wait_backfill, last acting [1]pg 3.78 is stuck unclean for 511221.531199, current state active+undersized+degraded+remapped+wait_backfill, last acting [2]pg 3.24 is stuck unclean for 510903.483324, current state active+remapped+backfilling, last acting [1,2]pg 4.8c is stuck unclean for 511231.668693, current state active+undersized+degraded+remapped+wait_backfill, last acting [1]pg 3.b4 is stuck unclean for 511222.612012, current state active+undersized+degraded+remapped+wait_backfill, last acting [0]pg 4.41 is stuck unclean for 511287.031264, current state active+remapped+wait_backfill, last acting [3,2]pg 3.d1 is stuck unclean for 510903.797329, current state active+remapped+wait_backfill, last acting [0,3]pg 3.7f is stuck unclean for 511222.929722, current state active+undersized+degraded+remapped+wait_backfill, last acting [1]pg 4.af is stuck unclean for 511262.494659, current state active+undersized+degraded+remapped, last acting [0]pg 3.66 is stuck unclean for 510903.296711, current state active+remapped+wait_backfill, last acting [3,0]pg 3.76 is stuck unclean for 511224.615144, current state active+undersized+degraded+remapped+wait_backfill, last acting [3]pg 4.57 is stuck unclean for 511234.514343, current state active+remapped, last acting [0,4]pg 3.69 is stuck unclean for 511224.672085, current state active+undersized+degraded+remapped+wait_backfill, last acting [4]pg 3.9a is stuck unclean for 510967.300000, current state active+remapped+wait_backfill, last acting [3,2]pg 4.50 is stuck unclean for 510903.825565, current state active+undersized+degraded+remapped+wait_backfill, last acting [1]pg 4.53 is stuck unclean for 510921.975268, current state active+undersized+degraded+remapped+wait_backfill, last acting [2]pg 3.e7 is stuck unclean for 511221.530592, current state active+undersized+degraded+remapped+wait_backfill, last acting [2]pg 4.6a is stuck unclean for 510911.284877, current state active+undersized+degraded+remapped+wait_backfill, last acting [0]pg 4.16 is stuck unclean for 511232.702762, current state active+undersized+degraded+remapped+wait_backfill, last acting [1]pg 3.2c is stuck unclean for 511222.443893, current state active+remapped+wait_backfill, last acting [2,3]pg 4.89 is stuck unclean for 511228.846614, current state active+undersized+degraded+remapped+wait_backfill, last acting [4]pg 4.39 is stuck unclean for 511239.544231, current state active+remapped+wait_backfill, last acting [3,2]pg 4.ce is stuck unclean for 511232.294586, current state active+undersized+degraded+remapped+wait_backfill, last acting [1]pg 3.91 is stuck unclean for 511232.341380, current state active+undersized+degraded+remapped+wait_backfill, last acting [2]pg 3.96 is stuck unclean for 510904.043900, current state active+undersized+degraded+remapped+wait_backfill, last acting [2]pg 4.c0 is stuck unclean for 510904.253281, current state active+undersized+degraded+remapped+wait_backfill, last acting [2]pg 4.9c is stuck unclean for 511237.612850, current state active+undersized+degraded+remapped+wait_backfill, last acting [1]pg 3.ab is stuck unclean for 510960.756324, current state active+remapped+wait_backfill, last acting [3,2]pg 4.aa is stuck unclean for 511229.307559, current state active+remapped+wait_backfill, last acting [0,3]pg 3.ad is stuck unclean for 510903.764157, current state active+remapped+wait_backfill, last acting [0,3]pg 3.b5 is stuck unclean for 511226.560774, current state active+undersized+degraded+remapped+wait_backfill, last acting [3]pg 4.58 is stuck unclean for 510919.273667, current state active+undersized+degraded+remapped+wait_backfill, last acting [1]pg 4.b9 is stuck unclean for 511232.760066, current state active+remapped+wait_backfill, last acting [5,4]pg 3.be is stuck unclean for 511224.422931, current state active+remapped+wait_backfill, last acting [0,4]pg 4.d4 is stuck unclean for 510962.810416, current state active+undersized+degraded+remapped+wait_backfill, last acting [3]pg 4.da is stuck unclean for 511259.506962, current state active+undersized+degraded+remapped+wait_backfill, last acting [2]pg 4.8c is active+undersized+degraded+remapped+wait_backfill, acting [1]pg 3.7f is active+undersized+degraded+remapped+wait_backfill, acting [1]pg 3.78 is active+undersized+degraded+remapped+wait_backfill, acting [2]pg 3.76 is active+undersized+degraded+remapped+wait_backfill, acting [3]pg 4.6a is active+undersized+degraded+remapped+wait_backfill, acting [0]pg 3.69 is active+undersized+degraded+remapped+wait_backfill, acting [4]pg 3.66 is active+remapped+wait_backfill, acting [3,0]pg 3.62 is active+undersized+degraded+remapped+wait_backfill, acting [4]pg 4.58 is active+undersized+degraded+remapped+wait_backfill, acting [1]pg 4.50 is active+undersized+degraded+remapped+wait_backfill, acting [1]pg 4.53 is active+undersized+degraded+remapped+wait_backfill, acting [2]pg 3.4f is active+undersized+degraded+remapped+wait_backfill, acting [1]pg 4.48 is active+remapped+wait_backfill, acting [5,4]pg 4.4d is active+undersized+degraded+remapped+wait_backfill, acting [2]pg 3.47 is active+remapped+wait_backfill, acting [3,0]pg 4.41 is active+remapped+wait_backfill, acting [3,2]pg 3.31 is active+remapped+wait_backfill, acting [0,3]pg 4.2f is active+undersized+degraded+remapped+wait_backfill, acting [2]pg 3.24 is active+remapped+backfilling, acting [1,2]pg 4.17 is active+undersized+degraded+remapped+wait_backfill, acting [1]pg 4.16 is active+undersized+degraded+remapped+wait_backfill, acting [1]pg 3.2c is active+remapped+wait_backfill, acting [2,3]pg 4.39 is active+remapped+wait_backfill, acting [3,2]pg 4.89 is active+undersized+degraded+remapped+wait_backfill, acting [4]pg 3.91 is active+undersized+degraded+remapped+wait_backfill, acting [2]pg 4.93 is active+undersized+degraded+remapped+wait_backfill, acting [3]pg 3.96 is active+undersized+degraded+remapped+wait_backfill, acting [2]pg 3.9a is active+remapped+wait_backfill, acting [3,2]pg 4.9c is active+undersized+degraded+remapped+wait_backfill, acting [1]pg 4.af is active+undersized+degraded+remapped, acting [0]pg 3.ab is active+remapped+wait_backfill, acting [3,2]pg 4.aa is active+remapped+wait_backfill, acting [0,3]pg 3.ad is active+remapped+wait_backfill, acting [0,3]pg 3.b4 is active+undersized+degraded+remapped+wait_backfill, acting [0]pg 3.b5 is active+undersized+degraded+remapped+wait_backfill, acting [3]pg 4.b9 is active+remapped+wait_backfill, acting [5,4]pg 3.be is active+remapped+wait_backfill, acting [0,4]pg 4.c0 is active+undersized+degraded+remapped+wait_backfill, acting [2]pg 4.ce is active+undersized+degraded+remapped+wait_backfill, acting [1]pg 3.d1 is active+remapped+wait_backfill, acting [0,3]pg 4.d5 is active+undersized+degraded+remapped+wait_backfill, acting [2]pg 4.d4 is active+undersized+degraded+remapped+wait_backfill, acting [3]pg 4.da is active+undersized+degraded+remapped+wait_backfill, acting [2]pg 3.e7 is active+undersized+degraded+remapped+wait_backfill, acting [2]pg 3.eb is active+remapped+wait_backfill, acting [3,0]pg 3.ec is active+undersized+degraded+remapped+wait_backfill, acting [2]pg 4.f6 is active+undersized+degraded+remapped+wait_backfill, acting [2]pg 3.f6 is active+undersized+degraded+remapped+wait_backfill, acting [2]recovery 1098780/40253637 objects degraded (2.730%)recovery 3401433/40253637 objects misplaced (8.450%)osd.0 is near full at 85%osd.4 is near full at 90%mds0: Client integ-hm3 failing to respond to cache pressure(client_id: 733998)mds0: Client integ-hm8 failing to respond to cache pressure(client_id: 843866)mds0: Client integ-hm2 failing to respond to cache pressure(client_id: 844939)mds0: Client integ-hm9 failing to respond to cache pressure(client_id: 845065)mds0: Client integ-hm5 failing to respond to cache pressure(client_id: 845068)mds0: Client integ-hm9-bkp failing to respond to cache pressure(client_id: 895898)mds0: Client me-build1-bkp failing to respond to cache pressure(client_id: 888666)hm ~]# ceph osd treeID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY-1 22.92604 root default-2 3.29749 host intcfs-osd10 3.29749 osd.0 up 1.00000 1.00000-3 3.26869 host intcfs-osd21 3.26869 osd.1 up 1.00000 1.00000-4 3.27339 host intcfs-osd32 3.27339 osd.2 up 1.00000 1.00000-5 3.24089 host intcfs-osd43 3.24089 osd.3 up 1.00000 1.00000-6 3.24089 host intcfs-osd54 3.24089 osd.4 up 1.00000 1.00000-7 3.32669 host intcfs-osd65 3.32669 osd.5 up 1.00000 1.00000-8 3.27800 host intcfs-osd76 3.27800 osd.6 up 1.00000 1.00000hm5 ~]# ceph osd dfID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS0 3.29749 1.00000 3376G 2874G 502G 85.13 1.26 1651 3.26869 1.00000 3347G 1922G 1424G 57.44 0.85 1522 3.27339 1.00000 3351G 2009G 1342G 59.95 0.89 1623 3.24089 1.00000 3318G 2130G 1188G 64.19 0.95 1684 3.24089 1.00000 3318G 2996G 321G 90.30 1.34 1765 3.32669 1.00000 3406G 2465G 940G 72.39 1.07 1656 3.27800 1.00000 3356G 1435G 1921G 42.76 0.63 166TOTAL 23476G 15834G 7641G 67.45MIN/MAX VAR: 0.63/1.34 STDDEV: 15.29RegardsPrabu GJ_______________________________________________ceph-users mailing list
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com