Re: dealing with the full osd / help reweight

Jacek Jarosiewicz <jjarosiewicz@xxxxxxxxxxxxx> · Tue, 29 Mar 2016 10:33:41 +0200

Thanks! I've set the parameters to the lower values and now recovery 
process doesn't disrupt the rados gateway!

Regards,
J

On 03/26/2016 04:09 AM, lin zhou wrote:
Yeah,I think the main reason is the setting of pg_num and pgp_num of some key pool.
This site will tell you the correct value:http://ceph.com/pgcalc/

Before you adjust pg_num and pgp_num,if this is a product environment,you should set as Christian Balzer said:
---
osd_max_backfills = 1
osd_backfill_scan_min = 4
osd_backfill_scan_max = 32
osd_recovery_max_active = 1
osd recovery threads = 1
osd recovery op priority = 1
—
you can use 'ceph tell osd.* injectargs “--odd-max-backfills 1” to change this setting.

and recommended process is 8->16->32->64->128->256->…  the more you increase the more data will rebalance,and more time it will take.so be careful.

--
hnuzhoulin2@xxxxxxxxx

在 2016年3月24日 20:57:17, Jacek Jarosiewicz (jjarosiewicz@xxxxxxxxxxxxx) 写到:
[root@cf01 ceph]# ceph osd pool ls detail
pool 0 'vms' replicated size 2 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 67 flags hashpspool stripe_width 0
pool 1 '.rgw.root' replicated size 2 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 117 flags hashpspool
stripe_width 0
pool 2 '.rgw.control' replicated size 2 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 118 flags hashpspool
stripe_width 0
pool 3 '.rgw.gc' replicated size 2 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 119 flags hashpspool
stripe_width 0
pool 4 '.rgw.buckets_cache' replicated size 2 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 121 flags hashpspool
stripe_width 0
pool 5 '.rgw.buckets.index' replicated size 2 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 122 flags hashpspool
stripe_width 0
pool 6 '.rgw.buckets.extra' replicated size 2 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 123 flags hashpspool
stripe_width 0
pool 7 '.log' replicated size 2 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 8 pgp_num 8 last_change 124 flags hashpspool stripe_width 0
pool 8 '.intent-log' replicated size 2 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 125 flags hashpspool
stripe_width 0
pool 9 '.usage' replicated size 2 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 8 pgp_num 8 last_change 126 flags hashpspool stripe_width 0
pool 10 '.users' replicated size 2 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 127 flags hashpspool
stripe_width 0
pool 11 '.users.email' replicated size 2 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 128 flags hashpspool
stripe_width 0
pool 12 '.users.swift' replicated size 2 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 129 flags hashpspool
stripe_width 0
pool 13 '.users.uid' replicated size 2 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 130 flags hashpspool
stripe_width 0
pool 14 '.rgw.buckets' replicated size 2 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 64 pgp_num 64 last_change 5717 flags
hashpspool stripe_width 0
pool 15 '.rgw' replicated size 2 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 8 pgp_num 8 last_change 135 owner 18446744073709551615
flags hashpspool stripe_width 0
pool 17 'one' replicated size 3 min_size 2 crush_ruleset 0 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 611 flags hashpspool
stripe_width 0
removed_snaps [1~e,13~1]

I've tried reweight-by-utilization, but after some data shifting the
cluster came up with a near full osd again..

Do I assume correctly that a lower weight of an osd means 'use that osd
less'?

J

On 03/24/2016 01:43 PM, koukou73gr wrote:
What is your pool size? 304 pgs sound awfuly small for 20 OSDs.
More pgs will help distribute full pgs better.

But with a full or near full OSD in hand, increasing pgs is a no-no
operation. If you search in the list archive, I believe there was a
thread last month or so which provided a walkthrough-sort of for dealing
with uneven distribution and a full OSD.

-K.

On 03/24/2016 01:54 PM, Jacek Jarosiewicz wrote:
disk usage on the full osd is as below. What are the *_TEMP directories
for? How can I make sure which pg directories are safe to remove?

[root@cf04 current]# du -hs *
156G 0.14_head
156G 0.21_head
155G 0.32_head
157G 0.3a_head
155G 0.e_head
156G 0.f_head
40K 10.2_head
4.0K 11.3_head
218G 14.13_head
218G 14.15_head
218G 14.1b_head
219G 14.1f_head
9.1G 14.29_head
219G 14.2a_head
75G 14.2d_head
125G 14.2e_head
113G 14.32_head
163G 14.33_head
218G 14.35_head
151G 14.39_head
218G 14.3b_head
103G 14.3d_head
217G 14.3f_head
219G 14.a_head
773M 17.0_head
814M 17.10_head
4.0K 17.10_TEMP
747M 17.19_head
4.0K 17.19_TEMP
669M 17.1b_head
659M 17.1c_head
638M 17.1f_head
681M 17.30_head
4.0K 17.30_TEMP
721M 17.34_head
695M 17.3d_head
726M 17.3e_head
734M 17.3f_head
4.0K 17.3f_TEMP
670M 17.d_head
597M 17.e_head
4.0K 17.e_TEMP
4.0K 1.7_head
34M 5.1_head
34M 5.6_head
4.0K 9.6_head
4.0K commit_op_seq
30M meta
0 nosnap
614M omap

On 03/24/2016 10:11 AM, Jacek Jarosiewicz wrote:
Hi!

I have a problem with the osds getting full on our cluster.

--
Jacek Jarosiewicz
Administrator Systemów Informatycznych

----------------------------------------------------------------------------------------
SUPERMEDIA Sp. z o.o. z siedzibą w Warszawie
ul. Senatorska 13/15, 00-075 Warszawa
Sąd Rejonowy dla m.st.Warszawy, XII Wydział Gospodarczy Krajowego
Rejestru Sądowego,
nr KRS 0000029537; kapitał zakładowy 44.556.000,00 zł
NIP: 957-05-49-503
Adres korespondencyjny: ul. Jubilerska 10, 04-190 Warszawa

----------------------------------------------------------------------------------------
SUPERMEDIA -> http://www.supermedia.pl
dostep do internetu - hosting - kolokacja - lacza - telefonia
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
Jacek Jarosiewicz
Administrator Systemów Informatycznych

----------------------------------------------------------------------------------------
SUPERMEDIA Sp. z o.o. z siedzibą w Warszawie
ul. Senatorska 13/15, 00-075 Warszawa
Sąd Rejonowy dla m.st.Warszawy, XII Wydział Gospodarczy Krajowego 
Rejestru Sądowego,
nr KRS 0000029537; kapitał zakładowy 44.556.000,00 zł
NIP: 957-05-49-503
Adres korespondencyjny: ul. Jubilerska 10, 04-190 Warszawa

----------------------------------------------------------------------------------------
SUPERMEDIA ->   http://www.supermedia.pl
dostep do internetu - hosting - kolokacja - lacza - telefonia
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com