Hi,I'm wondering why slow requests are being reported mainly when the request has been put into the queue for processing by its PG (queued_for_pg , http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/#debugging-slow-request).
Could it be due too low pg_num/pgp_num ?
It looks that slow requests are mainly addressed to default.rgw.buckets.data (pool id 20) , volumes (pool id 3) and default.rgw.buckets.index (pool id 14)
2018-01-31 12:06:55.899557 osd.59 osd.59 10.212.32.22:6806/4413 38 : cluster [WRN] slow request 30.125793 seconds old, received at 2018-01-31 12:06:25.773675: osd_op(client.857003.0:126171692 3.a4fec1ad 3.a4fec1ad (undecoded) ack+ondisk+write+known_if_redirected e5722) currently queued_for_pg
Btw how can I get more human-friendly client information from log entry like above ?
Current pg_num/pgp_num
pool 3 'volumes' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 4502 flags hashpspool stripe_width 0 application rbd removed_snaps [1~3]
pool 14 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool stripe_width 0 application rgw
pool 20 'default.rgw.buckets.data' erasure size 9 min_size 6 crush_rule 1 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 4502 flags hashpspool stripe_width 4224 application rgw
Usage
GLOBAL:
SIZE AVAIL RAW USED %RAW USED OBJECTS
385T 144T 241T 62.54 31023k
POOLS:
NAME ID QUOTA OBJECTS QUOTA BYTES USED %USED MAX AVAIL OBJECTS DIRTY READ WRITE RAW USED
volumes 3 N/A N/A 40351G 70.91 16557G 10352314 10109k 2130M 2520M 118T
default.rgw.buckets.index 14 N/A N/A 0 0 16557G 205 205 160M 27945k 0
default.rgw.buckets.data 20 N/A N/A 79190G 70.51 33115G 20865953 20376k 122M 113M 116T
# ceph osd pool ls detail
pool 0 'rbd' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 4502 flags hashpspool stripe_width 0 application rbd
pool 1 'vms' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 4502 flags hashpspool stripe_width 0 application rbd
pool 2 'images' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 512 pgp_num 512 last_change 5175 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~7,14~2]
pool 3 'volumes' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 4502 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool stripe_width 0 application rgw
pool 5 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool stripe_width 0 application rgw
pool 6 'default.rgw.data.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool stripe_width 0 application rgw
pool 7 'default.rgw.gc' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool stripe_width 0 application rgw
pool 8 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool stripe_width 0 application rgw
pool 9 'default.rgw.users.uid' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool stripe_width 0 application rgw
pool 10 'default.rgw.usage' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool stripe_width 0 application rgw
pool 11 'default.rgw.users.email' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 owner 18446744073709551615 flags hashpspool stripe_width 0 application rgw
pool 12 'default.rgw.users.keys' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 owner 18446744073709551615 flags hashpspool stripe_width 0 application rgw
pool 13 'default.rgw.users.swift' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool stripe_width 0 application rgw
pool 14 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool stripe_width 0 application rgw
pool 15 'default.rgw.buckets.data.old' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 4502 flags hashpspool stripe_width 0 application rgw
pool 16 'default.rgw.buckets.non-ec' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool stripe_width 0 application rgw
pool 17 'default.rgw.buckets.extra' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool stripe_width 0 application rgw
pool 18 '.rgw.buckets.extra' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool stripe_width 0 application rgw
pool 20 'default.rgw.buckets.data' erasure size 9 min_size 6 crush_rule 1 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 4502 flags hashpspool stripe_width 4224 application rgw
pool 21 'benchmark_replicated' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 4550 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
pool 22 'benchmark_erasure_coded' erasure size 9 min_size 7 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 last_change 4552 flags hashpspool stripe_width 24576 application rbd
removed_snaps [1~3]
# ceph df detail
GLOBAL:
SIZE AVAIL RAW USED %RAW USED OBJECTS
385T 144T 241T 62.54 31023k
POOLS:
NAME ID QUOTA OBJECTS QUOTA BYTES USED %USED MAX AVAIL OBJECTS DIRTY READ WRITE RAW USED
rbd 0 N/A N/A 0 0 16557G 0 0 1 134k 0
vms 1 N/A N/A 0 0 16557G 0 0 0 0 0
images 2 N/A N/A 7659M 0.05 16557G 1022 1022 51247 5668 22977M
volumes 3 N/A N/A 40351G 70.91 16557G 10352314 10109k 2130M 2520M 118T
.rgw.root 4 N/A N/A 1588 0 16557G 4 4 90 4 4764
default.rgw.control 5 N/A N/A 0 0 16557G 8 8 0 0 0
default.rgw.data.root 6 N/A N/A 93943 0 16557G 336 336 239k 6393 275k
default.rgw.gc 7 N/A N/A 0 0 16557G 32 32 1773M 5281k 0
default.rgw.log 8 N/A N/A 0 0 16557G 185 185 22404k 14936k 0
default.rgw.users.uid 9 N/A N/A 3815 0 16557G 15 15 187k 53303 11445
default.rgw.usage 10 N/A N/A 0 0 16557G 7 7 278k 556k 0
default.rgw.users.email 11 N/A N/A 58 0 16557G 3 3 0 3 174
default.rgw.users.keys 12 N/A N/A 177 0 16557G 10 10 262 22 531
default.rgw.users.swift 13 N/A N/A 40 0 16557G 3 3 0 3 120
default.rgw.buckets.index 14 N/A N/A 0 0 16557G 205 205 160M 27945k 0
default.rgw.buckets.data.old 15 N/A N/A 668G 3.88 16557G 180867 176k 707k 2318k 2004G
default.rgw.buckets.non-ec 16 N/A N/A 0 0 16557G 114 114 17960 12024 0
default.rgw.buckets.extra 17 N/A N/A 0 0 16557G 0 0 0 0 0
.rgw.buckets.extra 18 N/A N/A 0 0 16557G 0 0 0 0 0
default.rgw.buckets.data 20 N/A N/A 79190G 70.51 33115G 20865953 20376k 122M 113M 116T
benchmark_replicated 21 N/A N/A 1415G 7.88 16557G 363800 355k 1338k 1251k 4247G
benchmark_erasure_coded 22 N/A N/A 11057M 0.03 33115G 2761 2761 398 5520 16586M
Thanks
Jakub
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com