Re: poor data distribution

Dominik Mostowiec <dominikmostowiec@xxxxxxxxx> · Sat, 1 Feb 2014 16:00:05 +0100

Hi,
Stats for pool 3 (.rgw.buckets), objects distribution:
cat pg_pool_obj_size_up.txt | awk '{if ($1=="3") print $2}' | sed -e
's/...$//' | sort | uniq -c
    183 12
   6166 13
   1843 6

About 25% pools is two times smaller from other.
I think this can be a reason of strange data distribution on OSDs.
Can I do something with it?

--
Regards
Dominik

2014-02-01 Dominik Mostowiec <dominikmostowiec@xxxxxxxxx>:
> Hi,
> Change pg_num for .rgw.buckets to power of 2, an 'crush tunables
> optimal' didn't help :(
>
> Graph: http://dysk.onet.pl/link/BZ968
>
> What can i do with dhis?
>
> Something is broken because cluster before increase pg_num reported
> 10T of data, now it is
>  18751 GB data, 34612 GB used, 20497 GB / 55110 GB avail;
>
> --
> Regards
> Dominik
>
> 2014-01-30 Sage Weil <sage@xxxxxxxxxxx>:
>> On Thu, 30 Jan 2014, Dominik Mostowiec wrote:
>>> Hi,
>>> Thaks for Your response.
>>>
>>> > - with ~6,5k objects,  size ~1,4G
>>> > - with ~13k objects, size ~2,8G
>>> is on one the biggest pool 5 '.rgw.buckets'
>>>
>>> > This is because pg_num is not a power of 2
>>> This is for all PGs (sum of all pools) or for pool 5 '.rgw.buckets'
>>> where i have almost all data ?
>>
>> Each pool's pg_num is ideally a power of 2.
>>
>>>
>>> > Did you try, ceph osd crush tunables optimal
>>> No, I'll try it after change pg_num to correct value.
>>>
>>> --
>>> Regards
>>> Dominik
>>>
>>>
>>> 2014-01-30 Sage Weil <sage@xxxxxxxxxxx>:
>>> > On Thu, 30 Jan 2014, Dominik Mostowiec wrote:
>>> >> Hi,
>>> >> I found something else.
>>> >> 'ceph pg dump' shows PGs:
>>> >> - with zero or near zero objects count
>>> >
>>> > These are probably for a different pool than the big ones, right?  The
>>> > PG id is basically $pool.$shard.
>>> >
>>> >> - with ~6,5k objects,  size ~1,4G
>>> >> - with ~13k objects, size ~2,8G
>>> >> This can be a reason of wrong data distribution on OSD's?
>>> >
>>> > This is because pg_num is not a power of 2.  Generally that won't result
>>> > in an imbalance as drastic as yours, though.  If you do adjust pg_num,
>>> > bump it up to the power of 2.
>>> >
>>> > Did you try
>>> >
>>> >  ceph osd crush tunables optimal
>>> >
>>> > ?
>>> > sage
>>> >
>>> >
>>> >>
>>> >> ---
>>> >> Regards
>>> >> Dominik
>>> >>
>>> >>
>>> >> 2014-01-30 Dominik Mostowiec <dominikmostowiec@xxxxxxxxx>:
>>> >> > Hi,
>>> >> > I have problem with data distribution.
>>> >> > Smallest disk usage 40% vs highest 82%.
>>> >> > All PGS: 6504.
>>> >> > Almost all data is in '.rgw.buckets' pool with pg_num 4800.
>>> >> > The best way to better data distribution is increese pg_num in this pool?
>>> >> > Is thre another way? ( eg crush tunables, or something like that ..)
>>> >> >
>>> >> > Config: 9 hosts X 22 OSD = 198 OSDs
>>> >> > replica = 3
>>> >> >
>>> >> > ceph -v
>>> >> > ceph version 0.67.5 (a60ac9194718083a4b6a225fc17cad6096c69bd1)
>>> >> >
>>> >> > ceph osd dump:
>>> >> > pool 0 'data' rep size 3 min_size 1 crush_ruleset 0 object_hash
>>> >> > rjenkins pg_num 64 pgp_num 64 last_change 28459 owner 0
>>> >> > crash_replay_interval 45
>>> >> > pool 1 'metadata' rep size 3 min_size 1 crush_ruleset 1 object_hash
>>> >> > rjenkins pg_num 64 pgp_num 64 last_change 28460 owner 0
>>> >> > pool 2 'rbd' rep size 3 min_size 1 crush_ruleset 2 object_hash
>>> >> > rjenkins pg_num 64 pgp_num 64 last_change 28461 owner 0
>>> >> > pool 3 '.rgw.buckets' rep size 3 min_size 1 crush_ruleset 0
>>> >> > object_hash rjenkins pg_num 4800 pgp_num 4800 last_change 28462 owner
>>> >> > 0
>>> >> > pool 4 '.log' rep size 3 min_size 1 crush_ruleset 0 object_hash
>>> >> > rjenkins pg_num 1440 pgp_num 1440 last_change 28463 owner 0
>>> >> > pool 5 '.rgw' rep size 3 min_size 1 crush_ruleset 0 object_hash
>>> >> > rjenkins pg_num 8 pgp_num 8 last_change 28464 owner 0
>>> >> > pool 6 '.users.uid' rep size 3 min_size 1 crush_ruleset 0 object_hash
>>> >> > rjenkins pg_num 8 pgp_num 8 last_change 28465 owner 0
>>> >> > pool 7 '.users' rep size 3 min_size 1 crush_ruleset 0 object_hash
>>> >> > rjenkins pg_num 8 pgp_num 8 last_change 28466 owner 0
>>> >> > pool 8 '.usage' rep size 2 min_size 1 crush_ruleset 0 object_hash
>>> >> > rjenkins pg_num 8 pgp_num 8 last_change 28467 owner
>>> >> > 18446744073709551615
>>> >> > pool 9 '.intent-log' rep size 3 min_size 1 crush_ruleset 0 object_hash
>>> >> > rjenkins pg_num 8 pgp_num 8 last_change 28468 owner
>>> >> > 18446744073709551615
>>> >> > pool 10 '.rgw.control' rep size 3 min_size 1 crush_ruleset 0
>>> >> > object_hash rjenkins pg_num 8 pgp_num 8 last_change 33485 owner
>>> >> > 18446744073709551615
>>> >> > pool 11 '.rgw.gc' rep size 3 min_size 1 crush_ruleset 0 object_hash
>>> >> > rjenkins pg_num 8 pgp_num 8 last_change 33487 owner
>>> >> > 18446744073709551615
>>> >> > pool 12 '.rgw.root' rep size 2 min_size 1 crush_ruleset 0 object_hash
>>> >> > rjenkins pg_num 8 pgp_num 8 last_change 44540 owner 0
>>> >> > pool 13 '' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins
>>> >> > pg_num 8 pgp_num 8 last_change 46912 owner 0
>>> >> >
>>> >> > rados df:
>>> >> > pool name       category                 KB      objects       clones
>>> >> >    degraded      unfound           rd        rd KB           wr
>>> >> > wr KB
>>> >> >                 -                          0            0            0
>>> >> >            0           0            0            0            0
>>> >> >     0
>>> >> > .intent-log     -                          0            0            0
>>> >> >            0           0            1            0      1893949
>>> >> > 1893697
>>> >> > .log            -                      69222         6587            0
>>> >> >            0           0            0            0    172911881
>>> >> > 172884792
>>> >> > .rgw            -                     114449       813564            0
>>> >> >            0           0     51094886     39834914      3606364
>>> >> > 1047115
>>> >> > .rgw.buckets    -                10371229682     54367877            0
>>> >> >            0           0   2610759046  54967451976   1930820947
>>> >> > 24031291524
>>> >> > ....
>>> >> > empty pools
>>> >> >
>>> >> > --
>>> >> > Regards
>>> >> > Dominik
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Pozdrawiam
>>> >> Dominik
>>> >> _______________________________________________
>>> >> ceph-users mailing list
>>> >> ceph-users@xxxxxxxxxxxxxx
>>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >>
>>> >>
>>>
>>>
>>>
>>> --
>>> Pozdrawiam
>>> Dominik
>>>
>>>
>
>
>
> --
> Pozdrawiam
> Dominik

-- 
Pozdrawiam
Dominik
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com