Re: Troubleshooting an erasure coded pool with a cache tier

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It's all about the disk accesses. What's the slow part when you dump historic and in-progress ops?
On Sat, Nov 8, 2014 at 2:30 PM Loic Dachary <loic@xxxxxxxxxxx> wrote:
Hi Greg,

On 08/11/2014 20:19, Gregory Farnum wrote:> When acting as a cache pool it needs to go do a lookup on the base pool for every object it hasn't encountered before. I assume that's why it's slower.
> (The penalty should not be nearly as high as you're seeing here, but based on the low numbers I imagine you're running everything on an overloaded laptop or something.)

It's running on a small cluster that is busy but not to a point that I expect such a difference:

# dsh --concurrent-shell --show-machine-names --remoteshellopt=-p2222 -m g1 -m g2 -m g3 -m n7 -m stri dstat -c 10 3
g1: ----total-cpu-usage----
g1: usr sys idl wai hiq siq
g1:   6   1  88   6   0   0
g2: ----total-cpu-usage----
g2: usr sys idl wai hiq siq
g2:   4   1  88   7   0   0
n7: ----total-cpu-usage----
n7: usr sys idl wai hiq siq
n7:  18   3  58  20   0   1
stri: ----total-cpu-usage----
stri: usr sys idl wai hiq siq
stri:   6   1  86   6   0   0
g3: ----total-cpu-usage----
g3: usr sys idl wai hiq siq
g3:  37   2  55   5   0   1
g1:   2   0  93   4   0   0
g2:   2   0  92   6   0   0
n7:  13   2  65  20   0   1
stri:   4   1  92   3   0   0
g3:  32   2  62   4   0   1
g1:   3   0  94   3   0   0
g2:   3   1  94   3   0   0
n7:  13   3  61  22   0   1
stri:   4   1  90   5   0   0
g3:  31   2  61   4   0   1
g1:   3   0  89   7   0   0
g2:   2   1  89   8   0   0
n7:  20   3  50  25   0   1
stri:   6   1  87   5   0   0
g3:  57   2  36   3   0   1

# ceph tell osd.\* version
osd.0: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)"}
osd.1: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)"}
osd.2: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)"}
osd.3: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)"}
osd.4: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)"}
osd.5: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)"}
osd.6: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)"}
osd.7: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)"}
osd.8: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)"}
osd.9: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)"}
osd.10: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)"}
osd.11: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)"}
osd.12: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)"}
osd.13: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)"}
osd.14: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)"}
osd.15: { "version": "ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)"}

Cheers

> -Greg
> On Sat, Nov 8, 2014 at 11:14 AM Loic Dachary <loic@xxxxxxxxxxx <mailto:loic@xxxxxxxxxxx>> wrote:
>
>     Hi,
>
>     This is a first attempt, it is entirely possible that the solution is simple or RTFM ;-)
>
>     Here is the problem observed:
>
>     rados --pool ec4p1 bench 120 write # the erasure coded pool
>     Total time run:         147.207804
>     Total writes made:      458
>     Write size:             4194304
>     Bandwidth (MB/sec):     12.445
>
>     rados --pool disks bench 120 write # same crush ruleset at the cache tier
>     Total time run:         126.312601
>     Total writes made:      1092
>     Write size:             4194304
>     Bandwidth (MB/sec):     34.581
>
>     There must be something wrong in how the cache tier is setup: one would expect the same write speed since the total size written (a few GB) is lower than the size of the cache pool. Instead the write speed is consistently at least twice slower (12.445 * 2 < 34.581).
>
>     root@g1:~# ceph osd dump | grep disks
>     pool 58 'disks' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 15110 lfor 12228 flags hashpspool stripe_width 0
>     root@g1:~# ceph osd dump | grep ec4
>     pool 74 'ec4p1' erasure size 5 min_size 4 crush_ruleset 2 object_hash rjenkins pg_num 32 pgp_num 32 last_change 15604 lfor 15604 flags hashpspool tiers 75 read_tier 75 write_tier 75 stripe_width 4096
>     pool 75 'ec4p1c' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 12 pgp_num 12 last_change 15613 flags hashpspool,incomplete_clones tier_of 74 cache_mode writeback target_bytes 1000000000 target_objects 1000000000 hit_set bloom{false_positive___probability: 0.05, target_size: 0, seed: 0} 3600s x1 stripe_width 0
>
>     root@g1:~# ceph df
>     GLOBAL:
>         SIZE       AVAIL      RAW USED     %RAW USED
>         26955G     18850G        6735G         24.99
>     POOLS:
>         NAME            ID     USED       %USED     MAX AVAIL     OBJECTS
>     ..
>         disks           58      1823G      6.76         5305G      471080
>     ..
>         ec4p1           74       589G      2.19        12732G      153623
>         ec4p1c          75     57501k         0         5305G         491
>
>
>     Cheers
>     --
>     Loïc Dachary, Artisan Logiciel Libre
>
>     _________________________________________________
>     ceph-users mailing list
>     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxx.com>
>     http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>

--
Loïc Dachary, Artisan Logiciel Libre

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux