Re: read performance, separate client CRUSH maps or limit osd read access from each client

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Martin, thank you for the tip.
googling ceph crush rule examples doesn't give much on rules, just static placement of buckets.
this all seems to be for placing data, not to giving client in specific datacenter proper read osd

maybe something wrong with placement groups?

I added datacenter dc1 dc2 dc3
Current replicated_rule is
rule replicated_rule {
id 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}

# buckets
host ceph1 {
id -3 # do not change unnecessarily
id -2 class ssd # do not change unnecessarily
# weight 1.000
alg straw2
hash 0 # rjenkins1
item osd.0 weight 1.000
}
datacenter dc1 {
id -9 # do not change unnecessarily
id -4 class ssd # do not change unnecessarily
# weight 1.000
alg straw2
hash 0 # rjenkins1
item ceph1 weight 1.000
}
host ceph2 {
id -5 # do not change unnecessarily
id -6 class ssd # do not change unnecessarily
# weight 1.000
alg straw2
hash 0 # rjenkins1
item osd.1 weight 1.000
}
datacenter dc2 {
id -10 # do not change unnecessarily
id -8 class ssd # do not change unnecessarily
# weight 1.000
alg straw2
hash 0 # rjenkins1
item ceph2 weight 1.000
}
host ceph3 {
id -7 # do not change unnecessarily
id -12 class ssd # do not change unnecessarily
# weight 1.000
alg straw2
hash 0 # rjenkins1
item osd.2 weight 1.000
}
datacenter dc3 {
id -11 # do not change unnecessarily
id -13 class ssd # do not change unnecessarily
# weight 1.000
alg straw2
hash 0 # rjenkins1
item ceph3 weight 1.000
}
root default {
id -1 # do not change unnecessarily
id -14 class ssd # do not change unnecessarily
# weight 3.000
alg straw2
hash 0 # rjenkins1
item dc1 weight 1.000
item dc2 weight 1.000
item dc3 weight 1.000
}


#ceph pg dump
dumped all
version 29433
stamp 2018-11-09 11:23:44.510872
last_osdmap_epoch 0
last_pg_scan 0
PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES LOG DISK_LOG STATE STATE_STAMP VERSION REPORTED UP UP_PRIMARY ACTING ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP LAST_DEEP_SCRUB DEEP_SCRUB_STAMP SNAPTRIMQ_LEN
1.5f 0 0 0 0 0 0 0 0 active+clean 2018-11-09 04:35:32.320607 0'0 544:1317 [0,2,1] 0 [0,2,1] 0 0'0 2018-11-09 04:35:32.320561 0'0 2018-11-04 11:55:54.756115 0
2.5c 143 0 143 0 0 19490267 461 461 active+undersized+degraded 2018-11-08 19:02:03.873218 508'461 544:2100 [2,1] 2 [2,1] 2 290'380 2018-11-07 18:58:43.043719 64'120 2018-11-05 14:21:49.256324 0
.....
sum 15239 0 2053 2659 0 2157615019 58286 58286
OSD_STAT USED AVAIL TOTAL HB_PEERS PG_SUM PRIMARY_PG_SUM
2 3.7 GiB 28 GiB 32 GiB [0,1] 200 73
1 3.7 GiB 28 GiB 32 GiB [0,2] 200 58
0 3.7 GiB 28 GiB 32 GiB [1,2] 173 69
sum 11 GiB 85 GiB 96 GiB

#ceph pg map 2.5c
osdmap e545 pg 2.5c (2.5c) -> up [2,1] acting [2,1]

#pg map 1.5f
osdmap e547 pg 1.5f (1.5f) -> up [0,2,1] acting [0,2,1]

On Fri, Nov 9, 2018 at 2:21 AM Martin Verges <martin.verges@xxxxxxxx> wrote:
Hello Vlad,

Ceph clients connect to the primary OSD of each PG. If you create a
crush rule for building1 and one for building2 that takes a OSD from
the same building as the first one, your reads to the pool will always
be on the same building (if the cluster is healthy) and only write
request get replicated to the other building.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.verges@xxxxxxxx
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


2018-11-09 4:54 GMT+01:00 Vlad Kopylov <vladkopy@xxxxxxxxx>:
> I am trying to test replicated ceph with servers in different buildings, and
> I have a read problem.
> Reads from one building go to osd in another building and vice versa, making
> reads slower then writes! Making read as slow as slowest node.
>
> Is there a way to
> - disable parallel read (so it reads only from the same osd node where mon
> is);
> - or give each client read restriction per osd?
> - or maybe strictly specify read osd on mount;
> - or have node read delay cap (for example if node time out is larger then 2
> ms then do not use such node for read as other replicas are available).
> - or ability to place Clients on the Crush map - so it understands that osd
> in - for example osd in the same data-center as client has preference, and
> pull data from it/them.
>
> Mounting with kernel client latest mimic.
>
> Thank you!
>
> Vlad
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux