Re: Cache Tiering Question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

OK, I've set this up and now all I/O is locked up. I've reduced
target_max_bytes because one OSD was reporting 97% usage, there was
some I/O for a few seconds as things flushed, but client I/O is still
blocked. Anyone have some thoughts?

ceph osd crush rule create-simple ssd-tier ssd host firstn
ceph osd pool create ssd-pool 128 replicated ssd-tier
ceph osd tier add rbd ssd-pool
ceph osd tier cache-mode ssd-pool writeback
ceph osd tier set-overlay rbd ssd-pool
ceph osd pool set ssd-pool hit_set_type bloom
ceph osd pool set ssd-pool hit_set_count 6
ceph osd pool set ssd-pool hit_set_period 600
ceph osd pool set ssd-pool min_read_recency_for_promote 6
ceph osd pool set ssd-pool cache_target_dirty_ratio 0.4
ceph osd pool set ssd-pool cache_target_full_ratio 0.8
ceph osd pool set ssd-pool target_max_bytes 795642691584

ceph version 0.94.3-252-g629b631 (629b631488f044150422371ac77dfc005f3de1bc)

# ceph status
    cluster 050309fd-723e-42aa-9624-3b3e033ab359
     health HEALTH_OK
     monmap e1: 1 mons at {nodez=192.168.55.15:6789/0}
            election epoch 2, quorum 0 nodez
     osdmap e1333: 18 osds: 18 up, 18 in
      pgmap v87157: 384 pgs, 2 pools, 3326 GB data, 1368 kobjects
            3010 GB used, 20262 GB / 24518 GB avail
                 384 active+clean

# ceph osd df
ID WEIGHT  REWEIGHT SIZE   USE    AVAIL  %USE  VAR
18 0.20000  1.00000   208G   135G 64764M 64.63 4.77
 5 0.20999  1.00000   210G   181G 18392M 86.36 6.38
19 0.21999  1.00000   208G   161G 37941M 77.17 5.70
10 0.18999  1.00000   210G   167G 32712M 79.70 5.89
 7 0.20999  1.00000   210G   181G 18405M 86.35 6.38
20 0.20000  1.00000   208G   119G 80247M 57.39 4.24
22 0.20000  1.00000   208G 87596M   112G 40.95 3.02
 8 0.20999  1.00000   210G   170G 29422M 81.23 6.00
23 0.20999  1.00000   208G   151G 47404M 72.75 5.37
 1 0.20999  1.00000   210G   105G 96245M 50.17 3.71
 6 0.20999  1.00000   210G   131G 69937M 62.40 4.61
21 0.20000  1.00000   208G   192G  5667M 92.26 6.81
 0 3.64000  1.00000  3667G   231G  3249G  6.32 0.47
 9 3.57999  1.00000  3667G   262G  3219G  7.15 0.53
 2 3.64000  1.00000  3667G   273G  3207G  7.47 0.55
 3 3.64000  1.00000  3667G   256G  3224G  6.99 0.52
 4 3.64000  1.00000  3667G   239G  3241G  6.54 0.48
24 3.57999  1.00000  3667G   272G  3208G  7.42 0.55
              TOTAL 24518G  3320G 19952G 13.54
MIN/MAX VAR: 0.47/6.81  STDDEV: 48.64

After dropping target_max_bytes to 644470580183:
# ceph df
GLOBAL:
    SIZE       AVAIL      RAW USED     %RAW USED
    24518G     20241G        3031G         12.36
POOLS:
    NAME         ID     USED      %USED     MAX AVAIL     OBJECTS
    rbd          0      2856G     11.65         6379G     1158862
    ssd-pool     3       470G      1.92          162G      242140
# ceph osd df
ID WEIGHT  REWEIGHT SIZE   USE     AVAIL   %USE  VAR
18 0.20000  1.00000   208G    116G  83987M 55.64 4.50
 5 0.20999  1.00000   210G    151G  49392M 71.95 5.82
19 0.21999  1.00000   208G    134G  65792M 64.15 5.19
10 0.18999  1.00000   210G    138G  61961M 66.11 5.35
 7 0.20999  1.00000   210G    149G  50672M 71.36 5.77
20 0.20000  1.00000   208G 101842M 101167M 47.61 3.85
22 0.20000  1.00000   208G  72511M    127G 33.90 2.74
 8 0.20999  1.00000   210G    145G  55381M 69.17 5.59
23 0.20999  1.00000   208G    127G  72305M 61.11 4.94
 1 0.20999  1.00000   210G  95656M    105G 44.46 3.60
 6 0.20999  1.00000   210G    109G  92154M 52.07 4.21
21 0.20000  1.00000   208G    158G  40521M 75.97 6.14
 0 3.64000  1.00000  3667G    231G   3249G  6.32 0.51
 9 3.57999  1.00000  3667G    262G   3219G  7.15 0.58
 2 3.64000  1.00000  3667G    273G   3207G  7.47 0.60
 3 3.64000  1.00000  3667G    256G   3224G  6.99 0.57
 4 3.64000  1.00000  3667G    239G   3241G  6.54 0.53
24 3.57999  1.00000  3667G    272G   3208G  7.42 0.60
              TOTAL 24518G   3031G  20241G 12.36
MIN/MAX VAR: 0.51/6.14  STDDEV: 39.87

# ceph osd tree
ID  WEIGHT   TYPE NAME           UP/DOWN REWEIGHT PRIMARY-AFFINITY
 -9  2.46991 root ssd
 -8  0.40999     host nodew-ssd
 18  0.20000         osd.18           up  1.00000          1.00000
  5  0.20999         osd.5            up  1.00000          1.00000
- -10  0.40997     host nodev-ssd
 19  0.21999         osd.19           up  1.00000          1.00000
 10  0.18999         osd.10           up  1.00000          1.00000
- -11  0.40999     host nodezz-ssd
  7  0.20999         osd.7            up  1.00000          1.00000
 20  0.20000         osd.20           up  1.00000          1.00000
- -12  0.40999     host nodey-ssd
 22  0.20000         osd.22           up  1.00000          1.00000
  8  0.20999         osd.8            up  1.00000          1.00000
- -13  0.41998     host nodex-ssd
 23  0.20999         osd.23           up  1.00000          1.00000
  1  0.20999         osd.1            up  1.00000          1.00000
- -14  0.40999     host nodez-ssd
  6  0.20999         osd.6            up  1.00000          1.00000
 21  0.20000         osd.21           up  1.00000          1.00000
 -1 21.71997 root default
 -2  3.64000     host nodez
  0  3.64000         osd.0            up  1.00000          1.00000
 -3  3.57999     host nodew
  9  3.57999         osd.9            up  1.00000          1.00000
 -4  3.64000     host nodex
  2  3.64000         osd.2            up  1.00000          1.00000
 -5  3.64000     host nodey
  3  3.64000         osd.3            up  1.00000          1.00000
 -6  3.64000     host nodezz
  4  3.64000         osd.4            up  1.00000          1.00000
 -7  3.57999     host nodev
 24  3.57999         osd.24           up  1.00000          1.00000

# ceph osd crush rule dump
[
    {
        "rule_id": 0,
        "rule_name": "replicated_ruleset",
        "ruleset": 0,
        "type": 1,
        "min_size": 1,
        "max_size": 10,
        "steps": [
            {
                "op": "take",
                "item": -1,
                "item_name": "default"
            },
            {
                "op": "chooseleaf_firstn",
                "num": 0,
                "type": "host"
            },
            {
                "op": "emit"
            }
        ]
    },
    {
        "rule_id": 1,
        "rule_name": "ssd-tier",
        "ruleset": 1,
        "type": 1,
        "min_size": 1,
        "max_size": 10,
        "steps": [
            {
                "op": "take",
                "item": -9,
                "item_name": "ssd"
            },
            {
                "op": "chooseleaf_firstn",
                "num": 0,
                "type": "host"
            },
            {
                "op": "emit"
            }
        ]
    }
]
-----BEGIN PGP SIGNATURE-----
Version: Mailvelope v1.2.0
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJWIR8JCRDmVDuy+mK58QAAhyAP/3LYWWxCtDUABwzW/rov
5NCHpKgVRkEAUTGIRESFp9egbhr2loaC1pjfkp911Shg6My/C3N6Y9q9MLdq
zy7zGSB/GL5XjvS0TurEjBihtDpMF2SbBk5NkrzgVc1fiOuA8UEZl8J2wBtF
R81UOluZVULzvmMjbH4uWfD1UovJl30LlAz/MocDJsDDejjfnsM3PXn8NSaE
4AyNkj8tXj8yMZIzxZV25O8NWZXq0JnuOwND+YxT9VxG8k1o3gqg7747j/Uz
0A9/fJ4IkMJdNGyMCVPgoTJy87CjeSfDf0MmK3S5bXtLfKKZTKYv0m/+B8PY
KzZcuVTavBhFSLWiT3L2U1OOyPz5AEu2ezE2Y6ElFePc+g38eO/I7kuTSixV
+0yZL1tO6vEYZLnwWTWgYFmmrOA5yTBvssGpjpZVPe7swkJG97kvqe/bh2/W
OqQ5PEnhn5Gx3vIDHJwvI/PT4MXZk2VU9cpPMPs7PeIQBPZYPi0/WcfT8m+g
oclkznsM+BSLMiTT8yBc7/T1kLFQXS42jVXEFAKYnJj8LIk0aMc54Gu25g0w
PM6+IFROsMQlGdybbWCPXIXsZ94JjJOBbA3jSP7XkesNvNC9fqlRDJwxBS7h
2F4cUwpZRJZGSAJzIRbbFdDZOftoUjtIiv+GAH1z54o+lq/sR+WNo1ALTB8k
uNQ8
=z47G
-----END PGP SIGNATURE-----
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Thu, Oct 15, 2015 at 5:49 PM, Christian Balzer <chibi@xxxxxxx> wrote:
>
> Hello,
>
> Having run into this myself two days ago (setting relative sizing values
> doesn't flush things when expected) I'd say that the documentation is
> highly misleading when it comes to the relative settings.
>
> And unclear when it comes to the size/object settings.
>
> Guess this section needs at least one nice red paragraph and some further
> explanations.
>
> Christian
>
> On Thu, 15 Oct 2015 17:33:30 -0600 Robert LeBlanc wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA256
>>
>> One more question. Is max_{bytes,objects} before or after replication
>> factor?
>> - ----------------
>> Robert LeBlanc
>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>>
>>
>> On Thu, Oct 15, 2015 at 4:42 PM, LOPEZ Jean-Charles  wrote:
>> > Hi Robert,
>> >
>> > yes they do.
>> >
>> > Pools don’t have a size when you create them hence the couple
>> > value/ratio that is to be defined for cache tiering mechanism. Pool
>> > only have a number of PGs assigned. So setting the max values and the
>> > ratios for dirty and full must be set explicitly to match your
>> > configuration.
>> >
>> > Note that you can at the same time define max_bytes and max_objects.
>> > The first of the 2 values that breaches using your ratio settings will
>> > trigger eviction and/or flushing. The ratios you choose apply to both
>> > values.
>> >
>> > Cheers
>> > JC
>> >
>> >> On 15 Oct 2015, at 15:02, Robert LeBlanc  wrote:
>> >>
>> >> -----BEGIN PGP SIGNED MESSAGE-----
>> >> Hash: SHA256
>> >>
>> >> hmmm...
>> >>
>> >> http://docs.ceph.com/docs/master/rados/operations/cache-tiering/#relative-sizing
>> >>
>> >> makes it sound like it should be based on the size of the pool and
>> >> that you don't have to set anything like max bytes/objects. Can you
>> >> confirm that cache_target_{dirty,dirty_high,full}_ratio works as a
>> >> ratio of target_max_bytes set?
>> >> - ----------------
>> >> Robert LeBlanc
>> >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>> >>
>> >>
>> >> On Thu, Oct 15, 2015 at 3:32 PM, Nick Fisk  wrote:
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>> -----Original Message-----
>> >>>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On
>> >>>> Behalf Of Robert LeBlanc
>> >>>> Sent: 15 October 2015 22:06
>> >>>> To: ceph-users@xxxxxxxxxxxxxx
>> >>>> Subject:  Cache Tiering Question
>> >>>>
>> >>>> -----BEGIN PGP SIGNED MESSAGE-----
>> >>>> Hash: SHA256
>> >>>>
>> >>>> ceph df (ceph version 0.94.3-252-g629b631
>> >>>> (629b631488f044150422371ac77dfc005f3de1bc)) is showing some odd
>> >>>> results:
>> >>>>
>> >>>> root@nodez:~# ceph df
>> >>>> GLOBAL:
>> >>>>    SIZE       AVAIL      RAW USED     %RAW USED
>> >>>>    24518G     21670G        1602G          6.53
>> >>>> POOLS:
>> >>>>    NAME         ID     USED      %USED     MAX AVAIL     OBJECTS
>> >>>>    rbd          0      2723G     11.11         6380G     1115793
>> >>>>    ssd-pool     2          0         0          732G           1
>> >>>>
>> >>>> The rbd pool is showing 11.11% used, but if you calculate the
>> >>>> numbers
>> >>> there
>> >>>> it is 2723/6380=42.68%.
>> >>>
>> >>> I have a feeling that the percentage is based on the amount used of
>> >>> the total cluster size. Ie 2723/24518
>> >>>
>> >>>>
>> >>>> Will this cause problems with the relative cache tier settings? Do
>> >>>> I need
>> >>> to set
>> >>>> the percentage based on what Ceph is reporting here?
>> >>>
>> >>> The flushing/eviction thresholds are based on the target_max_bytes
>> >>> number that you set, they have nothing to do with the underlying
>> >>> pool size. It's up to you to come up with a sane number for this
>> >>> variable.
>> >>>
>> >>>>
>> >>>> Thanks,
>> >>>> - ----------------
>> >>>> Robert LeBlanc
>> >>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>> >>>> ----- BEGIN PGP SIGNATURE-----
>> >>>> Version: Mailvelope v1.2.0
>> >>>> Comment: https://www.mailvelope.com
>> >>>>
>> >>>> wsFcBAEBCAAQBQJWIBVGCRDmVDuy+mK58QAAXEYQAKm5IBGn81Hlb9az4
>> >>>> 52x
>> >>>> hSH6onk7mJE7L2s5FnoJv2sNW4azhDEVKGQBE9vvhIVBhhtKtnqdzu3ytk6E
>> >>>> EUFuPBzUWLJyG3wQtp3QC0PdYzlGkS7bowdpZqk9PdaYZYgEdqG/cLEl/eAx
>> >>>> LGIUXmr6vIuNhnntGIIYeUAiWXA7b5qzOKbef6OlOp7Mz6Euel9S8ycZlSAR
>> >>>> eBQ5hdLSFoFai5ldyV+/hmqLnujOfanRFC8pIYr41aKe7wBOPOargLGQdka3
>> >>>> jswmcf+0hV7QqZSOjJijDYvOgRuHBFK6cdyP9SRKxWxG7uH+yDOvya0TqOob
>> >>>> 1yDomYC1zD2uzG9+L5Iv6at8fuBF5xFKPqax9N4WQj3Oj9fBwioQVBocNxHc
>> >>>> MIlQnvnLeq6OLtdfPoPignTAHIH2RrvAmdwYkSCuopjUSTkmBsyBLIiiz/KI
>> >>>> P4mSXAxZb0UF4pbCDgdYG6qUEywR/enGsT1lnmNLx4vY8W/yz9xQ3o3JnIpD
>> >>>> pWyo9zJ8Ugnwvihbo7xKe+EZOeJL0YF4BiyAprH5pKFdQcAWcV98zWHnLBxd
>> >>>> EFHyN9fHsVdw0UsxIUBZFfM1u4S7fchgVeFfiTSdGqd/dWHQCHKJPNBSJnae
>> >>>> aPKTyvg77N6zTn04VGspfenR+svGbkAtUfO2HJ1Kkd4/wZ9GIzsS1ovPZFsM
>> >>>> jJe4
>> >>>> =YSyj
>> >>>> -----END PGP SIGNATURE-----
>> >>>> _______________________________________________
>> >>>> ceph-users mailing list
>> >>>> ceph-users@xxxxxxxxxxxxxx
>> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>>
>> >>>
>> >>>
>> >>>
>> >>
>> >> -----BEGIN PGP SIGNATURE-----
>> >> Version: Mailvelope v1.2.0
>> >> Comment: https://www.mailvelope.com
>> >>
>> >> wsFcBAEBCAAQBQJWICJwCRDmVDuy+mK58QAAyTUQALkwOnB++bXto+cM0iSZ
>> >> B3nZgvl9FKZnujb0MUIiS29a+Y2nnBpAGgHbF4Y9ngnDQYNZ0yf1DD2wYad2
>> >> rll6pYeWRRYSmaBCBfdPlqbbVw8WpjdXLR9FtLFfUR2V+Ghf4U83F8iKiWn1
>> >> +6DqouHMA/auHjEr49w+Ue0kpKSfItH/9LkVjYQBKp6E7tyOSsrzcM1milKR
>> >> lwsIOewiKvsg4neDLqkdqaO6+bYuaDJmgN+hEqzl7lxbzt5pJbzfknpiAewm
>> >> GTw8C2AUbzcYqIhzqWcY9Jiy6ZZkYAPDODsJpkc/Pubnq73jlkllB4JaQpJy
>> >> 2964DynNn8jBAI9JJpLyldtKPEofmkumzZ6tPXgLDuo2VuV+hp/wVadZKy2k
>> >> PDhms1dpeLFM8NsgOToSpO6Ej1l1857C5+cy3EeTlKqgs6z1QbTwNvUeeCpk
>> >> /ORObJQCa7teNEM1c33oEJ3V1LOx7SfsEn1A6PVaaUegmMEEa6Cb8Va2RYl8
>> >> 5fhXqIcsU9KWHDmq8+MZ9x67etAucXKJmPQpIzJD6M9WtsWsDupsuJ1MgCKB
>> >> pxhqjwujuaZWfF+W3HEuOOP7OcXbj2U3RO1V3HOr9N0cLFTf+vuefIzOtgs1
>> >> qdBPrxIUNznfYXarclFuJzCWPzKpDTdKbLwYUcbh9hKayRpll3DGOW7qUX3u
>> >> eNXR
>> >> =cI+5
>> >> -----END PGP SIGNATURE-----
>> >> _______________________________________________
>> >> ceph-users mailing list
>> >> ceph-users@xxxxxxxxxxxxxx
>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>>
>> -----BEGIN PGP SIGNATURE-----
>> Version: Mailvelope v1.2.0
>> Comment: https://www.mailvelope.com
>>
>> wsFcBAEBCAAQBQJWIDfDCRDmVDuy+mK58QAA8qkQAIBtEorNvkAwVojMmOcW
>> /zEGPw9Hg0OgvoR7gv4DWSKO4y8raek3oL7BNE5WNrpkRkpKfjGe6OLLtTr+
>> 9b7K19Cv3oRQHVUG2S+rnwDzsg/4ORL90TZZSh729ThjE823g9PDpB1ThsdD
>> DApHvU4OoLEYVepCkxzZx4a8UztyaBnDl8/LCNK7Rzg30UWsiR9kRW4bru5F
>> igcFHslBmUSH0trbG0kxA9mrmnWq2m7i0QNVS1nUDJ7crDwqnJrnf17NG7NV
>> SQKKsAcuM2lmmAPkLIMy4J1oiBb8JXiCc27Bj+dtBG9Iqh8HdYvvmVd6O8Jv
>> bVgMUN7mmGGpuIs040Q3Fn4wSrhtGc5iUpzM5eJnemnrPi5ymE8WayHX6aak
>> qA5vfM8WLNKMmPBORqg2DB/1co6OkvHOLAk+ZAUYUo88I+dVp7BIXadaZMhS
>> GKbTPfpZgDdn0bHbn4Dyma1a1JVarpQXCaLq4ayvfY7DQuoFVi2eOImxvc+Q
>> gFSmmdegK0uto3aTnySR1fRl1Yk9grd+LSwJgmsew4t2AHjAbAYgG1idnvJt
>> t5e6Aj4NnNK3f085gkoundV1rrp37lu3Ot82gMq7xyxNmlT/FsAmOFSEelJP
>> U26AQHlgDM7oV95IQMnKOtdziIq7NFdspuVuN+umf7JpnuYLbROSREG3dIrq
>> qdxB
>> =de2k
>> -----END PGP SIGNATURE-----
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> --
> Christian Balzer        Network/Systems Engineer
> chibi@xxxxxxx           Global OnLine Japan/Fusion Communications
> http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux