-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 I started another fio test to one of the same RBDs (leaving the hung ones still hung) and it is working OK, but the hungs ones are still just hung. - ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Fri, Oct 16, 2015 at 10:00 AM, Robert LeBlanc wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA256 > > OK, I've set this up and now all I/O is locked up. I've reduced > target_max_bytes because one OSD was reporting 97% usage, there was > some I/O for a few seconds as things flushed, but client I/O is still > blocked. Anyone have some thoughts? > > ceph osd crush rule create-simple ssd-tier ssd host firstn > ceph osd pool create ssd-pool 128 replicated ssd-tier > ceph osd tier add rbd ssd-pool > ceph osd tier cache-mode ssd-pool writeback > ceph osd tier set-overlay rbd ssd-pool > ceph osd pool set ssd-pool hit_set_type bloom > ceph osd pool set ssd-pool hit_set_count 6 > ceph osd pool set ssd-pool hit_set_period 600 > ceph osd pool set ssd-pool min_read_recency_for_promote 6 > ceph osd pool set ssd-pool cache_target_dirty_ratio 0.4 > ceph osd pool set ssd-pool cache_target_full_ratio 0.8 > ceph osd pool set ssd-pool target_max_bytes 795642691584 > > ceph version 0.94.3-252-g629b631 (629b631488f044150422371ac77dfc005f3de1bc) > > # ceph status > cluster 050309fd-723e-42aa-9624-3b3e033ab359 > health HEALTH_OK > monmap e1: 1 mons at {nodez=192.168.55.15:6789/0} > election epoch 2, quorum 0 nodez > osdmap e1333: 18 osds: 18 up, 18 in > pgmap v87157: 384 pgs, 2 pools, 3326 GB data, 1368 kobjects > 3010 GB used, 20262 GB / 24518 GB avail > 384 active+clean > > # ceph osd df > ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR > 18 0.20000 1.00000 208G 135G 64764M 64.63 4.77 > 5 0.20999 1.00000 210G 181G 18392M 86.36 6.38 > 19 0.21999 1.00000 208G 161G 37941M 77.17 5.70 > 10 0.18999 1.00000 210G 167G 32712M 79.70 5.89 > 7 0.20999 1.00000 210G 181G 18405M 86.35 6.38 > 20 0.20000 1.00000 208G 119G 80247M 57.39 4.24 > 22 0.20000 1.00000 208G 87596M 112G 40.95 3.02 > 8 0.20999 1.00000 210G 170G 29422M 81.23 6.00 > 23 0.20999 1.00000 208G 151G 47404M 72.75 5.37 > 1 0.20999 1.00000 210G 105G 96245M 50.17 3.71 > 6 0.20999 1.00000 210G 131G 69937M 62.40 4.61 > 21 0.20000 1.00000 208G 192G 5667M 92.26 6.81 > 0 3.64000 1.00000 3667G 231G 3249G 6.32 0.47 > 9 3.57999 1.00000 3667G 262G 3219G 7.15 0.53 > 2 3.64000 1.00000 3667G 273G 3207G 7.47 0.55 > 3 3.64000 1.00000 3667G 256G 3224G 6.99 0.52 > 4 3.64000 1.00000 3667G 239G 3241G 6.54 0.48 > 24 3.57999 1.00000 3667G 272G 3208G 7.42 0.55 > TOTAL 24518G 3320G 19952G 13.54 > MIN/MAX VAR: 0.47/6.81 STDDEV: 48.64 > > After dropping target_max_bytes to 644470580183: > # ceph df > GLOBAL: > SIZE AVAIL RAW USED %RAW USED > 24518G 20241G 3031G 12.36 > POOLS: > NAME ID USED %USED MAX AVAIL OBJECTS > rbd 0 2856G 11.65 6379G 1158862 > ssd-pool 3 470G 1.92 162G 242140 > # ceph osd df > ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR > 18 0.20000 1.00000 208G 116G 83987M 55.64 4.50 > 5 0.20999 1.00000 210G 151G 49392M 71.95 5.82 > 19 0.21999 1.00000 208G 134G 65792M 64.15 5.19 > 10 0.18999 1.00000 210G 138G 61961M 66.11 5.35 > 7 0.20999 1.00000 210G 149G 50672M 71.36 5.77 > 20 0.20000 1.00000 208G 101842M 101167M 47.61 3.85 > 22 0.20000 1.00000 208G 72511M 127G 33.90 2.74 > 8 0.20999 1.00000 210G 145G 55381M 69.17 5.59 > 23 0.20999 1.00000 208G 127G 72305M 61.11 4.94 > 1 0.20999 1.00000 210G 95656M 105G 44.46 3.60 > 6 0.20999 1.00000 210G 109G 92154M 52.07 4.21 > 21 0.20000 1.00000 208G 158G 40521M 75.97 6.14 > 0 3.64000 1.00000 3667G 231G 3249G 6.32 0.51 > 9 3.57999 1.00000 3667G 262G 3219G 7.15 0.58 > 2 3.64000 1.00000 3667G 273G 3207G 7.47 0.60 > 3 3.64000 1.00000 3667G 256G 3224G 6.99 0.57 > 4 3.64000 1.00000 3667G 239G 3241G 6.54 0.53 > 24 3.57999 1.00000 3667G 272G 3208G 7.42 0.60 > TOTAL 24518G 3031G 20241G 12.36 > MIN/MAX VAR: 0.51/6.14 STDDEV: 39.87 > > # ceph osd tree > ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > -9 2.46991 root ssd > -8 0.40999 host nodew-ssd > 18 0.20000 osd.18 up 1.00000 1.00000 > 5 0.20999 osd.5 up 1.00000 1.00000 > - -10 0.40997 host nodev-ssd > 19 0.21999 osd.19 up 1.00000 1.00000 > 10 0.18999 osd.10 up 1.00000 1.00000 > - -11 0.40999 host nodezz-ssd > 7 0.20999 osd.7 up 1.00000 1.00000 > 20 0.20000 osd.20 up 1.00000 1.00000 > - -12 0.40999 host nodey-ssd > 22 0.20000 osd.22 up 1.00000 1.00000 > 8 0.20999 osd.8 up 1.00000 1.00000 > - -13 0.41998 host nodex-ssd > 23 0.20999 osd.23 up 1.00000 1.00000 > 1 0.20999 osd.1 up 1.00000 1.00000 > - -14 0.40999 host nodez-ssd > 6 0.20999 osd.6 up 1.00000 1.00000 > 21 0.20000 osd.21 up 1.00000 1.00000 > -1 21.71997 root default > -2 3.64000 host nodez > 0 3.64000 osd.0 up 1.00000 1.00000 > -3 3.57999 host nodew > 9 3.57999 osd.9 up 1.00000 1.00000 > -4 3.64000 host nodex > 2 3.64000 osd.2 up 1.00000 1.00000 > -5 3.64000 host nodey > 3 3.64000 osd.3 up 1.00000 1.00000 > -6 3.64000 host nodezz > 4 3.64000 osd.4 up 1.00000 1.00000 > -7 3.57999 host nodev > 24 3.57999 osd.24 up 1.00000 1.00000 > > # ceph osd crush rule dump > [ > { > "rule_id": 0, > "rule_name": "replicated_ruleset", > "ruleset": 0, > "type": 1, > "min_size": 1, > "max_size": 10, > "steps": [ > { > "op": "take", > "item": -1, > "item_name": "default" > }, > { > "op": "chooseleaf_firstn", > "num": 0, > "type": "host" > }, > { > "op": "emit" > } > ] > }, > { > "rule_id": 1, > "rule_name": "ssd-tier", > "ruleset": 1, > "type": 1, > "min_size": 1, > "max_size": 10, > "steps": [ > { > "op": "take", > "item": -9, > "item_name": "ssd" > }, > { > "op": "chooseleaf_firstn", > "num": 0, > "type": "host" > }, > { > "op": "emit" > } > ] > } > ] > -----BEGIN PGP SIGNATURE----- > Version: Mailvelope v1.2.0 > Comment: https://www.mailvelope.com > > wsFcBAEBCAAQBQJWIR8JCRDmVDuy+mK58QAAhyAP/3LYWWxCtDUABwzW/rov > 5NCHpKgVRkEAUTGIRESFp9egbhr2loaC1pjfkp911Shg6My/C3N6Y9q9MLdq > zy7zGSB/GL5XjvS0TurEjBihtDpMF2SbBk5NkrzgVc1fiOuA8UEZl8J2wBtF > R81UOluZVULzvmMjbH4uWfD1UovJl30LlAz/MocDJsDDejjfnsM3PXn8NSaE > 4AyNkj8tXj8yMZIzxZV25O8NWZXq0JnuOwND+YxT9VxG8k1o3gqg7747j/Uz > 0A9/fJ4IkMJdNGyMCVPgoTJy87CjeSfDf0MmK3S5bXtLfKKZTKYv0m/+B8PY > KzZcuVTavBhFSLWiT3L2U1OOyPz5AEu2ezE2Y6ElFePc+g38eO/I7kuTSixV > +0yZL1tO6vEYZLnwWTWgYFmmrOA5yTBvssGpjpZVPe7swkJG97kvqe/bh2/W > OqQ5PEnhn5Gx3vIDHJwvI/PT4MXZk2VU9cpPMPs7PeIQBPZYPi0/WcfT8m+g > oclkznsM+BSLMiTT8yBc7/T1kLFQXS42jVXEFAKYnJj8LIk0aMc54Gu25g0w > PM6+IFROsMQlGdybbWCPXIXsZ94JjJOBbA3jSP7XkesNvNC9fqlRDJwxBS7h > 2F4cUwpZRJZGSAJzIRbbFdDZOftoUjtIiv+GAH1z54o+lq/sR+WNo1ALTB8k > uNQ8 > =z47G > -----END PGP SIGNATURE----- > ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > On Thu, Oct 15, 2015 at 5:49 PM, Christian Balzer wrote: >> >> Hello, >> >> Having run into this myself two days ago (setting relative sizing values >> doesn't flush things when expected) I'd say that the documentation is >> highly misleading when it comes to the relative settings. >> >> And unclear when it comes to the size/object settings. >> >> Guess this section needs at least one nice red paragraph and some further >> explanations. >> >> Christian >> >> On Thu, 15 Oct 2015 17:33:30 -0600 Robert LeBlanc wrote: >> >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA256 >>> >>> One more question. Is max_{bytes,objects} before or after replication >>> factor? >>> - ---------------- >>> Robert LeBlanc >>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >>> >>> >>> On Thu, Oct 15, 2015 at 4:42 PM, LOPEZ Jean-Charles wrote: >>> > Hi Robert, >>> > >>> > yes they do. >>> > >>> > Pools don’t have a size when you create them hence the couple >>> > value/ratio that is to be defined for cache tiering mechanism. Pool >>> > only have a number of PGs assigned. So setting the max values and the >>> > ratios for dirty and full must be set explicitly to match your >>> > configuration. >>> > >>> > Note that you can at the same time define max_bytes and max_objects. >>> > The first of the 2 values that breaches using your ratio settings will >>> > trigger eviction and/or flushing. The ratios you choose apply to both >>> > values. >>> > >>> > Cheers >>> > JC >>> > >>> >> On 15 Oct 2015, at 15:02, Robert LeBlanc wrote: >>> >> >>> >> -----BEGIN PGP SIGNED MESSAGE----- >>> >> Hash: SHA256 >>> >> >>> >> hmmm... >>> >> >>> >> http://docs.ceph.com/docs/master/rados/operations/cache-tiering/#relative-sizing >>> >> >>> >> makes it sound like it should be based on the size of the pool and >>> >> that you don't have to set anything like max bytes/objects. Can you >>> >> confirm that cache_target_{dirty,dirty_high,full}_ratio works as a >>> >> ratio of target_max_bytes set? >>> >> - ---------------- >>> >> Robert LeBlanc >>> >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >>> >> >>> >> >>> >> On Thu, Oct 15, 2015 at 3:32 PM, Nick Fisk wrote: >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>> -----Original Message----- >>> >>>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On >>> >>>> Behalf Of Robert LeBlanc >>> >>>> Sent: 15 October 2015 22:06 >>> >>>> To: ceph-users@xxxxxxxxxxxxxx >>> >>>> Subject: Cache Tiering Question >>> >>>> >>> >>>> -----BEGIN PGP SIGNED MESSAGE----- >>> >>>> Hash: SHA256 >>> >>>> >>> >>>> ceph df (ceph version 0.94.3-252-g629b631 >>> >>>> (629b631488f044150422371ac77dfc005f3de1bc)) is showing some odd >>> >>>> results: >>> >>>> >>> >>>> root@nodez:~# ceph df >>> >>>> GLOBAL: >>> >>>> SIZE AVAIL RAW USED %RAW USED >>> >>>> 24518G 21670G 1602G 6.53 >>> >>>> POOLS: >>> >>>> NAME ID USED %USED MAX AVAIL OBJECTS >>> >>>> rbd 0 2723G 11.11 6380G 1115793 >>> >>>> ssd-pool 2 0 0 732G 1 >>> >>>> >>> >>>> The rbd pool is showing 11.11% used, but if you calculate the >>> >>>> numbers >>> >>> there >>> >>>> it is 2723/6380=42.68%. >>> >>> >>> >>> I have a feeling that the percentage is based on the amount used of >>> >>> the total cluster size. Ie 2723/24518 >>> >>> >>> >>>> >>> >>>> Will this cause problems with the relative cache tier settings? Do >>> >>>> I need >>> >>> to set >>> >>>> the percentage based on what Ceph is reporting here? >>> >>> >>> >>> The flushing/eviction thresholds are based on the target_max_bytes >>> >>> number that you set, they have nothing to do with the underlying >>> >>> pool size. It's up to you to come up with a sane number for this >>> >>> variable. >>> >>> >>> >>>> >>> >>>> Thanks, >>> >>>> - ---------------- >>> >>>> Robert LeBlanc >>> >>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >>> >>>> ----- BEGIN PGP SIGNATURE----- >>> >>>> Version: Mailvelope v1.2.0 >>> >>>> Comment: https://www.mailvelope.com >>> >>>> >>> >>>> wsFcBAEBCAAQBQJWIBVGCRDmVDuy+mK58QAAXEYQAKm5IBGn81Hlb9az4 >>> >>>> 52x >>> >>>> hSH6onk7mJE7L2s5FnoJv2sNW4azhDEVKGQBE9vvhIVBhhtKtnqdzu3ytk6E >>> >>>> EUFuPBzUWLJyG3wQtp3QC0PdYzlGkS7bowdpZqk9PdaYZYgEdqG/cLEl/eAx >>> >>>> LGIUXmr6vIuNhnntGIIYeUAiWXA7b5qzOKbef6OlOp7Mz6Euel9S8ycZlSAR >>> >>>> eBQ5hdLSFoFai5ldyV+/hmqLnujOfanRFC8pIYr41aKe7wBOPOargLGQdka3 >>> >>>> jswmcf+0hV7QqZSOjJijDYvOgRuHBFK6cdyP9SRKxWxG7uH+yDOvya0TqOob >>> >>>> 1yDomYC1zD2uzG9+L5Iv6at8fuBF5xFKPqax9N4WQj3Oj9fBwioQVBocNxHc >>> >>>> MIlQnvnLeq6OLtdfPoPignTAHIH2RrvAmdwYkSCuopjUSTkmBsyBLIiiz/KI >>> >>>> P4mSXAxZb0UF4pbCDgdYG6qUEywR/enGsT1lnmNLx4vY8W/yz9xQ3o3JnIpD >>> >>>> pWyo9zJ8Ugnwvihbo7xKe+EZOeJL0YF4BiyAprH5pKFdQcAWcV98zWHnLBxd >>> >>>> EFHyN9fHsVdw0UsxIUBZFfM1u4S7fchgVeFfiTSdGqd/dWHQCHKJPNBSJnae >>> >>>> aPKTyvg77N6zTn04VGspfenR+svGbkAtUfO2HJ1Kkd4/wZ9GIzsS1ovPZFsM >>> >>>> jJe4 >>> >>>> =YSyj >>> >>>> -----END PGP SIGNATURE----- >>> >>>> _______________________________________________ >>> >>>> ceph-users mailing list >>> >>>> ceph-users@xxxxxxxxxxxxxx >>> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> >>> >>> >>> >>> >>> >>> >> >>> >> -----BEGIN PGP SIGNATURE----- >>> >> Version: Mailvelope v1.2.0 >>> >> Comment: https://www.mailvelope.com >>> >> >>> >> wsFcBAEBCAAQBQJWICJwCRDmVDuy+mK58QAAyTUQALkwOnB++bXto+cM0iSZ >>> >> B3nZgvl9FKZnujb0MUIiS29a+Y2nnBpAGgHbF4Y9ngnDQYNZ0yf1DD2wYad2 >>> >> rll6pYeWRRYSmaBCBfdPlqbbVw8WpjdXLR9FtLFfUR2V+Ghf4U83F8iKiWn1 >>> >> +6DqouHMA/auHjEr49w+Ue0kpKSfItH/9LkVjYQBKp6E7tyOSsrzcM1milKR >>> >> lwsIOewiKvsg4neDLqkdqaO6+bYuaDJmgN+hEqzl7lxbzt5pJbzfknpiAewm >>> >> GTw8C2AUbzcYqIhzqWcY9Jiy6ZZkYAPDODsJpkc/Pubnq73jlkllB4JaQpJy >>> >> 2964DynNn8jBAI9JJpLyldtKPEofmkumzZ6tPXgLDuo2VuV+hp/wVadZKy2k >>> >> PDhms1dpeLFM8NsgOToSpO6Ej1l1857C5+cy3EeTlKqgs6z1QbTwNvUeeCpk >>> >> /ORObJQCa7teNEM1c33oEJ3V1LOx7SfsEn1A6PVaaUegmMEEa6Cb8Va2RYl8 >>> >> 5fhXqIcsU9KWHDmq8+MZ9x67etAucXKJmPQpIzJD6M9WtsWsDupsuJ1MgCKB >>> >> pxhqjwujuaZWfF+W3HEuOOP7OcXbj2U3RO1V3HOr9N0cLFTf+vuefIzOtgs1 >>> >> qdBPrxIUNznfYXarclFuJzCWPzKpDTdKbLwYUcbh9hKayRpll3DGOW7qUX3u >>> >> eNXR >>> >> =cI+5 >>> >> -----END PGP SIGNATURE----- >>> >> _______________________________________________ >>> >> ceph-users mailing list >>> >> ceph-users@xxxxxxxxxxxxxx >>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> > >>> >>> -----BEGIN PGP SIGNATURE----- >>> Version: Mailvelope v1.2.0 >>> Comment: https://www.mailvelope.com >>> >>> wsFcBAEBCAAQBQJWIDfDCRDmVDuy+mK58QAA8qkQAIBtEorNvkAwVojMmOcW >>> /zEGPw9Hg0OgvoR7gv4DWSKO4y8raek3oL7BNE5WNrpkRkpKfjGe6OLLtTr+ >>> 9b7K19Cv3oRQHVUG2S+rnwDzsg/4ORL90TZZSh729ThjE823g9PDpB1ThsdD >>> DApHvU4OoLEYVepCkxzZx4a8UztyaBnDl8/LCNK7Rzg30UWsiR9kRW4bru5F >>> igcFHslBmUSH0trbG0kxA9mrmnWq2m7i0QNVS1nUDJ7crDwqnJrnf17NG7NV >>> SQKKsAcuM2lmmAPkLIMy4J1oiBb8JXiCc27Bj+dtBG9Iqh8HdYvvmVd6O8Jv >>> bVgMUN7mmGGpuIs040Q3Fn4wSrhtGc5iUpzM5eJnemnrPi5ymE8WayHX6aak >>> qA5vfM8WLNKMmPBORqg2DB/1co6OkvHOLAk+ZAUYUo88I+dVp7BIXadaZMhS >>> GKbTPfpZgDdn0bHbn4Dyma1a1JVarpQXCaLq4ayvfY7DQuoFVi2eOImxvc+Q >>> gFSmmdegK0uto3aTnySR1fRl1Yk9grd+LSwJgmsew4t2AHjAbAYgG1idnvJt >>> t5e6Aj4NnNK3f085gkoundV1rrp37lu3Ot82gMq7xyxNmlT/FsAmOFSEelJP >>> U26AQHlgDM7oV95IQMnKOtdziIq7NFdspuVuN+umf7JpnuYLbROSREG3dIrq >>> qdxB >>> =de2k >>> -----END PGP SIGNATURE----- >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> -- >> Christian Balzer Network/Systems Engineer >> chibi@xxxxxxx Global OnLine Japan/Fusion Communications >> http://www.gol.com/ -----BEGIN PGP SIGNATURE----- Version: Mailvelope v1.2.0 Comment: https://www.mailvelope.com wsFcBAEBCAAQBQJWISFQCRDmVDuy+mK58QAAXXMP/1vK1wgXKf+eVH2+Adkp UlzZEdQ96XxZ+tLbo2+jjBZwbnIxZ8BpjTyLGMLVRwthR7iBEt9klMQRSAy5 tT82Uk5gRqAJVoMSLTXQhnNJRPYxbTBEP1BBkw10WJu/5l/AvUpNNkejiQ1i E/KXNqwKEFf9FUAlWGAwW2naGtjU3Z7HK95K8C9FwP7BcIJA+b/3FhUIXntE MLTAeaj6yIftsBD9Hav/RWQxpgA7Db0IHF2EyV4Ry+ds2V25PBg60MeEJpa0 rJxPgfKB+yawWzzSVKpywQLulxbqafxAnugDrXZfQBfAjFPScrdBWeAbW6Tp qzwG8/5/TWj394DzkJ5ary/YjiUJwsliZ6yiBKXIY+OIJcUWb4+Aa8ktswhs 3PcyE4HJk/QPSynjF9CyX62lhpXO4lSBcq5AoL5VP48QRYx/ZPTeAjkuBEAx CjjL8WcBDUww56/beelecDci7TW+tetdAJ6t3nWgWNkBoDSqjXJV6CQqtOJ6 rhhnZ9hn8tLQ84anpPp9o8P+XyueDQOdN82IbLUt5qIzRVcwGvCnyZy2RzfB s2NS9b2RqRdM6HMFAR57xAoPnmlK9kC9I4LfN5ApCa3RryntLxbT5rI3nHhU nvIrjKI7p28QLB5EtnWC7oJsuXAvF4wVJ8QOvjv8VynldWW0FwSUs4yeaUxz ib4U =AEVh -----END PGP SIGNATURE----- _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com