Re: Troubleshooting hanging storage backend whenever there is any cluster change

Nils Fahldieck - Profihost AG <n.fahldieck@xxxxxxxxxxxx> · Fri, 12 Oct 2018 13:55:22 +0200

I rebooted a Ceph host and logged `ceph status` & `ceph health detail`
every 5 seconds. During this I encountered 'PG_AVAILABILITY Reduced data
availability: pgs peering'. At the same time some VMs hung as described
before.

See the log here: https://pastebin.com/wxUKzhgB

PG_AVAILABILITY is noted at timestamps [2018-10-12 12:16:15.403394] and
[2018-10-12 12:17:40.072655].

Ceph docs say regarding PG_AVAILABILITY:

Data availability is reduced, meaning that the cluster is unable to
service potential read or write requests for some data in the cluster.
Specifically, one or more PGs is in a state that does not allow IO
requests to be serviced. Problematic PG states include peering, stale,
incomplete, and the lack of active (if those conditions do not clear
quickly).

Do you know why those PGs are stuck peering and how I might troubleshoot
this any further?
Am 11.10.2018 um 22:27 schrieb David Turner:
> You should definitely stop using `size 3 min_size 1` on your pools.  Go
> back to the default `min_size 2`.  I'm a little confused why you have 3
> different CRUSH rules.  They're all identical.  You only need different
> CRUSH rules if you're using Erasure Coding or targeting a different set
> of OSDs like SSD vs HDD OSDs for different pools.
> 
> All of that said, I don't see anything in those rules that would
> indicate why you're having problems with accessing your data when a node
> is being restarted.  The `ceph status` and `ceph health detail` outputs
> will be helpful while it's happening.
> 
> On Thu, Oct 11, 2018 at 3:02 PM Nils Fahldieck - Profihost AG
> <n.fahldieck@xxxxxxxxxxxx <mailto:n.fahldieck@xxxxxxxxxxxx>> wrote:
> 
>     Thanks for your reply. I'll capture a `ceph status` the next time I
>     encounter a not working RBD. Here's the other output you asked for:
> 
>     $ ceph osd crush rule dump
>     [
>         {
>             "rule_id": 0,
>             "rule_name": "data",
>             "ruleset": 0,
>             "type": 1,
>             "min_size": 1,
>             "max_size": 10,
>             "steps": [
>                 {
>                     "op": "take",
>                     "item": -10000,
>                     "item_name": "root"
>                 },
>                 {
>                     "op": "chooseleaf_firstn",
>                     "num": 0,
>                     "type": "host"
>                 },
>                 {
>                     "op": "emit"
>                 }
>             ]
>         },
>         {
>             "rule_id": 1,
>             "rule_name": "metadata",
>             "ruleset": 1,
>             "type": 1,
>             "min_size": 1,
>             "max_size": 10,
>             "steps": [
>                 {
>                     "op": "take",
>                     "item": -10000,
>                     "item_name": "root"
>                 },
>                 {
>                     "op": "chooseleaf_firstn",
>                     "num": 0,
>                     "type": "host"
>                 },
>                 {
>                     "op": "emit"
>                 }
>             ]
>         },
>         {
>             "rule_id": 2,
>             "rule_name": "rbd",
>             "ruleset": 2,
>             "type": 1,
>             "min_size": 1,
>             "max_size": 10,
>             "steps": [
>                 {
>                     "op": "take",
>                     "item": -10000,
>                     "item_name": "root"
>                 },
>                 {
>                     "op": "chooseleaf_firstn",
>                     "num": 0,
>                     "type": "host"
>                 },
>                 {
>                     "op": "emit"
>                 }
>             ]
>         }
>     ]
> 
>     $ ceph osd pool ls detail
>     pool 5 'cephstor1' replicated size 3 min_size 1 crush_rule 0 object_hash
>     rjenkins pg_num 4096 pgp_num 4096 last_change 1217074 flags hashpspool
>     min_read_recency_for_promote 1 min_write_recency_for_promote 1
>     stripe_width 0 application rbd
>             removed_snaps
>     [1~9,b~1,d~7d1e8,7d1f6~3d05f,ba256~4bd9,bee30~357,bf188~5531,c46ba~85b3,ccc6e~b599,d820b~1,d820d~1,d820f~1,d8211~1,d8214~1,d8216~1,d8219~2,d821d~1,d821f~1,d8221~1,d8223~1,d8226~2,d8229~1,d822b~2,d822e~2,d8231~3,d8236~1,d8238~2,d823b~1,d823d~3,d8241~1,d8243~1,d8245~1,d8247~3,d824d~1,d824f~1,d8251~1,d8253~1,d8255~2,d8258~1,d825c~1,d825e~2,d8262~1,d8264~1,d8266~1,d8268~2,d826e~2,d8272~1,d8274~1,d8276~8,d8280~1,d8282~1,d8284~1,d8286~1,d8288~1,d828a~1,d828c~1,d828e~1,d8290~1,d8292~1,d8294~3,d8298~1,d829a~2,d829d~1,d82a0~4,d82a6~1,d82a8~2,d82ac~1,d82ae~1,d82b0~1,d82b2~1,d82b5~1,d82b7~1,d82b9~1,d82bb~1,d82bd~1,d82bf~1,d82c1~1,d82c3~2,d82c6~2,d82c9~1,d82cb~1,d82ce~1,d82d0~2,d82d3~1,d82d6~4,d82db~1,d82de~1,d82e0~1,d82e2~1,d82e4~1,d82e6~1,d82e8~1,d82ea~1,d82ed~1,d82ef~1,d82f1~1,d82f3~2,d82f7~2,d82fb~2,d82ff~1,d8301~1,d8303~1,d8305~1,d8307~1,d8309~1,d830b~1,d830e~1,d8311~2,d8314~3,d8318~1,d831a~1,d831c~1,d831f~3,d8323~2,d8329~1,d832b~2,d832f~1,d8331~1,d8333~1,d8335~1,d8338~6,d833f~1,d8341~1,d8343~1,d8345~2,d8349~2,d834c~1,d834e~1,d8350~1,d8352~1,d8354~1,d8356~4,d835b~1,d835d~2,d8360~1,d8362~3,d8366~3,d836b~3,d8370~1,d8372~1,d8374~1,d8376~3,d837a~1,d837c~1,d837e~2,d8381~1,d8383~1,d8385~1,d8387~3,d838b~2,d838e~4,d8393~1,d8396~1,d8398~2,d839b~1,d839d~2,d83a0~2,d83a3~1,d83a5~2,d83a9~2,d83ad~1,d83b0~2,d83b4~2,d83b8~1,d83ba~a,d83c5~1,d83c7~1,d83ca~1,d83cc~1,d83ce~1,d83d0~1,d83d2~6,d83d9~3,d83df~1,d83e1~2,d83e5~1,d83e8~1,d83eb~4,d83f0~1,d83f2~1,d83f4~3,d83f8~3,d83fd~2,d8402~1,d8405~1,d8407~1,d840a~2,d840f~1,d8411~1,d8413~3,d8417~3,d841c~4,d8422~4,d8428~2,d842b~1,d842e~1,d8430~1,d8432~5,d843a~1,d843c~3,d8440~5,d8447~1,d844a~1,d844d~1,d844f~1,d8452~1,d8455~1,d8457~1,d8459~2,d845d~2,d8460~1,d8462~3,d8467~1,d8469~1,d846b~2,d846e~2,d8471~4,d8476~6,d847d~3,d8482~1,d8484~1,d8486~2,d8489~2,d848c~1,d848e~1,d8491~4,d8499~1,d849c~3,d84a0~1,d84a2~1,d84a4~3,d84aa~2,d84ad~2,d84b1~4,d84b6~1,d84b8~1,d84ba~1,d84bc~1,d84be~1,d84c0~5,d84c7~4,d84ce~1,d84d0~1,d84d2~2,d84d6~2,d84db~1,d84dd~2,d84e2~2,d84e6~1,d84e9~1,d84eb~4,d84f0~4]
>     pool 6 'cephfs_cephstor1_data' replicated size 3 min_size 1 crush_rule 0
>     object_hash rjenkins pg_num 128 pgp_num 128 last_change 1214952 flags
>     hashpspool stripe_width 0 application cephfs
>     pool 7 'cephfs_cephstor1_metadata' replicated size 3 min_size 1
>     crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change
>     1214952 flags hashpspool stripe_width 0 application cephfs
> 
>     Am 11.10.2018 um 20:47 schrieb David Turner:
>     > My first guess is to ask what your crush rules are.  `ceph osd crush
>     > rule dump` along with `ceph osd pool ls detail` would be helpful. 
>     Also
>     > if you have a `ceph status` output from a time where the VM RBDs
>     aren't
>     > working might explain something.
>     >
>     > On Thu, Oct 11, 2018 at 1:12 PM Nils Fahldieck - Profihost AG
>     > <n.fahldieck@xxxxxxxxxxxx <mailto:n.fahldieck@xxxxxxxxxxxx>
>     <mailto:n.fahldieck@xxxxxxxxxxxx <mailto:n.fahldieck@xxxxxxxxxxxx>>>
>     wrote:
>     >
>     >     Hi everyone,
>     >
>     >     since some time we experience service outages in our Ceph cluster
>     >     whenever there is any change to the HEALTH status. E. g. swapping
>     >     storage devices, adding storage devices, rebooting Ceph hosts,
>     during
>     >     backfills ect.
>     >
>     >     Just now I had a recent situation, where several VMs hung after I
>     >     rebooted one Ceph host. We have 3 replications for each PG, 3
>     mon, 3
>     >     mgr, 3 mds and 71 osds spread over 9 hosts.
>     >
>     >     We use Ceph as a storage backend for our Proxmox VE (PVE)
>     environment.
>     >     The outages are in the form of blocked virtual file systems of
>     those
>     >     virtual machines running in our PVE cluster.
>     >
>     >     It feels similar to stuck and inactive PGs to me. Honestly
>     though I'm
>     >     not really sure on how to debug this problem or which log files to
>     >     examine.
>     >
>     >     OS: Debian 9
>     >     Kernel: 4.12 based upon SLE15-SP1
>     >
>     >     # ceph version
>     >     ceph version 12.2.8-133-gded2f6836f
>     >     (ded2f6836f6331a58f5c817fca7bfcd6c58795aa) luminous (stable)
>     >
>     >     Can someone guide me? I'm more than happy to provide more
>     information
>     >     as needed.
>     >
>     >     Thanks in advance
>     >     Nils
>     >     _______________________________________________
>     >     ceph-users mailing list
>     >     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>     <mailto:ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>>
>     >     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>     >
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com