Re: Troubleshooting hanging storage backend whenever there is any cluster change

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



You should definitely stop using `size 3 min_size 1` on your pools.  Go back to the default `min_size 2`.  I'm a little confused why you have 3 different CRUSH rules.  They're all identical.  You only need different CRUSH rules if you're using Erasure Coding or targeting a different set of OSDs like SSD vs HDD OSDs for different pools.

All of that said, I don't see anything in those rules that would indicate why you're having problems with accessing your data when a node is being restarted.  The `ceph status` and `ceph health detail` outputs will be helpful while it's happening.

On Thu, Oct 11, 2018 at 3:02 PM Nils Fahldieck - Profihost AG <n.fahldieck@xxxxxxxxxxxx> wrote:
Thanks for your reply. I'll capture a `ceph status` the next time I
encounter a not working RBD. Here's the other output you asked for:

$ ceph osd crush rule dump
[
    {
        "rule_id": 0,
        "rule_name": "data",
        "ruleset": 0,
        "type": 1,
        "min_size": 1,
        "max_size": 10,
        "steps": [
            {
                "op": "take",
                "item": -10000,
                "item_name": "root"
            },
            {
                "op": "chooseleaf_firstn",
                "num": 0,
                "type": "host"
            },
            {
                "op": "emit"
            }
        ]
    },
    {
        "rule_id": 1,
        "rule_name": "metadata",
        "ruleset": 1,
        "type": 1,
        "min_size": 1,
        "max_size": 10,
        "steps": [
            {
                "op": "take",
                "item": -10000,
                "item_name": "root"
            },
            {
                "op": "chooseleaf_firstn",
                "num": 0,
                "type": "host"
            },
            {
                "op": "emit"
            }
        ]
    },
    {
        "rule_id": 2,
        "rule_name": "rbd",
        "ruleset": 2,
        "type": 1,
        "min_size": 1,
        "max_size": 10,
        "steps": [
            {
                "op": "take",
                "item": -10000,
                "item_name": "root"
            },
            {
                "op": "chooseleaf_firstn",
                "num": 0,
                "type": "host"
            },
            {
                "op": "emit"
            }
        ]
    }
]

$ ceph osd pool ls detail
pool 5 'cephstor1' replicated size 3 min_size 1 crush_rule 0 object_hash
rjenkins pg_num 4096 pgp_num 4096 last_change 1217074 flags hashpspool
min_read_recency_for_promote 1 min_write_recency_for_promote 1
stripe_width 0 application rbd
        removed_snaps
[1~9,b~1,d~7d1e8,7d1f6~3d05f,ba256~4bd9,bee30~357,bf188~5531,c46ba~85b3,ccc6e~b599,d820b~1,d820d~1,d820f~1,d8211~1,d8214~1,d8216~1,d8219~2,d821d~1,d821f~1,d8221~1,d8223~1,d8226~2,d8229~1,d822b~2,d822e~2,d8231~3,d8236~1,d8238~2,d823b~1,d823d~3,d8241~1,d8243~1,d8245~1,d8247~3,d824d~1,d824f~1,d8251~1,d8253~1,d8255~2,d8258~1,d825c~1,d825e~2,d8262~1,d8264~1,d8266~1,d8268~2,d826e~2,d8272~1,d8274~1,d8276~8,d8280~1,d8282~1,d8284~1,d8286~1,d8288~1,d828a~1,d828c~1,d828e~1,d8290~1,d8292~1,d8294~3,d8298~1,d829a~2,d829d~1,d82a0~4,d82a6~1,d82a8~2,d82ac~1,d82ae~1,d82b0~1,d82b2~1,d82b5~1,d82b7~1,d82b9~1,d82bb~1,d82bd~1,d82bf~1,d82c1~1,d82c3~2,d82c6~2,d82c9~1,d82cb~1,d82ce~1,d82d0~2,d82d3~1,d82d6~4,d82db~1,d82de~1,d82e0~1,d82e2~1,d82e4~1,d82e6~1,d82e8~1,d82ea~1,d82ed~1,d82ef~1,d82f1~1,d82f3~2,d82f7~2,d82fb~2,d82ff~1,d8301~1,d8303~1,d8305~1,d8307~1,d8309~1,d830b~1,d830e~1,d8311~2,d8314~3,d8318~1,d831a~1,d831c~1,d831f~3,d8323~2,d8329~1,d832b~2,d832f~1,d8331~1,d8333~1,d8335~1,d8338~6,d833f~1,d8341~1,d8343~1,d8345~2,d8349~2,d834c~1,d834e~1,d8350~1,d8352~1,d8354~1,d8356~4,d835b~1,d835d~2,d8360~1,d8362~3,d8366~3,d836b~3,d8370~1,d8372~1,d8374~1,d8376~3,d837a~1,d837c~1,d837e~2,d8381~1,d8383~1,d8385~1,d8387~3,d838b~2,d838e~4,d8393~1,d8396~1,d8398~2,d839b~1,d839d~2,d83a0~2,d83a3~1,d83a5~2,d83a9~2,d83ad~1,d83b0~2,d83b4~2,d83b8~1,d83ba~a,d83c5~1,d83c7~1,d83ca~1,d83cc~1,d83ce~1,d83d0~1,d83d2~6,d83d9~3,d83df~1,d83e1~2,d83e5~1,d83e8~1,d83eb~4,d83f0~1,d83f2~1,d83f4~3,d83f8~3,d83fd~2,d8402~1,d8405~1,d8407~1,d840a~2,d840f~1,d8411~1,d8413~3,d8417~3,d841c~4,d8422~4,d8428~2,d842b~1,d842e~1,d8430~1,d8432~5,d843a~1,d843c~3,d8440~5,d8447~1,d844a~1,d844d~1,d844f~1,d8452~1,d8455~1,d8457~1,d8459~2,d845d~2,d8460~1,d8462~3,d8467~1,d8469~1,d846b~2,d846e~2,d8471~4,d8476~6,d847d~3,d8482~1,d8484~1,d8486~2,d8489~2,d848c~1,d848e~1,d8491~4,d8499~1,d849c~3,d84a0~1,d84a2~1,d84a4~3,d84aa~2,d84ad~2,d84b1~4,d84b6~1,d84b8~1,d84ba~1,d84bc~1,d84be~1,d84c0~5,d84c7~4,d84ce~1,d84d0~1,d84d2~2,d84d6~2,d84db~1,d84dd~2,d84e2~2,d84e6~1,d84e9~1,d84eb~4,d84f0~4]
pool 6 'cephfs_cephstor1_data' replicated size 3 min_size 1 crush_rule 0
object_hash rjenkins pg_num 128 pgp_num 128 last_change 1214952 flags
hashpspool stripe_width 0 application cephfs
pool 7 'cephfs_cephstor1_metadata' replicated size 3 min_size 1
crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change
1214952 flags hashpspool stripe_width 0 application cephfs

Am 11.10.2018 um 20:47 schrieb David Turner:
> My first guess is to ask what your crush rules are.  `ceph osd crush
> rule dump` along with `ceph osd pool ls detail` would be helpful.  Also
> if you have a `ceph status` output from a time where the VM RBDs aren't
> working might explain something.
>
> On Thu, Oct 11, 2018 at 1:12 PM Nils Fahldieck - Profihost AG
> <n.fahldieck@xxxxxxxxxxxx <mailto:n.fahldieck@xxxxxxxxxxxx>> wrote:
>
>     Hi everyone,
>
>     since some time we experience service outages in our Ceph cluster
>     whenever there is any change to the HEALTH status. E. g. swapping
>     storage devices, adding storage devices, rebooting Ceph hosts, during
>     backfills ect.
>
>     Just now I had a recent situation, where several VMs hung after I
>     rebooted one Ceph host. We have 3 replications for each PG, 3 mon, 3
>     mgr, 3 mds and 71 osds spread over 9 hosts.
>
>     We use Ceph as a storage backend for our Proxmox VE (PVE) environment.
>     The outages are in the form of blocked virtual file systems of those
>     virtual machines running in our PVE cluster.
>
>     It feels similar to stuck and inactive PGs to me. Honestly though I'm
>     not really sure on how to debug this problem or which log files to
>     examine.
>
>     OS: Debian 9
>     Kernel: 4.12 based upon SLE15-SP1
>
>     # ceph version
>     ceph version 12.2.8-133-gded2f6836f
>     (ded2f6836f6331a58f5c817fca7bfcd6c58795aa) luminous (stable)
>
>     Can someone guide me? I'm more than happy to provide more information
>     as needed.
>
>     Thanks in advance
>     Nils
>     _______________________________________________
>     ceph-users mailing list
>     ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux