Re: mons fail as soon as I attempt to mount

胡玮文 <huww98@xxxxxxxxxxx> · Tue, 16 Nov 2021 13:57:59 +0000

Hi Jeremy. Since you say the mons fail, could you share the logs of the failing mons? It is hard to diagnostic with little information.

发件人: Jeremy Hansen<mailto:jeremy@xxxxxxxxxx>
发送时间: 2021年11月16日 19:27
收件人: ceph-users<mailto:ceph-users@xxxxxxx>
主题:  Re: mons fail as soon as I attempt to mount

No ideas on this? It has me stumped. I’m going to revert my kernel upgrades if there isn’t a more logical explanation.

Thanks

> On Monday, Nov 15, 2021 at 2:33 AM, Jeremy Hansen <jeremy@xxxxxxxxxx (mailto:jeremy@xxxxxxxxxx)> wrote:
> This post references a kernel issue:
>
> https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/6JDAPD5IR46JI6R6YGWQORDJTZ5Z2FIU/
>
> I recently updated my ceph nodes to 5.15.1. Could this be my issue?
>
> -jeremy
>
>
>
>
> > On Sunday, Nov 14, 2021 at 4:37 PM, Jeremy Hansen <jeremy@xxxxxxxxxx (mailto:jeremy@xxxxxxxxxx)> wrote:
> > I’m trying to mount a cephfs volume from a new machine. For some reason, it looks like all the mons fail when attempting to mount:
> >
> > [root@btc04 ~]# mount -t ceph :/ /mnt/ceph -o name=btc
> > mount error: no mds server is up or the cluster is laggy
> > [root@btc04 ~]# rpm -qa | grep ceph
> > python3-cephfs-16.2.4-0.el8.x86_64
> > ceph-common-16.2.4-0.el8.x86_64
> > cephadm-16.2.4-0.el8.noarch
> > python3-ceph-argparse-16.2.4-0.el8.x86_64
> > libcephfs2-16.2.4-0.el8.x86_64
> > python3-ceph-common-16.2.4-0.el8.x86_64
> >
> >
> > [ 51.105212] libceph: loaded (mon/osd proto 15/24)
> > [ 51.145564] ceph: loaded (mds proto 32)
> > [ 51.164266] libceph: mon3 (1)192.168.30.14:6789 session established
> > [ 70.199453] libceph: mon3 (1)192.168.30.14:6789 socket closed (con state OPEN)
> > [ 70.199464] libceph: mon3 (1)192.168.30.14:6789 session lost, hunting for new mon
> > [ 70.204400] libceph: mon0 (1)192.168.30.11:6789 session established
> > [ 70.771652] libceph: mon0 (1)192.168.30.11:6789 socket closed (con state OPEN)
> > [ 70.771670] libceph: mon0 (1)192.168.30.11:6789 session lost, hunting for new mon
> > [ 70.774588] libceph: mon4 (1)192.168.30.15:6789 session established
> > [ 71.234037] libceph: mon4 (1)192.168.30.15:6789 socket closed (con state OPEN)
> > [ 71.234055] libceph: mon4 (1)192.168.30.15:6789 session lost, hunting for new mon
> > [ 77.904722] libceph: mon3 (1)192.168.30.14:6789 socket closed (con state V1_BANNER)
> > [ 78.160614] libceph: mon3 (1)192.168.30.14:6789 socket closed (con state V1_BANNER)
> > [ 78.664602] libceph: mon3 (1)192.168.30.14:6789 socket closed (con state V1_BANNER)
> > [ 79.824787] libceph: mon3 (1)192.168.30.14:6789 socket closed (con state V1_BANNER)
> > [ 81.808526] libceph: mon3 (1)192.168.30.14:6789 socket closed (con state V1_BANNER)
> > [ 85.840430] libceph: mon3 (1)192.168.30.14:6789 socket closed (con state V1_BANNER)
> >
> >
> > Not really sure why…
> >
> > [ceph: root@cn01 /]# ceph osd pool get cephfs.btc.data all
> > size: 4
> > min_size: 2
> > pg_num: 32
> > pgp_num: 32
> > crush_rule: replicated_rule
> > hashpspool: true
> > nodelete: false
> > nopgchange: false
> > nosizechange: false
> > write_fadvise_dontneed: false
> > noscrub: false
> > nodeep-scrub: false
> > use_gmt_hitset: 1
> > fast_read: 0
> > pg_autoscale_mode: on
> > [ceph: root@cn01 /]# ceph osd pool get cephfs.btc.meta all
> > size: 4
> > min_size: 2
> > pg_num: 32
> > pgp_num: 32
> > crush_rule: replicated_rule
> > hashpspool: true
> > nodelete: false
> > nopgchange: false
> > nosizechange: false
> > write_fadvise_dontneed: false
> > noscrub: false
> > nodeep-scrub: false
> > use_gmt_hitset: 1
> > fast_read: 0
> > recovery_priority: 5
> > pg_autoscale_mode: on
> > pg_num_min: 16
> > pg_autoscale_bias: 4
> >
> >
> > The cluster becomes unhealthy but then clears shortly after the client times out.
> >
> > [ceph: root@cn01 /]# ceph -s
> > cluster:
> > id: bfa2ad58-c049-11eb-9098-3c8cf8ed728d
> > health: HEALTH_OK
> >
> > services:
> > mon: 5 daemons, quorum cn05,cn02,cn03,cn04,cn01 (age 6m)
> > mgr: cn05.vpuwau(active, since 6d), standbys: cn02.arszct
> > mds: 2/2 daemons up, 4 standby
> > osd: 35 osds: 35 up (since 2d), 35 in (since 6d)
> >
> > data:
> > volumes: 2/2 healthy
> > pools: 6 pools, 289 pgs
> > objects: 6.73M objects, 4.3 TiB
> > usage: 17 TiB used, 108 TiB / 126 TiB avail
> > pgs: 289 active+clean
> >
> > io:
> > client: 0 B/s rd, 105 KiB/s wr, 2 op/s rd, 13 op/s wr
> >
> >
> >
> > -jeremy
> >
> >
> >

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx