Re: Issues after a shutdown

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yea, assuming you can ping with a lower MTU, check the MTU on your
switching.

On Mon, 25 Jul 2022, 23:05 Jeremy Hansen, <farnsworth.mcfadden@xxxxxxxxx>
wrote:

> That results in packet loss:
>
> [root@cn01 ~]# ping -M do -s 8972 192.168.30.14
> PING 192.168.30.14 (192.168.30.14) 8972(9000) bytes of data.
> ^C
> --- 192.168.30.14 ping statistics ---
> 3 packets transmitted, 0 received, 100% packet loss, time 2062ms
>
> That's very weird...  but this gives me something to figure out.  Hmmm.
> Thank you.
>
> On Mon, Jul 25, 2022 at 3:01 PM Sean Redmond <sean.redmond1@xxxxxxxxx>
> wrote:
>
>> Looks good, just confirm it with a large ping with don't fragment flag
>> set between each host.
>>
>> ping -M do -s 8972 [destination IP]
>>
>>
>> On Mon, 25 Jul 2022, 22:56 Jeremy Hansen, <farnsworth.mcfadden@xxxxxxxxx>
>> wrote:
>>
>>> MTU is the same across all hosts:
>>>
>>> --------- cn01.ceph.la1.clx.corp---------
>>> enp2s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9000
>>>         inet 192.168.30.11  netmask 255.255.255.0  broadcast
>>> 192.168.30.255
>>>         inet6 fe80::3e8c:f8ff:feed:728d  prefixlen 64  scopeid 0x20<link>
>>>         ether 3c:8c:f8:ed:72:8d  txqueuelen 1000  (Ethernet)
>>>         RX packets 3163785  bytes 2136258888 (1.9 GiB)
>>>         RX errors 0  dropped 0  overruns 0  frame 0
>>>         TX packets 6890933  bytes 40233267272 (37.4 GiB)
>>>         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>>>
>>> --------- cn02.ceph.la1.clx.corp---------
>>> enp2s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9000
>>>         inet 192.168.30.12  netmask 255.255.255.0  broadcast
>>> 192.168.30.255
>>>         inet6 fe80::3e8c:f8ff:feed:ff0c  prefixlen 64  scopeid 0x20<link>
>>>         ether 3c:8c:f8:ed:ff:0c  txqueuelen 1000  (Ethernet)
>>>         RX packets 3976256  bytes 2761764486 (2.5 GiB)
>>>         RX errors 0  dropped 0  overruns 0  frame 0
>>>         TX packets 9270324  bytes 56984933585 (53.0 GiB)
>>>         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>>>
>>> --------- cn03.ceph.la1.clx.corp---------
>>> enp2s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9000
>>>         inet 192.168.30.13  netmask 255.255.255.0  broadcast
>>> 192.168.30.255
>>>         inet6 fe80::3e8c:f8ff:feed:feba  prefixlen 64  scopeid 0x20<link>
>>>         ether 3c:8c:f8:ed:fe:ba  txqueuelen 1000  (Ethernet)
>>>         RX packets 13081847  bytes 93614795356 (87.1 GiB)
>>>         RX errors 0  dropped 0  overruns 0  frame 0
>>>         TX packets 4001854  bytes 2536322435 (2.3 GiB)
>>>         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>>>
>>> --------- cn04.ceph.la1.clx.corp---------
>>> enp2s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9000
>>>         inet 192.168.30.14  netmask 255.255.255.0  broadcast
>>> 192.168.30.255
>>>         inet6 fe80::3e8c:f8ff:feed:6f89  prefixlen 64  scopeid 0x20<link>
>>>         ether 3c:8c:f8:ed:6f:89  txqueuelen 1000  (Ethernet)
>>>         RX packets 60018  bytes 5622542 (5.3 MiB)
>>>         RX errors 0  dropped 0  overruns 0  frame 0
>>>         TX packets 59889  bytes 17463794 (16.6 MiB)
>>>         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>>>
>>> --------- cn05.ceph.la1.clx.corp---------
>>> enp2s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9000
>>>         inet 192.168.30.15  netmask 255.255.255.0  broadcast
>>> 192.168.30.255
>>>         inet6 fe80::3e8c:f8ff:feed:7245  prefixlen 64  scopeid 0x20<link>
>>>         ether 3c:8c:f8:ed:72:45  txqueuelen 1000  (Ethernet)
>>>         RX packets 69163  bytes 8085511 (7.7 MiB)
>>>         RX errors 0  dropped 0  overruns 0  frame 0
>>>         TX packets 73539  bytes 17069869 (16.2 MiB)
>>>         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>>>
>>> --------- cn06.ceph.la1.clx.corp---------
>>> enp2s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9000
>>>         inet 192.168.30.16  netmask 255.255.255.0  broadcast
>>> 192.168.30.255
>>>         inet6 fe80::3e8c:f8ff:feed:feab  prefixlen 64  scopeid 0x20<link>
>>>         ether 3c:8c:f8:ed:fe:ab  txqueuelen 1000  (Ethernet)
>>>         RX packets 23570  bytes 2251531 (2.1 MiB)
>>>         RX errors 0  dropped 0  overruns 0  frame 0
>>>         TX packets 22268  bytes 16186794 (15.4 MiB)
>>>         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>>>
>>> 10G.
>>>
>>> On Mon, Jul 25, 2022 at 2:51 PM Sean Redmond <sean.redmond1@xxxxxxxxx>
>>> wrote:
>>>
>>>> Is the MTU in n the new rack set correctly?
>>>>
>>>> On Mon, 25 Jul 2022, 11:30 Jeremy Hansen, <
>>>> farnsworth.mcfadden@xxxxxxxxx> wrote:
>>>>
>>>>> I transitioned some servers to a new rack and now I'm having major
>>>>> issues
>>>>> with Ceph upon bringing things back up.
>>>>>
>>>>> I believe the issue may be related to the ceph nodes coming back up
>>>>> with
>>>>> different IPs before VLANs were set.  That's just a guess because I
>>>>> can't
>>>>> think of any other reason this would happen.
>>>>>
>>>>> Current state:
>>>>>
>>>>> Every 2.0s: ceph -s
>>>>>    cn01.ceph.la1.clx.corp: Mon Jul 25 10:13:05 2022
>>>>>
>>>>>   cluster:
>>>>>     id:     bfa2ad58-c049-11eb-9098-3c8cf8ed728d
>>>>>     health: HEALTH_WARN
>>>>>             1 filesystem is degraded
>>>>>             2 MDSs report slow metadata IOs
>>>>>             2/5 mons down, quorum cn02,cn03,cn01
>>>>>             9 osds down
>>>>>             3 hosts (17 osds) down
>>>>>             Reduced data availability: 97 pgs inactive, 9 pgs down
>>>>>             Degraded data redundancy: 13860144/30824413 objects
>>>>> degraded
>>>>> (44.965%), 411 pgs degraded, 482 pgs undersized
>>>>>
>>>>>   services:
>>>>>     mon: 5 daemons, quorum cn02,cn03,cn01 (age 62m), out of quorum:
>>>>> cn05,
>>>>> cn04
>>>>>     mgr: cn02.arszct(active, since 5m)
>>>>>     mds: 2/2 daemons up, 2 standby
>>>>>     osd: 35 osds: 15 up (since 62m), 24 in (since 58m); 222 remapped
>>>>> pgs
>>>>>
>>>>>   data:
>>>>>     volumes: 1/2 healthy, 1 recovering
>>>>>     pools:   8 pools, 545 pgs
>>>>>     objects: 7.71M objects, 6.7 TiB
>>>>>     usage:   15 TiB used, 39 TiB / 54 TiB avail
>>>>>     pgs:     0.367% pgs unknown
>>>>>              17.431% pgs not active
>>>>>              13860144/30824413 objects degraded (44.965%)
>>>>>              1137693/30824413 objects misplaced (3.691%)
>>>>>              280 active+undersized+degraded
>>>>>              67  undersized+degraded+remapped+backfilling+peered
>>>>>              57  active+undersized+remapped
>>>>>              45  active+clean+remapped
>>>>>              44  active+undersized+degraded+remapped+backfilling
>>>>>              18  undersized+degraded+peered
>>>>>              10  active+undersized
>>>>>              9   down
>>>>>              7   active+clean
>>>>>              3   active+undersized+remapped+backfilling
>>>>>              2   active+undersized+degraded+remapped+backfill_wait
>>>>>              2   unknown
>>>>>              1   undersized+peered
>>>>>
>>>>>   io:
>>>>>     client:   170 B/s rd, 0 op/s rd, 0 op/s wr
>>>>>     recovery: 168 MiB/s, 158 keys/s, 166 objects/s
>>>>>
>>>>> I have to disable and re-enable the dashboard just to use it.  It
>>>>> seems to
>>>>> get bogged down after a few moments.
>>>>>
>>>>> The three servers that were moved to the new rack Ceph has marked as
>>>>> "Down", but if I do a cephadm host-check, they all seem to pass:
>>>>>
>>>>> ************************ ceph  ************************
>>>>> --------- cn01.ceph.---------
>>>>> podman (/usr/bin/podman) version 4.0.2 is present
>>>>> systemctl is present
>>>>> lvcreate is present
>>>>> Unit chronyd.service is enabled and running
>>>>> Host looks OK
>>>>> --------- cn02.ceph.---------
>>>>> podman (/usr/bin/podman) version 4.0.2 is present
>>>>> systemctl is present
>>>>> lvcreate is present
>>>>> Unit chronyd.service is enabled and running
>>>>> Host looks OK
>>>>> --------- cn03.ceph.---------
>>>>> podman (/usr/bin/podman) version 4.0.2 is present
>>>>> systemctl is present
>>>>> lvcreate is present
>>>>> Unit chronyd.service is enabled and running
>>>>> Host looks OK
>>>>> --------- cn04.ceph.---------
>>>>> podman (/usr/bin/podman) version 4.0.2 is present
>>>>> systemctl is present
>>>>> lvcreate is present
>>>>> Unit chronyd.service is enabled and running
>>>>> Host looks OK
>>>>> --------- cn05.ceph.---------
>>>>> podman|docker (/usr/bin/podman) is present
>>>>> systemctl is present
>>>>> lvcreate is present
>>>>> Unit chronyd.service is enabled and running
>>>>> Host looks OK
>>>>> --------- cn06.ceph.---------
>>>>> podman (/usr/bin/podman) version 4.0.2 is present
>>>>> systemctl is present
>>>>> lvcreate is present
>>>>> Unit chronyd.service is enabled and running
>>>>> Host looks OK
>>>>>
>>>>> It seems to be recovering with what it has left, but a large amount of
>>>>> OSDs
>>>>> are down.  When trying to restart one of the down'd OSDs, I see a huge
>>>>> dump.
>>>>>
>>>>> Jul 25 03:19:38 cn06.ceph
>>>>> ceph-bfa2ad58-c049-11eb-9098-3c8cf8ed728d-osd-34[9516]: debug
>>>>> 2022-07-25T10:19:38.532+0000 7fce14a6c080  0 osd.34 30689 done with
>>>>> init,
>>>>> starting boot process
>>>>> Jul 25 03:19:38 cn06.ceph
>>>>> ceph-bfa2ad58-c049-11eb-9098-3c8cf8ed728d-osd-34[9516]: debug
>>>>> 2022-07-25T10:19:38.532+0000 7fce14a6c080  1 osd.34 30689 start_boot
>>>>> Jul 25 03:20:10 cn06.ceph
>>>>> ceph-bfa2ad58-c049-11eb-9098-3c8cf8ed728d-osd-34[9516]: debug
>>>>> 2022-07-25T10:20:10.655+0000 7fcdfd12d700  1 osd.34 30689 start_boot
>>>>> Jul 25 03:20:41 cn06.ceph
>>>>> ceph-bfa2ad58-c049-11eb-9098-3c8cf8ed728d-osd-34[9516]: debug
>>>>> 2022-07-25T10:20:41.159+0000 7fcdfd12d700  1 osd.34 30689 start_boot
>>>>> Jul 25 03:21:11 cn06.ceph
>>>>> ceph-bfa2ad58-c049-11eb-9098-3c8cf8ed728d-osd-34[9516]: debug
>>>>> 2022-07-25T10:21:11.662+0000 7fcdfd12d700  1 osd.34 30689 start_boot
>>>>>
>>>>> At this point it just keeps printing start_boot, but the dashboard has
>>>>> it
>>>>> marked as "in" but "down".
>>>>>
>>>>> On these three hosts that moved, there were a bunch marked as "out" and
>>>>> "down", and some with "in" but "down".
>>>>>
>>>>> Not sure where to go next.  I'm going to let the recovery continue and
>>>>> hope
>>>>> that my 4x replication on these pools saves me.
>>>>>
>>>>> Not sure where to go from here.  Any help is very much appreciated.
>>>>> This
>>>>> Ceph cluster holds all of our Cloudstack images...  it would be
>>>>> terrible to
>>>>> lose this data.
>>>>> _______________________________________________
>>>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>>>
>>>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux