Re: Messenger v2 and IPv6-only still seems to prefer IPv4 (OSDs stuck in booting state)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yeah, annoyingly `ms_bind_ipv4` it set to true by default, so if you just
set `ms_bind_ipv6` without turning off ipv4 you end up in dual stack mode.
I've created a PR to fix this with at least a warning _and_ to properly
mention this in the documentation: https://github.com/ceph/ceph/pull/36536

Hopefully it'll land at some point :)

Matt

On Fri, Sep 4, 2020 at 12:12 AM Wido den Hollander <wido@xxxxxxxx> wrote:

> Hi,
>
> Last night I've spend a couple of hours debugging a issue where OSDs
> would be marked as 'up', but then PGs stayed in the 'peering' state.
>
> Looking through the admin socket I saw these OSDs were in the 'booting'
> state.
>
> Looking at the OSDMap I saw this:
>
> osd.3 up   in  weight 1 up_from 26 up_thru 700 down_at 0
> last_clean_interval [0,0)
> [v2:[2a05:xx0:700:2::7]:6816/7923,v1:[2a05:xx:700:2::7]:6817/7923,v2:
> 0.0.0.0:6818/7923,v1:0.0.0.0:6819/7923]
> [v2:[2a05:xx:700:2::7]:6820/7923,v1:[2a05:1500:700:2::7]:6821/7923,v2:
> 0.0.0.0:6822/7923,v1:0.0.0.0:6823/7923]
> exists,up 786d3e9d-047f-4b09-b368-db9e8dc0805d
>
> In ceph.conf this was set:
>
> ms_bind_ipv6 = true
> public_addr = 2a05:xx:700:2::6
>
> On true IPv6-only nodes this works fine. But on nodes where there is
> also IPv4 present this can (and will?) cause problems.
>
> It did not use tcpdump/wireshark to investigate, but it seems that the
> OSDs tried to contact each other. Using the 0.0.0.0 IPv4 address.
>
> After adding these settings the problems were resolved:
>
> ms_bind_msgr1 = false
> ms_bind_ipv4 = false
>
> This also disables msgrv1 as we didn't need it here. A cluster and
> clients all running Octopus.
>
> The OSDMap now showed:
>
> osd.3 up   in  weight 1 up_from 704 up_thru 712 down_at 702
> last_clean_interval [26,701) v2:[2a05:xx:700:2::7]:6804/791503
> v2:[2a05:xx:700:2::7]:6805/791503 exists,up
> 786d3e9d-047f-4b09-b368-db9e8dc0805d
>
> OSDs can back right away, PGs peered and the problems were resolved.
>
> Wido
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux