Re: Fwd: Re: Issues with Ceph network redundancy using L2 MC-LAG

Serkan Çoban <cobanserkan@xxxxxxxxx> · Wed, 16 Jun 2021 11:19:25 +0300

You cannot do much if the link is flapping or the cable is bad.
Maybe you can write some rules to shut the port down on the switch if
the error packet ratio goes up.
I also remember there are some config on the switch side for link flapping.

On Wed, Jun 16, 2021 at 10:57 AM huxiaoyu@xxxxxxxxxxxx
<huxiaoyu@xxxxxxxxxxxx> wrote:
>
> Is it true that MC-LAG and 803.2ad, by its default, are working on active-active.
>
> What else should i take care to ensure fault tolerance when one path is bad?
>
> best regards,
>
> samuel
>
>
>
> huxiaoyu@xxxxxxxxxxxx
>
> From: Joe Comeau
> Date: 2021-06-15 23:44
> To: ceph-users@xxxxxxx
> Subject:  Fwd: Re: Issues with Ceph network redundancy using L2 MC-LAG
> We also run with Dell VLT switches (40 GB)
> everything is active/active, so multiple paths as Andrew describes in
> his config
> Our config allows us:
>    bring down one of the switches for upgrades
>    bring down an iscsi gatway for patching
> all the while at least one path is up and servicing
> Thanks Joe
>
>
> >>> Andrew Walker-Brown <andrew_jbrown@xxxxxxxxxxx> 6/15/2021 10:26 AM
> >>>
> With an unstable link/port you could see the issues you describe.  Ping
> doesn’t have the packet rate for you to necessarily have a packet in
> transit at exactly the same time as the port fails temporarily.  Iperf
> on the other hand could certainly show the issue, higher packet rate and
> more likely to have packets in flight at the time of a link
> fail...combined with packet loss/retries gives poor throughput.
>
> Depending on what you want to happen, there are a number of tuning
> options both on the switches and Linux.  If you want the LAG to be down
> if any link fails, the you should be able to config this on the switches
> and/or Linux  (minimum number of links = 2 if you have 2 links in the
> lag).
>
> You can also tune the link monitoring, how frequently the links are
> checked (e.g. miimon) etc.  Bringing this value down from the default of
> 100ms may allow you to detect a link failure more quickly.  But you then
> run into the chance if detecting a transient failure that wouldn’t have
> caused any issues....and the LAG becoming more unstable.
>
> Flapping/unstable links are the worst kind of situation.  Ideally you’d
> pick that up quickly from monitoring/alerts and either fix immediately
> or take the link down until you can fix it.
>
> I run 2x10G from my hosts into separate switches (Dell S series – VLT
> between switches).  Pulling a single interface has no impact on Ceph,
> any packet loss is tiny and we’re not exceeding 10G bandwidth per host.
>
> If you’re running 1G links and the LAG is already busy, a link failure
> could be causing slow writes to the host, just down to
> congestion...which then starts to impact the wider cluster based on how
> Ceph works.
>
> Just caveating the above with - I’m relatively new to Ceph myself....
>
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for
> Windows 10
>
> From: huxiaoyu@xxxxxxxxxxxx<mailto:huxiaoyu@xxxxxxxxxxxx>
> Sent: 15 June 2021 17:52
> To: Serkan Çoban<mailto:cobanserkan@xxxxxxxxx>
> Cc: ceph-users<mailto:ceph-users@xxxxxxx>
> Subject:  Re: Issues with Ceph network redundancy using L2
> MC-LAG
>
> When i pull out the cable, then the bond is working properly.
>
> Does it mean that the port is somehow flapping? Ping can still work,
> but the iperf test yields very low results.
>
>
>
>
>
> huxiaoyu@xxxxxxxxxxxx
>
> From: Serkan Çoban
> Date: 2021-06-15 18:47
> To: huxiaoyu@xxxxxxxxxxxx
> CC: ceph-users
> Subject: Re:  Issues with Ceph network redundancy using L2
> MC-LAG
> Do you observe the same behaviour when you pull a cable?
> Maybe a flapping port might cause this kind of behaviour, other than
> that you should't see any network disconnects.
> Are you sure about LACP configuration, what is the output of 'cat
> /proc/net/bonding/bond0'
>
> On Tue, Jun 15, 2021 at 7:19 PM huxiaoyu@xxxxxxxxxxxx
> <huxiaoyu@xxxxxxxxxxxx> wrote:
> >
> > Dear Cephers,
> >
> > I encountered the following networking issue several times, and i
> wonder whether there is a solution for networking HA solution.
> >
> > We build ceph using L2 multi chassis link aggregation group (MC-LAG )
> to provide switch redundancy. On each host, we use 802.3ad, LACP
> > mode for NIC redundancy. However, we observe several times, when a
> single network port, either the cable, or the SFP+ optical module fails,
> Ceph cluster  is badly affected by networking, although in theory it
> should be able to tolerate.
> >
> > Did i miss something important here? and how to really achieve
> networking HA in Ceph cluster?
> >
> > best regards,
> >
> > Samuel
> >
> >
> >
> >
> > huxiaoyu@xxxxxxxxxxxx
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
>
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx