On 1/27/20 1:27 PM, Allan W. Nielsen wrote: > CAUTION: This Email originated from outside Televic. Do not click links or open attachments unless you recognize the sender and know the content is safe. > > > Hi Jürgen, > > On 27.01.2020 12:29, Jürgen Lambrecht wrote: >> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe >> >> On 1/26/20 4:59 PM, Andrew Lunn wrote: >>> Given the design of the protocol, if the hardware decides the OS etc >>> is dead, it should stop sending MRP_TEST frames and unblock the ports. >>> If then becomes a 'dumb switch', and for a short time there will be a >>> broadcast storm. Hopefully one of the other nodes will then take over >>> the role and block a port. This can probably be a configuration option in the hardware, how to fall-back. > >> In my experience a closed loop should never happen. It can make >> software crash and give other problems. An other node should first >> take over before unblocking the ring ports. (If this is possible - I >> only follow this discussion halfly) >> >> What is your opinion? > Having loops in the network is never a good thing - but to be honest, I > think it is more important that we ensure the design can survive and > recover from loops. Indeed > > With the current design, it will be really hard to void loops when the > network boot. MRP will actually start with the ports blocked, but they > will be unblocked in the period from when the bridge is created and > until MRP is enabled. If we want to change this (which I'm not too keen > on), then we need to be able to block the ports while the bridge is > down. Our ring network is part of a bigger network. Loops are really not allowed. > > And even if we do this, then we can not guarantee to avoid loops. Lets > assume we have a small ring with just 2 nodes: a MRM and a MRC. Lets > assume the MRM boots first. It will unblock both ports as the ring is > open. Now the MRC boots, and make the ring closed, and create a loop. > This will take some time (milliseconds) before the MRM notice this and > block one of the ports. In my view there is a bring-up and tear-down module needed. I don't know if it should be part of MRP or not? Probably not, so something on top of the mrp daemon. > > But while we are at this topic, we need to add some functionality to > the user-space application such that it can set the priority of the MRP > frames. We will get that fixed. Indeed! In my old design I had to give high priority, else the loop was wrongly closed at high network load. I guess you mean the priority in the VLAN header? I think to remember one talked about the bride code being VLAN-agnostic. > >> (FYI: I made that mistake once doing a proof-of-concept ring design: >> during testing, when a "broken" Ethernet cable was "fixed" I had for a >> short time a loop, and then it happened often that that port of the >> (Marvell 88E6063) switch was blocked. (To unblock, only solution was >> to bring that port down and up again, and then all "lost" packets came >> out in a burst.) That problem was caused by flow control (with pause >> frames), and disabling flow control fixed it, but flow-control is >> default on as far as I know.) > I see. It could be fun to see if what we have proposed so far will with > with such a switch. Depending on the projects I could work on it later this year (or only next year or not..) Kind regards, Jürgen > > /Allan >