> -----Original Message----- > From: Ariel Elior <aelior@xxxxxxxxxxx> > Sent: Thursday, October 28, 2021 2:02 PM > To: Jakub Kicinski <kuba@xxxxxxxxxx> > Cc: Manish Chopra <manishc@xxxxxxxxxxx>; Greg KH > <gregkh@xxxxxxxxxxxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx; > stable@xxxxxxxxxxxxxxx; Sudarsana Reddy Kalluru <skalluru@xxxxxxxxxxx>; > malin1024@xxxxxxxxx; Shai Malin <smalin@xxxxxxxxxxx>; Omkar Kulkarni > <okulkarni@xxxxxxxxxxx>; Nilesh Javali <njavali@xxxxxxxxxxx>; GR-everest- > linux-l2@xxxxxxxxxxx; Andrew Lunn <andrew@xxxxxxx> > Subject: RE: [EXT] Re: [PATCH net-next 1/2] bnx2x: Utilize firmware 7.13.20.0 > > > From: Jakub Kicinski <kuba@xxxxxxxxxx> > > Sent: Wednesday, October 27, 2021 5:04 PM > > To: Ariel Elior <aelior@xxxxxxxxxxx> > > Cc: Manish Chopra <manishc@xxxxxxxxxxx>; Greg KH > > <gregkh@xxxxxxxxxxxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx; > > stable@xxxxxxxxxxxxxxx; Sudarsana Reddy Kalluru > > <skalluru@xxxxxxxxxxx>; malin1024@xxxxxxxxx; Shai Malin > > <smalin@xxxxxxxxxxx>; Omkar Kulkarni <okulkarni@xxxxxxxxxxx>; Nilesh > > Javali <njavali@xxxxxxxxxxx>; GR-everest- linux-l2@xxxxxxxxxxx; Andrew > > Lunn <andrew@xxxxxxx> > > Subject: Re: [EXT] Re: [PATCH net-next 1/2] bnx2x: Utilize firmware > > 7.13.20.0 > > > > On Wed, 27 Oct 2021 05:17:43 +0000 Ariel Elior wrote: > > > You may recall we had a discussion on this during our last FW upgrade too. > > > > "During our last FW upgrade" is pretty misleading here. The discussion > > seems to have been after user reported that you broke their systems: > > > > https://urldefense.proofpoint.com/v2/url?u=https- > > 3A__lore.kernel.org_netdev_ffbcf99c-2D8274-2Deca1-2D5166- > > 2Defc0828ca05b- > > 40molgen.mpg.de_&d=DwICAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=cWBgNIF > > UifZRx2xhypdcaYrfIsMGt93NxP1r8GQtC0s&m=7SlExiNTnqs5yykfN6rmIWZvB > > wQtSHCEW1ZmdHhe3S4&s=5OfVYQjTBOqb9GeN7GmGw70U_7MZLYz9YDyl > > o1iqdtQ&e= > > > > Now you want to make your users' lives even more miserable by pushing > > your changes into stable. > > > > > Please note this is not FW which resides in flash, which may or may > > > not be updated during the life cycle of a specific board deployment, > > > but rather an initialization sequence recipe which happens to > > > contain FW content (as well > > as > > > many other register and memory initializations) which is activated > > > when > > driver > > > loads. We do have Flash based FW as well, with which we are fully > > backwards and > > > forwards compatible. There is no method to build the init sequence > > > in a backwards compatible mode for these devices - it would > > > basically mean duplicating most of the device interaction logic > > > (control plane and data > > plane). > > > To support these products we need to be able to update this from > > > time to > > time. > > > > And the driver can't support two versions of init sequence because...? > Well. Generally speaking on the architecture of these devices, the init sequence > includes programing the PRAM of the processors on the fastpath, so two > different init sequences would mean two different FW versions, and supporting > two different data planes. It would also mean drastic changes in control plane as > well, for the same reason: the fastpath FW expects completely different > structures and messaging from one version to the next. For these two versions, > however, changes are admittedly not that big. > > > > > > Please note these devices are EOLing, and therefore this may well be > > > the > > last > > > update to this FW. > > > > Solid argument. > At this time we don't have any plans on further replacing the FW, so hopefully > we won't find ourselves here again. Another important point: the major fix we > are pushing here is a breakage in the SR-IOV virtual function compatibility > implementation in the FW (introduced in the previous FW version). If we would > maintain the ability to support that older FW in the PF driver, we would also be > maintaining the presence of the breakage. In other words you may be better off > with the driver failing to load with a clear message of "need new FW file" rather > than having it load, but then having all your virtual functions which are passed > through to virtual machines with distro kernel fail in weird and unpredictable > ways. For this reason we also ask to expedite the acceptance of this change, as > it is desirable to limit the exposure of this problem. Back to the EOLing point: the > set of people still familiar with this > >13 years old architecture is severely reduced, so would rather not to try and > invent new ways of doing things. > > > > > > The only theoretical way we can think of getting around this if we > > > had to is adding the entire thing as a huge header file and have the > > > driver compile with it. This would detach the dependency on the FW > > > file being present on disk, but has big disadvantages of making the > > > compiled driver huge, and bloating the kernel with redundant headers > > > filled with what is essentially a binary blob. We do make sure to > > > add the FW files to the FW sub tree in advance of modifying the > > > drivers to use them. > > > > All the patch is doing is changing some offsets. Why can't you just > > make the offset the driver uses dependent on the FW version? > > > > Would be great if the engineer who wrote the code could answer that. > My original answer was more for the general design/architecture of these > products, and a general design of supporting multiple FW versions on them. > Specifically for this FW change, as relatively little has changed, we > *could* maintain two sets of offsets based on the FW version. This might impact > driver performance, however, as some of the offsets are used in data plane (e.g. > updating the L2 ring producers), although I suppose we could also work around > that by preparing a new "generic" array of offsets, populate it at FW load time > and use that. However, as I stated above, I don't think it is desirable in this case, > as the previous version contains some nasty behavior, and it may take us some > considerable time to build that design, allowing the problematic FW to spread to > further Linux distributions. > > Thanks, > Ariel Hello Jakub et al, Just following up based on the comments put by Ariel a week back. The earlier firmware has caused some important regression w.r.t SR-IOV compatibility, so it's critical to have these new FW patches to be accepted sooner (thinking of the impact on various Linux distributions/kernels where that bug/regression will be carried over with earlier firmware), as Ariel pointed out the complexities, in general making the FW backwards compatible on these devices architecture meaning supporting different data/control path (which is not good from performance perspective), However these two particular versions are not changing that much (from data/control path perspective) so we could have made them backward compatible for these two particular versions but given the time criticality, regression/bug introduced by the earlier FW, bnx2x devices being almost EOL, this would be our last FW submission hopefully so we don't want to re-invent something which has been continued for many years now for these bnx2* devices. PS: this series was not meant for stable (I have Cced stable mistakenly), please let me know if I can send v2 with stable removed from recipients. Thanks, Manish