On tis, mar 22, 2022 at 14:21, Hans Schultz <schultz.hans@xxxxxxxxx> wrote: > On tis, mar 22, 2022 at 13:08, Vladimir Oltean <olteanv@xxxxxxxxx> wrote: >> On Tue, Mar 22, 2022 at 12:01:13PM +0100, Hans Schultz wrote: >>> On fre, mar 18, 2022 at 15:19, Vladimir Oltean <olteanv@xxxxxxxxx> wrote: >>> > On Fri, Mar 18, 2022 at 02:10:26PM +0100, Hans Schultz wrote: >>> >> In the offloaded case there is no difference between static and dynamic >>> >> flags, which I see as a general issue. (The resulting ATU entry is static >>> >> in either case.) >>> > >>> > It _is_ a problem. We had the same problem with the is_local bit. >>> > Independently of this series, you can add the dynamic bit to struct >>> > switchdev_notifier_fdb_info and make drivers reject it. >>> > >>> >> These FDB entries are removed when link goes down (soft or hard). The >>> >> zero DPV entries that the new code introduces age out after 5 minutes, >>> >> while the locked flagged FDB entries are removed by link down (thus the >>> >> FDB and the ATU are not in sync in this case). >>> > >>> > Ok, so don't let them disappear from hardware, refresh them from the >>> > driver, since user space and the bridge driver expect that they are >>> > still there. >>> >>> I have now tested with two extra unmanaged switches (each connected to a >>> seperate port on our managed switch, and when migrating from one port to >>> another, there is member violations, but as the initial entry ages out, >>> a new miss violation occurs and the new port adds the locked entry. In >>> this case I only see one locked entry, either on the initial port or >>> later on the port the host migrated to (via switch). >>> >>> If I refresh the ATU entries indefinitly, then this migration will for >>> sure not work, and with the member violation suppressed, it will be >>> silent about it. >> >> Manual says that migrations should trigger miss violations if configured >> adequately, is this not the case? >> > Yes, but that depends on the ATU entries ageing out. As it is now, it works. > >>> So I don't think it is a good idea to refresh the ATU entries >>> indefinitely. >>> >>> Another issue I see, is that there is a deadlock or similar issue when >>> receiving violations and running 'bridge fdb show' (it seemed that >>> member violations also caused this, but not sure yet...), as the unit >>> freezes, not to return... I have now verified that it is only on miss violations that the problem occurs, so it seems that there is a deadlock (with 'bridge fdb show') somehow with the nl lock that the handling of ATU miss violations acquires. >> >> Have you enabled lockdep, debug atomic sleep, detect hung tasks, things >> like that? > > No, I haven't looked deeper into it yet. Maybe I was hoping someone had > an idea... but I guess it cannot be a netlink deadlock?