On Wed, May 27, 2015 at 9:59 AM, Scott Feldman <sfeldma@xxxxxxxxx> wrote: > On Wed, May 27, 2015 at 12:05 AM, Nikolay Aleksandrov > <nikolay@xxxxxxxxxxxxxxxxxxx> wrote: >> On Tue, May 26, 2015 at 7:28 PM, Stephen Hemminger >> <stephen@xxxxxxxxxxxxxxxxxx> wrote: >>> On Thu, 21 May 2015 03:42:57 -0700 >>> Nikolay Aleksandrov <nikolay@xxxxxxxxxxxxxxxxxxx> wrote: >>> >>>> From: Wilson Kok <wkok@xxxxxxxxxxxxxxxxxxx> >>>> >>>> Check in fdb_add_entry() if the source port should learn, similar >>>> check is used in br_fdb_update. >>>> Note that new fdb entries which are added manually or >>>> as local ones are still permitted. >>>> This patch has been tested by running traffic via a bridge port and >>>> switching the port's state, also by manually adding/removing entries >>>> from the bridge's fdb. >>>> >>>> Signed-off-by: Wilson Kok <wkok@xxxxxxxxxxxxxxxxxxx> >>>> Signed-off-by: Nikolay Aleksandrov <nikolay@xxxxxxxxxxxxxxxxxxx> >>> >>> What is the problem this is trying to solve? >>> >>> I think user should be allowed to manually add any entry >>> even if learning. >> >> Hi Stephen, >> I have been thinking about the use case and have discussed it >> internally with colleagues and the patch >> author, the main problem is when there's an external software that >> adds dynamic entries (learning) and >> it could experience a race condition, here's a possible situation: >> * external software learns a mac from hw, sends an add to kernel >> * right before that, port goes blocking (or down) and kernel flushes >> mac, sends notification about the state change and mac flush >> * right after that, kernel gets the previous add from external software, it's >> allowed to add, and then sends an add notification >> * mean while, external software processes the link block/down and mac flush, >> followed by the mac add from kernel. At this point, external software can't >> really know whether it's a user adding the mac intentionally or it's >> a race. >> >> This issue can't really be avoided in user-space. >> As I've noted local and static entries are still allowed, and iproute2 >> bridge utility always >> marks the entries as static (NUD_NOARP), this only affects external >> dynamic entries which >> are usually sent by something that does the learning externally. >> I'll keep digging to see if there's another way to go about this since >> I'd like to give the user >> full freedom. Personally I don't have strong feeling for this patch >> and if it's not preferred then >> I'll post a revert. > > So there is already a switchdev API to add/del an externally learned > FDB entry which holds rtnl_lock and avoids these races. I would > suggest using that and revert this patch. > > See call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) and > the handler in br.c:br_switchdev_event(). > > -scott Hmm, I'm new to the switchdev API and am possibly missing something, but how do you suggest to use it here ? How can we differentiate between user-added entry and an externally learned one ? Do you mean to use (for example) the NTF_EXT_LEARNED flag when adding an entry from user-space so the API can get called in br_fdb_add ? Minor note: br_fdb_add (ndo_fdb_add) is already called with rtnl held, so the API will have to be used directly. Thanks, Nik