On Sat, 4 Feb 2017 18:05:05 +0100 Nikolay Aleksandrov <nikolay@xxxxxxxxxxxxxxxxxxx> wrote: > Hi all, > This is the first set which begins to deal with the bad bridge cache > access patterns. The first patch rearranges the bridge and port structs > a little so the frequently (and closely) accessed members are in the same > cache line. The second patch then moves the garbage collection to a > workqueue trying to improve system responsiveness under load (many fdbs) > and more importantly removes the need to check if the matched entry is > expired in __br_fdb_get which was a major source of false-sharing. > The third patch is a preparation for the final one which > If properly configured, i.e. ports bound to CPUs (thus updating "updated" > locally) then the bridge's HitM goes from 100% to 0%, but even without > binding we get a win because previously every lookup that iterated over > the hash chain caused false-sharing due to the first cache line being > used for both mac/vid and used/updated fields. > > Some results from tests I've run: > (note that these were run in good conditions for the baseline, everything > ran on a single NUMA node and there were only 3 fdbs) > > 1. baseline > 100% Load HitM on the fdbs (between everyone who has done lookups and hit > one of the 3 hash chains of the communicating > src/dst fdbs) > Overall 5.06% Load HitM for the bridge, first place in the list > > 2. patched & ports bound to CPUs > 0% Local load HitM, bridge is not even in the c2c report list > Also there's 3% consistent improvement in netperf tests. What tool are you using to measure this? > > Thanks, > Nik > > Nikolay Aleksandrov (4): > bridge: modify bridge and port to have often accessed fields in one > cache line > bridge: move to workqueue gc > bridge: move write-heavy fdb members in their own cache line > bridge: fdb: write to used and updated at most once per jiffy > > net/bridge/br_device.c | 1 + > net/bridge/br_fdb.c | 34 +++++++++++++++++----------- > net/bridge/br_if.c | 2 +- > net/bridge/br_input.c | 3 ++- > net/bridge/br_ioctl.c | 2 +- > net/bridge/br_netlink.c | 2 +- > net/bridge/br_private.h | 57 +++++++++++++++++++++++------------------------ > net/bridge/br_stp.c | 2 +- > net/bridge/br_stp_if.c | 4 ++-- > net/bridge/br_stp_timer.c | 2 -- > net/bridge/br_sysfs_br.c | 2 +- > 11 files changed, 59 insertions(+), 52 deletions(-) Looks good thanks, I wounder this impacts smaller work loads. Reviewed-by: Stephen Hemminger <stephen@xxxxxxxxxxxxxxxxxx>