Eric Dumazet <edumazet@xxxxxxxxxx> wrote: > On Wed, Oct 11, 2017 at 10:48 AM, Florian Westphal <fw@xxxxxxxxx> wrote: > > Eric Dumazet <edumazet@xxxxxxxxxx> wrote: > >> On Wed, Oct 11, 2017 at 7:26 AM, Florian Westphal <fw@xxxxxxxxx> wrote: > >> > xt_replace_table relies on table replacement counter retrieval (which > >> > uses xt_recseq to synchronize pcpu counters). > >> > > >> > This is fine, however with large rule set get_counters() can take > >> > a very long time -- it needs to synchronize all counters because > >> > it has to assume concurrent modifications can occur. > >> > > >> > Make xt_replace_table synchronize by itself by waiting until all cpus > >> > had an even seqcount. > >> > > >> > This allows a followup patch to copy the counters of the old ruleset > >> > without any synchonization after xt_replace_table has completed. > >> > > >> > Cc: Dan Williams <dcbw@xxxxxxxxxx> > >> > Cc: Eric Dumazet <edumazet@xxxxxxxxxx> > >> > Signed-off-by: Florian Westphal <fw@xxxxxxxxx> > >> > --- > >> > v3: check for 'seq is uneven' OR 'has changed' since > >> > last check. Its fine if seq is uneven iff its a different > >> > sequence number than the initial one. > >> > > >> > v2: fix Erics email address > >> > net/netfilter/x_tables.c | 18 +++++++++++++++--- > >> > 1 file changed, 15 insertions(+), 3 deletions(-) > >> > > >> > > >> > >> Reviewed-by: Eric Dumazet <edumazet@xxxxxxxxxx> > >> > >> But it seems we need an extra smp_wmb() after > >> > >> smp_wmb(); > >> table->private = newinfo; > >> + smp_wmb(); > >> > >> Otherwise we have no guarantee other cpus actually see the new ->private value. > > > > Seems to be unrelated to this change, so I will submit > > a separate patch for nf.git that adds this. > > This is related to this change, please read the comment before the > local_bh_enable9) > > /* > * Even though table entries have now been swapped, other CPU's > * may still be using the old entries. This is okay, because > * resynchronization happens because of the locking done > * during the get_counters() routine. > */ Hmm, but get_counters() does not issue a wmb, and the 'new' code added here essentially is the same as get_counters(), except that we only read seqcount until we saw a change (and not for each counter in the rule set). What am I missing? -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html