Re: [PATCH RESEND v3] netfilter: xtables: add quota support to nfacct

Mathieu Poirier <mathieu.poirier@xxxxxxxxxx> · Fri, 14 Feb 2014 08:20:44 -0700

On 12 February 2014 15:16, Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
> On Wed, Feb 12, 2014 at 09:58:28AM -0700, Mathieu Poirier wrote:
> [...]
>> > I can see that packet and byte counters are not in sync, but I don't
>> > see why that should be a problem for your quota extension. Your
>> > accounting is based on packet *OR* byte counters.
>>
>> If we have a filter reporting the amount of packets and bytes that
>> have gone through a chain the least we can do is make them coherent.
>> Otherwise the numbers simply can't be trusted.  Fixing that problem is
>> what I've done in the 4th version of my patchset sent on Monday.
>>
>> > Moreover, if you inspect the result of the counter update via
>> > atomic_inc_return / atomic_add_return, you can ensure that packets
>> > running out of the quota limit will always match.
>>
>> Thanks for Florian Westphal's hint on 'cmpxchg' I came up with a match
>> algorithm that requires no spinlock and doesn't mandate that packet
>> count be synchronised with bytes.  Before sending it out I'd like us
>> to reach a consensus on the above - I think the current accounting
>> method is broken and ought to be fixed.
>
> I really think we should skip that spinlock with atomic64. That
> doesn't make sense to me.

I'm not fond of it either.

>
>> If the current situation is fine with you I'll simply send another
>> version that works without counts being synchronised - but I really
>> think we can do better.
>
> Given the interferences that you described, I think you can use this
> approach to ensure that only one event is delivered:
>
>         if (old < info->quota && new >= info->quota &&
>             test_and_set_bit(0, info->quota_reached))
>                 deliver_event(...);

I had something similar but I used atomic_cmpxch() instead:

         if (val >= info->quota)
                 if (atomic_cmpxchg(&info->priv->notified, 0, 1) == 0)
                        nfnl_quota_event(info->nfacct);

Note that I've embedded 'notified' in a private field since userspace
doesn't need to know what we do in the kernel.  I've also used an
atomic value... Get back to me with your preference.

>
> On top of that, there's another problem that needs to be solved now
> that we have a quota_reached field in the info area. This needs a
> "database of quota objects" that are identified by a name, otherwise
> iptables rule updates will trigger a new (unnecessary event) event
> every time *any* new rule is added.
>
> If you want to test what I mean, try this:
>
> iptables -I INPUT -m quota --quota 1000 -j LOG --log-prefix "quota: "
>
> then keep an eye on /var/log/messages, ping somewhere to generate
> traffic. You should see log messages for each packet until the quota
> is reached.  After a couple of ICMP packets, you will see no more logs
> since the quota has been reached.
>
> Now, keep monitoring /var/log/messages and add any rule, eg. iptables
> -I INPUT. The quota matches again and you'll start seeing more log
> messages since every rule updates "resets" the internal state of the
> match. This is a known problem. To solve this, you have to use a
> specific quota object with a refcnt that is identified by a name. See
> xt_hashlimit.c for instance. Note that the quota_reached can't be
> added to the existing nfacct object since it's a valid configuration
> to have two rules using with different quota values that refer to the
> same nfacct object (think of a scenario in which you want to throttle
> to N Kbytes/s after X bytes has been consumed, then throttle to a
> lower M Kbytes/s after X+Y bytes has been consumed).

Thanks for pointing this out - I was able to reproduce the problem
with the quota filter.  I'll look at the effect on nfacct and will get
back to you if I have questions.

>
> Thus, the iptables command for this should look like:
>
> iptables -I INPUT -p icmp \
>         -m nfacct --mode packets --quota 100 --quota-name icmp-quota-100
>
> In _mt_check you have to look up for "icmp-quota-100" in the list of
> quota objects to attach it to the rule. If it doesn't exist you create
> it. In the _destroy path, you have to put the refcnt, if it reaches
> zero, it will be released. It may sound tricky, but this will work
> fine in practise.
>
> Please, take the time to digest this and let us know if you have any
> question. Thank you.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html