Re: [PATCH iptables] iptables: support insisting that the lock is held

Aaron Conole <aconole@xxxxxxxxxx> · Wed, 03 May 2017 10:51:08 -0400

Aaron Conole <aconole@xxxxxxxxxx> writes:

> Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> writes:
>
>> On Thu, Apr 20, 2017 at 06:23:33PM +0900, Lorenzo Colitti wrote:
>>> Currently, iptables programs will exit with an error if the
>>> iptables lock cannot be acquired, but will silently continue if
>>> the lock cannot be opened at all.
>>
>> This sounds to me like a wrong design decision was made when
>> introducing this userspace lock.
>
> I wouldn't say it that way.  I looked at this a while ago, and one thing
> to keep in mind is the if the particular prefix path in the filesystem
> (for instance /run) isn't available, then this will cause iptables to
> fail.  I'm not sure how many systems do provide /run - at the time it
> might have been more common.

Another issue is container systems.  Until recently, Kubernetes didn't
provide /run at all, and not all container orchestration tools will
provide this filesystem.

>>> This can cause unexpected failures (with unhelpful error messages)
>>> in the presence of concurrent updates.
>>> 
>>> This patch adds a compile-time option that causes iptables to
>>> refuse to do anything if the lock cannot be acquired. It is a
>>> compile-time option instead of a command-line flag because:
>>> 
>>> 1. In order to reliably avoid concurrent modification, all
>>>    invocations of iptables commands must follow this behaviour.
>>> 2. Whether or not the lock can be opened is typically not
>>>    a run-time condition but is likely to be a configuration
>>>    error.
>>>
>>> Tested by deleting xtables.lock and observing that all commands
>>> failed if iptables was compiled with --enable-strict-locking, but
>>> succeeded otherwise.
>>> 
>>> By default, --enable-strict-locking is disabled for backwards
>>> compatibility reasons. It can be enabled by default in a future
>>> change if desired.
>>
>> I would like to skip this compile time switch, if the existing
>> behaviour is broken, we should just fix it. What is the scenario that
>> can indeed have an impact in terms of backward compatibility breakage?
>> Does it really make sense to keep a buggy behaviour around?
>
> I'm not sure about a change to the behavior, but I agree that a compile
> time switch is probably not the way to go.

I've thought about it.  I think a better change that makes sense in the
presence of concurrent updates would be to use the wait argument as a
total time, and apply it to both the userspace lock, AND an EAGAIN from
the kernel space side.  So if the kernel space says 'locked try again',
and the user passed a -w option, we can simply keep retrying until the
wait time expires.  Does that make sense and solve your issue, Lorenzo?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html