Re: [PATCH] netfilter: conntrack: fix wrong ct->timeout value

Tzung-Bi Shih <tzungbi@xxxxxxxxxx> · Fri, 14 Apr 2023 11:52:02 +0800

On Thu, Apr 13, 2023 at 11:04:24AM +0200, Pablo Neira Ayuso wrote:
> On Thu, Apr 13, 2023 at 11:23:11AM +0800, Tzung-Bi Shih wrote:
> > nf_ct_expires() got called by userspace program.  And the return
> > value (which means the remaining timeout) will be the parameter for
> > the next ctnetlink_change_timeout().
> 
> Unconfirmed conntrack is owned by the packet that refers to it, it is
> not yet in the hashes. I don't see how concurrent access to the
> timeout might occur.
> 
> Or are you referring to a different scenario that triggers the partial
> state?

I think it isn't a concurrent access.  We observed that userspace program
reads the remaining timeout and sets it back for unconfirmed conntrack.

Conceptually, it looks like:
timeout = nf_ct_expires(...);  (1)
ctnetlink_change_timeout(...timeout);

At (1), `nfct_time_stamp` is wrongly involved in the calculation[4] because
the conntrack is unconfirmed.  The `ct->timeout` is an internal but not a
timestamp.

As a result, ctnetlink_change_timeout() sets the wrong value to `ct->timeout`.

[4]: https://elixir.bootlin.com/linux/v6.3-rc6/source/include/net/netfilter/nf_conntrack.h#L296

> > As you can see in [4], if this happens on an unconfirmed conntrack, the
> > `nfct_time_stamp` would be wrongly invoved in the calculation again.
> > That's why we take care of all `ct->timeout` accesses in v1.

Again, that's why v1 separates all `ct->timeout` accesses.

If the conntrack is confirmed:
- `ct->timeout` is a timestamp.
- `nfct_time_stamp` should be involved when getting/setting `ct->timeout`.

If the conntrack is unconfirmed:
- `ct->timeout` is an interval.
- `nfct_time_stamp` shouldn't be involved.

Only separate `ct->timeout` access in __nf_ct_set_timeout() is obviously
insufficient.  I would suggest either take care of all `ct->timeout`
accesses as v1 or at least change both __nf_ct_set_timeout() and
nf_ct_expires().

In the latter case, it looks fine in our environment.  However, I'm not
sure if any hidden cases we haven't seen.