Hello, On Mon, 16 Oct 2023, xiaolinkui wrote: > From: Linkui Xiao <xiaolinkui@xxxxxxxxxx> > > There is a race condition which can be demonstrated by the following > script: > > ipset create hash_ip1 hash:net family inet hashsize 1024 maxelem 1048576 > ipset add hash_ip1 172.20.0.0/16 > ipset add hash_ip1 192.168.0.0/16 > iptables -A INPUT -m set --match-set hash_ip1 src -j ACCEPT > while [ 1 ] > do > ipset create hash_ip2 hash:net family inet hashsize 1024 maxelem 1048576 > ipset add hash_ip2 172.20.0.0/16 > ipset swap hash_ip1 hash_ip2 > ipset destroy hash_ip2 > sleep 0.05 > done I have been running the script above for more than two hours and nothing happened. What else is needed to trigger the bug? (I have been continuously sending ping to the tested host.) > Swap will exchange the values of ref so destroy will see ref = 0 instead of > ref = 1. So after running this script for a period of time, the following > race situations may occur: > CPU0: CPU1: > ipt_do_table > ->set_match_v4 > ->ip_set_test > ipset swap hash_ip1 hash_ip2 > ipset destroy hash_ip2 > ->hash_net4_kadt > > CPU0 found ipset through the index, and at this time, hash_ip2 has been > destroyed by CPU1 through name search. So CPU0 will crash when accessing > set->data in the function hash_net4_kadt. > > With this fix in place swap will not succeed because ip_set_test still has > ref_swapping on the set. > > Both destroy and swap will error out if ref_swapping != 0 on the set. > > Signed-off-by: Linkui Xiao <xiaolinkui@xxxxxxxxxx> > --- > net/netfilter/ipset/ip_set_core.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c > index e5d25df5c64c..d6bd37010bfb 100644 > --- a/net/netfilter/ipset/ip_set_core.c > +++ b/net/netfilter/ipset/ip_set_core.c > @@ -741,11 +741,13 @@ ip_set_test(ip_set_id_t index, const struct sk_buff *skb, > int ret = 0; > > BUG_ON(!set); > + > + __ip_set_get_swapping(set); > pr_debug("set %s, index %u\n", set->name, index); > > if (opt->dim < set->type->dimension || > !(opt->family == set->family || set->family == NFPROTO_UNSPEC)) > - return 0; > + goto out; > > ret = set->variant->kadt(set, skb, par, IPSET_TEST, opt); > > @@ -764,6 +766,8 @@ ip_set_test(ip_set_id_t index, const struct sk_buff *skb, > ret = -ret; > } > > +out: > + __ip_set_put_swapping(set); > /* Convert error codes to nomatch */ > return (ret < 0 ? 0 : ret); > } > -- > 2.17.1 The patch above alas is also not acceptable: it adds locking to a lockless area and thus slows down the execution unnecessarily. If there's a bug, then that must be handled in ip_set_swap() itself, like for example adding a quiescent time and waiting for the ongoing users of the swapped set to finish their job. You can make it slow there, because swapping is not performance critical. Best regards, Jozsef -- E-mail : kadlec@xxxxxxxxxxxxxxxxx, kadlecsik.jozsef@xxxxxxxxx PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics H-1525 Budapest 114, POB. 49, Hungary