RE: [PATCH bpf 1/2] bpf: Fix too early release of tcx_entry

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Daniel Borkmann wrote:
> Pedro Pinto and later independently also Hyunwoo Kim and Wongi Lee reported
> an issue that the tcx_entry can be released too early leading to a use
> after free (UAF) when an active old-style ingress or clsact qdisc with a
> shared tc block is later replaced by another ingress or clsact instance.
> 
> Essentially, the sequence to trigger the UAF (one example) can be as follows:
> 
>   1. A network namespace is created
>   2. An ingress qdisc is created. This allocates a tcx_entry, and
>      &tcx_entry->miniq is stored in the qdisc's miniqp->p_miniq. At the
>      same time, a tcf block with index 1 is created.
>   3. chain0 is attached to the tcf block. chain0 must be connected to
>      the block linked to the ingress qdisc to later reach the function
>      tcf_chain0_head_change_cb_del() which triggers the UAF.
>   4. Create and graft a clsact qdisc. This causes the ingress qdisc
>      created in step 1 to be removed, thus freeing the previously linked
>      tcx_entry:
> 
>      rtnetlink_rcv_msg()
>        => tc_modify_qdisc()
>          => qdisc_create()
>            => clsact_init() [a]
>          => qdisc_graft()
>            => qdisc_destroy()
>              => __qdisc_destroy()
>                => ingress_destroy() [b]
>                  => tcx_entry_free()
>                    => kfree_rcu() // tcx_entry freed
> 
>   5. Finally, the network namespace is closed. This registers the
>      cleanup_net worker, and during the process of releasing the
>      remaining clsact qdisc, it accesses the tcx_entry that was
>      already freed in step 4, causing the UAF to occur:
> 
>      cleanup_net()
>        => ops_exit_list()
>          => default_device_exit_batch()
>            => unregister_netdevice_many()
>              => unregister_netdevice_many_notify()
>                => dev_shutdown()
>                  => qdisc_put()
>                    => clsact_destroy() [c]
>                      => tcf_block_put_ext()
>                        => tcf_chain0_head_change_cb_del()
>                          => tcf_chain_head_change_item()
>                            => clsact_chain_head_change()
>                              => mini_qdisc_pair_swap() // UAF
> 
> There are also other variants, the gist is to add an ingress (or clsact)
> qdisc with a specific shared block, then to replace that qdisc, waiting
> for the tcx_entry kfree_rcu() to be executed and subsequently accessing
> the current active qdisc's miniq one way or another.
> 
> The correct fix is to turn the miniq_active boolean into a counter. What
> can be observed, at step 2 above, the counter transitions from 0->1, at
> step [a] from 1->2 (in order for the miniq object to remain active during
> the replacement), then in [b] from 2->1 and finally [c] 1->0 with the
> eventual release. The reference counter in general ranges from [0,2] and
> it does not need to be atomic since all access to the counter is protected
> by the rtnl mutex. With this in place, there is no longer a UAF happening
> and the tcx_entry is freed at the correct time.
> 
> Fixes: e420bed02507 ("bpf: Add fd-based tcx multi-prog infra with link support")
> Reported-by: Pedro Pinto <xten@xxxxxxx>
> Co-developed-by: Pedro Pinto <xten@xxxxxxx>
> Signed-off-by: Pedro Pinto <xten@xxxxxxx>
> Signed-off-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
> Cc: Hyunwoo Kim <v4bel@xxxxxxxxx>
> Cc: Wongi Lee <qwerty@xxxxxxxxx>
> Cc: Martin KaFai Lau <martin.lau@xxxxxxxxxx>
> ---

Acked-by: John Fastabend <john.fastabend@xxxxxxxxx>




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux