On 10/25/2024 9:32 AM, Hou Tao wrote: > From: Hou Tao <houtao1@xxxxxxxxxx> > > On 32-bit hosts (e.g., arm32), when a bpf program passes a u64 to > bpf_iter_bits_new(), bpf_iter_bits_new() will use bits_copy to store the > content of the u64. However, bits_copy is only 4 bytes, leading to stack > corruption. > SNIP > > Fix it by changing the type of both bits and bit_count from unsigned > long to u64. However, the change is not enough. The main reason is that > bpf_iter_bits_next() uses find_next_bit() to find the next bit and the > pointer passed to find_next_bit() is an unsigned long pointer instead > of a u64 pointer. For 32-bit little-endian host, it is fine but it is > not the case for 32-bit big-endian host. Because under 32-bit big-endian > host, the first iterated unsigned long will be the bits 32-63 of the u64 > instead of the expected bits 0-31. Therefore, in addition to changing > the type, swap the two unsigned longs within the u64 for 32-bit > big-endian host. > > Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx> > --- > kernel/bpf/helpers.c | 33 ++++++++++++++++++++++++++++++--- > 1 file changed, 30 insertions(+), 3 deletions(-) > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > index daec74820dbe..824718349958 100644 > --- a/kernel/bpf/helpers.c > +++ b/kernel/bpf/helpers.c > @@ -2855,13 +2855,36 @@ struct bpf_iter_bits { > > struct bpf_iter_bits_kern { > union { > - unsigned long *bits; > - unsigned long bits_copy; > + __u64 *bits; > + __u64 bits_copy; > }; > int nr_bits; > int bit; > } __aligned(8); > > +/* On 64-bit hosts, unsigned long and u64 have the same size, so passing > + * a u64 pointer and an unsigned long pointer to find_next_bit() will > + * return the same result, as both point to the same 8-byte area. > + * > + * For 32-bit little-endian hosts, using a u64 pointer or unsigned long > + * pointer also makes no difference. This is because the first iterated > + * unsigned long is composed of bits 0-31 of the u64 and the second unsigned > + * long is composed of bits 32-63 of the u64. > + * > + * However, for 32-bit big-endian hosts, this is not the case. The first > + * iterated unsigned long will be bits 32-63 of the u64, so swap these two > + * ulong values within the u64. > + */ > +static void swap_ulong_in_u64(u64 *bits, unsigned int nr) > +{ > +#if !defined(CONFIG_64BIT) && defined(__BIG_ENDIAN) > + unsigned int i; > + > + for (i = 0; i < nr; i++) > + bits[i] = (bits[i] >> 32) | ((u64)(u32)bits[i] << 32); > +#endif > +} > + Just find out the bitmap_from_arr64() API from lib/bitmap. However the API assumes the memories for dst and src are not overlapped, so it is a pity that we can not use it. According to the implementation ofbitmap_from_arr64(), I think it would be better to use "BITS_PER_LONG == 32" instead of "defined(CONFIG_64BIT) " in swap_ulong_in_u64(). > /** > * bpf_iter_bits_new() - Initialize a new bits iterator for a given memory area > * @it: The new bpf_iter_bits to be created > @@ -2906,6 +2929,8 @@ bpf_iter_bits_new(struct bpf_iter_bits *it, const u64 *unsafe_ptr__ign, u32 nr_w > if (err) > return -EFAULT; > > + swap_ulong_in_u64(&kit->bits_copy, nr_words); > + > kit->nr_bits = nr_bits; > return 0; > } > @@ -2924,6 +2949,8 @@ bpf_iter_bits_new(struct bpf_iter_bits *it, const u64 *unsafe_ptr__ign, u32 nr_w > return err; > } > > + swap_ulong_in_u64(kit->bits, nr_words); > + > kit->nr_bits = nr_bits; > return 0; > }