On Tue, Mar 4, 2025 at 7:15 PM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > > which is doing: > > __unqual_typeof(*(p)) VAL; > > (__unqual_typeof(*(p)))READ_ONCE(*__ptr); > > > > and llvm will insert cast_kern() there, > > Yes, I do see a r1 = addr_space_cast(r2, 0x0, 0x1). > r2 is node->next loaded from arena pointer 'node'. > > But I can't understand why that's a problem. > > If I do > for (;;) { > next = READ_ONCE(node->next); > if (next) > break; > cond_break_label(...); > } > > instead of the macro, everything works ok. because the above doesn't have addr space casts. > But that's because LLVM didn't insert a cast, and the verifier sees > next as a scalar. > So if next is 0x100000000000, it will see 0x100000000000. > With cast_kern it only sees 0. right. > It will probably be casted once we try to write to next->locked later on. not quite. In a typical program llvm will emit bare minimum cast_user, because all pointers are full 64-bit valid user space addresses all the time. The cast_kern() is needed for read/write through the pointer if it's not a kernel pointer yet. See list_add_head() in bpf_arena_list.h that has a bunch of explicit cast_kern/user (with llvm there will be a fraction of them), but they illustrate the idea: cast_user(first); cast_kern(n); // before writing into 'n' it has to be 'kern' WRITE_ONCE(n->next, first); // first has to be full 64-bit cast_kern(first); // ignore this one :) it's my mistake. should be after 'if' if (first) { tmp = &n->next; cast_user(tmp); WRITE_ONCE(first->pprev, tmp); } cast_user(n); WRITE_ONCE(h->first, n); > I would gather there's a lot of other cases where someone dereferences > before doing some pointer equality comparison. > In that case we might end up in the same situation. > ptr = load_from_arena; > x = ptr->xyz; > if (ptr == ptr2) { ... } There shouldn't be any issues here. The 'load from arena' will return full 64-bit and they should be stored as full 64-bit in memory. ptr->xyz (assuming xyz is another pointer) will read full 64-bit too. > The extra cast_kern is certainly causing this to surface, but I am not > sure whether it's something to fix in the macro. I think it's a macro issue due to casting addr space off. > > so if (VAL) always sees upper 32-bit as zero. > > > > So I suspect it's not a zero page issue. > > > > When I bpf_printk the node address of the qnode of CPU 0, it is > 0x100000000000 i.e. user_vm_start. This is the pointer that's misdetected. > So it appears to be on the first page. yes and looks the addr passed into printk is correct full 64-bit as it should be. So this part: return &qnodes[cpu + 1][idx].mcs; is fine. It's full 64-bit. &((struct arena_qnode __arena *)base + idx)->mcs; is also ok. There are no addr space casts there. But the macro is problematic.