On Wed, 16 Oct 2024 20:13:25 +0200 Sven Schnelle <svens@xxxxxxxxxxxxx> wrote: > > From what you said in v15: > > > >> I haven't yet fully understood why this logic is needed, but the > >> WARN_ON_ONCE triggers on s390. I'm assuming this fails because fp always > >> has the upper bits of the address set on x86 (and likely others). As an > >> example, in my test setup, fp is 0x8feec218 on s390, while it is > >> 0xffff888100add118 in x86-kvm. > > > > Since we only need to save 4 bits for size, we could have what it is > > replacing always be zero or always be f, depending on the arch. The > > question then is, is s390's 4 MSBs always zero? > > s390 has separate address spaces for kernel and userspace - so kernel > addresses could be anywhere. I don't know think the range should be > limited artifically because of some optimizations. Note, this is information saved in the shadow stack. When the first callback is attached to the fgraph tracer, all tasks will get a shadow stack. It currently defaults to PAGE_SIZE. When a function is entered and one of the callbacks wants to trace that function, information is saved on the shadow stack, including the original return address as the old return address is replaced to a call to a trampoline to jump to the return side callback on function exit. We allow up to 16 callbacks to be attached to the tracer. Each that wants to trace the return side will have some information saved on this shadow stack. Note all information in the shadow stack is saved by natural word size (8 bytes on 64 bit machines, leaving 512 storage slots on 4096 size shadow stack). Each entry callback can also reserve information on this shadow stack (in word aligned segments) that can be retrieved by the function exit callback. This information we are storing is saved on this shadow stack. The optimization being done here is to not waste a full 8 bytes (1 slot on the shadow stack) for just 4 bits. If we need to save the full fp, then there's no choice but to use a full slot to also save the 4 bits. If other architectures can do tricks to combine the size and fp, they should, to save the slots. I also plan on changing the size of the shadow stack. We could even do that per architecture. Thus, if we need to use two slots on the shadow stack to save the fp and size, then we could make the shadow stack bigger for those architectures that need that. -- Steve