* Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote: > Anyway, I used some linker magic to temporarily move the unwinder code to the > end of .text, so that unwinder changes don't add unexpected side effects to the > microbenchmark behavior. Now I'm getting more consistent results: the packed > struct is measuring ~2% slower. The slight slowdown might just be explained by > the fact that GCC generates some extra instructions for extracting the fields > out of the packed struct. Yeah, the 16-bit field accesses versus a zero-extended 32-bit field are more complex to access even on x86 that has a fair amount of 16-bit legacy. > In the meantime, I found a ~10% speedup by making the "fast lookup table" block > size a power-of-two (256) to get rid of the need for a slow 'div' instruction. > > I think I'm done performance tweaking for now. I'll keep the packed struct, and > add the code for the 'div' removal, and hope to submit v3 soon. Sounds good to me! ~2% slowdown for ~30% RAM savings for a debug data structure that is about as large as a typical kernel's total .text is a decent trade-off. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe live-patching" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html