On Tue, 1 Aug 2023 19:22:01 -0700 Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > On Tue, Aug 1, 2023 at 5:44 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > > > On Tue, 1 Aug 2023 20:40:54 -0400 > > Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > > > > Maybe we can add a ftrace_partial_regs(fregs) that returns a > > > partially filled pt_regs, and the caller that uses this obviously knows > > > its partial (as it's in the name). But this doesn't quite help out arm64 > > > because unlike x86, struct ftrace_regs does not contain an address > > > compatibility with pt_regs fields. It would need to do a copy. > > > > > > ftrace_partial_regs(fregs, ®s) ? > > > > Well, both would be pointers so you wouldn't need the "&", but it was > > to stress that it would be copying one to the other. > > > > void ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs regs); > > Copy works, but why did you pick a different layout? I think it is for minimize the stack consumption. pt_regs on arm64 will consume 42*u64 = 336 bytes, on the other hand ftrace_regs will use 14*unsigned long = 112 bytes. And most of the registers in pt_regs are not accessed usually. (as you may know RISC processors usually have many registers - and x86 will be if we use APX in kernel. So pt_regs is big.) > Why not to use pt_regs ? if save of flags is slow, just skip that part > and whatever else that is slow. You don't even need to zero out > unsaved fields. Just ask the caller to zero out pt_regs before hand. > Most users have per-cpu pt_regs that is being reused. > So there will be one zero-out in the beginning and every partial > save of regs will be fast. > Then there won't be any need for copy-converter from ftrace_regs to pt_regs. > Maybe too much churn at this point. copy is fine. If there is no nested call, yeah, per-cpu pt_regs will work. Thank you, -- Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>