On Thu, Apr 21, 2016 at 12:39:42PM -0700, Andy Lutomirski wrote: > On Wed, Apr 20, 2016 at 12:05 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Wed, Apr 20, 2016 at 08:40:23AM -0700, Andy Lutomirski wrote: > >> >> Peter, I got lost in the code that calls this. Are regs coming from > >> >> the overflow interrupt's regs, current_pt_regs(), or > >> >> perf_get_regs_user? > >> > > >> > So get_perf_callchain() will get regs from: > >> > > >> > - interrupt/NMI regs > >> > - perf_arch_fetch_caller_regs() > >> > > >> > And when user && !user_mode(), we'll use: > >> > > >> > - task_pt_regs() (which arguably should maybe be perf_get_regs_user()) > >> > >> Could you point me to this bit of the code? > > > > kernel/events/callchain.c:198 > > But that only applies to the callchain code, right? Yes, which is what I thought you were after.. > AFAICS the PEBS > code is invoked through the x86_pmu NMI handler and always gets the > IRQ regs. Except for this case: > > static inline void intel_pmu_drain_pebs_buffer(void) > { > struct pt_regs regs; > > x86_pmu.drain_pebs(®s); > } > > which seems a bit confused. Yes, so that only gets used with 'large' pebs, which requires no other flags than PERF_FRERERUNNING_FLAGS, which precludes the regs set from being used. Could definitely use a comment. > I don't suppose we could arrange to pass something consistent into the > PEBS handlers... > > Or is the PEBS code being called from the callchain code somehow? No. I think we were/are slightly talking past one another. > >> One call to perf_get_user_regs per interrupt shouldn't be too bad -- > >> certainly much better then one per PEBS record. One call to get user > >> ABI per overflow would be even less bad, but at that point, folding it > >> in to the PEBS code wouldn't be so bad either. > > > > Right; although note that the whole fixup_ip() thing requires a single > > record per interrupt (for we need the LBR state for each record in order > > to rewind). > > So do earlier PEBS events not get rewound? Or so we just program the > thing to only ever give us one event at a time? The latter; we program PEBS such that it can hold but a single record and thereby assure we get an interrupt for each record. > > The problem here is that the overflow stuff is designed for a single > > 'event' per interrupt, so passing it data for multiple events is > > somewhat icky. > > It also seems that there's a certain amount of confusion as to exactly > what "regs" means in various contexts. Or at least I'm confused by > it. Yes, there's too much regs. Typically 'regs' is the 'interrrupt'/'event' regs, that is the register set at eventing time. For sampling hardware PMUs this is NMI/IRQ like things, for software events this ends up being perf_arch_fetch_caller_regs(). Then there's PERF_SAMPLE_REGS_USER|PERF_SAMPLE_STACK_USER, which, for each event with it set, use perf_get_regs_user() to dump the thing into our ringbuffer as part of the event record. And then there's the callchain code, which first unwinds kernel space if the 'interrupt'/'event' reg set points into the kernel, and then uses task_pt_regs() (which I think we agree should be perf_get_regs_user()) to obtain the user regs to continue with the user stack unwind. Finally there's PERF_SAMPLE_REGS_INTR, which dumps whatever 'interrupt/event' regs we get into the ringbuffer sample record. Did that help? Or did I confuse you moar? -- To unsubscribe from this list: send the line "unsubscribe linux-kselftest" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html