Re: [PATCH bpf-next 07/13] uprobes/x86: Add support to emulate nop5 instruction

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 13, 2024 at 11:45:36AM +0100, Peter Zijlstra wrote:
> On Wed, Dec 11, 2024 at 02:33:56PM +0100, Jiri Olsa wrote:
> > Adding support to emulate nop5 as the original uprobe instruction.
> > 
> > This speeds up uprobes on top of nop5 instructions:
> > (results from benchs/run_bench_uprobes.sh)
> > 
> > current:
> > 
> >      uprobe-nop     :    3.252 ± 0.019M/s
> >      uprobe-push    :    3.097 ± 0.002M/s
> >      uprobe-ret     :    1.116 ± 0.001M/s
> >  --> uprobe-nop5    :    1.115 ± 0.001M/s
> >      uretprobe-nop  :    1.731 ± 0.016M/s
> >      uretprobe-push :    1.673 ± 0.023M/s
> >      uretprobe-ret  :    0.843 ± 0.009M/s
> >  --> uretprobe-nop5 :    1.124 ± 0.001M/s
> > 
> > after the change:
> > 
> >      uprobe-nop     :    3.281 ± 0.003M/s
> >      uprobe-push    :    3.085 ± 0.003M/s
> >      uprobe-ret     :    1.130 ± 0.000M/s
> >  --> uprobe-nop5    :    3.276 ± 0.007M/s
> >      uretprobe-nop  :    1.716 ± 0.016M/s
> >      uretprobe-push :    1.651 ± 0.017M/s
> >      uretprobe-ret  :    0.846 ± 0.006M/s
> >  --> uretprobe-nop5 :    3.279 ± 0.002M/s
> > 
> > Strangely I can see uretprobe-nop5 is now much faster compared to
> > uretprobe-nop, while perf profiles for both are almost identical.
> > I'm still checking on that.
> > 
> > Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
> > ---
> >  arch/x86/kernel/uprobes.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
> > index 23e4f2821cff..cdea97f8cd39 100644
> > --- a/arch/x86/kernel/uprobes.c
> > +++ b/arch/x86/kernel/uprobes.c
> > @@ -909,6 +909,11 @@ static const struct uprobe_xol_ops push_xol_ops = {
> >  	.emulate  = push_emulate_op,
> >  };
> >  
> > +static int is_nop5_insn(uprobe_opcode_t *insn)
> > +{
> > +	return !memcmp(insn, x86_nops[5], 5);
> > +}
> > +
> >  /* Returns -ENOSYS if branch_xol_ops doesn't handle this insn */
> >  static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn)
> >  {
> > @@ -928,6 +933,8 @@ static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn)
> >  		break;
> >  
> >  	case 0x0f:
> > +		if (is_nop5_insn((uprobe_opcode_t *) &auprobe->insn))
> > +			goto setup;
> 
> This isn't right, this is not x86_64 specific code, and there's a bunch
> of 32bit 5 byte nops that do not start with 0f.
> 
> Also, since you already have the insn decoded, I would suggest you
> simply check OPCODE2(insn) == 0x1f /* NOPL */ and length == 5.

ah right.. ok will change, thanks

jirka

> 
> >  		if (insn->opcode.nbytes != 2)
> >  			return -ENOSYS;
> >  		/*
> > -- 
> > 2.47.0
> > 




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux