Re: [PATCH v4 1/4] LoongArch: vDSO: Wire up getrandom() vDSO implementation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2024-08-27 at 15:49 +0200, Jason A. Donenfeld wrote:
> On Tue, Aug 27, 2024 at 09:20:14PM +0800, Xi Ruoyao wrote:
> > +	register long ret asm("a0");
> > +	register long int nr asm("a7") = __NR_getrandom;
> 
> The first line is `long` and the second line is `long int` here. Just
> call them both `long` like usual?

I'll change it.
> 
> >  struct loongarch_vdso_data {
> >  	struct vdso_pcpu_data pdata[NR_CPUS];
> > +#ifdef CONFIG_VDSO_GETRANDOM
> > +	struct vdso_rng_data rng_data;
> > +#endif
> 
> If VSO_GETRANDOM is selected unconditionally for the arch, why the
> ifdef
> here?
> 
> > +obj-vdso-$(CONFIG_VDSO_GETRANDOM) += vgetrandom.o vgetrandom-
> > chacha.o
> 
> Likewise, same question here.

I'll remove the ifdef and just add them into obj-vdso-y.

> > +	/* copy[3] = "expa" */
> > +	li.w		copy3, 0x6b206574
> 
> Might want to mention why you're doing this.
> 
>     /* copy[3] = "expa", because it was clobbered by the i index. */

I'll add it.

> Or something like that.
> 
> But on the topic of those constants,
> 
> > +       li.w            copy0, 0x61707865
> > +       li.w            copy1, 0x3320646e
> > +       li.w            copy2, 0x79622d32
> 
> What if you avoid doing this,
> 
> > +
> > +       ld.w            cnt_lo, counter, 0
> > +       ld.w            cnt_hi, counter, 4
> > +
> > +.Lblock:
> > +       /* state[0,1,2,3] = "expand 32-byte k" */
> > +       move            state0, copy0
> > +       move            state1, copy1
> > +       move            state2, copy2
> 
> Use li.w here with the integer literals,

li.w is expanded to two instructions (lu12i.w + addi.w) by the
assembler.

> > +	/* copy[3] = "expa" */
> > +	li.w		copy3, 0x6b206574
> 
> Skip this,
> 
> > +       add.w           state0, state0, copy0
> > +       add.w           state1, state1, copy1
> > +       add.w           state2, state2, copy2
> > +       add.w           state3, state3, copy3
> 
> And then use addi.w here with the integer literals instead?

LoongArch addi.w can only handle 12-bit signed immediate values (such a
limitation is very common in RISC machines).  On my processor I can
avoid using a register to materialize the constant with addu16i.d +
addu12i.w + addi.w.  But there would be 3 instructions, and addu12i.w is
a part of the Loongson Binary Translation extension which is not
available on some processors.  Also LBT isn't intended for general use,
so most LBT instructions have a lower throughput than the basic
instructions.

> I don't know anything about loongarch, so just guessing.

> BTW, can you confirm that this passes the test in test_vdso_chacha?

Yes, it has passed the test.

-- 
Xi Ruoyao <xry111@xxxxxxxxxxx>
School of Aerospace Science and Technology, Xidian University





[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]
  Powered by Linux