On Tue, 2024-08-27 at 15:49 +0200, Jason A. Donenfeld wrote: > On Tue, Aug 27, 2024 at 09:20:14PM +0800, Xi Ruoyao wrote: > > + register long ret asm("a0"); > > + register long int nr asm("a7") = __NR_getrandom; > > The first line is `long` and the second line is `long int` here. Just > call them both `long` like usual? I'll change it. > > > struct loongarch_vdso_data { > > struct vdso_pcpu_data pdata[NR_CPUS]; > > +#ifdef CONFIG_VDSO_GETRANDOM > > + struct vdso_rng_data rng_data; > > +#endif > > If VSO_GETRANDOM is selected unconditionally for the arch, why the > ifdef > here? > > > +obj-vdso-$(CONFIG_VDSO_GETRANDOM) += vgetrandom.o vgetrandom- > > chacha.o > > Likewise, same question here. I'll remove the ifdef and just add them into obj-vdso-y. > > + /* copy[3] = "expa" */ > > + li.w copy3, 0x6b206574 > > Might want to mention why you're doing this. > > /* copy[3] = "expa", because it was clobbered by the i index. */ I'll add it. > Or something like that. > > But on the topic of those constants, > > > + li.w copy0, 0x61707865 > > + li.w copy1, 0x3320646e > > + li.w copy2, 0x79622d32 > > What if you avoid doing this, > > > + > > + ld.w cnt_lo, counter, 0 > > + ld.w cnt_hi, counter, 4 > > + > > +.Lblock: > > + /* state[0,1,2,3] = "expand 32-byte k" */ > > + move state0, copy0 > > + move state1, copy1 > > + move state2, copy2 > > Use li.w here with the integer literals, li.w is expanded to two instructions (lu12i.w + addi.w) by the assembler. > > + /* copy[3] = "expa" */ > > + li.w copy3, 0x6b206574 > > Skip this, > > > + add.w state0, state0, copy0 > > + add.w state1, state1, copy1 > > + add.w state2, state2, copy2 > > + add.w state3, state3, copy3 > > And then use addi.w here with the integer literals instead? LoongArch addi.w can only handle 12-bit signed immediate values (such a limitation is very common in RISC machines). On my processor I can avoid using a register to materialize the constant with addu16i.d + addu12i.w + addi.w. But there would be 3 instructions, and addu12i.w is a part of the Loongson Binary Translation extension which is not available on some processors. Also LBT isn't intended for general use, so most LBT instructions have a lower throughput than the basic instructions. > I don't know anything about loongarch, so just guessing. > BTW, can you confirm that this passes the test in test_vdso_chacha? Yes, it has passed the test. -- Xi Ruoyao <xry111@xxxxxxxxxxx> School of Aerospace Science and Technology, Xidian University