Re: [PATCH bpf] arm32, bpf: reimplement sign-extension mov instruction

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 19, 2024 at 06:28:32PM +0000, Puranjay Mohan wrote:
> The current implementation of the mov instruction with sign extension
> has the following problems:
> 
>   1. It clobbers the source register if it is not stacked because it
>      sign extends the source and then moves it to the destination.
>   2. If the dst_reg is stacked, the current code doesn't write the value
>      back in case of 64-bit mov.
>   3. There is room for improvement by emitting fewer instructions.
> 
> The steps for fixing this and the instructions emitted by the JIT are
> explained below with examples in all combinations:
> 
> Case A: offset == 32:
> =====================
>   Case A.1: src and dst are stacked registers:
>   --------------------------------------------
>     1. Load src_lo into tmp_lo
>     2. Store tmp_lo into dst_lo
>     3. Sign extend tmp_lo into tmp_hi
>     4. Store tmp_hi to dst_hi
> 
>     Example: r3 = (s32)r3
> 	r3 is a stacked register
> 
> 	ldr     r6, [r11, #-16]	// Load r3_lo into tmp_lo
> 	// str to dst_lo is not emitted because src_lo == dst_lo
> 	asr     r7, r6, #31	// Sign extend tmp_lo into tmp_hi
> 	str     r7, [r11, #-12] // Store tmp_hi into r3_hi
> 
>   Case A.2: src is stacked but dst is not:
>   ----------------------------------------
>     1. Load src_lo into dst_lo
>     2. Sign extend dst_lo into dst_hi
> 
>     Example: r6 = (s32)r3
> 	r6 maps to {ARM_R5, ARM_R4} and r3 is stacked
> 
> 	ldr     r4, [r11, #-16] // Load r3_lo into r6_lo
> 	asr     r5, r4, #31	// Sign extend r6_lo into r6_hi
> 
>   Case A.3: src is not stacked but dst is stacked:
>   ------------------------------------------------
>     1. Store src_lo into dst_lo
>     2. Sign extend src_lo into tmp_hi
>     3. Store tmp_hi to dst_hi
> 
>     Example: r3 = (s32)r6
> 	r3 is stacked and r6 maps to {ARM_R5, ARM_R4}
> 
> 	str     r4, [r11, #-16] // Store r6_lo to r3_lo
> 	asr     r7, r4, #31	// Sign extend r6_lo into tmp_hi
> 	str     r7, [r11, #-12]	// Store tmp_hi to dest_hi
> 
>   Case A.4: Both src and dst are not stacked:
>   -------------------------------------------
>     1. Mov src_lo into dst_lo
>     2. Sign extend src_lo into dst_hi
> 
>     Example: (bf) r6 = (s32)r6
> 	r6 maps to {ARM_R5, ARM_R4}
> 
> 	// Mov not emitted because dst == src
> 	asr     r5, r4, #31 // Sign extend r6_lo into r6_hi
> 
> Case B: offset != 32:
> =====================
>   Case B.1: src and dst are stacked registers:
>   --------------------------------------------
>     1. Load src_lo into tmp_lo
>     2. Sign extend tmp_lo according to offset.
>     3. Store tmp_lo into dst_lo
>     4. Sign extend tmp_lo into tmp_hi
>     5. Store tmp_hi to dst_hi
> 
>     Example: r9 = (s8)r3
> 	r9 and r3 are both stacked registers
> 
> 	ldr     r6, [r11, #-16] // Load r3_lo into tmp_lo
> 	lsl     r6, r6, #24	// Sign extend tmp_lo
> 	asr     r6, r6, #24	// ..
> 	str     r6, [r11, #-56] // Store tmp_lo to r9_lo
> 	asr     r7, r6, #31	// Sign extend tmp_lo to tmp_hi
> 	str     r7, [r11, #-52] // Store tmp_hi to r9_hi
> 
>   Case B.2: src is stacked but dst is not:
>   ----------------------------------------
>     1. Load src_lo into dst_lo
>     2. Sign extend dst_lo according to offset.
>     3. Sign extend tmp_lo into dst_hi
> 
>     Example: r6 = (s8)r3
> 	r6 maps to {ARM_R5, ARM_R4} and r3 is stacked
> 
> 	ldr     r4, [r11, #-16] // Load r3_lo to r6_lo
> 	lsl     r4, r4, #24	// Sign extend r6_lo
> 	asr     r4, r4, #24	// ..
> 	asr     r5, r4, #31	// Sign extend r6_lo into r6_hi
> 
>   Case B.3: src is not stacked but dst is stacked:
>   ------------------------------------------------
>     1. Sign extend src_lo into tmp_lo according to offset.
>     2. Store tmp_lo into dst_lo.
>     3. Sign extend src_lo into tmp_hi.
>     4. Store tmp_hi to dst_hi.
> 
>     Example: r3 = (s8)r1
> 	r3 is stacked and r1 maps to {ARM_R3, ARM_R2}
> 
> 	lsl     r6, r2, #24 	// Sign extend r1_lo to tmp_lo
> 	asr     r6, r6, #24	// ..
> 	str     r6, [r11, #-16] // Store tmp_lo to r3_lo
> 	asr     r7, r6, #31	// Sign extend tmp_lo to tmp_hi
> 	str     r7, [r11, #-12] // Store tmp_hi to r3_hi
> 
>   Case B.4: Both src and dst are not stacked:
>   -------------------------------------------
>     1. Sign extend src_lo into dst_lo according to offset.
>     2. Sign extend dst_lo into dst_hi.
> 
>     Example: r6 = (s8)r1
> 	r6 maps to {ARM_R5, ARM_R4} and r1 maps to {ARM_R3, ARM_R2}
> 
> 	lsl     r4, r2, #24	// Sign extend r1_lo to r6_lo
> 	asr     r4, r4, #24	// ..
> 	asr     r5, r4, #31	// Sign extend r6_lo to r6_hi
> 
> Fixes: fc832653fa0d ("arm32, bpf: add support for sign-extension mov instruction")
> Reported-by: syzbot+186522670e6722692d86@xxxxxxxxxxxxxxxxxxxxxxxxx
> Closes: https://lore.kernel.org/all/000000000000e9a8d80615163f2a@xxxxxxxxxx/
> Signed-off-by: Puranjay Mohan <puranjay@xxxxxxxxxx>

This looks really great, thanks for putting in the extra effort!

Reviewed-by: Russell King (Oracle) <rmk+kernel@xxxxxxxxxxxxxxx>

Thanks!

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux