On Fri, Nov 10, 2023 at 04:48:02PM +0900, Ryusuke Konishi wrote: > On Fri, Nov 10, 2023 at 6:06 AM Matthew Wilcox (Oracle) wrote: > > +++ b/fs/buffer.c > > @@ -199,7 +199,7 @@ __find_get_block_slow(struct block_device *bdev, sector_t block) > > int all_mapped = 1; > > static DEFINE_RATELIMIT_STATE(last_warned, HZ, 1); > > > > - index = block >> (PAGE_SHIFT - bd_inode->i_blkbits); > > + index = ((loff_t)block << bd_inode->i_blkbits) / PAGE_SIZE; > > Multiple 64-bit divisions are used in this patch, but why not use two > stage shifts as shown below if there is no left shift overflow and the > sign bit will not be set ? > > index = ((loff_t)block << bd_inode->i_blkbits) >> PAGE_SHIFT; Here's what the compiler turns that into: 3223: 49 8b 86 80 00 00 00 mov 0x80(%r14),%rax 322a: 4c 89 ee mov %r13,%rsi 322d: ba 01 00 00 00 mov $0x1,%edx 3232: 0f b6 88 c2 00 00 00 movzbl 0xc2(%rax),%ecx ^ this is where we load i_blkbits into ecx 3239: 48 89 45 d0 mov %rax,-0x30(%rbp) 323d: 4c 8b 60 30 mov 0x30(%rax),%r12 3241: 48 d3 e6 shl %cl,%rsi ^ shift left by %cl (the bottom byte of ecx) 3244: 31 c9 xor %ecx,%ecx 3246: 48 c1 ee 0c shr $0xc,%rsi ^ shift right by 12 324a: 4c 89 e7 mov %r12,%rdi 324d: e8 00 00 00 00 call 3252 <__find_get_block+0x82> 324e: R_X86_64_PLT32 __filemap_get_folio-0x4