On Fri, Oct 11, 2019 at 04:20:50PM -0400, Josef Bacik wrote: > I spent the last few weeks running down a weird regression in btrfs we > were seeing in production. It turned out to be introduced by > 62b37622718c, which took the following > > loff_t isize = i_size_read(inode); > > actual_end = min_t(u64, isize, end + 1); > > and turned it into > > actual_end = min_t(u64, i_size_read(inode), end + 1); > > The problem here is that the compiler is optimizing out the temporary > variables used in __cmp_once, so the resulting assembly looks like this > > 498 actual_end = min_t(u64, i_size_read(inode), end + 1); > 0xffffffff814b08c1 <+145>: 48 8b 44 24 28 mov 0x28(%rsp),%rax > 0xffffffff814b08c6 <+150>: 48 39 45 50 cmp %rax,0x50(%rbp) > 0xffffffff814b08ca <+154>: 48 89 c6 mov %rax,%rsi > 0xffffffff814b08cd <+157>: 48 0f 46 75 50 cmovbe 0x50(%rbp),%rsi > > as you can see we read the value of the inode to compare, and then we > read it a second time to assign it. > > This code is simply an optimization, so there's no locking to keep > i_size from changing, however we really need min_t to actually return > the minimum value for these two values, which it is failing to do. > > We've reverted that patch for now to fix the problem, but it's only a > matter of time before the compiler becomes smart enough to optimize out > the loff_t isize intermediate variable as well. > > Instead we want to make it explicit that i_size_read() should only read > the value once. This will keep this class of problem from happening in > the future, regardless of what the compiler chooses to do. With this > change we get the following assembly generated for this code > > 491 actual_end = min_t(u64, i_size_read(inode), end + 1); > 0xffffffff8148f625 <+149>: 48 8b 44 24 20 mov 0x20(%rsp),%rax > > ./include/linux/compiler.h: > 199 __READ_ONCE_SIZE; > 0xffffffff8148f62a <+154>: 4c 8b 75 50 mov 0x50(%rbp),%r14 > > fs/btrfs/inode.c: > 491 actual_end = min_t(u64, i_size_read(inode), end + 1); > 0xffffffff8148f62e <+158>: 49 39 c6 cmp %rax,%r14 > 0xffffffff8148f631 <+161>: 4c 0f 47 f0 cmova %rax,%r14 > > and this works out properly, we only read the value once and so we won't > trip over this problem again. > > Signed-off-by: Josef Bacik <josef@xxxxxxxxxxxxxx> Al, Will you pick this up, or do you want me to send it along? Thanks, Josef