On Wed, Jun 21, 2023 at 05:25:27PM +0800, Wu Guanghao wrote: > We encountered a segfault while testing the mkfs.xfs + iscsi. > > (gdb) bt > #0 libxfs_log_sb (tp=0xaaaafaea0630) at xfs_sb.c:810 > #1 0x0000aaaaca991468 in __xfs_trans_commit (tp=<optimized out>, tp@entry=0xaaaafaea0630, regrant=regrant@entry=true) at trans.c:995 > #2 0x0000aaaaca991790 in libxfs_trans_roll (tpp=tpp@entry=0xfffffe1f3018) at trans.c:103 > #3 0x0000aaaaca9bcde8 in xfs_dialloc_roll (agibp=0xaaaafaea2fa0, tpp=0xfffffe1f31c8) at xfs_ialloc.c:1561 > #4 xfs_dialloc_try_ag (ok_alloc=true, new_ino=<synthetic pointer>, parent=0, pag=0xaaaafaea0210, tpp=0xfffffe1f31c8) at xfs_ialloc.c:1698 > #5 xfs_dialloc (tpp=tpp@entry=0xfffffe1f31c8, parent=0, mode=mode@entry=16877, new_ino=new_ino@entry=0xfffffe1f3128) at xfs_ialloc.c:1776 > #6 0x0000aaaaca9925b0 in libxfs_dir_ialloc (tpp=tpp@entry=0xfffffe1f31c8, dp=dp@entry=0x0, mode=mode@entry=16877, nlink=nlink@entry=1, rdev=rdev@entry=0, cr=cr@entry=0xfffffe1f31d0, > fsx=fsx@entry=0xfffffe1f36a4, ipp=ipp@entry=0xfffffe1f31c0) at util.c:525 > #7 0x0000aaaaca988fac in parseproto (mp=0xfffffe1f36c8, pip=0x0, fsxp=0xfffffe1f36a4, pp=0xfffffe1f3370, name=0x0) at proto.c:552 > #8 0x0000aaaaca9867a4 in main (argc=<optimized out>, argv=<optimized out>) at xfs_mkfs.c:4217 > > (gdb) p bp > $1 = 0x0 > > ``` > void > xfs_log_sb( > struct xfs_trans *tp) > { > // iscsi offline > ... > // failed to read sb, bp = NULL > struct xfs_buf *bp = xfs_trans_getsb(tp); > ... > } > ``` > > When writing data to sb, if the device is abnormal at this time, > the bp may be empty. Using it without checking will result in > a segfault. xfs_trans_getsb() is not supposed to fail. In the kernel code (which this is a copy of) it can't fail because the superblock buffer is always pinned in memory at mount time and so is *never read from the storage* after mount. Hence something similar needs to be in userspace with libxfs_getsb() so that the superblock is only read when setting up the initial mount state in libxfs.... > diff --git a/libxfs/xfs_attr_leaf.c b/libxfs/xfs_attr_leaf.c > index 6cac2531..73079df1 100644 > --- a/libxfs/xfs_attr_leaf.c > +++ b/libxfs/xfs_attr_leaf.c > @@ -668,7 +668,7 @@ xfs_sbversion_add_attr2( > spin_lock(&mp->m_sb_lock); > xfs_add_attr2(mp); > spin_unlock(&mp->m_sb_lock); > - xfs_log_sb(tp); > + ASSERT(!xfs_log_sb(tp)); FWIW, that's never a valid conversion nor a valid way to handle something that can fail. That turns the code into code that is only executed on debug builds, and it will panic the debug build rather than handle the error..... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx