On Mon, Mar 27, 2017 at 01:20:30PM -0700, Darrick J. Wong wrote: > [move to new list] > > On Thu, Apr 21, 2016 at 09:06:56PM +0800, Eryu Guan wrote: > > There's a race condition in the [get|put]_invtrecord() routines, because > > a lseek() followed by a read()/write() is not atmoic, the file offset > > might be changed before read()/write(). > > > > xfs/302 catches this failure as: > > xfsdump: drive 1: INV : Unknown version 0 - Expected version 1 > > xfsdump: inv_core.c:66: get_counters: Assertion `((invt_counter_t *)(*cntpp))->ic_vernum == (inv_version_t) 1' failed. > > > > And it can be reproduced by running multi-stream dump in a tight loop > > mount /dev/<dev> /mnt/xfs > > mkdir /mnt/xfs/dumpdir > > # populate dumpdir here > > while xfsdump -M l1 -M l2 -f d1 -f d2 -L ses /mnt/xfs -s dumpdir; do > > : > > done > > > > Fix it by replacing the "lseek(); read()/write()" sequence by > > pread()/pwrite(), which make the seek and I/O an atomic operation. > > > > Also convert and remove all *_SEEKCUR routines to "SEEK_SET" variants, > > because they depend on the maintenance of current file offset, but > > pread()/pwrite() don't change file offset. > > > > Signed-off-by: Eryu Guan <eguan@xxxxxxxxxx> > > --- > > > > Tested via the reproducer and xfstests "-g dump" run, with both v4 and v5 XFS. > > > > I'm not sure if this is the right fix, perhaps what should be fixed is the > > "INVLOCK()", which is now implemented by flock(2), and doesn't work in > > multi-thread env, if what it's meant to protect is concurrent accesses from > > different threads, not processes. > > > > If so, it seems to me that making INVLOCK() a pthread rw lock could fix the > > race condition as well. But the INVLOCK calls are almost everywhere, I didn't > > find a simple way to try it. > > I wonder, did this ever make any progress? Offhand it looks ok, but then > I'm no xfsdump expert. No, you're the first one to comment on this patch :) > > (Yes, our QA is bugging me about xfs/302 failures too...) JFYI, xfs/059 and xfs/301 also fail due to this bug, just that xfs/059 failure rarely happens. Thanks, Eryu -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html