[RFC PATCH v2 0/3] xfs: more unlinked inode list optimization v2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi forks,

Sorry for long delay these days due to my work ineffective, this is RFC
v2 of this patchset addressing previous constructive comments...

This RFC patchset mainly addresses the thoughts [*] and [**] from Dave's
original patchset,
https://lore.kernel.org/r/20200623095015.1934171-1-david@xxxxxxxxxxxxx

In short, it focues on the following ideas mentioned by Dave:
 - use bucket 0 instead of multiple buckets since in-memory double
   linked list finally works;

 - avoid taking AGI buffer and unnecessary AGI update if possible, so
   1) add a new lock and keep proper locking order to avoid deadlock;
   2) insert a new unlinked inode from the tail instead of head;

In addition, it's worth noticing 3 things:
 - xfs_iunlink_remove() should support old multiple buckets in order
   to keep old inode unlinked list (old image) working when recovering.

 - (but) OTOH, the old kernel recovery _shouldn't_ work with new image
   since the bucket_index from old xfs_iunlink_remove() is generated
   by the old formula (rather than keep in xfs_inode), which is now
   fixed as 0. So this feature is not forward compatible without some
   extra backport patches;

 - a tail xfs_inode pointer is also added in the perag, which keeps 
   track of the tail of bucket 0 since it's mainly used for xfs_iunlink().

 - the old kernel recovery shouldn't work with new image since
     its bucket_index from old kernel is 

The git tree is also available at
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/linux.git tags/xfs/iunlink_opt_v2

Gitweb:
https://git.kernel.org/pub/scm/linux/kernel/git/xiang/linux.git/log/?h=xfs/iunlink_opt_v2

Some limited test for debugging only is done (mainly fsstress and manual
power-cut tests) and it's worth noticing that when I tested fsstress for
much longer time, it showed the following kernel message twice and it
haven't been reproduced again for about 3 days, and I have no idea what's
wrong over the current logic, so it could be better to send out the patchset
first and go on reproducing with extra debugging prints added...

[ 7884.930106] XFS (sda): bad inode magic/vsn daddr 111232 #0 (magic=5050)
[ 7884.930988] XFS (sda): Metadata corruption detected at xfs_buf_ioend+0x11a/0x190, xfs_inode block 0x1b280 xfs_inode_buf_verify
[ 7884.932418] XFS (sda): Unmount and run xfs_repair
[ 7884.933154] XFS (sda): First 128 bytes of corrupted metadata buffer:
[ 7884.934253] 00000000: 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50  PPPPPPPPPPPPPPPP
[ 7884.935495] 00000010: 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50  PPPPPPPPPPPPPPPP
[ 7884.936752] 00000020: 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50  PPPPPPPPPPPPPPPP
[ 7884.937761] 00000030: 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50  PPPPPPPPPPPPPPPP
[ 7884.939336] 00000040: 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50  PPPPPPPPPPPPPPPP
[ 7884.940914] 00000050: 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50  PPPPPPPPPPPPPPPP
[ 7884.942382] 00000060: 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50  PPPPPPPPPPPPPPPP
[ 7884.943513] 00000070: 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50  PPPPPPPPPPPPPPPP
[ 7884.945006] XFS (sda): metadata I/O error in "xfs_imap_to_bp+0x61/0xd0" at daddr 0x1b280 len 32 error 117
[ 7884.952066] XFS (sda): xfs_do_force_shutdown(0x1) called from line 312 of file fs/xfs/xfs_trans_buf.c. Return address = ffffffff983ea7e0
[ 7884.952494] sda: writeback error on inode 20695, offset 770048, sector 1970864
[ 7884.954353] XFS (sda): I/O Error Detected. Shutting down filesystem
[ 7884.956566] sda: writeback error on inode 117946, offset 2347008, sector 612424
[ 7884.956802] XFS (sda): Please unmount the filesystem and rectify the problem(s)
[ 7884.958943] XFS (sda): xfs_inactive_ifree: xfs_trans_commit returned error -117
fsstress: check_cwd failure
[ 7884.963475] XFS (sda): xfs_imap_lookup: xfs_ialloc_read_agi() returned error -5, agno 0

Comments and directions are welcomed. :)

Thanks,
Gao Xiang


Changes since v1:
 - show some number of reduced AGI buffer modification (Darrick);
 - add comments about pag_unlinked_mutex, pag_unlinked_tail (Darrick);
 - avoid touching xfs_iunlink_update_bucket() logic (Dave);
 - spilt "xfs: don't access AGI on unlinked inodes if it can" into 2 patches (Dave);
 - add xfs_iunlink_insert_lock/xfs_iunlink_remove_lock/xfs_iunlink_unlock to
   make the code easy to follow (Dave).

Gao Xiang (3):
  xfs: arrange all unlinked inodes into one list
  xfs: introduce perag iunlink lock
  xfs: insert unlinked inodes from tail

 fs/xfs/xfs_inode.c       | 202 ++++++++++++++++++++++++++++-----------
 fs/xfs/xfs_log_recover.c |   5 +
 fs/xfs/xfs_mount.c       |   2 +
 fs/xfs/xfs_mount.h       |   6 ++
 4 files changed, 159 insertions(+), 56 deletions(-)

-- 
2.18.1




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux