Hi all, Here's a v5 RFC of the automatic item relogging experiment. Firstly, note that this is still a POC and experimental code with various quirks. Some are documented in the code, others might not be (such as abusing the AIL lock, etc.). The primary purpose of this series is still to express and review a fundamental design. Based on discussion on the last version, there is specific focus towards addressing log reservation and pre-item locking deadlock vectors. While the code is still quite hacky, I believe this design addresses both of those fundamental issues. Further details on the design and approach are documented in the individual commit logs. In addition, the final few patches introduce buffer relogging capability and test infrastructure, which currently has no use case other than to demonstrate development flexibility and the ability to support arbitrary log items in the future, if ever desired. If this approach is taken forward, the current use cases are still centered around intent items such as the quotaoff use case and extent freeing use case defined by online repair of free space trees. On somewhat of a tangent, another intent oriented use case idea crossed my mind recently related to the long standing writeback stale data exposure problem (i.e. if we crash after a delalloc extent is converted but before writeback fully completes on the extent). The obvious approach of using unwritten extents has been rebuffed due to performance concerns over extent conversion. I wonder if we had the ability to log a "writeback pending" intent on some reasonable level of granularity (i.e. something between a block and extent), whether we could use that to allow log recovery to zero (or convert) such extents in the event of a crash. This is a whole separate design discussion, however, as it involves tracking outstanding writeback, etc. In this context it simply serves as a prospective use case for relogging, as such intents would otherwise risk similar log subsystem deadlocks as the quotaoff use case. Thoughts, reviews, flames appreciated. Brian rfcv5: - More fleshed out design to prevent log reservation deadlock and locking problems. - Split out core patches between pre-reservation management, relog item state management and relog mechanism. - Added experimental buffer relogging capability. rfcv4: https://lore.kernel.org/linux-xfs/20191205175037.52529-1-bfoster@xxxxxxxxxx/ - AIL based approach. rfcv3: https://lore.kernel.org/linux-xfs/20191125185523.47556-1-bfoster@xxxxxxxxxx/ - CIL based approach. rfcv2: https://lore.kernel.org/linux-xfs/20191122181927.32870-1-bfoster@xxxxxxxxxx/ - Different approach based on workqueue and transaction rolling. rfc: https://lore.kernel.org/linux-xfs/20191024172850.7698-1-bfoster@xxxxxxxxxx/ Brian Foster (9): xfs: set t_task at wait time instead of alloc time xfs: introduce ->tr_relog transaction xfs: automatic relogging reservation management xfs: automatic relogging item management xfs: automatic log item relog mechanism xfs: automatically relog the quotaoff start intent xfs: buffer relogging support prototype xfs: create an error tag for random relog reservation xfs: relog random buffers based on errortag fs/xfs/libxfs/xfs_errortag.h | 4 +- fs/xfs/libxfs/xfs_shared.h | 1 + fs/xfs/libxfs/xfs_trans_resv.c | 24 +++- fs/xfs/libxfs/xfs_trans_resv.h | 1 + fs/xfs/xfs_buf_item.c | 5 + fs/xfs/xfs_dquot_item.c | 7 ++ fs/xfs/xfs_error.c | 3 + fs/xfs/xfs_log.c | 2 +- fs/xfs/xfs_qm_syscalls.c | 12 +- fs/xfs/xfs_trace.h | 3 + fs/xfs/xfs_trans.c | 79 +++++++++++- fs/xfs/xfs_trans.h | 13 +- fs/xfs/xfs_trans_ail.c | 216 ++++++++++++++++++++++++++++++++- fs/xfs/xfs_trans_buf.c | 35 ++++++ fs/xfs/xfs_trans_priv.h | 6 + 15 files changed, 399 insertions(+), 12 deletions(-) -- 2.21.1