On Wed, Jun 29, 2022 at 02:42:43PM -0700, Darrick J. Wong wrote: > On Thu, Jun 30, 2022 at 07:34:37AM +1000, Dave Chinner wrote: > > On Wed, Jun 29, 2022 at 02:16:03PM -0700, Darrick J. Wong wrote: > > > On Mon, Jun 27, 2022 at 10:43:35AM +1000, Dave Chinner wrote: > > > > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c > > > > index 82cf0189c0db..0acb31093d9f 100644 > > > > --- a/fs/xfs/xfs_trans.c > > > > +++ b/fs/xfs/xfs_trans.c > > > > @@ -844,6 +844,90 @@ xfs_trans_committed_bulk( > > > > spin_unlock(&ailp->ail_lock); > > > > } > > > > > > > > +/* > > > > + * Sort transaction items prior to running precommit operations. This will > > > > + * attempt to order the items such that they will always be locked in the same > > > > + * order. Items that have no sort function are moved to the end of the list > > > > + * and so are locked last (XXX: need to check the logic matches the comment). > > > > + * > > > > + * This may need refinement as different types of objects add sort functions. > > > > + * > > > > + * Function is more complex than it needs to be because we are comparing 64 bit > > > > + * values and the function only returns 32 bit values. > > > > + */ > > > > +static int > > > > +xfs_trans_precommit_sort( > > > > + void *unused_arg, > > > > + const struct list_head *a, > > > > + const struct list_head *b) > > > > +{ > > > > + struct xfs_log_item *lia = container_of(a, > > > > + struct xfs_log_item, li_trans); > > > > + struct xfs_log_item *lib = container_of(b, > > > > + struct xfs_log_item, li_trans); > > > > + int64_t diff; > > > > + > > > > + /* > > > > + * If both items are non-sortable, leave them alone. If only one is > > > > + * sortable, move the non-sortable item towards the end of the list. > > > > + */ > > > > + if (!lia->li_ops->iop_sort && !lib->li_ops->iop_sort) > > > > + return 0; > > > > + if (!lia->li_ops->iop_sort) > > > > + return 1; > > > > + if (!lib->li_ops->iop_sort) > > > > + return -1; > > > > + > > > > + diff = lia->li_ops->iop_sort(lia) - lib->li_ops->iop_sort(lib); > > > > > > I'm kinda surprised the iop_sort method doesn't take both log item > > > pointers, like most sorting-comparator functions? But I'll see, maybe > > > you're doing something clever wrt ordering of log items of differing > > > types, and hence the ->iop_sort implementations are required to return > > > some absolute priority or something. > > > > Nope, we have to order item locking based on an unchanging > > characteristic of the object. log items can come and go, we want to > > lock items in consistent ascending order, so it has to be based on > > some kind of physical characteristic, like inode number, block > > address, etc. > > > > e.g. If all objects are ordered by the physical location, we naturally > > get a lock order that can be applied sanely across differing object > > types e.g. AG headers will naturally sort and lock before buffers > > in the AG itself. e.g. inode cluster buffers for unlinked list > > manipulations will always get locked after the AGI.... > > <nod> So if (say) we were going to add dquots to this scheme, we'd > probably want to shift all the iop_sort functions to return (say) the > xfs_daddr_t of the associated item? No, I don't want to tie it specifically to physical address. ip->i_ino is not a daddr, but it would order just fine against one. > (Practically speaking, I don't know that I'd want to tie things down to > the disk address quite this soon, and since it's all incore code anyway > I don't think the precise type of the return values matter.) Indeed, I don't even want to tie this specifically to a 64 bit value; it's intended that objects will return a sort key, and as we add more object types we'll have to think harder about the specific key values we use. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx