Re: [PATCH RFC 2/4] xfs: defer agfl block frees when dfops is available

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 08, 2017 at 09:41:26AM +1100, Dave Chinner wrote:
> On Thu, Dec 07, 2017 at 01:58:08PM -0500, Brian Foster wrote:
> > The AGFL fixup code executes before every block allocation/free and
> > rectifies the AGFL based on the current, dynamic allocation
> > requirements of the fs. The AGFL must hold a minimum number of
> > blocks to satisfy a worst case split of the free space btrees caused
> > by the impending allocation operation. The AGFL is also updated to
> > maintain the implicit requirement for a minimum number of free slots
> > to satisfy a worst case join of the free space btrees.
> > 
> > Since the AGFL caches individual blocks, AGFL reduction typically
> > involves multiple, single block frees. We've had reports of
> > transaction overrun problems during certain workloads that boil down
> > to AGFL reduction freeing multiple blocks and consuming more space
> > in the log than was reserved for the transaction.
> > 
> > Since the objective of freeing AGFL blocks is to ensure free AGFL
> > free slots are available for the upcoming allocation, one way to
> > address this problem is to release surplus blocks from the AGFL
> > immediately but defer the free of those blocks (similar to how
> > file-mapped blocks are unmapped from the file in one transaction and
> > freed via a deferred operation) until the transaction is rolled.
> > This turns AGFL reduction into an operation with predictable log
> > reservation consumption.
> > 
> > Add the capability to defer AGFL block frees when a deferred ops
> > list is handed to the AGFL fixup code. Deferring AGFL frees is a
> > conditional behavior based on whether the caller has populated the
> > new dfops field of the xfs_alloc_arg structure. A bit of
> > customization is required to handle deferred completion processing
> > because AGFL blocks are accounted against a separate reservation
> > pool and AGFL are not inserted into the extent busy list when freed
> > (they are inserted when used and released back to the AGFL). Reuse
> > the majority of the existing deferred extent free infrastructure and
> > customize it appropriately to handle AGFL blocks.
> 
> Ok, so it uses the EFI/EFD to make sure that the block freeing is
> logged and replayed. So my question is:
> 
> > +/*
> > + * AGFL blocks are accounted differently in the reserve pools and are not
> > + * inserted into the busy extent list.
> > + */
> > +STATIC int
> > +xfs_agfl_free_finish_item(
> > +	struct xfs_trans		*tp,
> > +	struct xfs_defer_ops		*dop,
> > +	struct list_head		*item,
> > +	void				*done_item,
> > +	void				**state)
> > +{
> 
> How does this function get called by log recovery when processing
> the EFI as there is no flag in the EFI that says this was a AGFL
> block?
> 

It doesn't...

> That said, I haven't traced through whether this matters or not,
> but I suspect it does because freelist frees use XFS_AG_RESV_AGFL
> and that avoids accounting the free to the superblock counters
> because the block is already accounted as free space....
> 

I don't think it does matter. I actually tested log recovery precisely
for this question, to see whether the traditional EFI recovery path
would disrupt accounting or anything and I didn't reproduce any problems
(well, except for that rmap record cleanup failure thing).

However, I do still need to trace through and understand why that is, to
know for sure that there aren't any problems lurking here (and if not, I
should probably document it), but I suspect the reason is that the
differences between how agfl and regular blocks are handled here only
affect in-core state of the AG reservation pools. These are all
reinitialized from zero on a subsequent mount based on the on-disk state
(... but good point, and I will try to confirm that before posting a
non-RFC variant).

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux