Re: [PATCH 00/39 v5] xfs: CIL and log optimisations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 03, 2021 at 03:22:01PM +1000, Dave Chinner wrote:
> Hi folks,
> 
> This is an update of the consolidated log scalability patchset I've been working
> on. Version 4 was posted here:
> 
> https://lore.kernel.org/linux-xfs/20210519121317.585244-1-david@xxxxxxxxxxxxx/
> 
> This version contains the changes Darrick requested during review. The only
> patch remaining without at least one RVB tag is patch 30.
> 
> Performance improvements are largely documented in the change logs of the
> individual patches. Headline numbers are an increase in transaction rate from
> 700k commits/s to 1.7M commits/s, and a reduction in fua/flush operations by
> 2-3 orders of magnitude on metadata heavy workloads that don't use fsync.
> 
> Summary of series:
> 
> Patches		Modifications
> -------		-------------
> 1-7:		log write FUA/FLUSH optimisations
> 8:		bug fix
> 9-11:		Async CIL pushes
> 12-25:		xlog_write() rework
> 26-39:		CIL commit scalability

>From this latest posting, I see that the first nine patches all have
multiple reviews.  Some of patches 10-19 have review tags split between
Brian and Christoph, but neither have added them all the way through.
I think I'm the only one who has supplied RVB tags for all forty.

So my question is: at what point would you like me to pull the segments
of this patchset into upstream?  "The maintainer reviewed everything" is
of course our usual standard, but this touches a /lot/ of core logging
code, and logging isn't one of my stronger areas of familiarity.

I think 1-8 look fine for 5.14.  Do you want me to wait for Brian and/or
Christoph (or really, any third pair of eyes) to finish working their
way through 9-11 and 12-25 before merging them?

--D

> 
> Change log is below.
> 
> Cheers,
> 
> Dave.
> 
> Version 5:
> - fix typo in comment (patch 4)
> - fix typo in commit msg (patch 7)
> - removed unnecessary update of num_iovecs (patch 20)
> - fixed whitespace (patch 29)
> - removed unused space_used/curr_res from CIL pcp structure (patch 29)
> - added space_used to CIL pcp structure (patch 30)
> - add comment to xlog_cil_pcp_aggregate() (patch 30)
> - added space_reserved to CIL pcp structure (patch 31)
> - removed unnecessary commentary about CIL ordering issues from commit msg in
>   patch 33 as patch 35 addresses these issues..
> - fix whitespace in xlog_cil_free_logvec() (patch 35)
> - fix weird sentence in docco (patch 39)
> - propagate changes through patches to fix conflicts due to dependent changes.
>   (various patches)
> 
> 
> Version 4:
> - rebase on 5.13-rc2+
> - fixed completion logic for async cache flush
> - trimmed superflous comments about not requiring REQ_PREFLUSH for iclog IO
>   anymore.
> - ensure that setting/clearing XLOG_ICL_NEED_FLUSH is atomic (i.e. only modified
>   while holding the icloglock)
> - ensure callers only add iclog flush/fua flags appropriately before releasing
>   the iclog so that multiple independent writes to an iclog doesn't clear flags
>   other writes into the iclog depend on.
> - buffer log item dirty tracking patches merged so removed from series
> - replaced XFS_LSN_CMP() checks with direct lsn1 == lsn2 comparisons to simplify
>   the code
> - changed "push_async" to "push_commit_stable" to indicate that the push caller
>   wants the entire CIL checkpoint and commit record to be on stable storage when
>   it completes.
> - updated comment to indicate that iclog sync state is set according to the
>   caller's desire for a stable checkpoint to be performed.
> - Added comment explaining why the CIL workqueue is limited to 4 concurrent
>   works per filesystem.
> - debug overhead reduction patches merged so removed from series
> - cleaned up a couple of typedef uses.
> - updated pahole output for checkpoint header in commit message
> - Added BUILD_BUG_ON() to check the size of unmount records.
> - got rid of XFS_VOLUME define.
> - got rid of XLOG_TIC_LEN_MAX define.
> - cleaned up extra blank lines.
> - fixed double initialisation of lv in xlog_write_single().
> - removed the unnecessary change for reserved iclog space in
>   xlog_state_get_iclog_space().
> - no need to check for XLOG_CONTINUE_TRANS in xlog_write_partial() as it will
>   always be set.
> - added a patch for removing the optype parameter from xlog_write()
> - removed unused nvecs from struct xfs_cil_ctx
> - fixed whitespace damage in xlog_cil_pcp_dead()
> - added missing cpu dead accounting transfer in xlog_cil_pcp_dead().
> - factored out CIL push percpu structure aggregation into
>   xlog_cil_pcp_aggregate()
> - added the ctx->ticket->t_unit_res update back into the code even though it is
>   largely unnecessary.
> - cleaned up the pcp, cilpcp, pcptr mess in xlog_cil_pcp_alloc() and elsewhere
>   to use variable names consistently.
> - simplified the CIL sort comparison functions to a single comparison operation
> - fixed percpu CIL item list sort order where items in the same transaction
>   (order id) were reversed, leading to intents being replayed in the wrong
>   order.
> - split out log vector chain conversion to list_head into separate patch
> - Updated documentation with all the fixes and suggestions made.
> 
> 
> Version 3:
> - rebase onto 5.12-rc1+
> - aggregate many small dependent patchsets in one large one.
> - simplify xlog_wait_on_iclog_lsn() back to just a call to xlog_wait_on_iclog()
> - remove xfs_blkdev_issue_flush() instead of moving and renaming it.
> - pass bio to xfs_flush_bdev_async() so it doesn't need allocation.
> - skip cache flush in xfs_flush_bdev_async() if the underlying queue does not
>   require it.
> - fixed whitespace in xfs_flush_bdev_async()
> - remove the implicit external log's data device cache flush code and replace it
>   with an explicit flush in the unmount record write so that it works the same
>   as the new CIL checkpoint cache pre-flush mechanism. This mechanism now
>   guarantees metadata vs journal ordering for both internal and external logs.
> - updated various commit messages
> - fixed incorrect/unintended changes to xfs_log_force() behaviour
> - typedef uint64_t xfs_csn_t; and conversion.
> - removed stray trace_printk()s that were used for debugging.
> - fixed minor formatting details.
> - uninlined xlog_prepare_iovec()
> - fixed up "lv chain vector and size calculation" commit message to reflect we
>   are only calculating and passin gin the vector byte count.
> - reworked the loop in xlog_write_single() based on Christoph's suggestion. Much
>   cleaner!
> - added patch to pass log ticket down to xlog_sync() so that it accounts the
>   roundoff to the log ticket rather than directly modifying grant heads. Grant
>   heads are hot, so every little bit helps.
> - added patch to update delayed logging design doc with background material on
>   how transactions and log space accounting works in XFS.
> 
> Version 2:
> - fix ticket reservation roundoff to include 2 roundoffs
> - removed stale copied comment from roundoff initialisation.
> - clarified "separation" to mean "separation for ordering purposes" in commit
>   message.
> - added comment that newly activated, clean, empty iclogs have a LSN of 0 so are
>   captured by the "iclog lsn < start_lsn" case that avoids needing to wait
>   before releasing the commit iclog to be written.
> - added async cache flush infrastructure
> - convert CIL checkpoint push work it issue an unconditional metadata device
>   cache flush rather than asking the first iclog write to issue it via
>   REQ_PREFLUSH.
> - cleaned up xlog_write() to remove a redundant parameter and prepare the logic
>   for setting flags on the iclog based on the type of operational data is being
>   written to the log.
> - added XLOG_ICL_NEED_FUA flag to complement the NEED_FLUSH flag, allowing
>   callers to issue explicit flushes and clear the NEED_FLUSH flag before the
>   iclog is written without dropping the REQ_FUA requirement in /dev/null...
> - added CIL commit-in-start-iclog optimisation that clears the NEED_FLUSH flag
>   to avoid an unnecessary cache flush when issuing the iclog.
> - fixed typo in CIL throttle bugfix comment.
> - fixed trailing whitespace in commit message.
> 
> 



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux