Re: [PATCH v7 00/19] xfs: Delayed Ready Attrs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Feb 23, 2020 at 09:55:48AM +0200, Amir Goldstein wrote:
> On Sun, Feb 23, 2020 at 4:06 AM Allison Collins
> <allison.henderson@xxxxxxxxxx> wrote:
> >
> > Hi all,
> >
> > This set is a subset of a larger series for delayed attributes. Which is
> > a subset of an even larger series, parent pointers. Delayed attributes
> > allow attribute operations (set and remove) to be logged and committed
> > in the same way that other delayed operations do. This allows more
> > complex operations (like parent pointers) to be broken up into multiple
> > smaller transactions. To do this, the existing attr operations must be
> > modified to operate as either a delayed operation or a inline operation
> > since older filesystems will not be able to use the new log entries.
> 
> High level question, before I dive into the series:
> 
> Which other "delayed operations" already exist?

See Chandan's answer :P

> I think delayed operations were added by Darrick to handle the growth of
> translation size due to reflink. Right? So I assume the existing delayed
> operations deal with block accounting.

No, they are intended to allow atomic, recoverable multi-transaction
operations. They grew out of this:

https://xfs.org/index.php/Improving_Metadata_Performance_By_Reducing_Journal_Overhead#Atomic_Multi-Transaction_Operations

which was essentially an generalisation of the EFI/EFD intent
logging that has existed in XFS for 20 years.

Essentially, it is a mechanism of chaining intent operations to
ensure that recover will restart the operation at the point the
system failed so that once the operation is started (i.e. first
intent is logged to the journal) the entire operation is always
completed regardless of whether the system crashes or not.

> When speaking of parent pointers, without having looked into the details yet,
> it seem the delayed operations we would want to log are operations that deal
> with namespace changes, i.e.: link,unlink,rename.
> The information needed to be logged for these ops is minimal.

Not really. the parent pointers are held in attributes, so parent
pointers are effectively adding an attribute creation to every inode
allocation and an attribute modification to every directory
modification. And, well, when an inode has 100 million hard links,
it's going to have 100 million parent pointer attributes. Modifying
a link is then a major operation, and Chandan has done a great job
in analysing the attr btree to see if there are scalability issues
that will be exposed by this sort of attribute usage....

> Why do we need a general infrastructure for delayed attr operations?

These have to be done atomically with the create/unlink/rename/etc
and to include attribute modification in those transaction
reservations blows the size of them out massively (especially
rename!). By converting these operations to use defered operations
to add the parent pointer to the inode, we no longer need to
increase the log reservation for the operations (because the attr
reservation is usually smaller than the directory reservation), and
it is guaranteed to be atomic with the directory modification. i.e.
parent pointers never get out of sync, even when the system crashes.

Hence having attributes modified as a series of individual
operations chained together into an atomic whole via intents is a
pre-requisite for updating attributes atomically within directory
modification operations.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux