Re: Extreme fragmentation ho!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 30, 2020 at 05:28:36PM +1100, Chris Dunlop wrote:
> On Tue, Dec 29, 2020 at 09:06:22AM +1100, Dave Chinner wrote:
> > On Tue, Dec 22, 2020 at 08:54:53AM +1100, Chris Dunlop wrote:
> > > The file is sitting on XFS on LV on a raid6 comprising 6 x 5400 RPM HDD:
> > 
> > ... probably not that unreasonable for pretty much the slowest
> > storage configuration you can possibly come up with for small,
> > metadata write intensive workloads.
> 
> [ Chris grimaces and glances over at the 8+3 erasure-encoded ceph rbd
> sitting like a pitch drop experiment in the corner. ]

I would have thought that should be able to do more than the 20 IOPS
the raid6 above will do on random 4kB writes.... :)

> Speaking of slow storage and metadata write intensive workloads, what's the
> reason reflinks with a realtime device isn't supported? That was one
> approach I wanted to try, to get the metadata ops running on a small fast
> storage with the bulk data sitting on big slow bulk storage. But:
> 
> # mkfs.xfs -m reflink=1 -d rtinherit=1 -r rtdev=/dev/fast /dev/slow
> reflink not supported with realtime devices

Yup, the realtime device is a pure data device, so all it's metadata
is held externally to the device (i.e. it is held in the "data
device", not the RT device). IOWs, it's a completely separate
filesystem implementation within XFS, and so requires independent
functional extensions to support reflink + rmap...

> My naive thought was a reflink was probably "just" a block range referenced
> from multiple places, and probably a refcount somewhere. It seems like it
> should be possible to have the range, references and refcount sitting on the
> fast storage pointing to the actual data blocks on the slow storage.

Yes, it is possible, but the current reflink implementation is based
on allocation group internal structures (rmap is the same), and the
realtime device doesn't have these. Hence there are new metadata
structures that need to be added (refcount btrees rooted in inodes,
not fixed location AG headers) and a bunch of new supporting code to
be written. Largely Darrick has done this already, it's just a
problem of review bandwidth and validation time:

https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=realtime-reflink-extsize

(which also includes realtime rmap support, a whole new internal
metadata inode directory to index all the new inode btrees for the
rt device, etc)

It's a pretty large chunk of shiny new code....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux