Re: [PATCH v4] xfs: allow read IO and FICLONE to run concurrently

"Darrick J. Wong" <djwong@xxxxxxxxxx> · Fri, 20 Oct 2023 08:34:48 -0700

On Thu, Oct 19, 2023 at 11:06:42PM -0700, Christoph Hellwig wrote:
> On Thu, Oct 19, 2023 at 01:04:11PM -0700, Darrick J. Wong wrote:
> > Well... the stupid answer is that I augmented generic/176 to try to race
> > buffered and direct reads with cloning a million extents and print out
> > when the racing reads completed.  On an unpatched kernel, the reads
> > don't complete until the reflink does:
> 
> > So as you can see, reads from the reflink source file no longer
> > experience a giant latency spike.  I also wrote an fstest to check this
> > behavior; I'll attach it as a separate reply.
> 
> Nice.  I guess write latency doesn't really matter for this use
> case?

Nope -- they've gotten libvirt to tell qemu to redirect vm disk writes
to a new sidecar file.  Then they reflink the original source file to
the backup file, but they want qemu to be able to service reads from
that original source file while the reflink is ongoing.  When the backup
is done, they commit the sidecar contents back into the original image.

It would be kinda neat if we had file range locks.  Regular progress
could shorten the range as it makes progress.  If the thread doing the
reflink could find out that another thread has blocked on part of the
file range, it could even hurry up and clone that part so that neither
reads nor writes would see enormous latency spikes.

Even better, we could actually support concurrent reads and writes to
the page cache as long as the ranges don't overlap.  But that's all
speculative until Dave dumps his old ranged lock patchset on the list.

--D