Re: [LSF/MM TOPIC] Lazy file reflink

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 1, 2019 at 9:49 AM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>
> On Thu, Jan 31, 2019 at 11:13 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> >
> > On Tue, Jan 29, 2019 at 08:26:43AM +1100, Dave Chinner wrote:
> > > Really, though, for this use case it's make more sense to have "per
> > > file freeze" semantics. i.e. if you want a consistent backup image
> > > on snapshot capable storage, the process is usually "freeze
> > > filesystem, snapshot fs, unfreeze fs, do backup from snapshot,
> > > remove snapshot". We can already transparently block incoming
> > > writes/modifications on files via the freeze mechanism, so why not
> > > just extend that to per-file granularity so writes to the "very
> > > large read-mostly file" block while it's being backed up....
> > >
> > > Indeed, this would probably only require a simple extension to
> > > FIFREEZE/FITHAW - the parameter is currently ignored, but as defined
> > > by XFS it was a "freeze level". Set this to 0xffffffff and then it
> > > freezes just the fd passed in, not the whole filesystem.
> > > Alternatively, FI_FREEZE_FILE/FI_THAW_FILE is simple to define...
> >
> > This sounds like you want a lease (aka oplock), which we already have
> > implemented.
>
> Yes, its possibly true.
> I think that it could make sense to skip the reflink optimization for files that
> are open for write in our workloads. I'll need to check with my peers.
>

Getting back to this.
Since the topic got a slot in the LSF agenda, here are my talking points.

First of all, I would like to rewrite the subject. "lazy clone" was a specific
use case I had and the discussion mostly revolved around the viability
of this use case, but I have other use cases.

The core topic perhaps would be better described as "file pre-modification
callback".
We already have several of those: fsnotify, leases/oplocks, but they are
inadequate for some use cases, namely when the file is already open for
write and have writable maps.

One use case I have is taking a VFS level snapshot when there are open
files with writable maps.
Another similar use case is filesystem change journal, which I presented
last year: https://lwn.net/Articles/755277/

Another use case presented by Miklos is cache coherency between
guest and host in virtio-fs.

I envision something like fsnotify pre modification one shot permission
event that is emitted only once when inode data is dirtied after flushing
file's dirty data.
Depending on the use case, it may need to be combined with a file
freeze/thaw API or simply emit the event immediately after flushing
dirty data if inode is dirty.

For the cache coherency use case, that would mean that client
(i.e. guest) is valid for as long as host inode remains non-dirty.
Not sure if this is sufficient to meet virtio-fs requirements, but
I think this is pretty much similar to the way that networking
filesystems client-server cache coherency works, but with finer
granularity (break oplock/lease on dirtying instead of on open).

I would like to discuss possible ways to implement this API
and hear other people's concerns and other possible use cases.

Thanks,
Amir.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux