Re: [RFC PATCH v4 4/6] Btrfs: prevent ioctls from interfering with a swap file

Omar Sandoval <osandov@xxxxxxxxxxx> · Tue, 21 Aug 2018 01:46:07 -0700

On Fri, May 25, 2018 at 06:10:43PM +0200, David Sterba wrote:
> On Fri, May 25, 2018 at 09:00:58AM -0700, Omar Sandoval wrote:
> > On Fri, May 25, 2018 at 04:50:55PM +0200, David Sterba wrote:
> > > On Thu, May 24, 2018 at 02:41:28PM -0700, Omar Sandoval wrote:
> > > > From: Omar Sandoval <osandov@xxxxxx>
> > > > 
> > > > When a swap file is active, we must make sure that the extents of the
> > > > file are not moved and that they don't become shared. That means that
> > > > the following are not safe:
> > > > 
> > > > - chattr +c (enable compression)
> > > > - reflink
> > > > - dedupe
> > > > - snapshot
> > > > - defrag
> > > > - balance
> > > > - device remove/replace/resize
> > > > 
> > > > Don't allow those to happen on an active swap file. Balance and device
> > > > remove/replace/resize in particular are disallowed entirely; in the
> > > > future, we can relax this so that relocation skips/errors out only on
> > > > chunks containing an active swap file.
> > > 
> > > Hm, disabling the entire balance is too intrusive. It's clear that the
> > > swapfile causes a lot of trouble when it goes against the dynamic
> > > capabilities of btrfs (relocation and the functionality that builds on
> > > it).
> > > 
> > > Skipping the swapfile extents should be implemented at minimum.
> > 
> > Sure thing, this should definitely be possible. For balance, we can skip
> > them; for resize or delete, it of course has to fail if it encounters
> > swap extents. I'll take a stab at it.
> 
> We can detect if there's an active swap file on the filesystem before
> shrink, delete or replace is started so the user is not surprised if it
> fails in the end, or not start the operations at all and give some hints
> what to do.

I looked at this some more, it's not pretty... Basically, we need to

- Add a counter of active swap extents to struct btrfs_block_group_cache
- At activate time, map the file extent to a block group and increment the counter
- At relocation time, check the counter, and either error out or skip as
  appropriate
- At deactivate time, decrement all of the block group counters. Easier
  said than done because the file could in theory have new extents
  allocated beyond EOF

The last point is the tricky one. The straightforward way to implement
deactivate would be to walk all of the extents of the file in the same
way we did for activate, but the extents may have changed. So, we have
to remember which extents were activated (or get that information from
the generic swap code somehow). This seems fragile.

Does anyone see a better approach? Is it worth the trouble?