Re: [PATCH 0/4] Fiemap, an extent mapping ioctl - round 2

jim owens <jowens@xxxxxx> · Mon, 07 Jul 2008 19:01:24 -0400

Anton Altaparmakov wrote:

It is completely irrelevant whether the information is still valid  
after the fiemap returns.

So if that is true, any XFS utility that does more than PRINT
the extent map based on doing JUST a fiemap is subject to
erronious results.

I agree with everyone who says that to do useful work with
the output of fiemap, you need a set of syscall functions
that have this effect:

   mandatory_exclusive_file_lock();
     [optional] fsync(); or force_allocation();
   fiemap();
     [do ugly userspace stuff]
   release_mandatory_exclusive_file_lock();

Without the locking steps, any code that acts on the
fiemap output is just guessing, and if XFS utilities
do unlocked fiemap, it doesn't matter that they have
forced an atomic fsync, their extent map is no more
valid than the non-atomic case.  So why bother having
it allocate and sync storage (besides so you don't
have to add code to handle unknown extent types)?

Dave Chinner wrote:
On Fri, Jul 04, 2008 at 01:13:25PM +0100, Jamie Lokier wrote:
You can only read blocks if the mapping remains stable after returning
it, which means the application _must_ ensure no process is modifying
the file, and that it's on a filesystem which doesn't arbitrarily move
blocks when it feels like it.

Like:

# xfs_freeze -f <mntpt>
# xfs_bmap -vvp <file>
# <do something nasty with direct block access>
# xfs_freeze -u <mntpt>

You've explained that it does provide a
guarantee: the resulting map will be valid for a consistent snapshot
of the file at some instant in time during the FIEMAP call.  In other
words, with concurrent modifiers, atomic sync+map ensures no delalloc
regions (is there anything else?) in the map, while fsync() + map gets
close but does not ensure it.

Synchronisation with direct I/O, ensures unwritten extent conversion
completion with concurrent async direct I/O before mapping, space
preallocation, etc.

So the sequence above seems to match my locked sequence and
only needs the fsync() instead of counting on fiemap-with-sync.

However, I will point out that the FREEZE-FILESYSTEM commands
(which I assume is your semantic as it is using <mntpt>) I am
used to using do not allow any metadata changes on the storage.
This is because the device snapshot code needs it stable.

So if xfs_bmap and fiemap() are expected to ignore freeze and
change metadata to do allocations that is sematically incorrect too.

jim
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html