Re: correct use of vmtruncate()?

Zach Brown <zach.brown@xxxxxxxxxx> · Tue, 29 Apr 2008 10:10:59 -0700

> The obvious fix for this is that block_write_begin() and
> friends should be calling ->setattr to do the truncation and hence
> follow normal convention for truncating blocks off an inode.
> However, even that appears to have thorns. e.g. in XFS we hold the
> iolock exclusively when we call block_write_begin(), but it is not
> held in all cases where ->setattr is currently called. Hence calling
> ->setattr from block_write_begin in this failure case will deadlock
> unless we also pass a "nolock" flag as well. XFS already
> supports this (e.g. see the XFS fallocate implementation) but no other
> filesystem does (some probably don't need to).

This paragraph in particular reminds me of an outstanding bug with
O_DIRECT and ext*.  It isn't truncating partial allocations when a dio
fails with ENOSPC.  This was noticed by a user who saw that fsck found
bocks outside i_size in the file that saw ENOSPC if they tried to
unmount and check the volume after the failed write.

So, whether we decide that failed writes should call setattr or
vmtruncate, we should also keep the generic O_DIRECT path in
consideration.  Today it doesn't even try the supposed generic method of
calling vmtrunate().

- z

(Though I'm sure XFS' dio code already handles freeing blocks :))
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html