在 2022/2/27 0:38, Amir Goldstein 写道:
On Sat, Feb 26, 2022 at 5:21 PM Chengguang Xu <cgxu519@xxxxxxxxxxxx> wrote:
Calling fsync for parent directory in copy-up to
ensure the change get synced.
It is not clear to me that this change is really needed
What if the reported problem?
I found this issue by eyeball scan when I was looking for
the places which need to mark overlay inode dirty in change.
However, I think there are still some real world cases will be impacted
by this kind of issue,
for example, using docker build to make new docker image and power
failure makes new
image inconsistant.
Besides this can impact performance in some workloads.
The difference between parent copy up and file copy up is that
failing to fsync to copied up data and linking/moving the upper file
into place may result in corrupted data after power failure if temp
file data is not synced.
Failing the fsync the parent dir OTOH may result in revert to
lower file data after power failure.
The thing is, although POSIX gives you no such guarantee, with
ext4/xfs fsync of the upper file itself will guarantee that parents
will be present after power failure (see [1]).
In the new test case (079) which I posted, I've tried xfs as underlying
fs and found the parent of
copy-up file didn't present after power failure. Am I missing something?
This is not true for btrfs, but there are fewer users using overlayfs
over btrfs (at least in the container world).
So while your patch is certainly "correct", for most users its effects
will be only negative - performance penalty without fixing anything.
So I think this change should be opt-in with Kconfig/module/mount option.
I prefer to add this as standard option, but if Miklos and you guys all
think it should be opt-in then
mount option is also acceptable for me.
Thanks,
Chengguang
Unfortunately, there is currently no way to know whether the underlying
filesystem needs the parent dir fsync or not.
I was trying to promote a VFS API for a weaker version of fsync for
the parent dir [2] but that effort did not converge.
Thanks,
Amir.
[1] https://lore.kernel.org/linux-fsdevel/1552418820-18102-1-git-send-email-jaya@xxxxxxxxxxxxx/
[2] https://lore.kernel.org/linux-fsdevel/20190527172655.9287-1-amir73il@xxxxxxxxx/
Signed-off-by: Chengguang Xu <cgxu519@xxxxxxxxxxxx>
---
fs/overlayfs/copy_up.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index e040970408d4..52ca915f04a3 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -944,6 +944,7 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
{
int err;
DEFINE_DELAYED_CALL(done);
+ struct file *parent_file = NULL;
struct path parentpath;
struct ovl_copy_up_ctx ctx = {
.parent = parent,
@@ -972,6 +973,12 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
AT_STATX_SYNC_AS_STAT);
if (err)
return err;
+
+ parent_file = ovl_path_open(&parentpath, O_WRONLY);
+ if (IS_ERR(parent_file)) {
+ err = PTR_ERR(parent_file);
+ return err;
+ }
}
/* maybe truncate regular file. this has no effect on dirs */
@@ -998,6 +1005,14 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
err = ovl_copy_up_meta_inode_data(&ctx);
ovl_copy_up_end(dentry);
}
+
+ if (!err) {
+ if (parent_file) {
+ vfs_fsync(parent_file, 0);
+ fput(parent_file);
+ }
+ }
+
do_delayed_call(&done);
return err;
--
2.27.0