On Sun, Feb 27, 2022 at 3:25 PM Chengguang Xu <cgxu519@xxxxxxxxxxxx> wrote: > > 在 2022/2/27 13:16, Amir Goldstein 写道: > > On Sun, Feb 27, 2022 at 6:28 AM Chengguang Xu <cgxu519@xxxxxxxxxxxx> wrote: > >> 在 2022/2/27 0:38, Amir Goldstein 写道: > >>> On Sat, Feb 26, 2022 at 5:21 PM Chengguang Xu <cgxu519@xxxxxxxxxxxx> wrote: > >>>> Calling fsync for parent directory in copy-up to > >>>> ensure the change get synced. > >>> It is not clear to me that this change is really needed > >>> What if the reported problem? > >> I found this issue by eyeball scan when I was looking for > >> the places which need to mark overlay inode dirty in change. > >> > >> However, I think there are still some real world cases will be impacted > >> by this kind of issue, > >> for example, using docker build to make new docker image and power > >> failure makes new > >> image inconsistant. > > A very good example where the fsync of parent will be counter productive. > > The efficient way of building a docker image would be: > > 1. Write/copy up all the files > > 2. Writeback all the upper inodes > > 3. syncfs() upper fs > > > > 2 and 3 should happen on overlayfs umount as long as we properly > > marked all the copied up overlayfs inodes dirty. > > > > So the question is not if parent dir needs fsync, but if it needs to be > > dirtied. > > > >> > >>> Besides this can impact performance in some workloads. > >>> > >>> The difference between parent copy up and file copy up is that > >>> failing to fsync to copied up data and linking/moving the upper file > >>> into place may result in corrupted data after power failure if temp > >>> file data is not synced. > >>> > >>> Failing the fsync the parent dir OTOH may result in revert to > >>> lower file data after power failure. > >>> > >>> The thing is, although POSIX gives you no such guarantee, with > >>> ext4/xfs fsync of the upper file itself will guarantee that parents > >>> will be present after power failure (see [1]). > >> In the new test case (079) which I posted, I've tried xfs as underlying > >> fs and found the parent of > >> copy-up file didn't present after power failure. Am I missing something? > >> > > I think you are. > > The test does not reproduce an inconsistency. > > The test reproduced changes that did not persist to storage after > > power failure and that is the expected behavior when a user does > > not fsync after making changes - unless the 'sync' or 'dirsync' mount > > options have been used. > > > > I think your test will become correct if you use -o sync for the overlayfs > > mount and your patch will become correct if you do: > > > > + if (inode_needs_sync(d_inode(dentry)) { > > + parent_file = ovl_path_open(&parentpath, O_DIRECTORY|O_RDONLY); > > Why should parent_file open with O_RDONLY flag? Meanwhile, I think the It's a directory. What is the meaning of opening it for write? fsync doesn't need a writable fd. > fix is not sufifcient for fully supporting 'dirsync' or 'sync' in overlayfs. Right. > Anyway, I think the description of expected behavior above makes sense > so maybe the topic should turn to implement 'sync' or 'dirsync' mount > options or reject specifying those options in overlayfs. If anything, I would reject them, unless you know of users that are going to use this functionality, but maybe you better complete the syncfs work before going into new adventures ;-) Thanks, Amir.