On Wed, Dec 01, 2021 at 06:45:46PM +0800, Kang Chen wrote: > I found that the 'xfs_update_prealloc_flags' function is called > during the ‘fallocate’ syscall and the SUID flag is cleared > when the 'XFS_PREALLOC_INVISIBLE' flag is not set. > I am a beginner and have some questions about it. > > 1. What does XFS_PREALLOC_INVISIBLE mean and > why should the SUID flag be cleared > when XFS_PREALLOC_INVISIBLE is not set? > > 2. The behavior of XFS in handling the fallocate syscall is > a bit strange and not quite the same as other file systems, > such as ext4 and btrfs. > > Here is an example: > foo is a normal file. > chmod set the SUID and SGID flag. > The last two parameters of fallocate are irrelevant to this problem. > After running, ext4 and btrfs set mode o6000, but xfs set mode o2000. > ``` > int fd = open("foo", 2, 0); > chmod("foo", o6000); > fallocate(fd, 3, 6549, 1334); > fsync(fd); > ``` > > Can you give me some help? The Open Group spec says (for file writes) that "Upon successful completion, where nbyte is greater than 0, write() shall mark for update the last data modification and last file status change timestamps of the file, and if the file is a regular file, the S_ISUID and S_ISGID bits of the file mode may be cleared." https://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html I think XFS (and ocfs2) interpret fallocate calls as a file write, since punch/zero/collapse/insert directly change the file contents and extending the length changes what you get if you read() the entire file. If nothing else, xfs updates the ctime for any fallocate request. This might be overkill for preallocating into the middle of a file, but for the rest I think it's necessary. That's the reason I can come up with for why these two filesystems remove the suid/sgid bits on fallocate. --D > Best wishes.