On Thu, Aug 24, 2017 at 09:59:46AM +1000, Dave Chinner wrote: > On Wed, Aug 23, 2017 at 09:35:43AM -0700, Darrick J. Wong wrote: > > On Wed, Aug 23, 2017 at 09:28:31AM +1000, Dave Chinner wrote: > > > On Tue, Aug 22, 2017 at 05:33:24PM -0500, Eric Sandeen wrote: > > > > On 8/22/17 4:10 PM, Ruoxin Jiang wrote: > > > > > Hello, > > > > > > > > > > We are researchers from Columbia University, New York. As part of our > > > > > current research we have run into some issues using xfs filesystem. > > > > > For example, we encountered a case where the setuid bit of a modified > > > > > executable was not removed as desired. > > > > > > > > Thanks for the reports. > > > > > > > > For case 1: > > > > > > > > > * In xfs, function `xfs_file_aio_write_checks` calls `file_remove_privs` which > > > > > removes special file priviledges when the executable is written to or truncated. > > > > > > > > > > * If an DIO write is not aligned to device logical sector size, the syscall > > > > > fails with EINVAL` before `xfs_file_aio_write_checks` is called. Thus the setuid > > > > > bit will not be removed from the modified executable. > > > > > > > > If the write did not happen, why should the setuid bit be removed? > > > > The write failed and the file's contents were not modified. It seems > > > > like xfs's behavior makes sense; is there a posix specification that > > > > indicates the behavior should be different? > > > > > > > > Case 2 does look problematic (failed mmap write extending the file EOF) > > > > > > mmap should not be able to extend the file at all. It should segv > > > if an attempt to write past EOF occurs. > > .... > > Create a 48k mapping at 0x20000000 to some anonymous memory... with > > PROT_NONE. > > > > r[5] = syscall(__NR_write, r[0], 0x2000b000ul, 0x1ul, 0, 0, 0, 0, 0, 0); > > printf("write()=%ld, errno=%d\n", r[5], errno); > > > > Then try to write 1 byte to r0 (whose pos is 10) from this anonymous > > region, which bails out with EFAULT because the region is PROT_NONE. > > This is effectively pwrite(r[0], <unreadable>, 1, 10); > > My bad - I didn't look at the code for that case, I was just > commenting on Eric's statement about mmap extending the file. > > > Therefore, it's not an mmap write that extends the file length, it's the > > write call that moves out EOF in preparation for the write, only to have > > the buffered write fail because the process doesn't have read access to > > the memory buffer. > > > > Not sure what to do about this corner case, though -- I guess you could > > check that at least the first byte of the write will succeed, and only > > then call xfs_zero_eof. Could still be a TOCTOU race though. > > > > Thoughts? > > The zeroing succeeded, yes? So there's no stale data exposure at > all? Correct. > Given it's such a whacky corner case, do we really care what > happens to the file size on error as long as we don't expose stale > data? I'd just as soon focus on sharper problems, so I'll just throw out the usual "patches welcomed" and see if anyone feels motivated to fix this. :) --D > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html