On Wed, Apr 27, 2022 at 02:50:34PM +1000, Dave Chinner wrote: > On Tue, Apr 26, 2022 at 08:14:45PM -0700, Darrick J. Wong wrote: > > On Wed, Apr 27, 2022 at 12:22:52PM +1000, Dave Chinner wrote: > > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > > > Callers currently have to round out the size of buffers to match the > > > aligment constraints of log iovecs and xlog_write(). They should not > > > need to know this detail, so introduce a new function to calculate > > > the iovec length (for use in ->iop_size implementations). Also > > > modify xlog_finish_iovec() to round up the length to the correct > > > alignment so the callers don't need to do this, either. > > > > > > Convert the only user - inode forks - of this alignment rounding to > > > use the new interface. > > > > Hmm. So currently, we require that the inode fork buffer be rounded up > > to the next 4 bytes, and then I guess the log will copy that into the > > log iovec? IOWs, if we have a 37-byte data fork, we'll allocate a 40 > > byte buffer for the xfs_ifork, and the log will copy all 40 bytes into a > > 40 byte iovec. > > Yes, that's how the current code works. It ends up leaking whatever > was in those 3 bytes into the shadow buffer that we then copy into > the log region. i.e. the existing code "leaks" non-zeroed allocated > memory to the journal. > > > Now it looks like we'd allocate a 37-byte buffer for the xfs_ifork, but > > the log iovec will still be 40 bytes. So ... do we copy 37 bytes out of > > the ifork buffer and zero the last 3 bytes in the iovec? > > Yes, we copy 37 bytes out of the ifork buffer now into the shadow > buffer so we do not overrun the inode fork buffer. > > > Does we leak > > kernel memory in those last 3 bytes? > > We does indeed still leak the remaining 3 bytes as they are not > zeroed. > > > Or do we copy 40 bytes and > > overrun? > > No, we definitely don't do that - KASAN gets very unhappy when you > do that... > > > It sorta looks like (at least for the local format case) xlog_copy_iovec > > will copy 37 bytes and leave the last 3 bytes of the iovec in whatever > > state it was in previously. Is that zeroed? Because it then looks like > > xlog_finish_iovec will round that 37 up to 40. > > The shadow buffer is only partially zeroed - the part that makes io > the header and iovec pointer array is zeroed, but the region that > the journal data is written to is not zeroed. > > > (FWIW I'm just checking for kernel memory exposure vectors here.) > > Yup, I hadn't even considered that aspect of the code because we > aren't actually leaking anything to userspace. If an unprivileged > user can read 3 bytes of uninitialised data out of the journal we've > got much, much bigger security problems to deal with. > > It should be trivial to fix, though. I'll do the initial fix as a > standalone patch, though, and then roll it into this one because the > problem has been around for a long while and fixing this patch > doesn't produce an easily backportable fix... <nod> I agree that it's a very minor disclosure vulnerability (certainly less severe than ALLOCSP) since you'd need CAP_SYS_RAWIO to exploit it. But certainly worth patching before someone discovers that a former pagecache page with your credit card numbers on it got recycled into a log vector page. Thanks for doing the fix. :) --D > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx