On Tue, Jan 28, 2020 at 11:54:17AM -0500, Jeff Moyer wrote: > "Darrick J. Wong" <darrick.wong@xxxxxxxxxx> writes: > > > Do we need to check that ft->lsectorsize <= XFS_MAX_SECTORSIZE too? > > We can. What would be the correct response to such a situation? Fail. The device is outside of the bounds of the validated on-disk format at that point. Historically speaking, the on-disk format only supports up to 32kB sector sizes and the limitation is a journal format limitation. That is, the journal implementation is based around 32kB buffers and journal writes requiring atomic sector writes. Even though v2 logs can support up to 256kB buffers, the format is still based around 32kB "segments" within the larger buffer. IIRC, this limitation is due to the built-in sector-based "whole buffer written" validation mechanism built into the log headers. The v1 format has an array that can store 64 individual sector indexes, so for 512 byte sectors the max log buffer size is 32kB. As long as that first sector of the log buffer is written atomically, it doesn't matter what the sector size actually is. However, we cannot do 32kB writes to a 64kB physical sector device, and hence the v1 format icannot support log records larger than 32kB, so the format is limited to 32kB sector sizes. The v2 log format does increase the size of the log buffer, but it does so by adding extension header sectors that contain the index mapping for each 32kB sector. It works under the same principle as the v1 log - as long as the first sector write is atomic, we can validate the extension headers are intact and so validate all 32kB segments of the larger buffer. The v2 log format also allows for stripe unit padding of writes, such that it will always write entire stripe units even when the log buffer is not full. Hence it supports writing aligned sizes larger than 32kB, even up to 256kB. However, it still relies on atomic sector writes for validation. When we moved to the v5 on-disk format, we also turned on CRC checking of the log records. THis means we are no longer dependent on atomic sector writes to detect torn log writes - if the first 512 byte sector is not written atomically, we can detect that in recovery on v5 filesystems. That means we don't really care what the sector size used by the log is anymore. We can do aligned log writes up to 256kB, and we can detect torn single sector writes. Hence I suspect XFS_MAX_SECTORSIZE is no longer relevant to v5 filesystems and we may be able to increase it to 64kB without any issues. We'd need to do quite a bit more code spelunking than I've just done, not to mention validation testing, before we can do such a thing, though.... > > (Not that I really expect disks with 64k LBA units...) > > It looks like you can tell loop to do that, but I wasn't able to make > that happen in practice. I'm not quite sure why, though. IIRC, it depends on what IO alignment capability the underlying filesystem provides via it's sector/block sizes. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx