On Aug 23, 2017, at 2:13 PM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Wed, Aug 23, 2017 at 12:53 PM, Doug Nazar <nazard@xxxxxxxx> wrote: >> >> It's compiling now, but I think it's already set to MAX_LFS_FILESIZE. >> >> [ 169.095127] ppos=80180006000, s_maxbytes=7ffffffffff, magic=0x62646576, >> type=bdev > > Oh, right you are - I'm much too used to 64-bit, where > MAX_LFS_FILESIZE is basically infinite, and was jusr assuming that it > was something like the UFS bug we had not that long ago that was due > to the 32-bit limit. > > But yes, on 32-bit, we are limited by the 32-bit index into the page > cache, and we limit the index to 31 bits too, so we have (PAGE_SIZE << > 31) -1, which is that 7ffffffffff. > > And that also explains why people haven't seen it. You do need > > (a) 32-bit environment > > (b) a disk larger than that 8TB in size > > The *hard* limit for the page cache on a 32-bit environment should > actually be (PAGE_SIZE << 32)-PAGE_SIZE (that final PAGE_SIZE > subtraction is to make sure we don't generate that page cache with > index -1), so having a disk that is 16TB or larger is not going to > work, but your disk is right in that 8TB-16TB hole that used to work > and was broken by that check. > > Anyway, that makes me feel better. I should have looked at your disk > size more, now I at least understand why nobody noticed before. > > So just throw away my patch. That's wrong, and garbage. > > The *right* patch is likely to just this instead: > > -#define MAX_LFS_FILESIZE (((loff_t)PAGE_SIZE << (BITS_PER_LONG-1))-1) > +#define MAX_LFS_FILESIZE (((loff_t)PAGE_SIZE << > BITS_PER_LONG)-PAGE_SIZE) > > which should make MAX_LFS_FILESIZE be 0xffffffff000 and you disk size > should be ok. Doug, I noticed while checking for other implications of changing MAX_LFS_FILESIZE that fs/jfs/super.c is also working around this limit. If you are going to submit a patch for this, it also makes sense to fix jfs_fill_super() to use MAX_LFS_FILESIZE instead of JFS rolling its own, something like: /* logical blocks are represented by 40 bits in pxd_t, etc. * and page cache is indexed by long. */ sb->s_maxbytes = min((u64)sb->s_blocksize) << 40, MAX_LFS_FILESIZE); It also looks like ocfs2_max_file_offset() is trying to avoid overflowing the old 31-bit limit, and isn't using MAX_LFS_FILESIZE directly, so it will now be wrong. It looks like it could use "bitshift = 32; trim = bytes;", but Joel or Mark should confirm. Finally, there is a check in fs/super.c::mount_fs() that is verifying s_maxbytes is not set too large, but this has been present since 2.6.32 and should probably be removed at this point, or changed to a BUG_ON() (see commit 42cb56ae2ab for details). Cheers, Andreas
Attachment:
signature.asc
Description: Message signed with OpenPGP