Re: Kernels v4.9+ cause short reads of block devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Aug 23, 2017, at 2:13 PM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> 
> On Wed, Aug 23, 2017 at 12:53 PM, Doug Nazar <nazard@xxxxxxxx> wrote:
>> 
>> It's compiling now, but I think it's already set to MAX_LFS_FILESIZE.
>> 
>> [  169.095127] ppos=80180006000, s_maxbytes=7ffffffffff, magic=0x62646576,
>> type=bdev
> 
> Oh, right you are - I'm much too used to 64-bit, where
> MAX_LFS_FILESIZE is basically infinite, and was jusr assuming that it
> was something like the UFS bug we had not that long ago that was due
> to the 32-bit limit.
> 
> But yes, on 32-bit, we are limited by the 32-bit index into the page
> cache, and we limit the index to 31 bits too, so we have (PAGE_SIZE <<
> 31) -1, which is that 7ffffffffff.
> 
> And that also explains why people haven't seen it. You do need
> 
> (a) 32-bit environment
> 
> (b) a disk larger than that 8TB in size
> 
> The *hard* limit for the page cache on a 32-bit environment should
> actually be (PAGE_SIZE << 32)-PAGE_SIZE (that final PAGE_SIZE
> subtraction is to make sure we don't generate that page cache with
> index -1), so having a disk that is 16TB or larger is not going to
> work, but your disk is right in that 8TB-16TB hole that used to work
> and was broken by that check.
> 
> Anyway, that makes me feel better. I should have looked at your disk
> size more, now I at least understand why nobody noticed before.
> 
> So just throw away my patch. That's wrong, and garbage.
> 
> The *right* patch is likely to just this instead:
> 
>  -#define MAX_LFS_FILESIZE       (((loff_t)PAGE_SIZE << (BITS_PER_LONG-1))-1)
>  +#define MAX_LFS_FILESIZE       (((loff_t)PAGE_SIZE <<
> BITS_PER_LONG)-PAGE_SIZE)
> 
> which should make MAX_LFS_FILESIZE be 0xffffffff000 and you disk size
> should be ok.

Doug,
I noticed while checking for other implications of changing MAX_LFS_FILESIZE
that fs/jfs/super.c is also working around this limit.  If you are going
to submit a patch for this, it also makes sense to fix jfs_fill_super() to
use MAX_LFS_FILESIZE instead of JFS rolling its own, something like:

	/* logical blocks are represented by 40 bits in pxd_t, etc.
	 * and page cache is indexed by long. */
	sb->s_maxbytes = min((u64)sb->s_blocksize) << 40,
                             MAX_LFS_FILESIZE);

It also looks like ocfs2_max_file_offset() is trying to avoid overflowing
the old 31-bit limit, and isn't using MAX_LFS_FILESIZE directly, so it will
now be wrong.  It looks like it could use "bitshift = 32; trim = bytes;",
but Joel or Mark should confirm.

Finally, there is a check in fs/super.c::mount_fs() that is verifying
s_maxbytes is not set too large, but this has been present since 2.6.32
and should probably be removed at this point, or changed to a BUG_ON()
(see commit 42cb56ae2ab for details).

Cheers, Andreas





Attachment: signature.asc
Description: Message signed with OpenPGP


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux