On Mon, 28 Sep 2015 09:26:45 +1000 Dave Chinner <david@xxxxxxxxxxxxx> wrote: > On Sun, Sep 27, 2015 at 10:59:33AM -0700, Hugh Dickins wrote: > > On Sun, 27 Sep 2015, angelo wrote: > > > On 27/09/2015 03:36, Hugh Dickins wrote: > > > > Let's Cc linux-fsdevel, who will be more knowledgable. > > > > > > > > On Sun, 27 Sep 2015, angelo wrote: > > > > > > > > > Hi all, > > > > > > > > > > running xfstests, generic 308 on whatever 32bit arch is possible > > > > > to observe cpu to hang near 100% on unlink. > > > > I have since tried to repeat your result, but generic/308 on 32-bit just > > skipped the test for me. I didn't investigate why: it's quite possible > > that I had a leftover 64-bit executable in the path that it tried to use, > > but didn't show the relevant error message. > > > > I did verify your result with a standalone test; and that proves that > > nobody has actually been using such files in practice before you, > > since unmounting the xfs filesystem would hang in the same way if > > they didn't unlink them. > > It used to work - this is a regression. Just because nobody has > reported it recently simply means nobody has run xfstests on 32 bit > storage recently. There are 32 bit systems out there that expect > this to work, and we've broken it. > > The regression was introduced in 3.11 by this commit: > > commit 5a7203947a1d9b6f3a00a39fda08c2466489555f > Author: Lukas Czerner <lczerner@xxxxxxxxxx> > Date: Mon May 27 23:32:35 2013 -0400 > > mm: teach truncate_inode_pages_range() to handle non page aligned ranges > > This commit changes truncate_inode_pages_range() so it can handle non > page aligned regions of the truncate. Currently we can hit BUG_ON when > the end of the range is not page aligned, but we can handle unaligned > start of the range. > > Being able to handle non page aligned regions of the page can help file > system punch_hole implementations and save some work, because once we're > holding the page we might as well deal with it right away. > > In previous commits we've changed ->invalidatepage() prototype to accept > 'length' argument to be able to specify range to invalidate. No we can > use that new ability in truncate_inode_pages_range(). > > Signed-off-by: Lukas Czerner <lczerner@xxxxxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: Hugh Dickins <hughd@xxxxxxxxxx> > Signed-off-by: Theodore Ts'o <tytso@xxxxxxx> > > > > > > > The test removes a sparse file of length 16tera where only the last > > > > > 4096 bytes block is mapped. > > > > > At line 265 of truncate.c there is a > > > > > if (index >= end) > > > > > break; > > > > > But if index is, as in this case, a 4294967295, it match -1 used as > > > > > eof. Hence the cpu loops 100% just after. > > > > > > > > > That's odd. I've not checked your patch, because I think the problem > > > > would go beyond truncate, and the root cause lie elsewhere. > > > > > > > > My understanding is that the 32-bit > > > > #define MAX_LFS_FILESIZE (((loff_t)PAGE_CACHE_SIZE << (BITS_PER_LONG-1))-1) > > > > makes a page->index of -1 (or any "negative") impossible to reach. > > We've supported > 8TB files on 32 bit XFS file systems since > since mid 2003: > > http://oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=commitdiff;h=d13d78f6b83eefbd90a6cac5c9fbe42560c6511e > > And it's been documented as such for a long time, too: > > http://xfs.org/docs/xfsdocs-xml-dev/XFS_User_Guide/tmp/en-US/html/ch02s04.html > > (that was written, IIRC, back in 2007). > > i.e. whatever the definition says about MAX_LFS_FILESIZE being an > 8TB limit on 32 bit is stale and has been for a very long time. > > > A surprise to me, and I expect to others, that 32-bit xfs is not > > respecting MAX_LFS_FILESIZE: going its own way with 0xfff ffffffff > > instead of 0x7ff ffffffff (on a PAGE_CACHE_SIZE 4096 system). > > > > MAX_LFS_FILESIZE has been defined that way ever since v2.5.4: > > this is probably just an oversight from when xfs was later added > > into the Linux tree. > > We supported >8 TB file offsets on 32 bit systems on 2.4 kernels > with XFS, so it sounds like it was wrong even when it was first > committed. Of course, XFS wasn't merged until 2.5.36, so I guess > nobody realised... ;) > > > > But if s_maxbytes doesn't have to be greater than MAX_LFS_FILESIZE, > > > i agree the issue should be fixed in layers above. > > > > There is a "filesystems should never set s_maxbytes larger than > > MAX_LFS_FILESIZE" comment in fs/super.c, but unfortunately its > > warning is written with just 64-bit in mind (testing for negative). > > Yup, introduced here: > > commit 42cb56ae2ab67390da34906b27bedc3f2ff1393b > Author: Jeff Layton <jlayton@xxxxxxxxxx> > Date: Fri Sep 18 13:05:53 2009 -0700 > > vfs: change sb->s_maxbytes to a loff_t > > sb->s_maxbytes is supposed to indicate the maximum size of a file that can > exist on the filesystem. It's declared as an unsigned long long. > > And yes, that will never fire on a 32bit filesystem, because loff_t > is a "long long" type.... > Hmm...should we change that to something like this instead? WARN(((unsigned long long)sb->s_maxbytes > (unsigned long long)MAX_LFS_FILESIZE, "%s set sb->s_maxbytes to too large a value (0x%llx)\n", type->name, sb->s_maxbytes); -- Jeff Layton <jlayton@xxxxxxxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html