On 2/21/13 12:07 AM, Theodore Ts'o wrote: > On Mon, Feb 18, 2013 at 03:41:11PM -0600, Eric Sandeen wrote: >> Can't remember how I stumbled on this testcase, but mounting >> an ext3 filesystem with "-t ext4" and then resizing leads to trouble. >> >> With -o nodelalloc, the newly added space isn't seen by the allocator >> and we get ENOSPC for the extending writes in the script below. >> >> Without -o nodelalloc, the writes worked but I got an umount hang. >> >> Without -t ext4 (but letting ext4.ko handle the ext3 mount) it seems >> to work fine. >> >> Haven't looked into it much at all yet but wanted to put it out >> there for posterity. > > At least one of the problems is that ext4_alloc_blocks() is buggy if > it is asked to allocate one or more indirect blocks, and then it > doesn't have room to allocate any direct blocks. In that case, > ext4_alloc_blocks() does not return ENOSPC, and so ext4_alloc_branch() > doesn't fail. But since the number of direct blocks allocated is > zeor, ext4_splice_branch() will not actually initialize the indirect > block, and then we end up looping forever and calling > ext4_mballoc_alloc() --- demonstrating that one of the best definition > of insanity is doing the same thing over and over again and expecting > a different result: > > flush-254:32-2913 [001] .... 1073.028245: ext4_mballoc_alloc: dev 254,32 inode 21824 orig 0/4031/64@4981 goal 0/4029/2048@4096 result 0/0/0@0 blks 0 grps 0 cr 3 flags 0x0c20 tail 0 broken 0 > flush-254:32-2913 [001] .... 1073.050655: ext4_mballoc_alloc: dev 254,32 inode 21824 orig 0/4031/64@4981 goal 0/4029/2048@4096 result 0/0/0@0 blks 0 grps 0 cr 3 flags 0x0c20 tail 0 broken 0 > flush-254:32-2913 [001] .... 1073.073034: ext4_mballoc_alloc: dev 254,32 inode 21824 orig 0/4031/64@4981 goal 0/4029/2048@4096 result 0/0/0@0 blks 0 grps 0 cr 3 flags 0x0c20 tail 0 broken 0 > flush-254:32-2913 [001] .... 1073.112163: ext4_mballoc_alloc: dev 254,32 inode 21824 orig 0/4031/64@4981 goal 0/4029/2048@4096 result 0/0/0@0 blks 0 grps 0 cr 3 flags 0x0c20 tail 0 broken 0 > > I suspect the right way to deal with this is to nuke > ext4_alloc_blocks() from orbit, and change ext4_alloc_branch() to > allocate the indirect and direct blocks directly, calling > ext4_new_meta_block() and ext4_mb_new_blocks() directly. What we have > right now is pretty gross.... > > The other problem is why resizing isn't adding the blocks so that they > are visible to the allocator. Since we are using the same code path > for ext3 and ext4 file systems, I have a sneaking suspicion that we're > not actually making all of the newly allocated blocks for ext4 file > systems available too, but it's something like we're not making the > first block in each flex_bg group available (and that happens to be all > of the newly grown blocks for ext3 file systems). > > As near as I can tell this isn't a regression, but since this is a > pretty seriouis bug, it's something we should try to fix during the > 3.8 development cycle. I think you're correct that it's not a regression. Now I remember that the same basic bug came up w/ a RHEL customer who was doing this "mkfs ext3; mount -t ext4 -o nodelalloc" business. Which really isn't tested or supported, but it's still quite the odd corner case. (Being RHEL, though, it wasn't clear that the same bug persisted upstream, but it appears that it does. The older RHEL kernel was more noticeable because it spewed the ext4 allocation context errors as well). Thanks, -Eric > - Ted > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html