On Mon, Aug 22, 2011 at 11:27:05AM +1000, Dave Chinner wrote: > On Sun, Aug 21, 2011 at 03:59:34PM -0700, Robert Jacobs wrote: > > > On Mittwoch, 17. August 2011 Robert Jacobs wrote: > > > > I've attached the outputs of xfs_check and three different runs of > > > > xfs_repair (with backtrace, when available). > > > > > > Can you provide a XFS metadump? That should help devs to reproduce the > > > error. > > > > I've uploaded it to <http://eamp.org/metadumpf.lzma>. It's 9MB but I > > have somewhat limited upload quota. I noticed that some filenames > > weren't obscured so I obscured them by hand -- if this is a problem, > > please tell me. > > The metadump image you have provided does not trigger problems on > either 3.1.5 on a current xfs_repair build out of the xfsprogs-dev > repository on my machines. Can you restore the image and run > xfs_repair on the image and confirm it still crashes? I stand corrected - it doesn't crash on x86-64, but does crash on i686. I get several different random crashes, so it looks like some kind of heap corruption is occurring. Running under valgrind shows that the inode extent blkmaps are getting out of bound reads and writes. I think the problem is this: typedef struct bmap_ext { xfs_dfiloff_t startoff; xfs_dfsbno_t startblock; xfs_dfilblks_t blockcount; } bmap_ext_t; That's a 32 byte structure. Then there is this: /* * Block map. */ typedef struct blkmap { int naexts; int nexts; bmap_ext_t exts[1]; } blkmap_t; #define BLKMAP_SIZE(n) \ (offsetof(blkmap_t, exts) + (sizeof(bmap_ext_t) * (n))) So the allocation here is (8 + (32 * n)) bytes. In the case of the corruption, I am seeing this number of extents being asked for: realloc from 11 to 536870913 fork data so n = 536870913. therefore the allocation size is: size = 8 + (32 * 536870913) = 17,179,869,224 = 0x400000028 Which, when we truncate to 32 bits is 0x28 or 40 bytes. So, access to the second extent will overflow the allocated array and we corrupt memory. The patch below will result in the following output with that metadump image when it encounters this corruption: Number of extents requested in blkmap_alloc (536870913) overflows 32 bits. If this is not a corruption, then you will need a 64 bit system to repair this filesystem. It will then continue and shouldn't crash. As it is, I'd recommend upgrading to x86-64 anyway - XFS on i686 doesn't receive a lot of developer testing... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx xfs-repair: fix some valgrind reported errors on i686 From: Dave Chinner <dchinner@xxxxxxxxxx> Fix a potential prefetch read problem due to the first loop execution of pf_batch_read potentially not initialising the fsbno varaible: ==10177== Thread 6: ==10177== Conditional jump or move depends on uninitialised value(s) ==10177== at 0x8079CAB: pf_batch_read (prefetch.c:408) ==10177== by 0x6A2996D: clone (clone.S:130) ==10177== Fix a bunch of invalid read/write errors due to excessive blkmap allocations when inode forks are corrupted. These show up some time after making a blkmap allocation for 536870913 extents on i686, which is followed some time later by a crash caused bymemory corruption. This blkmap allocation size overflows 32 bits in such a way that it results in a 32 byte allocation and so access to the second extent results in access beyond the allocated memory and corrupts random memory. ==5419== Invalid write of size 4 ==5419== at 0x80507DA: blkmap_set_ext (bmap.c:260) ==5419== by 0x8055CF4: process_bmbt_reclist_int (dinode.c:712) ==5419== by 0x8056206: process_bmbt_reclist (dinode.c:813) ==5419== by 0x80579DA: process_exinode (dinode.c:1324) ==5419== by 0x8059B77: process_dinode_int (dinode.c:2036) ==5419== by 0x805ABE6: process_dinode (dinode.c:2823) ==5419== by 0x8052493: process_inode_chunk.isra.4 (dino_chunks.c:777) ==5419== by 0x8054012: process_aginodes (dino_chunks.c:1024) ==5419== by 0xFFF: ??? ==5419== Address 0x944cfb8 is 0 bytes after a block of size 32 alloc'd ==5419== at 0x48E1102: realloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==5419== by 0x80501F3: blkmap_alloc (bmap.c:56) ==5419== by 0x80599F5: process_dinode_int (dinode.c:2027) ==5419== by 0x805ABE6: process_dinode (dinode.c:2823) ==5419== by 0x8052493: process_inode_chunk.isra.4 (dino_chunks.c:777) ==5419== by 0x8054012: process_aginodes (dino_chunks.c:1024) ==5419== by 0xFFF: ??? Add overflow detection code into the blkmap allocation code to avoid this problem, and also free large allocations once they are finished with to avoid pinning large amounts of memory due to the occasional large extent list in a filesystem. Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> --- repair/bmap.c | 37 ++++++++++++++++++++++++++++++++++++- repair/prefetch.c | 2 +- 2 files changed, 37 insertions(+), 2 deletions(-) diff --git a/repair/bmap.c b/repair/bmap.c index 79b9f79..1127a87 100644 --- a/repair/bmap.c +++ b/repair/bmap.c @@ -47,6 +47,17 @@ blkmap_alloc( if (nex < 1) nex = 1; +#if (BITS_PER_LONG != 64) + if (nex > (INT_MAX / sizeof(bmap_ext_t) - 1)) { + do_warn( + _("Number of extents requested in blkmap_alloc (%u) overflows 32 bits.\n" + "If this is not a corruption, then will need a 64 bit system\n" + "to repair this filesystem.\n"), + nex); + return NULL; + } +#endif + key = whichfork ? ablkmap_key : dblkmap_key; blkmap = pthread_getspecific(key); if (!blkmap || blkmap->naexts < nex) { @@ -66,12 +77,27 @@ blkmap_alloc( /* * Free a block map. + * + * If the map is a large, uncommon size (say for hundreds of thousands of + * extents) then free it to release the memory. This prevents us from pinning + * large tracts of memory due to corrupted fork values or one-off fragmented + * files. Otherwise we have nothing to do but keep the memory around for the + * next inode */ void blkmap_free( blkmap_t *blkmap) { - /* nothing to do! - keep the memory around for the next inode */ + /* consider more than 100k extents rare */ + if (blkmap->naexts < 100 * 1024) + return; + + if (blkmap == pthread_getspecific(dblkmap_key)) + pthread_setspecific(dblkmap_key, NULL); + else + pthread_setspecific(ablkmap_key, NULL); + + free(blkmap); } /* @@ -218,6 +244,15 @@ blkmap_grow( } blkmap->naexts += 4; +#if (BITS_PER_LONG != 64) + if (blkmap->naexts > (INT_MAX / sizeof(bmap_ext_t) - 1)) { + do_error( + _("Number of extents requested in blkmap_grow (%u) overflows 32 bits.\n" + "You need a 64 bit system to repair this filesystem.\n"), + blkmap->naexts); + return NULL; + } +#endif blkmap = realloc(blkmap, BLKMAP_SIZE(blkmap->naexts)); if (blkmap == NULL) do_error(_("realloc failed in blkmap_grow\n")); diff --git a/repair/prefetch.c b/repair/prefetch.c index d2fdf90..da074a8 100644 --- a/repair/prefetch.c +++ b/repair/prefetch.c @@ -397,7 +397,7 @@ pf_batch_read( int len, size; int i; int inode_bufs; - unsigned long fsbno; + unsigned long fsbno = 0; unsigned long max_fsbno; char *pbuf; _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs