e2fsck hanging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm trying to run e2fsck on a ~6TB filesystem which is about 90% full. We're doing backup to disk to this filesystem, and have a number of hard links (link counts up to 90).

strace shows:

write(1, "Pass 2: Checking ", 17)       = 17
write(1, "directory", 9)                = 9
write(1, " structure\n", 11)            = 11
mmap(NULL, 91574272, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b4299dbd000 mmap(NULL, 91574272, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b429f512000 mmap(NULL, 506724352, PROT_READ|PROT_WRITE, MAP_PRIVATE| MAP_ANONYMOUS, -1, 0) = 0x2b42a4c67000 mmap(NULL, 596029440, PROT_READ|PROT_WRITE, MAP_PRIVATE| MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
brk(0x23e56000)                         = 0x5eb000
mmap(NULL, 596164608, PROT_READ|PROT_WRITE, MAP_PRIVATE| MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) mmap(NULL, 2097152, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS| MAP_NORESERVE, -1, 0) = 0x2b430a09e000
munmap(0x2b430a09e000, 401408)          = 0
munmap(0x2b430a200000, 647168)          = 0
mprotect(0x2b430a100000, 135168, PROT_READ|PROT_WRITE) = 0
mmap(NULL, 596029440, PROT_READ|PROT_WRITE, MAP_PRIVATE| MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
lseek(3, 6303744, SEEK_SET)             = 6303744
read(3, "\2\0\0\0\f\0\1\2.\0\0\0\2\0\0\0\f\0\2\2..\0\0\v\0\0\0 \24"..., 4096) = 4096
lseek(3, 6307840, SEEK_SET)             = 6307840
read(3, "\v\0\0\0\f\0\1\2.\0\0\0\2\0\0\0\364\17\2\2..\0\0\0\0\0"..., 4096) = 4096
lseek(3, 6311936, SEEK_SET)             = 6311936
read(3, "\0\0\0\0\0\20\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
lseek(3, 6316032, SEEK_SET)             = 6316032
read(3, "\0\0\0\0\0\20\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
lseek(3, 6320128, SEEK_SET)             = 6320128
read(3, "\0\0\0\0\0\20\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
lseek(3, 41709568, SEEK_SET)            = 41709568
read(3, "\323\0\0\0\f\0\1\2.\0\0\0\226\2\252+\f\0\2\2..\0\0\324"..., 4096) = 4096
lseek(3, 41713664, SEEK_SET)            = 41713664
read(3, "\324\0\0\0\f\0\1\2.\0\0\0\323\0\0\0\f\0\2\2..\0\0\214 \300"..., 4096) = 4096
lseek(3, 41717760, SEEK_SET)            = 41717760
read(3, "\325\0\0\0\f\0\1\2.\0\0\0\226\2\252+\f\0\2\2..\0\0\326"..., 4096) = 4096

And, that's it.  No more output.

A backtrace from gdb shows:

(gdb) bt
#0  0x0000000000418aa5 in get_icount_el (icount=0x5cf170,
ino=732562070, create=1) at icount.c:251
#1  0x0000000000418dd7 in ext2fs_icount_increment (icount=0x5cf170,
ino=732562070, ret=0x7fffffa79a96)
     at icount.c:339
#2  0x000000000040a3cf in check_dir_block (fs=0x5af560,
db=0x2b7070cc6064, priv_data=0x7fffffa79c90) at pass2.c:1021
#3  0x0000000000416c69 in ext2fs_dblist_iterate (dblist=0x5c3f20,
func=0x409980 <check_dir_block>,
     priv_data=0x7fffffa79c90) at dblist.c:234
#4  0x0000000000408d9d in e2fsck_pass2 (ctx=0x5ae700) at pass2.c:149
#5  0x0000000000403102 in e2fsck_run (ctx=0x5ae700) at e2fsck.c:193
#6  0x0000000000401e50 in main (argc=Variable "argc" is not available.
) at unix.c:1075


It's stuck inside the while loop in get_icount_el() (line 251).

I've added more memory to the server (up to 6 GB now), and am re- running e2fsck. Additionally, I upped /proc/sys/vm/max_map_count to 20,000,000 (just pulled that number out of the air). It takes 6 or 7 hours to get the part where it locks up, so I'm not sure if this is going to help or not. I figured while it's running I would post here to see if anyone has any additional insights.

Thanks!

Brian Davidson
George Mason University

_______________________________________________
Ext3-users mailing list
Ext3-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/ext3-users

[Index of Archives]         [Linux RAID]     [Kernel Development]     [Red Hat Install]     [Video 4 Linux]     [Postgresql]     [Fedora]     [Gimp]     [Yosemite News]

  Powered by Linux