Kernel panic when populating cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm populating my 6 node cluster, running glusterfs 3.0.4, by copying
files into /mnt/glusterfs, the gluster mounted filesystem. I had one
machine go down with a kernel panic last week, but I wasn't able to
see the error (it's a remote server) so we just restarted and went
along. I was running 4 instances, all writing to the /mnt/gluster
directory, again today and saw the following error in the logs. I've
stopped and restarted my processes, this time just running two of
them, and I'm not seeing the error. Obviously this is taking much
longer to populate the cluster, could I have overloaded it by having
four shell scripts copying files into the mount? What does this error
mean, and is my method the proper way to populate a 6 node cluster
with 50TB capacity?

Thanks

P

==> /var/log/syslog <==
Jul 20 23:49:54 clustr-01 kernel: [794473.515204] INFO: task cp:6706
blocked for more than 120 seconds.
Jul 20 23:49:54 clustr-01 kernel: [794473.515235] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 20 23:49:54 clustr-01 kernel: [794473.515292] cp            D
ffff880143d955c0     0  6706      1 0x00000004
Jul 20 23:49:54 clustr-01 kernel: [794473.515296]  ffff88013ce55bd0
0000000000000046 ffff88005d4afbc8 ffff88005d4afbc4
Jul 20 23:49:54 clustr-01 kernel: [794473.515300]  000000000000000e
0000000000000096 000000000000f8a0 ffff88005d4affd8
Jul 20 23:49:54 clustr-01 kernel: [794473.515303]  00000000000155c0
00000000000155c0 ffff88023e35e2e0 ffff88023e35e5d8
Jul 20 23:49:54 clustr-01 kernel: [794473.515306] Call Trace:
Jul 20 23:49:54 clustr-01 kernel: [794473.515315]
[<ffffffffa0213a99>] ? fuse_request_send+0x196/0x249 [fuse]
Jul 20 23:49:54 clustr-01 kernel: [794473.515319]
[<ffffffff81064a56>] ? autoremove_wake_function+0x0/0x2e
Jul 20 23:49:54 clustr-01 kernel: [794473.515324]
[<ffffffffa0218086>] ? fuse_flush+0xca/0xfe [fuse]
Jul 20 23:49:54 clustr-01 kernel: [794473.515328]
[<ffffffff810eb90e>] ? filp_close+0x37/0x62
Jul 20 23:49:54 clustr-01 kernel: [794473.515332]
[<ffffffff8104f710>] ? put_files_struct+0x64/0xc1
Jul 20 23:49:54 clustr-01 kernel: [794473.515335]
[<ffffffff81050fb2>] ? do_exit+0x225/0x6b5
Jul 20 23:49:54 clustr-01 kernel: [794473.515337]
[<ffffffff810514b8>] ? do_group_exit+0x76/0x9d
Jul 20 23:49:54 clustr-01 kernel: [794473.515341]
[<ffffffff8105dc50>] ? get_signal_to_deliver+0x310/0x33c
Jul 20 23:49:54 clustr-01 kernel: [794473.515353]
[<ffffffff8101002f>] ? do_notify_resume+0x87/0x73f
Jul 20 23:49:54 clustr-01 kernel: [794473.515357]
[<ffffffff810cb774>] ? handle_mm_fault+0x2f7/0x7a5
Jul 20 23:49:54 clustr-01 kernel: [794473.515361]
[<ffffffff810eddd6>] ? vfs_read+0xa6/0xff
Jul 20 23:49:54 clustr-01 kernel: [794473.515363]
[<ffffffff81010e0e>] ? int_signal+0x12/0x17

-- 
http://philcryer.com


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux