Apache hung tasks still occur with glusterfs 3.2.1

jlondon at lawinfo.com (Justice London) · Mon, 13 Jun 2011 06:58:07 -0700

Disable io-cache and up the threads to 64 and your problems should
disappear. They did for me when I made both of these changes.

Justice London

From: gluster-users-bounces at gluster.org
[mailto:gluster-users-bounces at gluster.org] On Behalf Of Jiri Lunacek
Sent: Monday, June 13, 2011 1:49 AM
To: gluster-users at gluster.org
Subject: Apache hung tasks still occur with glusterfs 3.2.1

Hi all.

We have been having problems with hung tasks of apache reading from
glusterfs 2-replica volume ever since upgrading to 3.2.0. The problems were
identical to those described here:
http://gluster.org/pipermail/gluster-users/2011-May/007697.html

Yesterday we updated to 3.2.1.

A good thing is that the hung tasks stopped appearing when gluster is in
"intact" operation, i.e. when there are no modifications to the gluster
configs at all.

Today we modified some other volume exported by the same cluster (but not
sharing anything with the volume used by the apache process). And, once
again, two requests of apache reading from glusterfs volume are stuck.

Any help with this issue would be very appreciated as right now we have to
nightly-reboot the machine as the processes re stuck in iowait ->
unkillable.

I really do not want to go through the downgrade to 3.1.4 since it seems
from the mailing list that it may not go exactly smooth. We are exporting
millions of files and any large operation on the exported filesystem takes
days.

I am attaching tech info on the problem.

client:

Centos 5.6

2.6.18-238.9.1.el5

fuse-2.7.4-8.el5

glusterfs-fuse-3.2.1-1

glusterfs-core-3.2.1-1

servers:

Centos 5.6

2.6.18-194.32.1.el5

fuse-2.7.4-8.el5

glusterfs-fuse-3.2.1-1

glusterfs-core-3.2.1-1

dmesg:

INFO: task httpd:1246 blocked for more than 120 seconds.

"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

httpd         D ffff81000101d7a0     0  1246   2394          1247  1191
(NOTLB)

 ffff81013ee7dc38 0000000000000082 0000000000000092 ffff81013ee7dcd8

 ffff81013ee7dd04 000000000000000a ffff810144d0f7e0 ffff81019fc28100

 0000308f8b444727 00000000000014ee ffff810144d0f9c8 000000038006e608

Call Trace:

 [<ffffffff8006ec4e>] do_gettimeofday+0x40/0x90

 [<ffffffff80028c5a>] sync_page+0x0/0x43

 [<ffffffff800637ca>] io_schedule+0x3f/0x67

 [<ffffffff80028c98>] sync_page+0x3e/0x43

 [<ffffffff8006390e>] __wait_on_bit_lock+0x36/0x66

 [<ffffffff8003ff27>] __lock_page+0x5e/0x64

 [<ffffffff800a2921>] wake_bit_function+0x0/0x23

 [<ffffffff8003fd85>] pagevec_lookup+0x17/0x1e

 [<ffffffff800cc666>] invalidate_inode_pages2_range+0x73/0x1bd

 [<ffffffff8004fc94>] finish_wait+0x32/0x5d

 [<ffffffff884b9798>] :fuse:wait_answer_interruptible+0xb6/0xbd

 [<ffffffff800a28f3>] autoremove_wake_function+0x0/0x2e

 [<ffffffff8009a485>] recalc_sigpending+0xe/0x25

 [<ffffffff8001decc>] sigprocmask+0xb7/0xdb

 [<ffffffff884bd456>] :fuse:fuse_finish_open+0x36/0x62

 [<ffffffff884bda11>] :fuse:fuse_open_common+0x147/0x158

 [<ffffffff884bda22>] :fuse:fuse_open+0x0/0x7

 [<ffffffff8001eb99>] __dentry_open+0xd9/0x1dc

 [<ffffffff8002766e>] do_filp_open+0x2a/0x38

 [<ffffffff8001a061>] do_sys_open+0x44/0xbe

 [<ffffffff8005d28d>] tracesys+0xd5/0xe0

INFO: task httpd:1837 blocked for more than 120 seconds.

"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

httpd         D ffff810001004420     0  1837   2394          1856  1289
(NOTLB)

 ffff81013c6f9c38 0000000000000086 ffff81013c6f9bf8 00000000fffffffe

 ffff810170ce7000 000000000000000a ffff81019c0ae7a0 ffffffff80311b60

 0000308c0f83d792 0000000000000ec4 ffff81019c0ae988 000000008006e608

Call Trace:

 [<ffffffff8006ec4e>] do_gettimeofday+0x40/0x90

 [<ffffffff80028c5a>] sync_page+0x0/0x43

 [<ffffffff800637ca>] io_schedule+0x3f/0x67

 [<ffffffff80028c98>] sync_page+0x3e/0x43

 [<ffffffff8006390e>] __wait_on_bit_lock+0x36/0x66

 [<ffffffff8003ff27>] __lock_page+0x5e/0x64

 [<ffffffff800a2921>] wake_bit_function+0x0/0x23

 [<ffffffff8003fd85>] pagevec_lookup+0x17/0x1e

 [<ffffffff800cc666>] invalidate_inode_pages2_range+0x73/0x1bd

 [<ffffffff8004fc94>] finish_wait+0x32/0x5d

 [<ffffffff884b9798>] :fuse:wait_answer_interruptible+0xb6/0xbd

 [<ffffffff800a28f3>] autoremove_wake_function+0x0/0x2e

 [<ffffffff8009a485>] recalc_sigpending+0xe/0x25

 [<ffffffff8001decc>] sigprocmask+0xb7/0xdb

 [<ffffffff884bd456>] :fuse:fuse_finish_open+0x36/0x62

 [<ffffffff884bda11>] :fuse:fuse_open_common+0x147/0x158

 [<ffffffff884bda22>] :fuse:fuse_open+0x0/0x7

 [<ffffffff8001eb99>] __dentry_open+0xd9/0x1dc

 [<ffffffff8002766e>] do_filp_open+0x2a/0x38

 [<ffffffff8001a061>] do_sys_open+0x44/0xbe

 [<ffffffff8005d28d>] tracesys+0xd5/0xe0

INFO: task httpd:383 blocked for more than 120 seconds.

"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

httpd         D ffff81019fa21100     0   383   2394           534
(NOTLB)

 ffff81013e497c08 0000000000000082 ffff810183eb8910 ffffffff884b9219

 ffff81019e41c600 0000000000000009 ffff81019b1e2100 ffff81019fa21100

 0000308c0e2c2bfb 0000000000016477 ffff81019b1e22e8 000000038006e608

Call Trace:

 [<ffffffff884b9219>] :fuse:flush_bg_queue+0x2b/0x48

 [<ffffffff8006ec4e>] do_gettimeofday+0x40/0x90

 [<ffffffff8005a9f9>] getnstimeofday+0x10/0x28

 [<ffffffff80028c5a>] sync_page+0x0/0x43

 [<ffffffff800637ca>] io_schedule+0x3f/0x67

 [<ffffffff80028c98>] sync_page+0x3e/0x43

 [<ffffffff8006390e>] __wait_on_bit_lock+0x36/0x66

 [<ffffffff8003ff27>] __lock_page+0x5e/0x64

 [<ffffffff800a2921>] wake_bit_function+0x0/0x23

 [<ffffffff8000c48d>] do_generic_mapping_read+0x1df/0x359

 [<ffffffff8000d279>] file_read_actor+0x0/0x159

 [<ffffffff8000c753>] __generic_file_aio_read+0x14c/0x198

 [<ffffffff800c8c45>] generic_file_read+0xac/0xc5

 [<ffffffff800a28f3>] autoremove_wake_function+0x0/0x2e

 [<ffffffff80130778>] selinux_file_permission+0x9f/0xb4

 [<ffffffff8000b78d>] vfs_read+0xcb/0x171

 [<ffffffff80011d34>] sys_read+0x45/0x6e

 [<ffffffff8005d28d>] tracesys+0xd5/0xe0

possibly relevant log messages:

[2011-06-13 09:09:14.985576] W [socket.c:1494:__socket_proto_state_machine]
0-glusterfs: reading from socket failed. Error (Transport endpoint is not
connected), peer (81.0.225.122:24007)

[2011-06-13 09:09:25.741055] I [glusterfsd-mgmt.c:637:mgmt_getspec_cbk] 0-:
No change in volfile, continuing

[2011-06-13 09:59:00.644130] I [afr-common.c:639:afr_lookup_self_heal_check]
0-isifa1-replicate-0: size differs for /data/foto/thumbs/3140/31409780.jpg 

[2011-06-13 09:59:00.644269] I [afr-common.c:801:afr_lookup_done]
0-isifa1-replicate-0: background  meta-data data self-heal triggered. path:
/data/foto/thumbs/3140/31409780.jpg

[2011-06-13 09:59:00.822524] W [dict.c:437:dict_ref]
(-->/opt/glusterfs/3.2.1/lib64/glusterfs/3.2.1/xlator/protocol/client.so(cli
ent3_1_fstat_cbk+0x2cb) [0x2aaaab1afa0b]
(-->/opt/glusterfs/3.2.1/lib64/glusterfs/3.2.1/xlator/cluster/replicate.so(a
fr_sh_data_fstat_cbk+0x17d) [0x2aaaab3e4c9d]
(-->/opt/glusterfs/3.2.1/lib64/glusterfs/3.2.1/xlator/cluster/replicate.so(a
fr_sh_data_fix+0x1fc) [0x2aaaab3e493c]))) 0-dict: dict is NULL

[2011-06-13 09:59:00.824323] I [afr-common.c:639:afr_lookup_self_heal_check]
0-isifa1-replicate-0: size differs for /data/foto/thumbs/3140/31409781.jpg 

[2011-06-13 09:59:00.824356] I [afr-common.c:801:afr_lookup_done]
0-isifa1-replicate-0: background  meta-data data self-heal triggered. path:
/data/foto/thumbs/3140/31409781.jpg

[2011-06-13 09:59:00.826494] I
[afr-self-heal-common.c:1557:afr_self_heal_completion_cbk]
0-isifa1-replicate-0: background  meta-data data self-heal completed on
/data/foto/thumbs/3140/31409780.jpg

[2011-06-13 09:59:00.830902] W [dict.c:437:dict_ref]
(-->/opt/glusterfs/3.2.1/lib64/glusterfs/3.2.1/xlator/protocol/client.so(cli
ent3_1_fstat_cbk+0x2cb) [0x2aaaab1afa0b]
(-->/opt/glusterfs/3.2.1/lib64/glusterfs/3.2.1/xlator/cluster/replicate.so(a
fr_sh_data_fstat_cbk+0x17d) [0x2aaaab3e4c9d]
(-->/opt/glusterfs/3.2.1/lib64/glusterfs/3.2.1/xlator/cluster/replicate.so(a
fr_sh_data_fix+0x1fc) [0x2aaaab3e493c]))) 0-dict: dict is NULL

gluster-server volume config:

volume isifa1-posix

    type storage/posix

    option directory /mnt/data/isifa1

end-volume

volume isifa1-access-control

    type features/access-control

    subvolumes isifa1-posix

end-volume

volume isifa1-locks

    type features/locks

    subvolumes isifa1-access-control

end-volume

volume isifa1-io-threads

    type performance/io-threads

    subvolumes isifa1-locks

end-volume

volume isifa1-marker

    type features/marker

    option volume-uuid 39d5c509-ad39-4b24-a272-c33e212cf912

    option timestamp-file /etc/glusterd/vols/isifa1/marker.tstamp

    option xtime off

    option quota off

    subvolumes isifa1-io-threads

end-volume

volume /mnt/data/isifa1

    type debug/io-stats

    option latency-measurement off

    option count-fop-hits off

    subvolumes isifa1-marker

end-volume

volume isifa1-server

    type protocol/server

    option transport-type tcp

    option auth.addr./mnt/data/isifa1.allow
81.0.225.120,81.0.225.121,81.0.225.90,81.0.225.117,81.0.225.118,82.208.17.11
3

    subvolumes /mnt/data/isifa1

end-volume

gluster-fuse config:

volume isifa1-client-0

    type protocol/client

    option remote-host isifa-data1

    option remote-subvolume /mnt/data/isifa1

    option transport-type tcp

end-volume

volume isifa1-client-1

    type protocol/client

    option remote-host isifa-data2

    option remote-subvolume /mnt/data/isifa1

    option transport-type tcp

end-volume

volume isifa1-replicate-0

    type cluster/replicate

    subvolumes isifa1-client-0 isifa1-client-1

end-volume

volume isifa1-write-behind

    type performance/write-behind

    subvolumes isifa1-replicate-0

end-volume

volume isifa1-read-ahead

    type performance/read-ahead

    subvolumes isifa1-write-behind

end-volume

volume isifa1-io-cache

    type performance/io-cache

    subvolumes isifa1-read-ahead

end-volume

volume isifa1-quick-read

    type performance/quick-read

    subvolumes isifa1-io-cache

end-volume

volume isifa1-stat-prefetch

    type performance/stat-prefetch

    subvolumes isifa1-quick-read

end-volume

volume isifa1

    type debug/io-stats

    option latency-measurement off

    option count-fop-hits off

    subvolumes isifa1-stat-prefetch

end-volume

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gluster.org/pipermail/gluster-users/attachments/20110613/22c59b7d/attachment-0001.htm>