Fwd: Apache hung tasks still occur with glusterfs 3.2.1

jiri.lunacek at hosting90.cz (Jiri Lunacek) · Tue, 14 Jun 2011 15:44:27 +0200

Another update...

Setting io-cache off, stat-prefetch off and io-thread-count 64 DOES NOT SOLVE the problem. It only narrows down the conditions in which the problem occurs.

I have run the same benchmark with these settings and ended up with the same result. 55 hung processes in sync_page ioctl.

I am attaching a dump from this test.

Jiri

Begin forwarded message:

> From: Jiri Lunacek <jiri.lunacek at hosting90.cz>
> Date: 14. ?ervna 2011 15:32:22 GMT+02:00
> To: gluster-users at gluster.org
> Subject: Fwd: Apache hung tasks still occur with glusterfs 3.2.1
> 
> I am back with the results.
> 
> Glusterdump is attached to this e-mail.
> 
> I found a way of reproducing this issue, I just ran:
> 
> for i in {1..1000}; do echo "http://server.address.com/file/hosted/from/gluster.jpg";; done | xargs -L1 -P20 wget -O/dev/null
> 
> i.e. multiple synchronous requests on an individual file through apache.
> 
> With io-cache enabled this caused over 70 apache processes to hang in sync_page ioctl.
> 
> Glusterdump is attached to this e-mail.
> 
> I have the server back with the work-around parameters. I am hoping that the dump will help you guys isolate the bug.
> 
> Jiri
> 
> Begin forwarded message:
> 
>> From: Jiri Lunacek <jiri.lunacek at hosting90.cz>
>> Date: 14. ?ervna 2011 14:59:56 GMT+02:00
>> To: gluster-users at gluster.org
>> Subject: Re: Apache hung tasks still occur with glusterfs 3.2.1
>> 
>> Hi.
>> 
>> 
>>> hello.
>>> 
>>> do you maybe have already feedback?
>>> was it successfull? (disabled io-cache, disabled stat-prefetch, inreades io-thread count to 64)
>>> 
>> 
>> For now it seems that the work-arround has worked. We have not encountered any hung processes on the server since the change (io-cache disable,  stat-prefetch disable io-thread-count=64).
>> 
>> The only "bad" influence is expectable, the pages (mainly list of several hundred images per page) take a little while longer. Of course this is caused by the files not being cached.
>> 
>>> is/was your problem similar to this one?
>>> http://bugs.gluster.com/show_bug.cgi?id=3011
>> 
>> The symptoms were the same. The processes were hung on ioctl. /proc//wchan for the PIDs showed "sync_page".
>> 
>> I'll experiment a bit once again today and set the volume back to original parameters and wait for a hung process to get you the information (/tmp/glusterdump.pid).
>> 
>> I'll report back later.
>> 
>> Jiri
>> 
>>> 
>>> Am 13.06.2011 19:14, schrieb Jiri Lunacek:
>>>> Thanks for the tip. I disabled io-cache and stat-prefetch, increased io-thread-count to 64 and
>>>> rebooted the server to clean off the hung apache processes. We'll see tomorrow.
>>>> 
>>>> On 13.6.2011, at 15:58, Justice London wrote:
>>>> 
>>>>> Disable io-cache and up the threads to 64 and your problems should disappear. They did for me when
>>>>> I made both of these changes.
>>>>> Justice London
>>>>> *From:*gluster-users-bounces at gluster.org
>>>>> <mailto:gluster-users-bounces at gluster.org>[mailto:gluster-users-bounces at gluster.org]*On Behalf
>>>>> Of*Jiri Lunacek
>>>>> *Sent:*Monday, June 13, 2011 1:49 AM
>>>>> *To:*gluster-users at gluster.org <mailto:gluster-users at gluster.org>
>>>>> *Subject:*Apache hung tasks still occur with glusterfs 3.2.1
>>>>> Hi all.
>>>>> We have been having problems with hung tasks of apache reading from glusterfs 2-replica volume
>>>>> ever since upgrading to 3.2.0. The problems were identical to those described here:
>>>>> http://gluster.org/pipermail/gluster-users/2011-May/007697.html
>>>>> Yesterday we updated to 3.2.1.
>>>>> A good thing is that the hung tasks stopped appearing when gluster is in "intact" operation, i.e.
>>>>> when there are no modifications to the gluster configs at all.
>>>>> Today we modified some other volume exported by the same cluster (but not sharing anything with
>>>>> the volume used by the apache process). And, once again, two requests of apache reading from
>>>>> glusterfs volume are stuck.
>>>>> Any help with this issue would be very appreciated as right now we have to nightly-reboot the
>>>>> machine as the processes re stuck in iowait -> unkillable.
>>>>> I really do not want to go through the downgrade to 3.1.4 since it seems from the mailing list
>>>>> that it may not go exactly smooth. We are exporting millions of files and any large operation on
>>>>> the exported filesystem takes days.
>>>>> I am attaching tech info on the problem.
>>>>> client:
>>>>> Centos 5.6
>>>>> 2.6.18-238.9.1.el5
>>>>> fuse-2.7.4-8.el5
>>>>> glusterfs-fuse-3.2.1-1
>>>>> glusterfs-core-3.2.1-1
>>>>> servers:
>>>>> Centos 5.6
>>>>> 2.6.18-194.32.1.el5
>>>>> fuse-2.7.4-8.el5
>>>>> glusterfs-fuse-3.2.1-1
>>>>> glusterfs-core-3.2.1-1
>>>>> dmesg:
>>>>> INFO: task httpd:1246 blocked for more than 120 seconds.
>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>> httpd D ffff81000101d7a0 0 1246 2394 1247 1191 (NOTLB)
>>>>> ffff81013ee7dc38 0000000000000082 0000000000000092 ffff81013ee7dcd8
>>>>> ffff81013ee7dd04 000000000000000a ffff810144d0f7e0 ffff81019fc28100
>>>>> 0000308f8b444727 00000000000014ee ffff810144d0f9c8 000000038006e608
>>>>> Call Trace:
>>>>> [<ffffffff8006ec4e>] do_gettimeofday+0x40/0x90
>>>>> [<ffffffff80028c5a>] sync_page+0x0/0x43
>>>>> [<ffffffff800637ca>] io_schedule+0x3f/0x67
>>>>> [<ffffffff80028c98>] sync_page+0x3e/0x43
>>>>> [<ffffffff8006390e>] __wait_on_bit_lock+0x36/0x66
>>>>> [<ffffffff8003ff27>] __lock_page+0x5e/0x64
>>>>> [<ffffffff800a2921>] wake_bit_function+0x0/0x23
>>>>> [<ffffffff8003fd85>] pagevec_lookup+0x17/0x1e
>>>>> [<ffffffff800cc666>] invalidate_inode_pages2_range+0x73/0x1bd
>>>>> [<ffffffff8004fc94>] finish_wait+0x32/0x5d
>>>>> [<ffffffff884b9798>] :fuse:wait_answer_interruptible+0xb6/0xbd
>>>>> [<ffffffff800a28f3>] autoremove_wake_function+0x0/0x2e
>>>>> [<ffffffff8009a485>] recalc_sigpending+0xe/0x25
>>>>> [<ffffffff8001decc>] sigprocmask+0xb7/0xdb
>>>>> [<ffffffff884bd456>] :fuse:fuse_finish_open+0x36/0x62
>>>>> [<ffffffff884bda11>] :fuse:fuse_open_common+0x147/0x158
>>>>> [<ffffffff884bda22>] :fuse:fuse_open+0x0/0x7
>>>>> [<ffffffff8001eb99>] __dentry_open+0xd9/0x1dc
>>>>> [<ffffffff8002766e>] do_filp_open+0x2a/0x38
>>>>> [<ffffffff8001a061>] do_sys_open+0x44/0xbe
>>>>> [<ffffffff8005d28d>] tracesys+0xd5/0xe0
>>>>> INFO: task httpd:1837 blocked for more than 120 seconds.
>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>> httpd D ffff810001004420 0 1837 2394 1856 1289 (NOTLB)
>>>>> ffff81013c6f9c38 0000000000000086 ffff81013c6f9bf8 00000000fffffffe
>>>>> ffff810170ce7000 000000000000000a ffff81019c0ae7a0 ffffffff80311b60
>>>>> 0000308c0f83d792 0000000000000ec4 ffff81019c0ae988 000000008006e608
>>>>> Call Trace:
>>>>> [<ffffffff8006ec4e>] do_gettimeofday+0x40/0x90
>>>>> [<ffffffff80028c5a>] sync_page+0x0/0x43
>>>>> [<ffffffff800637ca>] io_schedule+0x3f/0x67
>>>>> [<ffffffff80028c98>] sync_page+0x3e/0x43
>>>>> [<ffffffff8006390e>] __wait_on_bit_lock+0x36/0x66
>>>>> [<ffffffff8003ff27>] __lock_page+0x5e/0x64
>>>>> [<ffffffff800a2921>] wake_bit_function+0x0/0x23
>>>>> [<ffffffff8003fd85>] pagevec_lookup+0x17/0x1e
>>>>> [<ffffffff800cc666>] invalidate_inode_pages2_range+0x73/0x1bd
>>>>> [<ffffffff8004fc94>] finish_wait+0x32/0x5d
>>>>> [<ffffffff884b9798>] :fuse:wait_answer_interruptible+0xb6/0xbd
>>>>> [<ffffffff800a28f3>] autoremove_wake_function+0x0/0x2e
>>>>> [<ffffffff8009a485>] recalc_sigpending+0xe/0x25
>>>>> [<ffffffff8001decc>] sigprocmask+0xb7/0xdb
>>>>> [<ffffffff884bd456>] :fuse:fuse_finish_open+0x36/0x62
>>>>> [<ffffffff884bda11>] :fuse:fuse_open_common+0x147/0x158
>>>>> [<ffffffff884bda22>] :fuse:fuse_open+0x0/0x7
>>>>> [<ffffffff8001eb99>] __dentry_open+0xd9/0x1dc
>>>>> [<ffffffff8002766e>] do_filp_open+0x2a/0x38
>>>>> [<ffffffff8001a061>] do_sys_open+0x44/0xbe
>>>>> [<ffffffff8005d28d>] tracesys+0xd5/0xe0
>>>>> INFO: task httpd:383 blocked for more than 120 seconds.
>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>> httpd D ffff81019fa21100 0 383 2394 534 (NOTLB)
>>>>> ffff81013e497c08 0000000000000082 ffff810183eb8910 ffffffff884b9219
>>>>> ffff81019e41c600 0000000000000009 ffff81019b1e2100 ffff81019fa21100
>>>>> 0000308c0e2c2bfb 0000000000016477 ffff81019b1e22e8 000000038006e608
>>>>> Call Trace:
>>>>> [<ffffffff884b9219>] :fuse:flush_bg_queue+0x2b/0x48
>>>>> [<ffffffff8006ec4e>] do_gettimeofday+0x40/0x90
>>>>> [<ffffffff8005a9f9>] getnstimeofday+0x10/0x28
>>>>> [<ffffffff80028c5a>] sync_page+0x0/0x43
>>>>> [<ffffffff800637ca>] io_schedule+0x3f/0x67
>>>>> [<ffffffff80028c98>] sync_page+0x3e/0x43
>>>>> [<ffffffff8006390e>] __wait_on_bit_lock+0x36/0x66
>>>>> [<ffffffff8003ff27>] __lock_page+0x5e/0x64
>>>>> [<ffffffff800a2921>] wake_bit_function+0x0/0x23
>>>>> [<ffffffff8000c48d>] do_generic_mapping_read+0x1df/0x359
>>>>> [<ffffffff8000d279>] file_read_actor+0x0/0x159
>>>>> [<ffffffff8000c753>] __generic_file_aio_read+0x14c/0x198
>>>>> [<ffffffff800c8c45>] generic_file_read+0xac/0xc5
>>>>> [<ffffffff800a28f3>] autoremove_wake_function+0x0/0x2e
>>>>> [<ffffffff80130778>] selinux_file_permission+0x9f/0xb4
>>>>> [<ffffffff8000b78d>] vfs_read+0xcb/0x171
>>>>> [<ffffffff80011d34>] sys_read+0x45/0x6e
>>>>> [<ffffffff8005d28d>] tracesys+0xd5/0xe0
>>>>> possibly relevant log messages:
>>>>> [2011-06-13 09:09:14.985576] W [socket.c:1494:__socket_proto_state_machine] 0-glusterfs: reading
>>>>> from socket failed. Error (Transport endpoint is not connected), peer (81.0.225.122:24007)
>>>>> [2011-06-13 09:09:25.741055] I [glusterfsd-mgmt.c:637:mgmt_getspec_cbk] 0-: No change in volfile,
>>>>> continuing
>>>>> [2011-06-13 09:59:00.644130] I [afr-common.c:639:afr_lookup_self_heal_check] 0-isifa1-replicate-0:
>>>>> size differs for /data/foto/thumbs/3140/31409780.jpg
>>>>> [2011-06-13 09:59:00.644269] I [afr-common.c:801:afr_lookup_done] 0-isifa1-replicate-0: background
>>>>> meta-data data self-heal triggered. path: /data/foto/thumbs/3140/31409780.jpg
>>>>> [2011-06-13 09:59:00.822524] W [dict.c:437:dict_ref]
>>>>> (-->/opt/glusterfs/3.2.1/lib64/glusterfs/3.2.1/xlator/protocol/client.so(client3_1_fstat_cbk+0x2cb) [0x2aaaab1afa0b]
>>>>> (-->/opt/glusterfs/3.2.1/lib64/glusterfs/3.2.1/xlator/cluster/replicate.so(afr_sh_data_fstat_cbk+0x17d)
>>>>> [0x2aaaab3e4c9d]
>>>>> (-->/opt/glusterfs/3.2.1/lib64/glusterfs/3.2.1/xlator/cluster/replicate.so(afr_sh_data_fix+0x1fc)
>>>>> [0x2aaaab3e493c]))) 0-dict: dict is NULL
>>>>> [2011-06-13 09:59:00.824323] I [afr-common.c:639:afr_lookup_self_heal_check] 0-isifa1-replicate-0:
>>>>> size differs for /data/foto/thumbs/3140/31409781.jpg
>>>>> [2011-06-13 09:59:00.824356] I [afr-common.c:801:afr_lookup_done] 0-isifa1-replicate-0: background
>>>>> meta-data data self-heal triggered. path: /data/foto/thumbs/3140/31409781.jpg
>>>>> [2011-06-13 09:59:00.826494] I [afr-self-heal-common.c:1557:afr_self_heal_completion_cbk]
>>>>> 0-isifa1-replicate-0: background meta-data data self-heal completed on
>>>>> /data/foto/thumbs/3140/31409780.jpg
>>>>> [2011-06-13 09:59:00.830902] W [dict.c:437:dict_ref]
>>>>> (-->/opt/glusterfs/3.2.1/lib64/glusterfs/3.2.1/xlator/protocol/client.so(client3_1_fstat_cbk+0x2cb) [0x2aaaab1afa0b]
>>>>> (-->/opt/glusterfs/3.2.1/lib64/glusterfs/3.2.1/xlator/cluster/replicate.so(afr_sh_data_fstat_cbk+0x17d)
>>>>> [0x2aaaab3e4c9d]
>>>>> (-->/opt/glusterfs/3.2.1/lib64/glusterfs/3.2.1/xlator/cluster/replicate.so(afr_sh_data_fix+0x1fc)
>>>>> [0x2aaaab3e493c]))) 0-dict: dict is NULL
>>>>> gluster-server volume config:
>>>>> volume isifa1-posix
>>>>> type storage/posix
>>>>> option directory /mnt/data/isifa1
>>>>> end-volume
>>>>> volume isifa1-access-control
>>>>> type features/access-control
>>>>> subvolumes isifa1-posix
>>>>> end-volume
>>>>> volume isifa1-locks
>>>>> type features/locks
>>>>> subvolumes isifa1-access-control
>>>>> end-volume
>>>>> volume isifa1-io-threads
>>>>> type performance/io-threads
>>>>> subvolumes isifa1-locks
>>>>> end-volume
>>>>> volume isifa1-marker
>>>>> type features/marker
>>>>> option volume-uuid 39d5c509-ad39-4b24-a272-c33e212cf912
>>>>> option timestamp-file /etc/glusterd/vols/isifa1/marker.tstamp
>>>>> option xtime off
>>>>> option quota off
>>>>> subvolumes isifa1-io-threads
>>>>> end-volume
>>>>> volume /mnt/data/isifa1
>>>>> type debug/io-stats
>>>>> option latency-measurement off
>>>>> option count-fop-hits off
>>>>> subvolumes isifa1-marker
>>>>> end-volume
>>>>> volume isifa1-server
>>>>> type protocol/server
>>>>> option transport-type tcp
>>>>> option auth.addr./mnt/data/isifa1.allow
>>>>> 81.0.225.120,81.0.225.121,81.0.225.90,81.0.225.117,81.0.225.118,82.208.17.113
>>>>> subvolumes /mnt/data/isifa1
>>>>> end-volume
>>>>> gluster-fuse config:
>>>>> volume isifa1-client-0
>>>>> type protocol/client
>>>>> option remote-host isifa-data1
>>>>> option remote-subvolume /mnt/data/isifa1
>>>>> option transport-type tcp
>>>>> end-volume
>>>>> volume isifa1-client-1
>>>>> type protocol/client
>>>>> option remote-host isifa-data2
>>>>> option remote-subvolume /mnt/data/isifa1
>>>>> option transport-type tcp
>>>>> end-volume
>>>>> volume isifa1-replicate-0
>>>>> type cluster/replicate
>>>>> subvolumes isifa1-client-0 isifa1-client-1
>>>>> end-volume
>>>>> volume isifa1-write-behind
>>>>> type performance/write-behind
>>>>> subvolumes isifa1-replicate-0
>>>>> end-volume
>>>>> volume isifa1-read-ahead
>>>>> type performance/read-ahead
>>>>> subvolumes isifa1-write-behind
>>>>> end-volume
>>>>> volume isifa1-io-cache
>>>>> type performance/io-cache
>>>>> subvolumes isifa1-read-ahead
>>>>> end-volume
>>>>> volume isifa1-quick-read
>>>>> type performance/quick-read
>>>>> subvolumes isifa1-io-cache
>>>>> end-volume
>>>>> volume isifa1-stat-prefetch
>>>>> type performance/stat-prefetch
>>>>> subvolumes isifa1-quick-read
>>>>> end-volume
>>>>> volume isifa1
>>>>> type debug/io-stats
>>>>> option latency-measurement off
>>>>> option count-fop-hits off
>>>>> subvolumes isifa1-stat-prefetch
>>>>> end-volume
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>> 
>>> -- 
>>> 
>>> Mag. Christopher Anderlik
>>> Leiter Technik
>>> 
>>> ________________________________________________________________________________
>>> 
>>> 
>>> Xidras GmbH
>>> Stockern 47
>>> 3744 Stockern
>>> Austria
>>> 
>>> Tel:     0043 2983 201 30 5 01
>>> Fax:     0043 2983 201 30 5 01 9
>>> Email:   christopher.anderlik at xidras.com
>>> Web:     http://www.xidras.com
>>> 
>>> FN 317036 f | Landesgericht Krems | ATU64485024
>>> 
>>> ________________________________________________________________________________
>>> 
>>> VERTRAULICHE INFORMATIONEN!
>>> Diese eMail enth?lt vertrauliche Informationen und ist nur f?r den berechtigten
>>> Empf?nger bestimmt. Wenn diese eMail nicht f?r Sie bestimmt ist, bitten wir Sie,
>>> diese eMail an uns zur?ckzusenden und anschlie?end auf Ihrem Computer und
>>> Mail-Server zu l?schen. Solche eMails und Anlagen d?rfen Sie weder nutzen,
>>> noch verarbeiten oder Dritten zug?nglich machen, gleich in welcher Form.
>>> Wir danken f?r Ihre Kooperation!
>>> 
>>> CONFIDENTIAL!
>>> This email contains confidential information and is intended for the authorised
>>> recipient only. If you are not an authorised recipient, please return the email
>>> to us and then delete it from your computer and mail-server. You may neither
>>> use nor edit any such emails including attachments, nor make them accessible
>>> to third parties in any manner whatsoever.
>>> Thank you for your cooperation
>>> 
>>> ________________________________________________________________________________
>>> 
>>> 
>>> 
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gluster.org/pipermail/gluster-users/attachments/20110614/51fca046/attachment-0003.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: glusterdump.2155.gz
Type: application/x-gzip
Size: 145947 bytes
Desc: not available
URL: <http://gluster.org/pipermail/gluster-users/attachments/20110614/51fca046/attachment-0002.bin>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gluster.org/pipermail/gluster-users/attachments/20110614/51fca046/attachment-0004.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: glusterdump.2155.gz
Type: application/x-gzip
Size: 611754 bytes
Desc: not available
URL: <http://gluster.org/pipermail/gluster-users/attachments/20110614/51fca046/attachment-0003.bin>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gluster.org/pipermail/gluster-users/attachments/20110614/51fca046/attachment-0005.htm>