Re: glusterfs v1.3.8 client segfaulting in io-cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Amar, quick question. I've switched to readahead but really wish I could use io-cache. How likely do you think changing block-size from 128kb to 1MB (the same as what stripe uses, based on your advice) would fix the crash issue?


Dan Parsons


On May 6, 2008, at 12:43 PM, Amar S. Tumballi wrote:

Mostly give the same block-size (and page-size) in stripe and io- cache. Just
for checking it. But currently you can fall back to read-ahead.

Regards,
Amar

On Tue, May 6, 2008 at 12:38 PM, Dan Parsons <dparsons@xxxxxxxx> wrote:

Ah, so it's not something I'm doing wrong? Do you think changing
cache-size back to 32MB will prevent the problem from happening?

Perhaps I should switch to readahead until fix?


Dan Parsons



On May 6, 2008, at 12:37 PM, Amar S. Tumballi wrote:

Thanks for the bug report, We will get back to you in another 2-3 days
about it. Mostly with a fix :)

Regards,
Amar

On Tue, May 6, 2008 at 10:14 AM, Dan Parsons <dparsons@xxxxxxxx> wrote: Oh, one more useful bit of information, I see lines like the below a lot
in glusterfs log files, what do they mean?

2008-05-05 21:20:11 W [fuse-bridge.c:402:fuse_entry_cbk] glusterfs- fuse:
18054459: (34)
/bio/data/fast-hmmsearch-all/tmpDCex3b_fast-hmmsearch-all_job/ result.tigrfam.TIGR02736.hmmhits
=> 610503040 Rehashing because st_nlink less than dentry maps

Dan Parsons



On May 6, 2008, at 10:13 AM, Dan Parsons wrote:

I'm experiencing a glusterfs client crash, signal 11, under the io- cache xlator. This is on our bioinformatics cluster- the crash happened on 2 out of 33 machines. I've verified the hardware stability of the machines.

Running v1.3.8 built May 5th, 2008 from latest downloadable version.

Here is the crash message:

[0xffffe420]

/usr/local/lib/glusterfs/1.3.8/xlator/performance/io- cache.so(ioc_page_wakeup+0x67)[0xb76c5f67]

/usr/local/lib/glusterfs/1.3.8/xlator/performance/io- cache.so(ioc_inode_wakeup+0xb2)[0xb76c6902]

/usr/local/lib/glusterfs/1.3.8/xlator/performance/io- cache.so(ioc_cache_validate_cbk+0xae)[0xb76c1e5e]

/usr/local/lib/glusterfs/1.3.8/xlator/cluster/ stripe.so(stripe_stack_unwind_buf_cbk+0x98)[0xb76cd038]

/usr/local/lib/glusterfs/1.3.8/xlator/protocol/ client.so(client_fstat_cbk+0xcc)[0xb76dd13c]

/usr/local/lib/glusterfs/1.3.8/xlator/protocol/client.so(notify +0xa97)[0xb76db117]
/usr/local/lib/libglusterfs.so.0(transport_notify+0x38)[0xb7efe978]
/usr/local/lib/libglusterfs.so.0(sys_epoll_iteration+0xd6) [0xb7eff906]
/usr/local/lib/libglusterfs.so.0(poll_iteration+0x98)[0xb7efeb28]
[glusterfs](main+0x85e)[0x804a14e]
/lib/libc.so.6(__libc_start_main+0xdc)[0x7b1dec]
[glusterfs][0x8049391]

And here is my config file. The only thing I can think of is maybe my cache-size is too big. I want a lot of cache, we have big files, and the boxes have the RAM. Anyway, below is the config. If you see any problems with it, please let me know. There are no errors on the glusterfsd servers, except for an EOF from the machines where glusterfs client segfaulted.

volume fuse
type mount/fuse
option direct-io-mode 1
option entry-timeout 1
option attr-timeout 1
option mount-point /glusterfs
subvolumes ioc
end-volume

volume ioc
type performance/io-cache
option priority *.psiblast:3,*.seq:2,*:1
option force-revalidate-timeout 5
option cache-size 1200MB
option page-size 128KB
subvolumes stripe0
end-volume

volume stripe0
type cluster/stripe
option alu.disk-usage.exit-threshold 100MB
option alu.disk-usage.entry-threshold 2GB
option alu.write-usage.exit-threshold 4%
option alu.write-usage.entry-threshold 20%
option alu.read-usage.exit-threshold 4%
option alu.read-usage.entry-threshold 20%
option alu.order read-usage:write-usage:disk-usage
option scheduler alu
option block-size *:1MB
subvolumes distfs01 distfs02 distfs03 distfs04
end-volume

volume distfs04
type protocol/client
option remote-subvolume brick
option remote-host 10.8.101.54
option transport-type tcp/client
end-volume

volume distfs03
type protocol/client
option remote-subvolume brick
option remote-host 10.8.101.53
option transport-type tcp/client
end-volume

volume distfs02
type protocol/client
option remote-subvolume brick
option remote-host 10.8.101.52
option transport-type tcp/client
end-volume

volume distfs01
type protocol/client
option remote-subvolume brick
option remote-host 10.8.101.51
option transport-type tcp/client
end-volume


Dan Parsons




_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel




_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel



--
Amar Tumballi
Gluster/GlusterFS Hacker
[bulde on #gluster/irc.gnu.org]
http://www.zresearch.com - Commoditizing Super Storage!






--
Amar Tumballi
Gluster/GlusterFS Hacker
[bulde on #gluster/irc.gnu.org]
http://www.zresearch.com - Commoditizing Super Storage!





[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux