Re: glusterfs v1.3.8 client segfaulting in io-cache

"Amar S. Tumballi" <amar@xxxxxxxxxxxxx> · Tue, 6 May 2008 12:43:00 -0700

Mostly give the same block-size (and page-size) in stripe and io-cache. Just
for checking it. But currently you can fall back to read-ahead.

Regards,
Amar

On Tue, May 6, 2008 at 12:38 PM, Dan Parsons <dparsons@xxxxxxxx> wrote:

> Ah, so it's not something I'm doing wrong? Do you think changing
> cache-size back to 32MB will prevent the problem from happening?
>
> Perhaps I should switch to readahead until fix?
>
>
> Dan Parsons
>
>
>
> On May 6, 2008, at 12:37 PM, Amar S. Tumballi wrote:
>
>  Thanks for the bug report, We will get back to you in another 2-3 days
> > about it. Mostly with a fix :)
> >
> > Regards,
> > Amar
> >
> > On Tue, May 6, 2008 at 10:14 AM, Dan Parsons <dparsons@xxxxxxxx> wrote:
> > Oh, one more useful bit of information, I see lines like the below a lot
> > in glusterfs log files, what do they mean?
> >
> > 2008-05-05 21:20:11 W [fuse-bridge.c:402:fuse_entry_cbk] glusterfs-fuse:
> > 18054459: (34)
> > /bio/data/fast-hmmsearch-all/tmpDCex3b_fast-hmmsearch-all_job/result.tigrfam.TIGR02736.hmmhits
> > => 610503040 Rehashing because st_nlink less than dentry maps
> >
> > Dan Parsons
> >
> >
> >
> > On May 6, 2008, at 10:13 AM, Dan Parsons wrote:
> >
> > I'm experiencing a glusterfs client crash, signal 11, under the io-cache
> > xlator. This is on our bioinformatics cluster- the crash happened on 2 out
> > of 33 machines. I've verified the hardware stability of the machines.
> >
> > Running v1.3.8 built May 5th, 2008 from latest downloadable version.
> >
> > Here is the crash message:
> >
> > [0xffffe420]
> >
> > /usr/local/lib/glusterfs/1.3.8/xlator/performance/io-cache.so(ioc_page_wakeup+0x67)[0xb76c5f67]
> >
> > /usr/local/lib/glusterfs/1.3.8/xlator/performance/io-cache.so(ioc_inode_wakeup+0xb2)[0xb76c6902]
> >
> > /usr/local/lib/glusterfs/1.3.8/xlator/performance/io-cache.so(ioc_cache_validate_cbk+0xae)[0xb76c1e5e]
> >
> > /usr/local/lib/glusterfs/1.3.8/xlator/cluster/stripe.so(stripe_stack_unwind_buf_cbk+0x98)[0xb76cd038]
> >
> > /usr/local/lib/glusterfs/1.3.8/xlator/protocol/client.so(client_fstat_cbk+0xcc)[0xb76dd13c]
> >
> > /usr/local/lib/glusterfs/1.3.8/xlator/protocol/client.so(notify+0xa97)[0xb76db117]
> > /usr/local/lib/libglusterfs.so.0(transport_notify+0x38)[0xb7efe978]
> > /usr/local/lib/libglusterfs.so.0(sys_epoll_iteration+0xd6)[0xb7eff906]
> > /usr/local/lib/libglusterfs.so.0(poll_iteration+0x98)[0xb7efeb28]
> > [glusterfs](main+0x85e)[0x804a14e]
> > /lib/libc.so.6(__libc_start_main+0xdc)[0x7b1dec]
> > [glusterfs][0x8049391]
> >
> > And here is my config file. The only thing I can think of is maybe my
> > cache-size is too big. I want a lot of cache, we have big files, and the
> > boxes have the RAM. Anyway, below is the config. If you see any problems
> > with it, please let me know. There are no errors on the glusterfsd servers,
> > except for an EOF from the machines where glusterfs client segfaulted.
> >
> > volume fuse
> >  type mount/fuse
> >  option direct-io-mode 1
> >  option entry-timeout 1
> >  option attr-timeout 1
> >  option mount-point /glusterfs
> >  subvolumes ioc
> > end-volume
> >
> > volume ioc
> >  type performance/io-cache
> >  option priority *.psiblast:3,*.seq:2,*:1
> >  option force-revalidate-timeout 5
> >  option cache-size 1200MB
> >  option page-size 128KB
> >  subvolumes stripe0
> > end-volume
> >
> > volume stripe0
> >  type cluster/stripe
> >  option alu.disk-usage.exit-threshold 100MB
> >  option alu.disk-usage.entry-threshold 2GB
> >  option alu.write-usage.exit-threshold 4%
> >  option alu.write-usage.entry-threshold 20%
> >  option alu.read-usage.exit-threshold 4%
> >  option alu.read-usage.entry-threshold 20%
> >  option alu.order read-usage:write-usage:disk-usage
> >  option scheduler alu
> >  option block-size *:1MB
> >  subvolumes distfs01 distfs02 distfs03 distfs04
> > end-volume
> >
> > volume distfs04
> >  type protocol/client
> >  option remote-subvolume brick
> >  option remote-host 10.8.101.54
> >  option transport-type tcp/client
> > end-volume
> >
> > volume distfs03
> >  type protocol/client
> >  option remote-subvolume brick
> >  option remote-host 10.8.101.53
> >  option transport-type tcp/client
> > end-volume
> >
> > volume distfs02
> >  type protocol/client
> >  option remote-subvolume brick
> >  option remote-host 10.8.101.52
> >  option transport-type tcp/client
> > end-volume
> >
> > volume distfs01
> >  type protocol/client
> >  option remote-subvolume brick
> >  option remote-host 10.8.101.51
> >  option transport-type tcp/client
> > end-volume
> >
> >
> > Dan Parsons
> >
> >
> >
> >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel@xxxxxxxxxx
> > http://lists.nongnu.org/mailman/listinfo/gluster-devel
> >
> >
> >
> >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel@xxxxxxxxxx
> > http://lists.nongnu.org/mailman/listinfo/gluster-devel
> >
> >
> >
> > --
> > Amar Tumballi
> > Gluster/GlusterFS Hacker
> > [bulde on #gluster/irc.gnu.org]
> > http://www.zresearch.com - Commoditizing Super Storage!
> >
>
>
>

-- 
Amar Tumballi
Gluster/GlusterFS Hacker
[bulde on #gluster/irc.gnu.org]
http://www.zresearch.com - Commoditizing Super Storage!