Re: glusterfs v1.3.8 client segfaulting in io-cache

Dan Parsons <dparsons@xxxxxxxx> · Thu, 8 May 2008 09:03:45 -0700

Amar, any update on this issue?

Dan Parsons

On May 6, 2008, at 11:21 PM, Dan Parsons wrote:

Amar, quick question. I've switched to readahead but really wish I  
could use io-cache. How likely do you think changing block-size from  
128kb to 1MB (the same as what stripe uses, based on your advice)  
would fix the crash issue?

Dan Parsons

On May 6, 2008, at 12:43 PM, Amar S. Tumballi wrote:

Mostly give the same block-size (and page-size) in stripe and io- 
cache. Just
for checking it. But currently you can fall back to read-ahead.

Regards,
Amar

On Tue, May 6, 2008 at 12:38 PM, Dan Parsons <dparsons@xxxxxxxx>  
wrote:

Ah, so it's not something I'm doing wrong? Do you think changing
cache-size back to 32MB will prevent the problem from happening?

Perhaps I should switch to readahead until fix?

Dan Parsons

On May 6, 2008, at 12:37 PM, Amar S. Tumballi wrote:

Thanks for the bug report, We will get back to you in another 2-3  
days
about it. Mostly with a fix :)

Regards,
Amar

On Tue, May 6, 2008 at 10:14 AM, Dan Parsons <dparsons@xxxxxxxx>  
wrote:
Oh, one more useful bit of information, I see lines like the  
below a lot
in glusterfs log files, what do they mean?

2008-05-05 21:20:11 W [fuse-bridge.c:402:fuse_entry_cbk]  
glusterfs-fuse:
18054459: (34)
/bio/data/fast-hmmsearch-all/tmpDCex3b_fast-hmmsearch-all_job/ 
result.tigrfam.TIGR02736.hmmhits
=> 610503040 Rehashing because st_nlink less than dentry maps

Dan Parsons

On May 6, 2008, at 10:13 AM, Dan Parsons wrote:

I'm experiencing a glusterfs client crash, signal 11, under the  
io-cache
xlator. This is on our bioinformatics cluster- the crash happened  
on 2 out
of 33 machines. I've verified the hardware stability of the  
machines.

Running v1.3.8 built May 5th, 2008 from latest downloadable  
version.

Here is the crash message:

[0xffffe420]

/usr/local/lib/glusterfs/1.3.8/xlator/performance/io- 
cache.so(ioc_page_wakeup+0x67)[0xb76c5f67]

/usr/local/lib/glusterfs/1.3.8/xlator/performance/io- 
cache.so(ioc_inode_wakeup+0xb2)[0xb76c6902]

/usr/local/lib/glusterfs/1.3.8/xlator/performance/io- 
cache.so(ioc_cache_validate_cbk+0xae)[0xb76c1e5e]

/usr/local/lib/glusterfs/1.3.8/xlator/cluster/ 
stripe.so(stripe_stack_unwind_buf_cbk+0x98)[0xb76cd038]

/usr/local/lib/glusterfs/1.3.8/xlator/protocol/ 
client.so(client_fstat_cbk+0xcc)[0xb76dd13c]

/usr/local/lib/glusterfs/1.3.8/xlator/protocol/client.so(notify 
+0xa97)[0xb76db117]
/usr/local/lib/libglusterfs.so.0(transport_notify+0x38)[0xb7efe978]
/usr/local/lib/libglusterfs.so.0(sys_epoll_iteration+0xd6) 
[0xb7eff906]
/usr/local/lib/libglusterfs.so.0(poll_iteration+0x98)[0xb7efeb28]
[glusterfs](main+0x85e)[0x804a14e]
/lib/libc.so.6(__libc_start_main+0xdc)[0x7b1dec]
[glusterfs][0x8049391]

And here is my config file. The only thing I can think of is  
maybe my
cache-size is too big. I want a lot of cache, we have big files,  
and the
boxes have the RAM. Anyway, below is the config. If you see any  
problems
with it, please let me know. There are no errors on the  
glusterfsd servers,
except for an EOF from the machines where glusterfs client  
segfaulted.

volume fuse
type mount/fuse
option direct-io-mode 1
option entry-timeout 1
option attr-timeout 1
option mount-point /glusterfs
subvolumes ioc
end-volume

volume ioc
type performance/io-cache
option priority *.psiblast:3,*.seq:2,*:1
option force-revalidate-timeout 5
option cache-size 1200MB
option page-size 128KB
subvolumes stripe0
end-volume

volume stripe0
type cluster/stripe
option alu.disk-usage.exit-threshold 100MB
option alu.disk-usage.entry-threshold 2GB
option alu.write-usage.exit-threshold 4%
option alu.write-usage.entry-threshold 20%
option alu.read-usage.exit-threshold 4%
option alu.read-usage.entry-threshold 20%
option alu.order read-usage:write-usage:disk-usage
option scheduler alu
option block-size *:1MB
subvolumes distfs01 distfs02 distfs03 distfs04
end-volume

volume distfs04
type protocol/client
option remote-subvolume brick
option remote-host 10.8.101.54
option transport-type tcp/client
end-volume

volume distfs03
type protocol/client
option remote-subvolume brick
option remote-host 10.8.101.53
option transport-type tcp/client
end-volume

volume distfs02
type protocol/client
option remote-subvolume brick
option remote-host 10.8.101.52
option transport-type tcp/client
end-volume

volume distfs01
type protocol/client
option remote-subvolume brick
option remote-host 10.8.101.51
option transport-type tcp/client
end-volume

Dan Parsons

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel

--
Amar Tumballi
Gluster/GlusterFS Hacker
[bulde on #gluster/irc.gnu.org]
http://www.zresearch.com - Commoditizing Super Storage!

--
Amar Tumballi
Gluster/GlusterFS Hacker
[bulde on #gluster/irc.gnu.org]
http://www.zresearch.com - Commoditizing Super Storage!

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel