Re: Segfault in read_ahead in 1.3.0_pre4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Is your find test definitely without stat-prefetch? Not finding a file or directory that was clearly present was one type of glitch I encountered in my tests, and it was occurring only with stat-prefetch.

Thanks,

Brent

On Thu, 24 May 2007, Harris Landgarten wrote:

Brent,

I have now switched all clients and servers to 2.4-Mainline. The read-ahead crash is fixed. The spurious disconnect errors remain but I suspect it may be tied to large reads. I cannot get through a large mailbox reindex without a disconnect in the middle which is now recovered from but leaves 300-1000 unindexed files. I turned off stat-prefetch because my application didn't need it but I didn't notice any errors with it on.

One thing I am noticing. When I run:

 find /mnt/gluster/folder -type f -exec md5sum {} \;

on a tree with over 50,000 files, it does not run to completion. After about 30,000 files or so if fails with cannot find file or folder errors. It does not seem to be tied to any errors in the logs. find without the exec and du run correctly on the same tree. Do you have a way of duplicating this test?

Harris

----- Original Message -----
From: "Brent A Nelson" <brent@xxxxxxxxxxxx>
To: "Anand Avati" <avati@xxxxxxxxxxxxx>
Cc: "Harris Landgarten" <harrisl@xxxxxxxxxxxxx>, gluster-devel@xxxxxxxxxx
Sent: Thursday, May 24, 2007 11:30:16 AM (GMT-0500) America/New_York
Subject: Re: Segfault in read_ahead in 1.3.0_pre4

I was getting the same behavior in my testing.  After I reported it, the
readahead crash was quickly patched, but the random disconnect is still
very much a mystery...

I noticed that you are using stat-prefetch; have you encountered any
issues? I was finding that du's on complex directories could return
abnormal results and/or errors, so it seemed that heavy metadata queries
were occassionally glitchy.  Without stat-prefetch, it's been fine.  If
you've been having good luck with it, maybe I should try again.

Thanks,

Brent

On Thu, 24 May 2007, Anand Avati wrote:

Harris,
this bug was fixed a few days back and the fix is available in the
glusterfs--mainline--2.4 repository latest checkout.

thanks,
avati

2007/5/24, Harris Landgarten <harrisl@xxxxxxxxxxxxx>:
I am running glusterfs in a very basic configuration on Amazon EC
instances. I have a 2 brick cluster and 2 clients. One of the clients is
running Zimbra and I am using the cluster as secondary storage for the mail
store. I have repeatedly tried to reindex a mailbox with 31000 items. Most
of the email is on the cluster. The entire process takes about 2 hours.
Part way through I get at least one TCP Disconnect which seems random. With
read_ahead enabled on the client, the disconnect results in a segfault and
the mount point disappears. When I disabled read_ahead on the client, the
disconnect was recovered from, and the process completed. This is the
backtrace from the read_ahead segfault:

[May 23 20:00:37] [CRITICAL/client-protocol.c:218/call_bail()]
client/protocol:bailing transport
[May 23 20:00:37] [DEBUG/tcp.c:123/cont_hand()] tcp:forcing poll/read/write
to break on blocked socket (if any)
[May 23 20:00:37] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw:
0 bytes r/w instead of 113 (errno=115)
[May 23 20:00:37] [DEBUG/protocol.c:244/gf_block_unserialize_transport()]
libglusterfs/protocol:gf_block_unserialize_transport: full_read of header
failed
[May 23 20:00:37] [DEBUG/client-protocol.c:2605/client_protocol_cleanup()]
protocol/client:cleaning up state in transport object 0x8078a08
[May 23 20:00:37] [CRITICAL/common-utils.c:215/gf_print_trace()]
debug-backtrace:Got signal (11), printing backtrace
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()]
debug-backtrace:/usr/lib/libglusterfs.so.0(gf_print_trace+0x2d)
[0xb7f2584d]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()]
debug-backtrace:[0xbfffe420]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()]
debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/performance/read-ahead.so(ra_page_error+0x47)
[0xb755e587]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()]
debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/performance/read-ahead.so
[0xb755ecf0]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()]
debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/performance/write-behind.so
[0xb7561809]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()]
debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/cluster/unify.so
[0xb7564919]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()]
debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/protocol/client.so
[0xb756d17b]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()]
debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/protocol/client.so
[0xb75717a5]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()]
debug-backtrace:/usr/lib/libglusterfs.so.0(transport_notify+0x1d)
[0xb7f26d2d]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()]
debug-backtrace:/usr/lib/libglusterfs.so.0(sys_epoll_iteration+0xe7)
[0xb7f279d7]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()]
debug-backtrace:/usr/lib/libglusterfs.so.0(poll_iteration+0x1d)
[0xb7f26ddd]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()]
debug-backtrace:glusterfs [0x804a15e]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()]
debug-backtrace:/lib/libc.so.6(__libc_start_main+0xdc) [0xb7dca8cc]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()]
debug-backtrace:glusterfs [0x8049e71]
Segmentation fault (core dumped)

This is a sample of the debug log with read_ahead turned off

[May 24 05:35:05] [CRITICAL/client-protocol.c:218/call_bail()]
client/protocol:bailing transport
[May 24 05:35:05] [DEBUG/tcp.c:123/cont_hand()] tcp:forcing poll/read/write
to break on blocked socket (if any)
[May 24 05:35:05] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw:
0 bytes r/w instead of 113 (errno=115)
[May 24 05:35:05] [DEBUG/protocol.c:244/gf_block_unserialize_transport()]
libglusterfs/protocol:gf_block_unserialize_transport: full_read of header
failed
[May 24 05:35:05] [DEBUG/client-protocol.c:2605/client_protocol_cleanup()]
protocol/client:cleaning up state in transport object 0x80783d0
[May 24 05:35:05] [CRITICAL/tcp.c:81/tcp_disconnect()]
transport/tcp:client1: connection to server disconnected
[May 24 05:35:05] [DEBUG/tcp-client.c:180/tcp_connect()] transport: tcp:
:try_connect: socket fd = 4
[May 24 05:35:05] [DEBUG/tcp-client.c:202/tcp_connect()] transport: tcp:
:try_connect: finalized on port `1022'
[May 24 05:35:05] [DEBUG/tcp-client.c:226/tcp_connect()]
tcp/client:try_connect: defaulting remote-port to 6996
[May 24 05:35:05] [DEBUG/tcp-client.c:262/tcp_connect()] tcp/client:connect
on 4 in progress (non-blocking)
[May 24 05:35:05] [DEBUG/tcp-client.c:301/tcp_connect()]
tcp/client:connection on 4 still in progress - try later
[May 24 05:35:05] [ERROR/client-protocol.c:204/client_protocol_xfer()]
protocol/client:transport_submit failed
[May 24 05:35:05] [DEBUG/client-protocol.c:2605/client_protocol_cleanup()]
protocol/client:cleaning up state in transport object 0x80783d0
[May 24 05:35:11] [DEBUG/tcp-client.c:310/tcp_connect()]
tcp/client:connection on 4 success, attempting to handshake
[May 24 05:35:11] [DEBUG/tcp-client.c:54/do_handshake()]
transport/tcp-client:dictionary length = 50
[May 24 07:20:10] [DEBUG/stat-prefetch.c:58/stat_prefetch_cache_flush()]
stat-prefetch:flush on: /
[May 24 07:20:20] [DEBUG/stat-prefetch.c:58/stat_prefetch_cache_flush()]
stat-prefetch:flush on: /backups/sessions
[May 24 07:57:12] [CRITICAL/client-protocol.c:218/call_bail()]
client/protocol:bailing transport
[May 24 07:57:12] [DEBUG/tcp.c:123/cont_hand()] tcp:forcing poll/read/write
to break on blocked socket (if any)
[May 24 07:57:12] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw:
0 bytes r/w instead of 113 (errno=115)
[May 24 07:57:12] [DEBUG/protocol.c:244/gf_block_unserialize_transport()]
libglusterfs/protocol:gf_block_unserialize_transport: full_read of header
failed
[May 24 07:57:12] [DEBUG/client-protocol.c:2605/client_protocol_cleanup()]
protocol/client:cleaning up state in transport object 0x80783d0
[May 24 07:57:12] [CRITICAL/tcp.c:81/tcp_disconnect()]
transport/tcp:client1: connection to server disconnected
[May 24 07:57:12] [DEBUG/tcp-client.c:180/tcp_connect()] transport: tcp:
:try_connect: socket fd = 4
[May 24 07:57:12] [DEBUG/tcp-client.c:202/tcp_connect()] transport: tcp:
:try_connect: finalized on port `1023'
[May 24 07:57:12] [DEBUG/tcp-client.c:226/tcp_connect()]
tcp/client:try_connect: defaulting remote-port to 6996
[May 24 07:57:12] [DEBUG/tcp-client.c:262/tcp_connect()] tcp/client:connect
on 4 in progress (non-blocking)
[May 24 07:57:12] [DEBUG/tcp-client.c:301/tcp_connect()]
tcp/client:connection on 4 still in progress - try later
[May 24 07:57:12] [ERROR/client-protocol.c:204/client_protocol_xfer()]
protocol/client:transport_submit failed
[May 24 07:57:12] [DEBUG/client-protocol.c:2605/client_protocol_cleanup()]
protocol/client:cleaning up state in transport object 0x80783d0
[May 24 07:57:12] [DEBUG/tcp-client.c:310/tcp_connect()]
tcp/client:connection on 4 success, attempting to handshake
[May 24 07:57:12] [DEBUG/tcp-client.c:54/do_handshake()]
transport/tcp-client:dictionary length = 50

This is the client config with read_ahead

### Add client feature and attach to remote subvolume
volume client1
  type protocol/client
  option transport-type tcp/client     # for TCP/IP transport
# option ibv-send-work-request-size  131072
# option ibv-send-work-request-count 64
# option ibv-recv-work-request-size  131072
# option ibv-recv-work-request-count 64
# option transport-type ib-sdp/client  # for Infiniband transport
# option transport-type ib-verbs/client # for ib-verbs transport
  option remote-host xx.xxx.xx.xxx     # IP address of the remote brick
# option remote-port 6996              # default server port is 6996

# option transport-timeout 30          # seconds to wait for a reply
                                       # from server for each request
  option remote-subvolume brick        # name of the remote volume
end-volume

### Add client feature and attach to remote subvolume
volume client2
  type protocol/client
    option transport-type tcp/client     # for TCP/IP transport
    # option ibv-send-work-request-size  131072
    # option ibv-send-work-request-count 64
    # option ibv-recv-work-request-size  131072
    # option ibv-recv-work-request-count 64
    # option transport-type ib-sdp/client  # for Infiniband transport
    # option transport-type ib-verbs/client # for ib-verbs transport
      option remote-host yy.yyy.yy.yyy     # IP address of the remote brick
    # option remote-port 6996              # default server port is 6996

    # option transport-timeout 30          # seconds to wait for a reply
                                           # from server for each request
      option remote-subvolume brick        # name of the remote volume
end-volume

volume bricks
  type cluster/unify
    subvolumes client1 client2
    option scheduler alu
    option alu.limits.min-free-disk 4GB
    option alu.limits.max-open-files 10000

    option alu.order disk-usage:read-usage:write-usage:open-files-usage
    option alu.disk-usage.entry-threshold 2GB
    option alu.disk-usage.exit-threshold 10GB
    option alu.open-files-usage.entry-threshold 1024
    option alu.open-files-usage.exit-threshold 32
    option alu.stat-refresh.interval 10sec

end-volume
#

### Add writeback feature
volume writeback
  type performance/write-behind
  option aggregate-size 131072 # unit in bytes
  subvolumes bricks
end-volume

### Add readahead feature
volume readahead
  type performance/read-ahead
  option page-size 65536     # unit in bytes
  option page-count 16       # cache per file  = (page-count x page-size)
  subvolumes writeback
end-volume

### Add stat-prefetch feature
### If you are not concerned about performance of interactive commands
### like "ls -l", you wouln't need this translator.
volume statprefetch
   type performance/stat-prefetch
   option cache-seconds 2   # timeout for stat cache
   subvolumes readahead
end-volume

This is the brick config:

### Export volume "brick" with the contents of "/home/export" directory.
volume brick
  type storage/posix                   # POSIX FS translator
  option directory /mnt/export        # Export this directory
end-volume

volume iothreads
  type performance/io-threads
  option thread-count 8
  subvolumes brick
end-volume

### Add network serving capability to above brick.
volume server
  type protocol/server
  option transport-type tcp/server     # For TCP/IP transport
# option ibv-send-work-request-size  131072
# option ibv-send-work-request-count 64
# option ibv-recv-work-request-size  131072
# option ibv-recv-work-request-count 64
# option transport-type ib-sdp/server  # For Infiniband transport
# option transport-type ib-verbs/server # For ib-verbs transport
# option bind-address 192.168.1.10     # Default is to listen on all
interfaces
# option listen-port 6996              # Default is 6996
# option client-volume-filename /etc/glusterfs/glusterfs-client.vol
  subvolumes iothreads
# NOTE: Access to any volume through protocol/server is denied by
# default. You need to explicitly grant access through # "auth"
# option.
  option auth.ip.brick.allow * # Allow access to "brick" volume
end-volume






_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel



--
Anand V. Avati


_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel







[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux