Re: Brick-Xlators crashes after Set-RO and Read

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Vijay,

how can I create such a core file? Or will it be created automatically if a gluster process crashes?
Maybe you can give me a hint and will try to get a backtrace.

Unfortunately this bug is not easy to reproduce because it appears only sometimes.

Regards
David Spisla

Am Mo., 6. Mai 2019 um 19:48 Uhr schrieb Vijay Bellur <vbellur@xxxxxxxxxx>:
Thank you for the report, David. Do you have core files available on any of the servers? If yes, would it be possible for you to provide a backtrace.

Regards,
Vijay

On Mon, May 6, 2019 at 3:09 AM David Spisla <spisla80@xxxxxxxxx> wrote:
Hello folks,

we have a client application (runs on Win10) which does some FOPs on a gluster volume which is accessed by SMB.

Scenario 1 is a READ Operation which reads all files successively and checks if the files data was correctly copied. While doing this, all brick processes crashes and in the logs one have this crash report on every brick log:
CTX_ID:a0359502-2c76-4fee-8cb9-365679dc690e-GRAPH_ID:0-PID:32934-HOST:XX-XXXXX-XX-XX-PC_NAME:shortterm-client-2-RECON_NO:-0, gfid: 00000000-0000-0000-0000-000000000001, req(uid:2000,gid:2000,perm:1,ngrps:1), ctx(uid:0,gid:0,in-groups:0,perm:700,updated-fop:LOOKUP, acl:-) [Permission denied]
pending frames:
frame : type(0) op(27)
frame : type(0) op(40)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash: 
2019-04-16 08:32:21
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 5.5
/usr/lib64/libglusterfs.so.0(+0x2764c)[0x7f9a5bd4d64c]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7f9a5bd57d26]
/lib64/libc.so.6(+0x361a0)[0x7f9a5af141a0]
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0xb910)[0x7f9a4ef0e910]
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x8118)[0x7f9a4ef0b118]
/usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0x128d6)[0x7f9a4f1278d6]
/usr/lib64/glusterfs/5.5/xlator/features/access-control.so(+0x575b)[0x7f9a4f35975b]
/usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0xb3b3)[0x7f9a4f1203b3]
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x85b2)[0x7f9a4ef0b5b2]
/usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7f9a5bdd7b6c]
/usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7f9a5bdd7b6c]
/usr/lib64/glusterfs/5.5/xlator/features/upcall.so(+0xf548)[0x7f9a4e8cf548]
/usr/lib64/libglusterfs.so.0(default_lookup_resume+0x1e2)[0x7f9a5bdefc22]
/usr/lib64/libglusterfs.so.0(call_resume+0x75)[0x7f9a5bd733a5]
/usr/lib64/glusterfs/5.5/xlator/performance/io-threads.so(+0x6088)[0x7f9a4e6b7088]
/lib64/libpthread.so.0(+0x7569)[0x7f9a5b29f569]
/lib64/libc.so.6(clone+0x3f)[0x7f9a5afd69af]
Scenario 2 The application just SET Read-Only on each file sucessively. After the 70th file was set, all the bricks crashes and again, one can read this crash report in every brick log:
 

[2019-05-02 07:43:39.953591] I [MSGID: 139001] [posix-acl.c:263:posix_acl_log_permit_denied] 0-longterm-access-control: client: CTX_ID:21aa9c75-3a5f-41f9-925b-48e4c80bd24a-GRAPH_ID:0-PID:16325-HOST:XXX-X-X-XXX-PC_NAME:longterm-client-0-RECON_NO:-0, gfid: 00000000-0000-0000-0000-000000000001, req(uid:2000,gid:2000,perm:1,ngrps:1), ctx(uid:0,gid:0,in-groups:0,perm:700,updated-fop:LOOKUP, acl:-) [Permission denied]

pending frames:

frame : type(0) op(27)

patchset: git://git.gluster.org/glusterfs.git

signal received: 11

time of crash:

2019-05-02 07:43:39

configuration details:

argp 1

backtrace 1

dlfcn 1

libpthread 1

llistxattr 1

setfsid 1

spinlock 1

epoll.h 1

xattr.h 1

st_atim.tv_nsec 1

package-string: glusterfs 5.5

/usr/lib64/libglusterfs.so.0(+0x2764c)[0x7fbb3f0b364c]

/usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7fbb3f0bdd26]

/lib64/libc.so.6(+0x361e0)[0x7fbb3e27a1e0]

/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0xb910)[0x7fbb32257910]

/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x8118)[0x7fbb32254118]

/usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0x128d6)[0x7fbb324708d6]

/usr/lib64/glusterfs/5.5/xlator/features/access-control.so(+0x575b)[0x7fbb326a275b]

/usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0xb3b3)[0x7fbb324693b3]

/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x85b2)[0x7fbb322545b2]

/usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7fbb3f13db6c]

/usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7fbb3f13db6c]

/usr/lib64/glusterfs/5.5/xlator/features/upcall.so(+0xf548)[0x7fbb31c18548]

/usr/lib64/libglusterfs.so.0(default_lookup_resume+0x1e2)[0x7fbb3f155c22]

/usr/lib64/libglusterfs.so.0(call_resume+0x75)[0x7fbb3f0d93a5]

/usr/lib64/glusterfs/5.5/xlator/performance/io-threads.so(+0x6088)[0x7fbb31a00088]

/lib64/libpthread.so.0(+0x7569)[0x7fbb3e605569]

/lib64/libc.so.6(clone+0x3f)[0x7fbb3e33c9ef]


This happens on a 3-Node Gluster v5.5 Cluster on two different volumes. But both volumes has the same settings:
Volume Name: shortterm
Type: Replicate
Volume ID: 5307e5c5-e8a1-493a-a846-342fb0195dee
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: fs-xxxxx-c1-n1:/gluster/brick4/glusterbrick
Brick2: fs-xxxxx-c1-n2:/gluster/brick4/glusterbrick
Brick3: fs-xxxxx-c1-n3:/gluster/brick4/glusterbrick
Options Reconfigured:
storage.reserve: 1
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
user.smb: disable
features.read-only: off
features.worm: off
features.worm-file-level: on
features.retention-mode: enterprise
features.default-retention-period: 120
network.ping-timeout: 10
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.nl-cache: on
performance.nl-cache-timeout: 600
client.event-threads: 32
server.event-threads: 32
cluster.lookup-optimize: on
performance.stat-prefetch: on
performance.cache-invalidation: on
performance.md-cache-timeout: 600
performance.cache-samba-metadata: on
performance.cache-ima-xattrs: on
performance.io-thread-count: 64
cluster.use-compound-fops: on
performance.cache-size: 512MB
performance.cache-refresh-timeout: 10
performance.read-ahead: off
performance.write-behind-window-size: 4MB
performance.write-behind: on
storage.build-pgfid: on
features.utime: on
storage.ctime: on
cluster.quorum-type: fixed
cluster.quorum-count: 2
features.bitrot: on
features.scrub: Active
features.scrub-freq: daily
cluster.enable-shared-storage: enable


Why can this happen to all Brick processes? I don't understand the crash report. The FOPs are nothing special and after restart brick processes everything works fine and our application was succeed.

Regards
David Spisla



_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux