Re: Sporadic Bus error on mmap() on FUSE mount

Niels de Vos <ndevos@xxxxxxxxxx> · Tue, 18 Jul 2017 14:05:40 +0200

On Tue, Jul 18, 2017 at 01:55:17PM +0200, Jan Wrona wrote:
> On 18.7.2017 12:17, Niels de Vos wrote:
> > On Tue, Jul 18, 2017 at 10:48:45AM +0200, Jan Wrona wrote:
> > > Hi,
> > > 
> > > I need to use rrdtool on top of a Gluster FUSE mount, rrdtool uses
> > > memory-mapped file IO extensively (I know I can recompile rrdtool with
> > > mmap() disabled, but that is just a workaround). I have three FUSE mount
> > > points on three different servers, on one of them the command "rrdtool
> > > create test.rrd --start 920804400 DS:speed:COUNTER:600:U:U
> > > RRA:AVERAGE:0.5:1:24" works fine, on the other two servers the command is
> > > killed and Bus error is reported. With every Bus error, following two lines
> > > rise in the mount log:
> > > [2017-07-18 08:30:22.470770] E [MSGID: 108008]
> > > [afr-transaction.c:2629:afr_write_txn_refresh_done] 0-flow-replicate-0:
> > > Failing FALLOCATE on gfid 6a675cdd-2ea1-473f-8765-2a4c935a22ad: split-brain
> > > observed. [Input/output error]
> > > [2017-07-18 08:30:22.470843] W [fuse-bridge.c:1291:fuse_err_cbk]
> > > 0-glusterfs-fuse: 56589: FALLOCATE() ERR => -1 (Input/output error)
> > > 
> > > I'm not sure about current state of mmap() on FUSE and Gluster, but its
> > > strange that it works only on certain mount of the same volume.
> > This can be caused when a mmap()'d region is not written. For example,
> > trying to read/write the mmap()'d region that is after the end-of-file.
> > I've seen issues like this before (long ago), and that got fixed in the
> > write-behind xlator.
> > 
> > Could you disable the performance.write-behind option for the volume and
> > try to reproduce the problem? If the issue is in write-behind, disabling
> > it should prevent the issue.
> > 
> > If this helps, please file a bug with strace of the application and
> > tcpdump that contains the GlusterFS traffic from start to end when the
> > problem is observed.
> 
> I've disabled the performance.write-behind, umounted, stopped and started
> the volume, then mounted again, but no effect. After that I've been
> successively disabling/enabling options and xlators, and I've found that the
> problem is related to the cluster.nufa option. When NUFA translator is
> disabled, rrdtool works fine on all mounts. When enabled again, the problem
> shows up again.

Thanks for testing. NUFA is not something that is used a lot, and I
think it only has benefits for very few workloads. I dont think we can
recommend using NUFA.

In any case, this seems to be a bug in the NUFA xlator, please file a
bug for that never the less. In the bug, please point to this discussion
in the mailinglist archives.

  http://lists.gluster.org/pipermail/gluster-users/ (find the URL there)

Thanks,
Niels

> 
> > 
> >    https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS&component=write-behind
> > 
> > HTH,
> > Niels
> > 
> > 
> > > version: glusterfs 3.10.3
> > > 
> > > [root@dc1]# gluster volume info flow
> > > Volume Name: flow
> > > Type: Distributed-Replicate
> > > Volume ID: dc6a9ea0-97ec-471f-b763-1d395ece73e1
> > > Status: Started
> > > Snapshot Count: 0
> > > Number of Bricks: 3 x 2 = 6
> > > Transport-type: tcp
> > > Bricks:
> > > Brick1: dc1.liberouter.org:/data/glusterfs/flow/brick1/safety_dir
> > > Brick2: dc2.liberouter.org:/data/glusterfs/flow/brick2/safety_dir
> > > Brick3: dc2.liberouter.org:/data/glusterfs/flow/brick1/safety_dir
> > > Brick4: dc3.liberouter.org:/data/glusterfs/flow/brick2/safety_dir
> > > Brick5: dc3.liberouter.org:/data/glusterfs/flow/brick1/safety_dir
> > > Brick6: dc1.liberouter.org:/data/glusterfs/flow/brick2/safety_dir
> > > Options Reconfigured:
> > > performance.parallel-readdir: on
> > > performance.client-io-threads: on
> > > cluster.nufa: enable
> > > network.ping-timeout: 10
> > > transport.address-family: inet
> > > nfs.disable: true
> > > 
> > > [root@dc1]# gluster volume status flow
> > > Status of volume: flow
> > > Gluster process                             TCP Port  RDMA Port Online  Pid
> > > ------------------------------------------------------------------------------
> > > Brick dc1.liberouter.org:/data/glusterfs/fl
> > > ow/brick1/safety_dir                        49155     0 Y       26441
> > > Brick dc2.liberouter.org:/data/glusterfs/fl
> > > ow/brick2/safety_dir                        49155     0 Y       26110
> > > Brick dc2.liberouter.org:/data/glusterfs/fl
> > > ow/brick1/safety_dir                        49156     0 Y       26129
> > > Brick dc3.liberouter.org:/data/glusterfs/fl
> > > ow/brick2/safety_dir                        49152     0 Y       8703
> > > Brick dc3.liberouter.org:/data/glusterfs/fl
> > > ow/brick1/safety_dir                        49153     0 Y       8722
> > > Brick dc1.liberouter.org:/data/glusterfs/fl
> > > ow/brick2/safety_dir                        49156     0 Y       26460
> > > Self-heal Daemon on localhost               N/A       N/A Y       26493
> > > Self-heal Daemon on dc2.liberouter.org      N/A       N/A Y       26151
> > > Self-heal Daemon on dc3.liberouter.org      N/A       N/A Y       8744
> > > 
> > > Task Status of Volume flow
> > > ------------------------------------------------------------------------------
> > > There are no active volume tasks
> > > 
> > > _______________________________________________
> > > Gluster-users mailing list
> > > Gluster-users@xxxxxxxxxxx
> > > http://lists.gluster.org/mailman/listinfo/gluster-users
> 
> 
Attachment:
signature.asc

Description: PGP signature
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users