Re: NFS service dying

Giuseppe Ragusa <giuseppe.ragusa@xxxxxxxxxxx> · Tue, 17 Jan 2017 00:16:14 +0100

On Mon, Jan 16, 2017, at 11:18, Niels de Vos wrote:
> On Fri, Jan 13, 2017 at 06:43:33PM +0100, Giuseppe Ragusa wrote:
> > On Fri, Jan 13, 2017, at 12:39, Niels de Vos wrote:
> > > On Wed, Jan 11, 2017 at 11:58:29AM -0700, Paul Allen wrote:
> > > > I'm running into an issue where the gluster nfs service keeps dying on a
> > > > new cluster I have setup recently. We've been using Gluster on several
> > > > other clusters now for about a year or so and I have never seen this
> > > > issue before, nor have I been able to find anything remotely similar to
> > > > it while searching on-line. I initially was using the latest version in
> > > > the Gluster Debian repository for Jessie, 3.9.0-1, and then I tried
> > > > using the next one down, 3.8.7-1. Both behave the same for me.
> > > > 
> > > > What I was seeing was after a while the nfs service on the NAS server
> > > > would suddenly die after a number of processes had run on the app server
> > > > I had connected to the new NAS servers for testing (we're upgrading the
> > > > NAS servers for this cluster to newer hardware and expanded storage, the
> > > > current production NAS servers are using nfs-kernel-server with no type
> > > > of clustering of the data). I checked the logs but all it showed me was
> > > > something that looked like a stack trace in the nfs.log and the
> > > > glustershd.log showed the nfs service disconnecting. I turned on
> > > > debugging but it didn't give me a whole lot more, and certainly nothing
> > > > that helps me identify the source of my issue. It is pretty consistent
> > > > in dying shortly after I mount the file system on the servers and start
> > > > testing, usually within 15-30 minutes. But if I have nothing using the
> > > > file system, mounted or no, the service stays running for days. I tried
> > > > mounting it using the gluster client, and it works fine, but I can;t use
> > > > that due to the performance penalty, it slows the websites down by a few
> > > > seconds at a minimum.
> > > 
> > > This seems to be related to the NLM protocol that Gluster/NFS provides.
> > > Earlier this week one of our Red Hat quality engineers also reported
> > > this (or a very similar) problem.
> > > 
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1411344
> > > 
> > > At the moment I suspect that this is related to re-connects of some
> > > kind, but I have not been able to identify the cause sufficiently to be
> > > sure. This definitely is a coding problem in Gluster/NFS, but the more I
> > > look at the NLM implementation, the more potential issues I see with it.
> > 
> > Should we assume that, with the complete removal of Gluster/NFS
> > already on the horizon, debugging and fixing NLM (even if only for the
> > more common, reported crash cases) would be an extremely low-priority
> > task? ;-)
> 
> Crashes are expected to be fixed, if they happen in 'normal'
> circumstances. It is not an extremely low priority, but neither is it
> very high. I'm looking into it, but am also working on other higher
> priority things. Maybe I have a fix by the end of the week, depending on
> whatever distracts me from working on it.

Many thanks for all your efforts!
I will eagerly await any patch and I will upgrade to the version that contains it (even if that means upgrading from 3.7.x to 3.8.x) or I could even recompile local rpms as soon as a tested patch is available in Gerrit and then upgrade to that at the first available maintenance window.

If it can help, I had already added my logs/info to https://bugzilla.redhat.com/show_bug.cgi?id=1381970 (but could ultimately be a different issue, of course; it is easily reproducible, anyway) and could try to help in collecting further details/logs if you need them.

> > Would it be possible for someone to simply check whether the crashes
> > happen also on the 3.7.9-12 codebase used in latest RHGS 3.1.3?
> > My cluster is already at 3.7.12 feature level (and using it), so I
> > suppose I could not easily downgrade.
> > Since Red Hat QA found a similar problem while testing the 3.8.4-10
> > codebase in RHGS 3.2.0, we could trace the problem back to post-3.7.9
> > developments, if RHGS 3.1.3 is immune.
> 
> I expect to see this problem in older versions too. There has been very
> little change to the NLM code in Gluster. It is possible that the Linux
> kernel was updated and the NFS-client behaves a little different,
> exposing this problem just now, or the testing has changed...

I thought that it could not be present in earlier/current RHGS versions *and* escape QA/real_use, but your suggestion could be the explanation; maybe even the fact that most use could have been on RHEL 6 up to now (so the kernel NFS client difference vs RHEL 7 could have been even more marked).

> > > If the workload does not require locking operations, you may be able to
> > > work around the problem by mounting with "-o nolock". Depending on the
> > > application, this can be safe or cause data corruption...
> > 
> > If I'm not mistaken, typical NFS uses such as YUM repositories and
> > home directories would be barred (I think that SQLite needs locking
> > and both createrepo and firefox use SQLite, right?).
> 
> Only if access to the same content is happening from different
> NFS-clients. The case of createrepo is normally safe, it generates new
> files and renames them over the older ones.
> 
> SQLite needs locking if multiple processes read/write the file. By
> default SQLite uses in-memory (SHM) locks and does not try to use file
> locks at all. At least that was the behaviour months (or years?) back.
> This causes troubles for any SQLite database stored on a network
> filesystem and accessed from different clients.

Ok, I understand, I just assumed that it was so because in the past I had a misconfigured Gluster volume that failed locking in NFS altogether and createrepo crashed each and every time while creating the SQLite db (with no contention whatsoever).

> > > An other alternative is to use NFS-Ganesha instead of Gluster/NFS.
> > > Ganesha is more mature than Gluster/NFS and is more actively developed.
> > > Gluster/NFS is being deprecated in favour of NFS-Ganesha.
> > 
> > Pure storage infrastructure uses should be migratable, I suppose, but
> > extended testing and a suitable maintenance window (a rolling live
> > migration from Gluster/NFS is not feasible, if I understood Ganesha
> > right) would be needed anyway, I suppose.
> 
> Yes, migrations from one software component to an other will normally
> require testing and a maintenance window for making the change. This
> case is not different from that.

Well, proper IT practices should always be followed, sure, but I was subtly hinting at the lack of specific documented practices (even on the "standard" case) on the switch from Gluster-NFS to NFS-Ganesha :-)

> > More "extreme" uses (such as mine, unfortunately: hyperconverged
> > Gluster+oVirt setup coexisting with full CIFS/NFS sharing services)
> > have not been documented/considered for Ganesha, according to my own
> > research on the case (but please correct me if I'm wrong).
> 
> Hyperconverged (how it mostly is used) means that you are running
> Gluster+oVirt on the same servers. In this case, locking with NFS will
> already have problems. It is only possible for a NFS-server *or*
> NFS-client to register the locking service (NLM protocol) at the
> portmapper (rpcbind). When an NFS mount is done, feature checking for
> NLM happens as well. If one of the two (client or server) does not have
> a functional NLM, locking will (silently) be disabled.
> 
> You can use locks only when you use the native GlusterFS protocol on the
> same servers as clients. That means FUSE mounts or access through
> libgfapi.

Sorry for not being more clear: there are no NFS server/client issues in the virtualization part of my setup.
I am using an hyperconverged Gluster+oVirt setup where:

*) all the oVirt-related Gluster volumes (replicated-distributed in replica 3 with arbiter and sharding) are accessed by means of FUSE mounts (since oVirt does not support gfapi access as of oVirt version 3.6.x) except for one (ISO domain, which is defined as NFS storage domain but never had any problem nonetheless)

*) separate Gluster volumes for non-oVirt-related data (replicated-distributed in replica 3 with arbiter without sharding) are accessed (both by oVirt-based virtual machines and physical servers) by means of NFS; these are the only volumes experiencing the problems reported here

> > Since I already envisioned such an outcome, I recently posted a
> > request for info/directions on such a migration in my particular case:
> > 
> > http://lists.gluster.org/pipermail/gluster-users/2017-January/029650.html
> > 
> > Can anyone from the developers camp kindly comment on the above points? :-)
> 
> I'll poke some of them and see that someone replies to it.

Many many thanks for your assistance.

Best regards,
Giuseppe

> Cheers,
> Niels
> 
> 
> > 
> > Many many thanks in advance.
> > 
> > Best regards,
> > Giuseppe
> > 
> > > HTH,
> > > Niels
> > > 
> > > 
> > > > 
> > > > Here is the output from the logs one of the times it died:
> > > > 
> > > > glustershd.log:
> > > > 
> > > > [2017-01-10 19:06:20.265918] W [socket.c:588:__socket_rwv] 0-nfs: readv
> > > > on /var/run/gluster/a921bec34928e8380280358a30865cee.socket failed (No
> > > > data available)
> > > > [2017-01-10 19:06:20.265964] I [MSGID: 106006]
> > > > [glusterd-svc-mgmt.c:327:glusterd_svc_common_rpc_notify] 0-management:
> > > > nfs has disconnected from glusterd.
> > > > 
> > > > 
> > > > nfs.log:
> > > > 
> > > > [2017-01-10 19:06:20.135430] D [name.c:168:client_fill_address_family]
> > > > 0-NLM-client: address-family not specified, marking it as unspec for
> > > > getaddrinfo to resolve from (remote-host: 10.20.5.13)
> > > > [2017-01-10 19:06:20.135531] D [MSGID: 0]
> > > > [common-utils.c:335:gf_resolve_ip6] 0-resolver: returning ip-10.20.5.13
> > > > (port-48963) for hostname: 10.20.5.13 and port: 48963
> > > > [2017-01-10 19:06:20.136569] D [logging.c:1764:gf_log_flush_extra_msgs]
> > > > 0-logging-infra: Log buffer size reduced. About to flush 5 extra log
> > > > messages
> > > > [2017-01-10 19:06:20.136630] D [logging.c:1767:gf_log_flush_extra_msgs]
> > > > 0-logging-infra: Just flushed 5 extra log messages
> > > > pending frames:
> > > > frame : type(0) op(0)
> > > > patchset: git://git.gluster.com/glusterfs.git
> > > > signal received: 11
> > > > time of crash:
> > > > 2017-01-10 19:06:20
> > > > configuration details:
> > > > argp 1
> > > > backtrace 1
> > > > dlfcn 1
> > > > libpthread 1
> > > > llistxattr 1
> > > > setfsid 1
> > > > spinlock 1
> > > > epoll.h 1
> > > > xattr.h 1
> > > > st_atim.tv_nsec 1
> > > > package-string: glusterfs 3.9.0
> > > > /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xac)[0x7f891f0846ac]
> > > > /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x324)[0x7f891f08dcc4]
> > > > /lib/x86_64-linux-gnu/libc.so.6(+0x350e0)[0x7f891db870e0]
> > > > /lib/x86_64-linux-gnu/libc.so.6(+0x91d8a)[0x7f891dbe3d8a]
> > > > /usr/lib/x86_64-linux-gnu/glusterfs/3.9.0/xlator/nfs/server.so(+0x3a352)[0x7f8918682352]
> > > > /usr/lib/x86_64-linux-gnu/glusterfs/3.9.0/xlator/nfs/server.so(+0x3cc15)[0x7f8918684c15]
> > > > /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x2aa)[0x7f891ee4e4da]
> > > > /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7f891ee4a7e3]
> > > > /usr/lib/x86_64-linux-gnu/glusterfs/3.9.0/rpc-transport/socket.so(+0x4b33)[0x7f8919eadb33]
> > > > /usr/lib/x86_64-linux-gnu/glusterfs/3.9.0/rpc-transport/socket.so(+0x8f07)[0x7f8919eb1f07]
> > > > /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x7e836)[0x7f891f0d9836]
> > > > /lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7f891e3010a4]
> > > > /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f891dc3a62d]
> > > > 
> > > > 
> > > > The IP showing in the nfs.log is actually for a web server I was also
> > > > testing with, not the app server, but it doesn't appear to me that would
> > > > be the cause for the nfs service dying. I'm at a loss as to what is
> > > > going on, and I need to try and get this fixed pretty quickly here, I
> > > > was hoping to have this in production last Friday. If anyone has any
> > > > ideas I'd be very grateful.
> > > > 
> > > > -- 
> > > > 
> > > > Paul Allen
> > > > 
> > > > Inetz System Administrator
> > > > 
> > > > 
> > > > _______________________________________________
> > > > Gluster-users mailing list
> > > > Gluster-users@xxxxxxxxxxx
> > > > http://www.gluster.org/mailman/listinfo/gluster-users
> > > _______________________________________________
> > > Gluster-users mailing list
> > > Gluster-users@xxxxxxxxxxx
> > > http://www.gluster.org/mailman/listinfo/gluster-users
> > > Email had 1 attachment:
> > > + signature.asc
> > >   1k (application/pgp-signature)
> Email had 1 attachment:
> + signature.asc
>   1k (application/pgp-signature)
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users