Re: GlusterFS 3.3.0 and NFS client failures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



FYI - this appears to be resolved.

  I'm now at 22 days of uptime on the machine I was fixing/debugging, in the same time I managed to hang another twice.  

The issue appears to be a known bug in the kernel's client-side NFS code (and is *not* a GlusterFS NFS server issue);
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/585657

I took the target Ubuntu Maverick host from Ubuntu kernel 2.6.35-22 to version 2.6.35-32 and it seems to have fixed the problem (during the same period, no changes were made to the Gluster/Saturn NFS server installation).

Obviously the same NFS client bug exists in my fully patched Linux Mint machine .. pity .. and PITA.


Thanks again for your help.



----- Original Message -----
>From: "Ian Latter" <ian.latter@xxxxxxxxxxxxxxxx>
>To: "Anand Avati" <anand.avati@xxxxxxxxx>
>Subject:  Re: GlusterFS 3.3.0 and NFS client failures
>Date: Sun, 24 Mar 2013 12:16:45 +1000
>
> 
> Interestingly, I never considered this as an "NFS" problem.
> 
>   1) Eg; http://www.novell.com/support/kb/doc.php?id=7008148
>       (note that none of my machines are SLES).
> 
>   2) One NFS client in my environment has never hung on the 
>       Gluster/NFS share, and it only does play-back (read).
> 
> 
> This has opened up some possibilities.  Thanks for the idea bouncing.
> 
> 
> 
> ----- Original Message -----
> >From: "Ian Latter" <ian.latter@xxxxxxxxxxxxxxxx>
> >To: "Anand Avati" <anand.avati@xxxxxxxxx>
> >Subject:  Re: GlusterFS 3.3.0 and NFS client failures
> >Date: Sun, 24 Mar 2013 11:35:52 +1000
> >
> > 
> > Thanks for the extra queries;
> > 
> > > 1) when the system was "hung", was the client still flushing data to the
> > > nfs server? 
> > 
> > I have no more detail outside of what has been supplied.  I.e. the other two 
> > applications I've seen fail have been starting a guest from a Gluster NFS 
> > share from VMWare Player (2nd most common, has happened four or five 
> > times) and "cp" from bash (3rd most common, has happened twice) - each 
> > on different hosts (NFS clients).
> > 
> > > Any network activity? 
> > 
> > On the myth backend server - No NFS network activity occurs after the
> > event, but I can still SSH into it (and everything is fine in that session, you
> > just can't perform any NFS iops or you'll hang that session on the blocked
> > io).
> > 
> > On the Gluster NFS server (Saturn) there's no issues at all - other 
> > devices on the network continue to access NFS shares from that box.
> > And no action is required on the Gluster NFS server for the failed host 
> > to reconnect and operate normally again.
> > 
> > > The backtrace below shows that the system was just waiting for a long
> > > time waiting for a write to complete.
> > 
> > Note that this state never recovers.  Sometimes its days before I find that 
> > the backend server has hung (as in, it has been in a hung state for days - 
> > like when it failed the day after I left for an overseas trip).  It doesn't reset, 
> > all future IOPS on that share hang.
> > 
> > So one diagnostic option is for me to write up a script that just creates and 
> > destroys a shed-load of data on a Gluster NFS share with an strace 
> > running over it .. but your comment suggests that if a "cp" hangs on write 
> > with output like the below then you'll be unimpressed.  What about a 
> > network trace instead?
> > 
> > 
> > > 2) anything in the gluster nfs logs?
> > 
> > There are no other logs on Saturn, other than syslog and the console 
> > (dmesg), and I'm not seeing any Gluster or NFS entries there.  Is there 
> > a way to get Gluster to log to syslog rather than on-disk files?  I only 
> > have a couple of Megabytes of disk space available.
> > 
> > 
> > > 3) is it possible DHCP assigned a different IP while renewing lease?
> > 
> > No.  As a backend service it is issued a static DHCP assignment based 
> > on its MAC address, and in the final case (13 days uptime) I removed 
> > DHCP requesting completely by hard coding the IPv4 address in 
> > Ubuntu's /etc/network* scripts on the Myth Backend host.  I.e. it fails 
> > without a DHCP process involved.
> > 
> > 
> > The feeling that I get is that this is a rare issue - in which case I'm looking
> > for something domestic - like a hardware fault or something bespoke in 
> > my configuration.  The problem reminds of the error we found in one of 
> > the Gluster modules that would kill replication for some files > 2GB but 
> > not all, so one of my other thoughts was that it may be something related 
> > to this (Saturn) being a 32bit platform.  However I have another site 
> > running Saturn that doesn't have this problem, so I'm reluctant to believe
> > that its the kernel-to-gluster stack either.  But seeing the failures in three
> > NFS clients makes it look like its Gluster/Saturn side issue.
> > 
> > Let me try a packet capture and we'll see if there's anything odd in there.
> > 
> > The next failure is due in a week or so.  If I get a chance I'll also write up
> > a test script to see if I can force a failure (it might also give a view on the 
> > volume of data required to trigger an event, and if it works it will give me
> > the ability to separate testing from my production kit).
> > 
> > 
> > 
> > Thanks,
> > 
> > 
> > 
> > 
> > ----- Original Message -----
> > >From: "Anand Avati" <anand.avati@xxxxxxxxx>
> > >To: "Ian Latter" <ian.latter@xxxxxxxxxxxxxxxx>
> > >Subject:  Re: GlusterFS 3.3.0 and NFS client failures
> > >Date: Sat, 23 Mar 2013 16:40:38 -0700
> > >
> > > Do you have any more details, like -
> > > 
> > > 1) when the system was "hung", was the client still flushing data to the
> > > nfs server? Any network activity? The backtrace below shows that the system
> > > was just waiting for a long time waiting for a write to complete.
> > > 
> > > 2) anything in the gluster nfs logs?
> > > 
> > > 3) is it possible DHCP assigned a different IP while renewing lease?
> > > 
> > > Avati
> > > 
> > > On Fri, Mar 22, 2013 at 3:04 AM, Ian Latter <ian.latter@xxxxxxxxxxxxxxxx>wrote:
> > > 
> > > > Hello,
> > > >
> > > >
> > > >   This is a problem that I've been chipping at on and off for a while and
> > > > its finally cost me one recording too many - I just want to get it cured -
> > > > any
> > > > help would be greatly appreciated.
> > > >
> > > >   I'm using the kernel NFS client on a number of Linux machines (four, I
> > > > believe), to map back to two Gluster 3.3.0 shares.
> > > >
> > > >   I have seen Linux Mint and Ubuntu machines of various generations and
> > > > configurations (one is 64bit) hang intermittently on either one of the two
> > > > Gluster shares on "access" (I can't say if its writing or not - the below
> > > > log
> > > > is for a write).  But by far the most common failure example is my MythTV
> > > > Backend server.  It has 5 tuners pulling down up to a gigabyte per hour
> > > > each directly to an NFS share from Gluster 3.3.30 with two local 3TB
> > > > drives in a "distribute" volume.  It also re-parses each recording for Ad
> > > > filtering, so the share gets a good thrashing.  The myth backend box would
> > > > fail (hang the system) once each 2-4 days.
> > > >
> > > > The backend server was also updating its NIC via DHCP.  I have been using
> > > > an MTU of 1460 and each DHCP event would thus result in this note in syslog;
> > > >  [  12.248640] r8169: WARNING! Changing of MTU on this NIC may lead to
> > > > frame reception errors!
> > > >
> > > > I change the DHCP MTU to 1500 and didn't see an improvement.  So, the
> > > > last change I made was a hard coded address and default MTU (of 1500).
> > > > The most recent trial saw a 13 day run time which is well outside the norm,
> > > > but it still borked (one test only - may have been lucky).
> > > >
> > > > >> syslog burp;
> > > > [1204800.908075] INFO: task mythbackend:21353 blocked for more than 120
> > > > seconds.
> > > > [1204800.908084] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > > disables this message.
> > > > [1204800.908091] mythbackend   D f6af9d28     0 21353      1 0x00000000
> > > > [1204800.908107]  f6af9d38 00000086 00000002 f6af9d28 f6a4e580 c05d89e0
> > > > c08c3700 c08c3700
> > > > [1204800.908123]  bd3a4320 0004479c c08c3700 c08c3700 bd3a0e4e 0004479c
> > > > 00000000 c08c3700
> > > > [1204800.908138]  c08c3700 f6a4e580 00000001 f6a4e580 c2488700 f6af9d80
> > > > f6af9d48 c05c6e51
> > > > [1204800.908152] Call Trace:
> > > > [1204800.908170]  [<c05c6e51>] io_schedule+0x61/0xa0
> > > > [1204800.908180]  [<c01d9c4d>] sync_page+0x3d/0x50
> > > > [1204800.908190]  [<c05c761d>] __wait_on_bit+0x4d/0x70
> > > > [1204800.908197]  [<c01d9c10>] ? sync_page+0x0/0x50
> > > > [1204800.908211]  [<c01d9e71>] wait_on_page_bit+0x91/0xa0
> > > > [1204800.908221]  [<c0165e60>] ? wake_bit_function+0x0/0x50
> > > > [1204800.908229]  [<c01da1f4>] filemap_fdatawait_range+0xd4/0x150
> > > > [1204800.908239]  [<c01da3c7>] filemap_write_and_wait_range+0x77/0x80
> > > > [1204800.908248]  [<c023aad4>] vfs_fsync_range+0x54/0x80
> > > > [1204800.908257]  [<c023ab5e>] generic_write_sync+0x5e/0x80
> > > > [1204800.908265]  [<c01dbda1>] generic_file_aio_write+0xa1/0xc0
> > > > [1204800.908292]  [<fb0bc94f>] nfs_file_write+0x9f/0x200 [nfs]
> > > > [1204800.908303]  [<c0218454>] do_sync_write+0xa4/0xe0
> > > > [1204800.908314]  [<c032e626>] ? apparmor_file_permission+0x16/0x20
> > > > [1204800.908324]  [<c0302a74>] ? security_file_permission+0x14/0x20
> > > > [1204800.908333]  [<c02185d2>] ? rw_verify_area+0x62/0xd0
> > > > [1204800.908342]  [<c02186e2>] vfs_write+0xa2/0x190
> > > > [1204800.908350]  [<c02183b0>] ? do_sync_write+0x0/0xe0
> > > > [1204800.908359]  [<c0218fa2>] sys_write+0x42/0x70
> > > > [1204800.908367]  [<c05c90a4>] syscall_call+0x7/0xb
> > > >
> > > > This might suggest a hardware fault on the Myth Backend host (like the
> > > > NIC) but I don't believe that to be the case because I've seen the same
> > > > issue on other clients.  I suspect that they are much more rare because
> > > > the data volume on those clients pales in comparison to the Myth Backend
> > > > process (virtual guests, etc - light work - months between failures,
> > > > doesn't
> > > > feel time related).
> > > >
> > > > The only cure is a hard reset (of the host with the NFS client) as any FS
> > > > operation on that share hangs - including df, ls, sync and umount - so the
> > > > system fails to shutdown.
> > > >
> > > > The kernel on the Myth Backend host isn't new ..
> > > >
> > > > >> uname -a;
> > > > Linux jupiter 2.6.35-22-generic #33-Ubuntu SMP Sun Sep 19 20:34:50 UTC
> > > > 2010 i686 GNU/Linux
> > > >
> > > > Is there a known good/bad version for the kernel/NFS client?  Am I under
> > > > that bar?
> > > >
> > > >
> > > > The GlusterFS NFS server an embedded platform (Saturn) that has been
> > > > running for 74 days;
> > > >
> > > > >> uptime output;
> > > > 08:39:07 up 74 days, 22:16,  load average: 0.87, 0.94, 0.94
> > > >
> > > > It is a much more modern platform;
> > > >
> > > > >> uname -a;
> > > > Linux (none) 3.2.14 #1 SMP Tue Apr 10 12:46:47 EST 2012 i686 GNU/Linux
> > > >
> > > > It has had one error in all of that time;
> > > > >> dmesg output;
> > > > Pid: 4845, comm: glusterfsd Not tainted 3.2.14 #1
> > > > Call Trace:
> > > >  [<c10512d0>] __rcu_pending+0x64/0x294
> > > >  [<c1051640>] rcu_check_callbacks+0x87/0x98
> > > >  [<c1034521>] update_process_times+0x2d/0x58
> > > >  [<c1047bdf>] tick_periodic+0x63/0x65
> > > >  [<c1047c2d>] tick_handle_periodic+0x17/0x5e
> > > >  [<c1015ae9>] smp_apic_timer_interrupt+0x67/0x7a
> > > >  [<c1b2a691>] apic_timer_interrupt+0x31/0x40
> > > >
> > > > .. this occurred months ago.
> > > >
> > > > Unfortunately due to its embedded nature, there are no logs coming from
> > > > this platform, only a looped buffer for syslog (and gluster doesn't seem to
> > > > syslog).  In previous discussions here (months ago) you'll see where I was
> > > > working to disable/remove logging from GlusterFS so that I could keep it
> > > > alive in an embedded environment - this is the current run configuration.
> > > >
> > > > The Myth Backend host only mounts one of the two NFS shares, but I've seen
> > > > the fault on the hosts that only mount the other - so I'm reluctant to
> > > > believe that its a hardware failure at the Drive level on the Saturn /
> > > > Gluster
> > > > server.
> > > >
> > > > The /etc/fstab entry for this share, on the Myth Backend host, is;
> > > >
> > > >   saturn:/recordings /var/lib/mythtv/saturn_recordings nfs
> > > > nfsvers=3,rw,rsize=8192,wsize=8192,hard,intr,sync,dirsync,noac,noatime,nodev,nosuid
> > > > 0  0
> > > >
> > > > When I softened this to async with soft failures (a config taken straight
> > > > from the Gluster site/FAQ) it crashed out in a much shorter time-frame
> > > > (less than a day, one test only - may have been unlucky);
> > > >
> > > >   saturn:/recordings /var/lib/mythtv/saturn_recordings nfs
> > > > defaults,_netdev,nfsvers=3,proto=tcp 0  0
> > > >
> > > >
> > > > Other than the high use Myth Backend host I've failed to accurately nail
> > > > down the trigger for this issue - which is making diagnostics painful (I
> > > > like
> > > > my TV too much to do more than reboot the failed box - and heaven forbid
> > > > the dad that fails to record Pepper Pig!).
> > > >
> > > >
> > > > Any thoughts?  Beyond enabling logs on the Saturn side ...
> > > >
> > > > Is it possible this is a bug that was reaped in later versions of Gluster?
> > > >
> > > > Appreciate being set straight ..
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > Cheers,
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Ian Latter
> > > > Late night coder ..
> > > > http://midnightcode.org/
> > > >
> > > > _______________________________________________
> > > > Gluster-devel mailing list
> > > > Gluster-devel@xxxxxxxxxx
> > > > https://lists.nongnu.org/mailman/listinfo/gluster-devel
> > > >
> > > 
> > 
> > 
> > --
> > Ian Latter
> > Late night coder ..
> > http://midnightcode.org/
> > 
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel@xxxxxxxxxx
> > https://lists.nongnu.org/mailman/listinfo/gluster-devel
> > 
> 
> 
> --
> Ian Latter
> Late night coder ..
> http://midnightcode.org/
> 


--
Ian Latter
Late night coder ..
http://midnightcode.org/



[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux