Re: processes hanging in state D when reading from nfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Rüdiger Meier <sweet_f_a@...> writes:

> 
> On Tuesday 20 September 2011, J. Bruce Fields wrote:
> > On Sat, Aug 27, 2011 at 09:22:53PM +0200, Rüdiger Meier wrote:
> > > I've got an annoying problem with my nfs4 clients.
> > > Lately I see many processes hanging in state D when reading from
> > > nfs mount. Sometimes they can be killed sometimes not.
> >
> > Is this still happening?
> 
> Yes, allthough we've managed to avoid the "dangerous" things.
> Sometimes we have also probs like the other current thread
> "Writing / Locking problem with NFSv4".
> 

For what it's worth:  we have been seeing very similar behavior on our OpenSuSE
11.3 (x86_64, 2.6.34.10-0.2) systems, though one other difference is that we are
using NFSv3 for these mounts.

I was able to get some traces via sysrq, though no ethernet dumps (these
problems would happen occasionally, impossible to determine when/where).  These
are heavily loaded systems, doing lots of compute and IO.

  1  [3754730.533669] R             D ffffffff810dc3e0     0 22621      1
0x00000004     
  2  [3754730.533671]  ffff88165f993cb8 0000000000000086 ffff881037174600
ffffffffa0332bbd 
  3  [3754730.533673]  0000000000013e80 0000000000013e80 ffff88165f993fd8
0000000000013e80 
  4  [3754730.533675]  ffff88165f993fd8 ffff881e5cd521c0 0000000000013e80
0000000000013e80 
  5  [3754730.533676] Call Trace:
  6  [3754730.533678]  [<ffffffff8145004e>] io_schedule+0x6e/0xb0
  7  [3754730.533681]  [<ffffffff810dc418>] sync_page+0x38/0x50
  8  [3754730.533683]  [<ffffffff814505da>] __wait_on_bit_lock+0x4a/0xb0
  9  [3754730.533685]  [<ffffffff810dc3be>] __lock_page+0x5e/0x70
 10  [3754730.533687]  [<ffffffff810dd2f8>] filemap_fault+0x2f8/0x410
 11  [3754730.533690]  [<ffffffff810f7c12>] __do_fault+0x52/0x4f0
 12  [3754730.533692]  [<ffffffff810fbf82>] handle_mm_fault+0x1b2/0xbd0
 13  [3754730.533694]  [<ffffffff81455799>] do_page_fault+0x169/0x3a0
 14  [3754730.533697]  [<ffffffff8145271f>] page_fault+0x1f/0x30
 15  [3754730.533699]  [<00007f79e2486ce0>] 0x7f79e2486ce0

This is pretty representative of the processses in D.  Does this help, or are
there too many differences from the original?

Thanks

Michael



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux