On Tue, 2010-05-25 at 14:58 +0200, Lukas Hejtmanek wrote: > Hi, > > On Tue, May 25, 2010 at 08:28:40AM -0400, Trond Myklebust wrote: > > > Seems like pretty fundamental problem in nfs :-(. Limiting writeback > > > caches for nfs, so that system has enough memory to perform rpc calls > > > with the rest might do the trick, but... > > > > > > > It's the same problem that you have for any file or storage system that > > has initiators in userland. On the storage side, iSCSI in particular has > > the same problem. On the filesystem side, CIFS, AFS, coda, .... do too. > > The clustered filesystems can deadlock if the node that is running the > > DLM runs out of memory... > > > > A few years ago there were several people proposing various solutions > > for allowing these daemons to run in a protected memory environment to > > avoid deadlocks, but those efforts have since petered out. Perhaps it is > > time to review the problem? > > I saw some patches targeting 2.6.35 that should prevent some deadlocks. They > seem to be not enough in some cases. rpc.* daemons should be mlocked for sure > but there is a problem with libkrb that reads files using fread(). fread() uses > anonymous mmap, under mlockall(MCL_FUTURE) this causes the anonymous map to be > mapped instantly and it deadlocks. > > IBM GPFS also uses userspace daemon, but it seems that the deamon is mlocked > and it does not open any files and does not create new connections. Doesn't matter. Just writing to a socket or pipe may trigger a kernel memory allocation which can result in an attempt to reclaim memory. Furthermore, there is the issue of what to do when you really are OOM, and the kernel cannot allocate more memory for you without reclaiming it. The schemes I'm talking about typically had special memory pools preallocated for use by daemons, and would label the daemons using some equivalent of the PF_MEMALLOC flag to prevent recursion into the filesystem. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html