On Aug 4, 2010, at 10:45 AM, Jim Rees wrote:
Andy Adamson wrote:
When a user with a Kerberos token of 2048 bytes or larger attempts to
access a filesystem mounted using Kerberized NFS, the NFS server
locks up
for 30 seconds, and ultimately the call fails.
Yes, this limitation has been known for a long time. We ran into
this same
issue using X.509 certs and spkm3. I imagine PKINIT will also hit
this
limitation.
But shouldn't it fail right away instead of locking up for 30 seconds?
It seems to me that it should error out with a log message, rather
than simply trying over and over again.
Does the entire server lock up, or just that one rpc?
The concrete manifestation of this is that all of the NFS kernel
processes run continuously. So on a single-processor system, it takes
100% of the CPU for those 30 seconds. On a multiprocessor system (at
least my RHEL system), the NFS kernel processes keep affinity with a
CPU, so it just consumes one processor. I have not tested if other
NFS requests can be processed during that window on a multiprocessor
system. It does not really "lock up", but rather monopolizes the CPU
with high-priority kernel threads.
Related to this, it was a real pain for me to debug, since setting any
of the rpcdebug flags in rpc simply overloaded the logging subsystem.
I had to put an ssleep() in svcauth_gss_handle_init() in order to get
debugging output I could use from rpcdebug.
Can a malicious client use this as a DOS?
Yes.
Does it require a valid ticket,
or will any ticket >= 2048 do?
I believe that all of the validity-checking of the token is done in
the upcall rpc.svcgssd, not in the sunrpc kernel code. I am a kernel
newbie though, so I am not sure.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html