> On Apr 6, 2023, at 11:54 AM, Christian Herzog <herzog@xxxxxxxxxxxx> wrote: > > Dear Chuck, > >>> That was our first idea too, but we haven't found any indication that this is the case. The xfs file systems seem perfectly fine when all nfsds are in D state, and we can >>> read from them and write to them. If xfs were to block nfs IO, this should >>> affect other processes too, right? >> >> It's possible that the NFSD threads are waiting on I/O to a particular filesystem block. XFS is not likely to block other activity in this case. > ok good to know. So far we were under the impression that a file system would > block as a whole. XFS tries to operate in parallel as much as it can. Maybe other filesystems aren't as capable. If the unresponsive block is part of a superblock or the journal (ie, shared metadata) I would expect XFS to become unresponsive. For I/O on blocks containing file data, it is likely to have more robust behavior. >> I'm merely suggesting that you should start troubleshooting at the bottom of the stack instead of the top. The wait is far outside the realm of NFSD. > thanks, point taken. So next time it happens we'll make sure to poke in this > direction during the few minutes we have for debugging before we get tarred > and feathered by the users. I encourage you to discuss debugging tactics with Jens and the block folks -- you can probably capture a lot of info during those few minutes if you have some expert guidance. Good luck! -- Chuck Lever