Re: Processes stuck in D state when accessing XFSv5 filesystem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Apr 10, 2017, at 11:23 AM, Brian Foster <bfoster@xxxxxxxxxx> wrote:
> 
> Interesting, nothing obvious stands out to me. It looks you have smbd
> waiting on a pread over fuse and mmon waiting on a stat. Presumably the
> smbd pread corresponds to the blocked servxfs pread and the stat to one
> of the servxfs getxattr calls.
> 
> Any idea where the other servxfs getxattr comes in?

servxfs stores some internal metadata in xattrs, so it should be normal that it makes occasional xattr calls.

>> I will get the xfs traces the next time it happens.  It seems to be happening 2-3 times a week, but frustratingly, I can't make it happen on demand. Is there something I should look for in particular in the trace output, or some amount of time to capture it for?
>> 
> 
> Can you elaborate on the resulting behavior? Are these same processes
> always involved? Can you identify whether they are attempting to access
> the same file(s) or not? Also, does the underlying filesystem continue
> to function outside of these processes seemingly all blocked on reads?
> E.g., can you read another file from the XFS fs or is all I/O blocked?
> 
> Because there are multiple layers involved here, presumably with custom
> code in between (e.g., your fuse userspace), this might be easier to
> reason about if you can dig more into what's blocked in the upper layers
> to describe precisely what high level requests are active at the XFS
> level.

I agree that removing the fuse userspace code from the equation can simplify things.  There are examples that don't go through fuse - for example, there is a postgres daemon that is running directly on the XFS filesystem that has occasionally hit the same issue (meaning going into D state and staying there "forever", or at least 30 minutes until it reboots).

Unfortunately, I'm having a huge problem reproducing the problem now.  Where it used to happen 2-3 times a week, now it's not.  So on the one hand, great, but on the other hand, it makes it hard to debug.

I will update as soon as I have something new.

David

--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux