Re: Process stuck in md_flush_request (state: D)

Les Stroud <les@xxxxxxxxxxxxx> · Mon, 27 Feb 2017 09:49:59 -0500

After a period of a couple of weeks with one of our test instances having this problem every other day, they were all nice enough to operate without an issue for 9 days.  It finally reoccurred last night on one of the machines.  

It exhibits the same symptoms and the call traces look as they did previously.  This particular instance is configured with a deadline scheduler.  I was able to capture the inflight you requested:

$ cat /sys/block/xvd[abcde]/inflight
       0        0
       0        0
       0        0
       0        0
       0        0

I’ve had this happen on instances with the deadline scheduler and the noop scheduler.  At this point, I have not had this happen on an instance that is noop and the raid filesystem (ext4) is mounted with nobarrier.  The instances with noop/nobarrier have not been running long enough for me to make any sort of conclusion that it works around the problem.  Frankly, I’m not sure I understand the interaction between ext4 barriers and raid0 block flushes well enough to theorize whether it should or shouldn’t make a difference.

Does any of this help with identifying the bug?  Is there anymore information I can get that would be useful?  

I have some systems that need to be moved into production in the next couple of weeks that are having this problem.  Do you have any ideas how I might be able to work around the problem?

Thanx for the time,
LES

> On Feb 17, 2017, at 3:40 PM, Les Stroud <les@xxxxxxxxxxxxx> wrote:
> 
> It’ll take a day or two for it to happen again.  When it does, I’ll pull the inflight stats.  Anything else I should grab while I’m at it?
> 
> Thanx,
> LES
> 
> 
>> On Feb 17, 2017, at 3:06 PM, Shaohua Li <shli@xxxxxxxxxx> wrote:
>> 
>> On Fri, Feb 17, 2017 at 02:05:49PM -0500, Les Stroud wrote:
>>> 
>>> I have a problem with processes entering an uninterruptible sleep state in md_flush_request and never returning. I having trouble identifying the underlying issue. I’m hoping someone on here may be able to help.
>>> 
>>> The servers in question are running in aws (xen hvm) with kernel 3.8.13-118.16.2.el6uek.x86_64.  The server has two mounts.  The first is vanilla ext4.  The second is a software RAID0 array, striped with 256k chunks, buiIt with md.  It’s file system is ext4. 
>>> 
>>> The most immediately and obvious symptom of the issue are kernel errors “kernel: INFO task [some process]: blocked for more than 120 seconds.”.  Shortly there after, other processes start entering the same uninterruptible wait state (D). This almost always impacts ssh logins.
>>> 
>>> The problem does not occur when the system is under load, or was recently under load.  It happens when the system is quiet (no cpu, very little io).
>> 
>> This seems suggesting we have a missed blk-plug flush in light workload. Can
>> you check the output of /sys/block/disk-bame/inflight for both md and the
>> underlayer disks? This will let us know if there is IO pending.
>> Also it would be great if you can test a upstream kernel.
>> 
>> Thanks,
>> Shaohua
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html