Re: [BUG] MD/RAID1 hung forever on freeze_array

NeilBrown <neilb@xxxxxxxx> · Thu, 08 Dec 2016 14:17:18 +1100

On Thu, Dec 08 2016, Jinpu Wang wrote:

> On Tue, Nov 29, 2016 at 12:15 PM, Jinpu Wang
> <jinpu.wang@xxxxxxxxxxxxxxxx> wrote:
>> On Mon, Nov 28, 2016 at 10:10 AM, Coly Li <colyli@xxxxxxx> wrote:
>>> On 2016/11/28 下午5:02, Jinpu Wang wrote:
>>>> On Mon, Nov 28, 2016 at 9:54 AM, Coly Li <colyli@xxxxxxx> wrote:
>>>>> On 2016/11/28 下午4:24, Jinpu Wang wrote:
>>>>>> snip
>>>>>>>>>
>>>>>>>>> every time nr_pending is 1 bigger then (nr_queued + 1), so seems we
>>>>>>>>> forgot to increase nr_queued somewhere?
>>>>>>>>>
>>>>>>>>> I've noticed (commit ccfc7bf1f09d61)raid1: include bio_end_io_list in
>>>>>>>>> nr_queued to prevent freeze_array hang. Seems it fixed similar bug.
>>>>>>>>>
>>>>>>>>> Could you give your suggestion?
>>>>>>>>>
>>>>>>>> Sorry, forgot to mention kernel version is 4.4.28

>
> I continue debug the bug:
>
> 20161207

>   nr_pending = 948,
>   nr_waiting = 9,
>   nr_queued = 946, // again we need one more to finished wait_event.
>   barrier = 0,
>   array_frozen = 1,

> on conf->bio_end_io_list we have 91 entries.

> on conf->retry_list we have 855

This is useful.  It confirms that nr_queued is correct, and that
nr_pending is consistently 1 higher than expected.
This suggests that a request has been counted in nr_pending, but hasn't
yet been submitted, or has been taken off one of the queues but has not
yet been processed.

I notice that in your first email the Blocked tasks listed included
raid1d which is blocked in freeze_array() and a few others in
make_request() blocked on wait_barrier().
In that case nr_waiting was 100, so there should have been 100 threads
blocked in wait_barrier().  Is that correct?  I assume you thought it
was pointless to list them all, which seems reasonable.

I asked because I wonder if there might have been one thread in
make_request() which was blocked on something else.  There are a couple
of places when make_request() will wait after having successfully called
wait_barrier().  If that happened, it would cause exactly the symptoms
you report.  Could you check all blocked threads carefully please?

There are other ways that nr_pending and nr_queued can get out of sync,
though I think they would result in nr_pending being less than
nr_queued, not more.

If the presense of a bad block in the bad block log causes a request to
be split into two r1bios, and if both of those end up on one of the
queues, then they would be added to nr_queued twice, but to nr_pending
only once.  We should fix that.

>
> list -H 0xffff8800b96acac0 r1bio.retry_list -s r1bio
>
> ffff8800b9791ff8
> struct r1bio {
>   remaining = {
>     counter = 0
>   },
>   behind_remaining = {
>     counter = 0
>   },
>   sector = 18446612141670676480, // corrupted?
>   start_next_window = 18446612141565972992, //ditto

I don't think this is corruption.

> crash> struct r1conf 0xffff8800b9792000
> struct r1conf {
  ....
>   retry_list = {
>     next = 0xffff8800afe690c0,
>     prev = 0xffff8800b96acac0
>   },

The pointer you started at was at the end of the list.
So this r1bio structure you are seeing is not an r1bio at all but the
memory out of the middle of the r1conf, being interpreted as an r1bio.
You can confirm this by noticing that retry_list in the r1bio:

>   retry_list = {
>     next = 0xffff8800afe690c0,
>     prev = 0xffff8800b96acac0
>   },

is exactly the same as the retry_list in the r1conf.

NeilBrown
Attachment:
signature.asc

Description: PGP signature