Re: Stuck in md_write_start because MD_SB_CHANGE_PENDING can't be cleared

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 09/05/2017 09:36 AM, NeilBrown wrote:
On Mon, Sep 04 2017, Xiao Ni wrote:


In function handle_stripe:
4697         if (s.handle_bad_blocks ||
4698             test_bit(MD_SB_CHANGE_PENDING, &conf->mddev->sb_flags)) {
4699                 set_bit(STRIPE_HANDLE, &sh->state);
4700                 goto finish;
4701         }

Because MD_SB_CHANGE_PENDING is set, so the stripes can't be handled.

Right, of course.  I see what is happening now.

- raid5d cannot complete stripes until the metadata is written
- the metadata cannot be written until raid5d gets the mddev_lock
- mddev_lock is held by the write to suspend_hi
- the write to suspend_hi is waiting for raid5_quiesce
- raid5_quiesce is waiting for some stripes to complete.

We could declare that ->quiesce(, 1) cannot be called while holding the
lock.
We could possible allow it but only if md_update_sb() is called first,
though that might still be racy.

->quiesce(, 1) is currently called from:
  mddev_suspend
  suspend_lo_store
  suspend_hi_store
  __md_stop_writes
  mddev_detach
  set_bitmap_file
  update_array_info (when setting/removing internal bitmap)
  md_do_sync

and most of those are call with the lock held, or take the lock.

Maybe we should *require* that mddev_lock is held when calling
->quiesce() and have ->quiesce() do the metadata update.

Something like the following maybe.  Can you test it?

Hi Neil

Thanks for the analysis. I need to thing for a while :)
I already added the patch and the test is running now. It usually needs more than 5 hours to reproduce this problem. I'll let it run more than 24 hours. I'll update the test
result later.

Regards
Xiao
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux