On 12/9/21 12:35 AM, Heinz Mauelshagen wrote:
NACK, see details below.
On Wed, Dec 8, 2021 at 3:24 PM Guoqing Jiang <guoqing.jiang@xxxxxxxxx
<mailto:guoqing.jiang@xxxxxxxxx>> wrote:
On 12/1/21 1:27 AM, Paul Menzel wrote:
>
>>>>>>> diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
>>>>>>> index cab12b2..0c4cbba 100644
>>>>>>> --- a/drivers/md/dm-raid.c
>>>>>>> +++ b/drivers/md/dm-raid.c
>>>>>>> @@ -3668,7 +3668,7 @@ static int raid_message(struct
dm_target
>>>>>>> *ti, unsigned int argc, char **argv,
>>>>>>> if (!strcasecmp(argv[0], "idle") ||
!strcasecmp(argv[0],
>>>>>>> "frozen")) {
>>>>>>> if (mddev->sync_thread) {
>>>>>>> set_bit(MD_RECOVERY_INTR,
>>>>>>> &mddev->recovery);
>>>>>>> - md_reap_sync_thread(mddev);
>>>>>>> + md_reap_sync_thread(mddev, false);
>>>>>
>>>>> I think we can add mddev_lock() and mddev_unlock() here and
then
>>>>> we don't
>>>>> need the extra parameter?
>>>>
>>>> I thought it too, but I would prefer get the input from DM
people
>>>> first.
>>>>
>>>> @ Mike or Alasdair
>>>
>>> Hi Mike and Alasdair,
>>>
>>> Could you please comment on this option: adding mddev_lock() and
>>> mddev_unlock()
>>> to raid_message() around md_reap_sync_thread()?
Add Heinz and Jonathan, could you comment about this? Thanks.
>>
>> The issue is unfortunately still unresolved (at least Linux
5.10.82).
>> How can we move forward?
If it is not applicable to change dm-raid, another alternative
could be
like this
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9409,8 +9409,12 @@ void md_reap_sync_thread(struct mdev *mddev)
sector_t old_dev_sectors = mddev->dev_sectors;
bool is_reshaped = false;
+ if (mddev_is_locked(mddev))
+ mddev_unlock(mddev);
/* resync has finished, collect result */
md_unregister_thread(&mddev->sync_thread);
+ if (mddev_is_locked(mddev))
+ mddev_lock(mddev);
if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery) &&
!test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery) &&
mddev->degraded != mddev->raid_disks) {
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 53ea7a6961de..96a88b7681d6 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -549,6 +549,11 @@ static inline int mddev_trylock(struct mddev
*mddev)
}
extern void mddev_unlock(struct mddev *mddev);
+static inline int mddev_is_locked(struct mddev *mddev)
+{
+ return mutex_is_locked(&mddev->reconfig_mutex);
+}
+
Patch is bogus relative to the proposed mddev_unlock/mddev_lock logic
in md.c around the
md_unregister_thread() as it's failing to lock again if it was holding
the mutex before as it again
calls mddev_locked() after having the mutex unlocked just before the
md_unregister_thread() call.
If that patch to md.c holds up in further analysis, it has to keep the
mddev_is_locked() result
and unlock/lock conditionally based on its result.
Yes, that was my intention too, thanks for pointing it out.
@@ -9407,10 +9407,16 @@ void md_reap_sync_thread(struct mddev *mddev)
{
struct md_rdev *rdev;
sector_t old_dev_sectors = mddev->dev_sectors;
- bool is_reshaped = false;
+ bool is_reshaped = false, is_locked = false;
/* resync has finished, collect result */
+ if (mddev_is_locked(mddev)) {
+ is_locked = true;
+ mddev_unlock(mddev);
+ }
md_unregister_thread(&mddev->sync_thread);
+ if (is_locked)
+ mddev_lock(mddev);
Thanks,
Guoqing