Re: [PATCH] super-intel: ensure suspended region is removed when reshape completes.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



NeilBrown <neilb@xxxxxxxx> writes:
> On Fri, Feb 19 2016, Jes Sorensen wrote:
>
>> NeilBrown <neilb@xxxxxxxx> writes:
>>> A recent commit removed a call to abort_reshape() when IMSM reshape
>>> completed.  An unanticipated result of this is that the suspended
>>> region is not cleared as it should be.
>>> So after a reshape, a region of the array will cause all IO to block.
>>>
>>> Re-instate the required updates to suspend_{lo,hi} coped from
>>> abort_reshape().
>>>
>>> This is caught (sometimes) by the test suite.
>>>
>>> Also fix a couple of typos found while exploring the code.
>>>
>>> Reported-by: Ken Moffat <zarniwhoop@xxxxxxxxxxxx>
>>> Cc: Artur Paszkiewicz <artur.paszkiewicz@xxxxxxxxx>
>>> Fixes: 2139b03c2080 ("imsm: don't call abort_reshape() in imsm_manage_reshape()")
>>> Signed-off-by: NeilBrown <neilb@xxxxxxxx>
>>> ---
>>>  super-intel.c | 7 +++++--
>>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/super-intel.c b/super-intel.c
>>> index 90b7b6dee5d0..80b48d0fdd47 100644
>>> --- a/super-intel.c
>>> +++ b/super-intel.c
>>> @@ -10465,7 +10465,7 @@ int check_degradation_change(struct mdinfo *info,
>>>   * Function:	imsm_manage_reshape
>>>   * Description:	Function finds array under reshape and it manages reshape
>>>   *		process. It creates stripes backups (if required) and sets
>>> - *		checheckpoits.
>>> + *		checkpoints.
>>>   * Parameters:
>>>   *	afd		: Backup handle (nattive) - not used
>>>   *	sra		: general array info
>>> @@ -10595,7 +10595,7 @@ static int imsm_manage_reshape(
>>>  
>>>  		start = current_position * 512;
>>>  
>>> -		/* allign reading start to old geometry */
>>> +		/* align reading start to old geometry */
>>>  		start_buf_shift = start % old_data_stripe_length;
>>>  		start_src = start - start_buf_shift;
>>>  
>>> @@ -10700,6 +10700,9 @@ static int imsm_manage_reshape(
>>>  	ret_val = 1;
>>>  abort:
>>>  	free(buf);
>>> +	sysfs_set_num(sra, NULL, "suspend_lo", 0x7FFFFFFFFFFFFFFFULL);
>>> +	sysfs_set_num(sra, NULL, "suspend_hi", 0);
>>> +	sysfs_set_num(sra, NULL, "suspend_lo", 0);
>>>  
>>>  	return ret_val;
>>>  }
>>
>> This does indeed match the behavior of abort_reshape(), however looking
>> through git history, I cannot find any explanation as to why the code
>> sets suspend_lo twice.
>>
>> Any chance you can enlighten me why this is necessary?
>
> This is what I never got when I was maintainer - people insisting
> (rightly) that I explain the details...  Thanks!
>
>
> Prior to
> Commit: 23ddff3792f6 ("md: allow suspend_lo and suspend_hi to decrease
> as well as increase.")
> you could only increase suspend_{lo,hi} unless the region they covered
> was empty.  So to reset to 0, you need to push suspend_lo up past
> suspend_hi first.
> So to maximize the chance of mdadm working on all kernels, we want to keep
> doing that.
>
> Maybe you could add that to the change comment?  Or should there be a
> comment in abort_reshape()??

Applied with a one-line refererence to the documentation added in
abort_reshape().

Thanks!
Jes
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux