Re: [PATCH 7/7] Hold all write bios when errors are handled

Takahiro Yasui <tyasui@xxxxxxxxxx> · Mon, 23 Nov 2009 12:54:47 -0500

On 11/23/09 00:58, malahal@xxxxxxxxxx wrote:
> Mikulas Patocka [mpatocka@xxxxxxxxxx] wrote:
>> @@ -730,10 +719,25 @@ static void do_failures(struct mirror_se
>>  		if (!ms->log_failure) {
>>  			ms->in_sync = 0;
>>  			dm_rh_mark_nosync(ms->rh, bio);
>> +		}
>> +		/*
>> +		 * If all the legs are dead, fail the I/O.
>> +		 *
>> +		 * If we are not using dmeventd, we pretend that the I/O
>> +		 * succeeded. This is wrong (the failed leg might come online
>> +		 * again after reboot and it would be replicated back to
>> +		 * the good leg) but it is consistent with current behavior.
>> +		 * For proper behavior, dm-raid1 shouldn't be used without
>> +		 * dmeventd at all.
>> +		 *
>> +		 * If we use dmeventd, hold the bio until dmeventd does its job.
>> +		 */
>> +		if (!get_valid_mirror(ms))
>> +			bio_endio(bio, -EIO);
>> +		else if (!errors_handled(ms))
>>  			bio_endio(bio, 0);
>> -		} else {
>> +		else
>>  			hold_bio(ms, bio);
> 
> You seem to be holding the I/O that has failed, but what about writes
> that are issued after this failure. They seem to go through just fine.
> Did I miss something? I think do_writes() needs to check for a leg
> failure (just like it does for log_failure already), and put them on
> failure/hold list, right?

On 09/09/09 16:18, Takahiro Yasui wrote:
>  - As I mentioned before, bios which are sent to out-of-sync regions also
>    need to be blocked because bios to out-of-sync regions are processed by
>    generic_make_request() and would return the "success" to upper layer
>    without dm-raid1 notices it. This might cause data corruption.
>    https://www.redhat.com/archives/dm-devel/2009-July/msg00118.html

I think you are right. I also commented above two months ago, but
I haven't got comments from Mikulas.

> Also, we do need to do the above work only if "primary" leg fails. We
> can continue to work just like the old code if "secondary" legs fail,
> right? Not sure if this is worth optimizing though, but I would like to
> see it implemented as it is just a few extra checks. We can have
> primary_failure field like log_failure field.

Good idea. This data corruption we are talking could happen only if
the primary leg fails. Keeping system running in case of non-primary
legs looks reasonable.

Taka

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel