On 02/02/10 08:00, Mikulas Patocka wrote: > On Mon, 1 Feb 2010, Takahiro Yasui wrote: > >>> This fixes bug https://bugzilla.redhat.com/show_bug.cgi?id=555197 >>> >>> Please submit this patch before 2.6.33 goes out. It fixes a bug when old >>> LVM (<= 2.02.51) is used, that doesn't pass errors_handled flag to >>> dm-raid1. >>> >>> It doesn't need to be backported to RHEL 5.5, because lvm always passes a >>> flag to handle errors there. >> >> Don't we need to backport it to RHEL 5.5? If lvm is the only user of dm-raid1, >> we don't need to backport it to RHEL 5.5. But if not, we need to. >> >>> Don't lose writes if errors are not handled and log fails >>> >>> If the log fails and errors are not handled by dmeventd, the writes >>> are successfully finished without being actually written to the device. >>> >>> This code path is taken: >>> do_writes: >>> bio_list_merge(&ms->failures, &sync); >>> do_failures: >>> if (!get_valid_mirror(ms)) (false) >>> else if (errors_handled(ms)) (false) >>> else bio_endio(bio, 0); >>> >>> The logic in do_failures is based on presuming that the write was already >>> tried --- if it succeeded at least on one leg and errors are not handled, >>> it is reported as success. >>> >>> However, bio can be added to the failures queue without being submitted, in >>> do_writes. >>> >>> This patch changes it so that bios are added to the failures list only if >>> errors are handled --- then, they will be held with hold_bio() called from >>> do_failures. >> >> I agree that bios should be issued by do_write() when ms->log_failures is set, >> but do we need to add bios to the failures queue? As you mentioned, the failures >> queue should be used to bios which are already handled. Therefore, I think >> bios are better to handled directly by hold_bio() instead of adding them to >> the failures queue as bios for nosync regions are done. >> >> - if (unlikely(ms->log_failure)) { >> - spin_lock_irq(&ms->lock); >> - bio_list_merge(&ms->failures, &sync); >> - spin_unlock_irq(&ms->lock); >> - wakeup_mirrord(ms); >> - } else >> - while ((bio = bio_list_pop(&sync))) >> + while ((bio = bio_list_pop(&sync))) >> + if (unlikely(ms->log_failure) && errors_handled(ms)) >> + hold_bio(ms, bio); >> + else >> do_write(ms, bio); > > I thought about this too, but I'd decided to put the bios on the failures > queue rather than holding them for this reason: if all the legs fail, it > is better to terminate the bio with -EIO than to hold it. If all the legs > fail, you can't save anything anyway and the less things you are holding, > the less possibility for deadlocks exists. Thank you for the explanation. With your following patch, we can keep consistency of the policy for sync and nosync region. https://www.redhat.com/archives/dm-devel/2010-February/msg00014.html Reviewed-by: Takahiro Yasui <tyasui@xxxxxxxxxx> Thanks, Taka > So I put the bios to the failure queue and do_failures will terminate them > with -EIO if all the legs failred and hold them if we use dmeventd and > there is at least one live leg. >> https://www.redhat.com/archives/dm-devel/2009-December/msg00211.html >> >> Thanks, >> Taka >> -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel