On Mon, 2 Jul 2012 11:16:26 +0800 Shaohua Li <shli@xxxxxxxxxx> wrote: > > Then I could see what is being added and what is being removed all in the one > > patch and I can be sure that they balance. > > reworked the patch 3-5 to two patches as you suggested, and sent to you. please check. Thanks. That's looking really good. However I think we can do better. I've been looking more closely at the code and I think that the only things that we need stripe_lock to protect are ->toread and ->towrite, together with the following bios. e.g. ->toread->bi_next etc. ->read and ->written don't need stripe_lock protection, as they are only manipulated by the handle_stripe state machine which uses STRIPE_ACTIVE and refcounts for exclusion. So add_stripe_bio need to take the lock while adding a bio to the ->toread and ->towrite lists, and ops_run_biodrain() and ops_run_biofill need to take the lock while the move the list from ->to{read,write} to ->{read,written}. But we don't need it anywhere else. e.g. analyse_stripe shouldn't need the lock at all. Any change that could happen during the loop could equally happen after the lock was released so we don't lose by not having the lock. There is another current user of the lock, but I think that should be discarded as a false optimisation. We currently try to optimise out extra calls to bitmap_startwrite and bitmap_endwrite when we see back-to-back writes to the one stripe. However I suspect that is extremely unlikely and it just imposes and pointless need for synchronisation in raid5. We just simply call bitmap_startwrite whenever ->towrite changes from NULL to non-NULL, and call bitmap_endwrite whenever we clear ->written, reguardless what value ->towrite now has. Would you like to experiment with that? If I haven't described it well enough I can write a patch to show what I mean. Thanks, NeilBrown
Attachment:
signature.asc
Description: PGP signature