On Tue, Jun 29 2010 at 7:40am -0400, Alasdair G Kergon <agk@xxxxxxxxxx> wrote: > On Thu, May 27, 2010 at 08:47:48AM -0400, Mike Snitzer wrote: > > Introduce 'type_lock' in mapped_device structure and use it to protect > > md->type access. table_load() sets md->type without concern for: > > > - another table_load() racing to set conflicting md->type. > which leads to non-deterministic behaviour that is of no practical use and > so there is no need to support it. > > > - do_resume() making a conflicting table live. > Any table being made live must already have passed through table_load so I > don't see how there can be conflicting types. As we talked about on irc. There is potential for do_resume() to destage the "inactive" table slot, in the process of making the table "live", and in parallel another table load can arrive to stage another "inactive" table (which can have a conflicting type). Without a critical section there is room for this to occur. And the critical section must span more than setting md->type; as md->queue initialization may fail (elevator memory allocation fails ENOMEM). So competing table_load needs protection from concurrent queue initialization. (do understand that I know you're commenting on this patch header in isolation; but I figured I'd give context for "why the mutex"). I have an updated patch header too: Before table_load() stages an inactive table check if its type conflicts with a previously established device type (md->type). Allowed md->type transitions: DM_TYPE_NONE => DM_TYPE_BIO_BASED DM_TYPE_NONE => DM_TYPE_REQUEST_BASED Once a table load occurs the DM device's type is completely immutable. This prevents a table reload from switching the inactive table directly to a conflicting type (even if the table is explicitly cleared before load). Introduce 'type_lock' in mapped_device structure and use it to protect md->type access. Use of mutex prepares for follow-on DM device queue initialization changes that will allocate memory while md->type_lock is held. Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx> Acked-by: Kiyoshi Ueda <k-ueda@xxxxxxxxxxxxx> > The mutex here seems to be overkill: instead of protecting one field to > cope with a useless race, we should be making the output of the race > deterministic (e.g. perhaps giving an error on the parallel table_load > attempt). We're protecting 2 fields in the end (md->type and md->queue): once 2/2 is applied. I agree 100% on improving the DM core to be trained to error out on competing table loads (at a higher level).. given the parallelism that we have right now in DM operations (across devices) we get this awkward inability to _know_ the state transitions of: no table -> inactive table -> live table As you pointed out that is an unwanted side-effect of supporting operations we do want. > Anyway, I can take this patch for now and we can deal with the mutex/race > differently later. OK, sounds good. Mike -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel