On Fri, May 22, 2009 at 5:03 PM, NeilBrown <neilb@xxxxxxx> wrote: > On Fri, May 22, 2009 7:46 pm, SandeepKsinha wrote: >> Hi Neil, >> >> >> On Fri, May 22, 2009 at 2:28 PM, NeilBrown <neilb@xxxxxxx> wrote: >>> On Fri, May 22, 2009 6:36 pm, Andre Noll wrote: >>>> On 13:48, SandeepKsinha wrote: >>>>> Signed-off-by: Sandeep K Sinha <sandeepksinha@xxxxxxxxx> >>>>> Date: Fri May 22 13:46:56 2009 +0530 >>>>> >>>>> Protecting mddev with barriers to avoid races. >>>>> >>>>> diff --git a/drivers/md/linear.c b/drivers/md/linear.c >>>>> index 9ad6ec4..e2dbda8 100644 >>>>> --- a/drivers/md/linear.c >>>>> +++ b/drivers/md/linear.c >>>>> @@ -28,10 +28,12 @@ >>>>> static inline dev_info_t *which_dev(mddev_t *mddev, sector_t sector) >>>>> { >>>>> int lo, mid, hi; >>>>> - linear_conf_t *conf = mddev_to_conf(mddev); >>>>> + linear_conf_t *conf; >>>>> >>>>> lo = 0; >>>>> hi = mddev->raid_disks - 1; >>>>> + barrier(); >>>>> + conf = mddev_to_conf(mddev); >>>> >>>> Are you sure that a compiler barrier is sufficient here? Maybe we >>>> need smp_mb(), but I'm not familar with all the different memory >>>> barrier types that exist in the kernel. >>> >>> I think we just want a read barrier here: >>> smb_rmb(); >>> because it is a barrier between reading mddev->raid_disk and >>> reading mddev->private. >>> >>> In linear_add, we want a write barrier >>> smb_wmb(); >>> before setting mdev->raid_disks, and I think we also want one >>> before setting mddev->private to make sure that all the assignments >>> to newconf->whatever are stable. >> >> Do we really need one before setting mddev->private? Will newconf be >> used by anyone before its assigned to mddev->private? >> No. Then what makes you put a write barrier before it? > > I'm not 100% certain we do need it, but the logic goes like that: > > One processor does > newconf->disks[0].end_sector = X; > mddev->private = newconf; > > and another processor does > conf = mddev->private; > foo = conf->disk[0].end_sector; > > Due to the lack of memory ordering guarantees, foo could end up > with the value of end_sector *before* X was assigned to it. > Not sure but can be a possibility. But I feel if thats the case, we might have to revisit the whole md code. I see similar instances at lot many places. > In fact, rcu_assign_pointer (in include/linux/rcupdate.h) has > exactly this purpose: place an smp_wmb before assigning the address > of a structure to a pointer, to ensure that the content of the > structure is globally visible before the address of the structure > become visible. > So I would probably do > > rcu_assign_pointer(mddev->private, newconf); > smb_wmb(); > mddev->raid_disks ++; > > An alternative (which actually might be an improvement, I'm not > sure) is to have a conf->raid_disks field and use that value in > which_dev. Then it would be more like > > mddev->raid_disk++; > newconf->raid_disk = mddev->raid_disks; > rcu_assign_pointer(mddev->private, newconf); > > and we only have one memory barrier. > The which_dev code then becomes: > > conf = rcu_dereference(mddev->private); > hi = conf->raid_disks = 1; > ..... > I like this approach more. And this will surely keep consistency throughout the code. > and again we have only one memory barrier. The significant benefit > of this is that instead of using the low level smp_[rw]mb() operations, > we use the higher level rcu_* interfaces which make it a lot more > obvious what we are doing. > > It would then be a very small step use use rcu_dereference for > all mddev->private accesses in linear.c, wrap them in rcu_read_{,un}lock, > and then use call_rcu to arrange for the old conf_t to be freed. > I am going to try this. Will keep you posted. > NeilBrown > > -- Regards, Sandeep. “To learn is to change. Education is a process that changes the learner.” -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html