On Sat, 2009-05-30 at 08:15 -0700, Eric W. Biederman wrote: > James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> writes: > > > On Sat, 2009-05-30 at 06:07 -0700, Eric W. Biederman wrote: > >> Tejun Heo <tj@xxxxxxxxxx> writes: > >> > >> > Eric W. Biederman wrote: > >> >> Also, I'm quite uncomfortable with these things > >> >>> being done in non-atomic manner. It can be made to work but things > >> >>> like this can lead to subtle race conditions and with the kind of > >> >>> layering we put on top of sysfs (kobject, driver model, driver > >> >>> midlayers and so on), it isn't all that easy to verify what's going > >> >>> on, so NACK for this one. > >> >> > >> >> Total nonsense. > >> >> > >> >> Mucking about with sysfs after we start deleting a directory is a bug. > >> >> At worst my change makes a buggy race slightly less deterministic. > >> >> > >> >> I am not ready to consider keeping the current unnecessary atomic > >> >> removal step. That unnecessary atomicity makes the following patches > >> >> more difficult, and requires a lot of unnecessary retesting. > >> >> > >> >> What do you think the extra unnecessary atomicity helps protect? > >> > > >> > It's just not a clean API. When people are trying to code things way > >> > up in the stack, they aren't likely to look up the code to see what > >> > assumptions are being made especially when the stack is deep and > >> > complex and sysfs is near the bottom of the tall stack. IMHO > >> > implementing the usually expected semantics at this depth is worth > >> > every effort. It's just good implementation style which might look > >> > like wasted effort but will harden the stack in the long run. Plus, > >> > it's not like making it atomic is difficult or anything. > >> > >> I guess we are going to have to disagree on this one. > >> > >> My take is simply that a correct user has to wait until no one else > >> can find the kobject before calling kobject_del. At which point > >> races are impossible, and it doesn't matter if sysfs_mutex is held > >> across the entire operation. > > > > I'm afraid this one isn't a valid assumption. If you look in SCSI, > > you'll see we do get objects after they've been removed from visibility. > > We use it as part of the state model for how our objects work (objects > > removed from visibility are dying, but we still need them to be findable > > (and gettable). > > I was not precise enough. It appears I overlooked the fact that > kobject_del is not always called from kobject_put by way of > kobject_release. OK ... just so you understand, I'm thinking about the device model rather than kobjects. device_del() can't be called from release methods because they're often called from interrupt context and the mutex requirements in device_del() mean it needs user context. > Strictly the requirement is that after kobject_del we don't add, > remove or otherwise manipulate sysfs attributes. That is we don't > call any of: > > sysfs_add_file > sysfs_create_file > sysfs_create_bin_file > sysfs_remove_file > sysfs_remove_bin_file > sysfs_create_link > sysfs_remove_link > sysfs_create_group > sysfs_remove_group > sysfs_create_subdir > sysfs_remove_subdir > > > Those all either oops or BUG today if you try it. So I can't see how > a subsystem could depend on those working. It doesn't; you've altered your requirement. We can fully buy into this new relaxed one. > Also there is sysfs_remove_dir (on a subdirectory) aka kobject_del on > a child object after kobject_del on the parent object. > > As best I can tell that only works by fluke today. Yes, that's an artifact of the fact that the reference counted lifecycle is on release ... del just happens at a certain point in it. We don't hold any counters that tell us what the visibility of our children are, so it's possible to make a parent invisible by calling del simply because you don't know. > > Now, this could be altered as part of an object lifetime rewrite of SCSI > > (and I suspect a few other subsystems) but it's certainly an open > > question of whether the pain is worth the gain. > > I won't tell you that sysfs, the kobject layer, or the device layer > are the best thing since sliced bread. I'm just trying to simplify > the code and get the bugs out. James -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html