> either merge it > before -rc1 and fix it up by -rc3 or so, or fix it during the > window and merge at around -rc2 - I'm fine with either > variant. We've kept a list we made from all those mail messages so we could check off things we've tried to address, I was looking at it yesterday and I know it is not up-to-date, but we'll work to get it that way. The second option might be safer unless you help us again, I don't want to sign a rubber check to Linus. -Mike On Fri, Mar 11, 2016 at 4:47 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > On Fri, Mar 11, 2016 at 03:18:57PM -0500, Mike Marshall wrote: >> Greetings... >> >> The Orangefs for-next tree is: >> >> git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux.git >> for-next >> >> I did a test merge (just locally, not pushed out) of Orangefs:for-next >> and v4.5-rc7 so I could test out how I think I need to patch for >> the follow_link -> get_link change, the diff is below. >> >> On Monday next, assuming that v4.5 is finalized this weekend, >> I plan to do a actual merge with v4.5, apply the get_link patch >> and push that to Orangefs:for-next. >> >> Hi Al <g>... might we get an ACK this time around? > > You do realize that it will mean fun few weeks post-merge fixing the rest of > problems, right? FWIW, I think that right now it *is* at the state where it > such fixing is feasible, so modulo that... > > As far as I can see, waiting-related logics should be solid by now, ditto > for lifetime rules; sanitizing the input... listxattr still does need fixing > (feed it a negative in ->downcall.resp.listxattr.lengths[0] and watch Bad > Things(tm) happen; no idea why would anyone go for > fs/orangefs/downcall.h:82: __s32 lengths[ORANGEFS_MAX_XATTR_LISTLEN]; > for representing string lengths in the first place, but that's what you've > got there and no sanity checks are done on it beyond > if (total + new_op->downcall.resp.listxattr.lengths[i] > size) > goto done; > which is not enough - not with total and size being ssize_t and ...lengths[] - > signed 32bit). > > The logics around maintaining the list of orangefs superblocks (add/remove/ > traverse) needs fixing; right now ioctl(..., ORANGEFS_DEV_REMOUNT_ALL) will > walk through it with only request_mutex held. Both insertion and removal > are protected only by orangefs_superblocks_lock, and removal is insane - > struct list_head *tmp_safe = NULL; \ > struct orangefs_sb_info_s *orangefs_sb = NULL; \ > \ > spin_lock(&orangefs_superblocks_lock); \ > list_for_each_safe(tmp, tmp_safe, &orangefs_superblocks) { \ > orangefs_sb = list_entry(tmp, \ > struct orangefs_sb_info_s, \ > list); \ > if (orangefs_sb && (orangefs_sb->sb == sb)) { \ > gossip_debug(GOSSIP_SUPER_DEBUG, \ > "Removing SB %p from orangefs superblocks\n", \ > orangefs_sb); \ > list_del(&orangefs_sb->list); \ > break; \ > } \ > } \ > spin_unlock(&orangefs_superblocks_lock); \ > list_entry is never NULL, for starters, and since there is a pointer back > from superblock to that orangefs_sb_info, there's no reason to walk the entire > list to find one. BTW, both add_orangefs_sb() and remove_orangefs_sb() should > be taken to their sole users. > > Sanity aside, there's really no lock in common for list modifiers and list > walker I'd mentioned above. FWIW, I would make orangefs_remount() > take struct orangefs_sb_info instead of struct super_block and flipped the > order of operations in orangefs_kill_sb() - kill_anon_super() *first*, then > remove from the list, then tell the userland that it's going away (i.e. > call orangefs_unmount_sb()). request_mutex in the last one would, at least, > prevent freeing the sucker before orangefs_remount() is done with it. > > Walking the list and calling orangefs_remount() on everything would still need > care - you'd need to hold orangefs_superblocks_lock, drop it for actual calls > of orangefs_remount() and have list removal preserve the forward pointer. > > That's probably the worst remaining locking issue I see in there. Doable, > if not pleasant... > > IIRC, there also had been some unpleasantness with getattr messing ->i_mode > halfway through... <checks> Yes - copy_attributes_to_inode() will be called, > and do > inode->i_mode = orangefs_inode_perms(attrs); > ... > inode->i_mode |= S_IFLNK; > ... > strncpy(orangefs_inode->link_target, > symname, > ORANGEFS_NAME_MAX); > If nothing else, *another* stat(2) racing with this one could pick the > intermediate value of ->i_mode and proceed to report it to userland. > Another problem is overwriting the symlink body; that can get very > unpleasant, since it might be traversed by another syscall right at that > moment. Any change of a symlink body means "we'd missed it going stale"; > there is no way to change a symlink contents without removing it and > creating a new one. Should anything other than orangefs_iget() even bother > copying it? The same goes for inode type changes, of course (regular > vs. directory vs. symlink, etc.). > > Speaking of orangefs_iget(), orangefs_set_inode() is pointlessly paranoid. > Not a bug per se, but > struct orangefs_inode_s *orangefs_inode = NULL; > > /* Make sure that we have sane parameters */ > if (!data || !inode) > return 0; > orangefs_inode = ORANGEFS_I(inode); > if (!orangefs_inode) > return 0; > is all wrong - 'data' is the last argument passed to iget5_locked (i.e. 'ref' > of orangefs_iget()) and that's always an address of either a local variable > or of a field in a large structure, and not even the first one; 'inode' > is never NULL - it's the address of struct inode the caller is about to > insert into the hash chain; ORANGEFS_I() is container_of(), so it's not > going to be NULL either. > > I'll need to look through the archived threads to see if there's anything > else left; IIRC, debugfs-related code had seriously nasty issues in case of > allocation failures, but those were fairly isolated. I'll read through the > archive tomorrow and see if there's anything else mentioned and not dealt > with; I don't remember anything really bad, but it had been well over > a hundred mails starting about half a year ago; I sure as hell do not > remember every tangential subthread in all of that, so I'll need to recheck. > > I _think_ that all remaining issues can be quickly dealt with, and the code > has zero impact on the rest of the kernel. I wouldn't risk putting it into > -final without fixups, but as for the merge schedule... either merge it > before -rc1 and fix it up by -rc3 or so, or fix it during the window and > merge at around -rc2 - I'm fine with either variant. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html