Hi Al (and everyone)... git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux.git for-next ... is updated to v4.5, it has the follow_link -> get_link change. I worked today to clean up the debugfs (and sysfs) problems, and we'll keep ticking things off the list, perhaps I should be posting patches here for review instead automatically updating for-next when we work on an issue... We're working on putting "the list" in a place that can be viewed by everyone and edited by both Martin and myself... -Mike On Fri, Mar 11, 2016 at 5:35 PM, Mike Marshall <hubcap@xxxxxxxxxxxx> wrote: >> either merge it >> before -rc1 and fix it up by -rc3 or so, or fix it during the >> window and merge at around -rc2 - I'm fine with either >> variant. > > We've kept a list we made from all those mail messages > so we could check off things we've tried to address, I > was looking at it yesterday and I know it is not up-to-date, > but we'll work to get it that way. The second option > might be safer unless you help us again, I don't want > to sign a rubber check to Linus. > > -Mike > > On Fri, Mar 11, 2016 at 4:47 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: >> On Fri, Mar 11, 2016 at 03:18:57PM -0500, Mike Marshall wrote: >>> Greetings... >>> >>> The Orangefs for-next tree is: >>> >>> git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux.git >>> for-next >>> >>> I did a test merge (just locally, not pushed out) of Orangefs:for-next >>> and v4.5-rc7 so I could test out how I think I need to patch for >>> the follow_link -> get_link change, the diff is below. >>> >>> On Monday next, assuming that v4.5 is finalized this weekend, >>> I plan to do a actual merge with v4.5, apply the get_link patch >>> and push that to Orangefs:for-next. >>> >>> Hi Al <g>... might we get an ACK this time around? >> >> You do realize that it will mean fun few weeks post-merge fixing the rest of >> problems, right? FWIW, I think that right now it *is* at the state where it >> such fixing is feasible, so modulo that... >> >> As far as I can see, waiting-related logics should be solid by now, ditto >> for lifetime rules; sanitizing the input... listxattr still does need fixing >> (feed it a negative in ->downcall.resp.listxattr.lengths[0] and watch Bad >> Things(tm) happen; no idea why would anyone go for >> fs/orangefs/downcall.h:82: __s32 lengths[ORANGEFS_MAX_XATTR_LISTLEN]; >> for representing string lengths in the first place, but that's what you've >> got there and no sanity checks are done on it beyond >> if (total + new_op->downcall.resp.listxattr.lengths[i] > size) >> goto done; >> which is not enough - not with total and size being ssize_t and ...lengths[] - >> signed 32bit). >> >> The logics around maintaining the list of orangefs superblocks (add/remove/ >> traverse) needs fixing; right now ioctl(..., ORANGEFS_DEV_REMOUNT_ALL) will >> walk through it with only request_mutex held. Both insertion and removal >> are protected only by orangefs_superblocks_lock, and removal is insane - >> struct list_head *tmp_safe = NULL; \ >> struct orangefs_sb_info_s *orangefs_sb = NULL; \ >> \ >> spin_lock(&orangefs_superblocks_lock); \ >> list_for_each_safe(tmp, tmp_safe, &orangefs_superblocks) { \ >> orangefs_sb = list_entry(tmp, \ >> struct orangefs_sb_info_s, \ >> list); \ >> if (orangefs_sb && (orangefs_sb->sb == sb)) { \ >> gossip_debug(GOSSIP_SUPER_DEBUG, \ >> "Removing SB %p from orangefs superblocks\n", \ >> orangefs_sb); \ >> list_del(&orangefs_sb->list); \ >> break; \ >> } \ >> } \ >> spin_unlock(&orangefs_superblocks_lock); \ >> list_entry is never NULL, for starters, and since there is a pointer back >> from superblock to that orangefs_sb_info, there's no reason to walk the entire >> list to find one. BTW, both add_orangefs_sb() and remove_orangefs_sb() should >> be taken to their sole users. >> >> Sanity aside, there's really no lock in common for list modifiers and list >> walker I'd mentioned above. FWIW, I would make orangefs_remount() >> take struct orangefs_sb_info instead of struct super_block and flipped the >> order of operations in orangefs_kill_sb() - kill_anon_super() *first*, then >> remove from the list, then tell the userland that it's going away (i.e. >> call orangefs_unmount_sb()). request_mutex in the last one would, at least, >> prevent freeing the sucker before orangefs_remount() is done with it. >> >> Walking the list and calling orangefs_remount() on everything would still need >> care - you'd need to hold orangefs_superblocks_lock, drop it for actual calls >> of orangefs_remount() and have list removal preserve the forward pointer. >> >> That's probably the worst remaining locking issue I see in there. Doable, >> if not pleasant... >> >> IIRC, there also had been some unpleasantness with getattr messing ->i_mode >> halfway through... <checks> Yes - copy_attributes_to_inode() will be called, >> and do >> inode->i_mode = orangefs_inode_perms(attrs); >> ... >> inode->i_mode |= S_IFLNK; >> ... >> strncpy(orangefs_inode->link_target, >> symname, >> ORANGEFS_NAME_MAX); >> If nothing else, *another* stat(2) racing with this one could pick the >> intermediate value of ->i_mode and proceed to report it to userland. >> Another problem is overwriting the symlink body; that can get very >> unpleasant, since it might be traversed by another syscall right at that >> moment. Any change of a symlink body means "we'd missed it going stale"; >> there is no way to change a symlink contents without removing it and >> creating a new one. Should anything other than orangefs_iget() even bother >> copying it? The same goes for inode type changes, of course (regular >> vs. directory vs. symlink, etc.). >> >> Speaking of orangefs_iget(), orangefs_set_inode() is pointlessly paranoid. >> Not a bug per se, but >> struct orangefs_inode_s *orangefs_inode = NULL; >> >> /* Make sure that we have sane parameters */ >> if (!data || !inode) >> return 0; >> orangefs_inode = ORANGEFS_I(inode); >> if (!orangefs_inode) >> return 0; >> is all wrong - 'data' is the last argument passed to iget5_locked (i.e. 'ref' >> of orangefs_iget()) and that's always an address of either a local variable >> or of a field in a large structure, and not even the first one; 'inode' >> is never NULL - it's the address of struct inode the caller is about to >> insert into the hash chain; ORANGEFS_I() is container_of(), so it's not >> going to be NULL either. >> >> I'll need to look through the archived threads to see if there's anything >> else left; IIRC, debugfs-related code had seriously nasty issues in case of >> allocation failures, but those were fairly isolated. I'll read through the >> archive tomorrow and see if there's anything else mentioned and not dealt >> with; I don't remember anything really bad, but it had been well over >> a hundred mails starting about half a year ago; I sure as hell do not >> remember every tangential subthread in all of that, so I'll need to recheck. >> >> I _think_ that all remaining issues can be quickly dealt with, and the code >> has zero impact on the rest of the kernel. I wouldn't risk putting it into >> -final without fixups, but as for the merge schedule... either merge it >> before -rc1 and fix it up by -rc3 or so, or fix it during the window and >> merge at around -rc2 - I'm fine with either variant. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html