On Tue, Apr 21, 2015 at 10:20:07PM +0100, Al Viro wrote: > I agree that unlazy_walk() attempted when walking a symlink ought to fail > with -ECHILD; we can't legitimize the symlink itself, so once we are out > of RCU mode, there's nothing to hold the inode of symlink (and its body) > from getting freed. Solution is wrong though; for example, when > nested symlink occurs in the middle of a trailing one, we should *not* > remove the flag upon leaving the nested symlink. > > Another unpleasant thing is that ->follow_link() saying "can't do that in > RCU mode" ends up with restart from scratch - that actually risks to be > worse than the mainline; there we would attempt unlazy_walk() and normally > it would've succeed. > > AFAICS, the real rule is "can't unlazy if nd->last.name points into a symlink > body and we might still need to access it"... Actually, I'm not sure anymore. What if we have unlazy_walk() legitimize all the symlinks we are traversing? They are visible in nd->stack, after all... It would mean more complex unlazy_walk(), but not terribly so - succeeding legitimize_mnt() won't block and we already deal with the possibility of having vfsmount legitimized, only to be dropped afterwards. The real unpleasantness here is different - it's the need to keep ->d_seq of those dentries to tell if they can be grabbed. That's 4 more bytes per level plus the fun with alignment. OTOH, it both avoids the fun with getting the logics of when to bail out right *and* avoids the guaranteed restarts when running into a symlink we can't deal with in RCU mode - we could simply unlazy and continue in such a situation. Hell knows... it probably means going all the way wrt dynamic (on demand) allocation, though. Say it, keeping a couple of levels on stack and allocating when we need more; the interesting part is in not freeing that sucker too early. At the very least, we don't want the progression through RCU/normal/ revalidate-everything modes to trigger allocation/freeing on each step; the nesting depth is going to be the same every time. That's not hard to do... I'm about to fall asleep right now, so all of the above might very well be complete hogwash; I'll look into it when I wake up. If anyone has any comments (including "Al, you are nuts", but something more specific would be more interesting), please reply. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html