On 2/24/24 18:18, Antonio SJ Musumeci wrote: > On 2/22/24 05:09, Miklos Szeredi wrote: >> On Thu, 22 Feb 2024 at 02:26, Antonio SJ Musumeci <trapexit@xxxxxxxxxx> wrote: >> >>> I'll try it when I get some cycles in the next week or so but... I'm not >>> sure I see how this would address it. Is this not still marking the >>> inode bad. So while it won't forget it perhaps it will still error out. >>> How does this keep ".." of root being looked up? >>> >>> I don't know the code well but I'd have thought the reason for the >>> forget was because the lookup of the parent fails. >> >> It shouldn't be looking up the parent of root. Root should always be >> there, and the only way I see root disappearing is by marking it bad. >> >> If the patch makes a difference, then you need to find out why the >> root is marked bad, since the filesystem will still fail in that case. >> But at least the kernel won't do stupid things. >> >> I think the patch is correct and is needed regardless of the outcome >> of your test. But there might be other kernel bugs involved, so >> definitely need to see what happens. >> >> Thanks, >> Miklos > > With the patch it doesn't issue forget(nodeid=1) anymore. Nor requesting > parent of nodeid=1. > > However, I'm seeing different issues. > > I instrumented FUSE to print when it tags an inode bad. > > After it gets into the bad state I'm seeing nfsd hammering the mount > even when I've umounted the nfs share and killed the FUSE server. nfsd > is pegging a CPU core and the kernel log is filled with > fuse_stale_inode(nodeid=1) fuse_make_bad(nodeid=1) calls. Have to reboot. > > What's triggering the flagging the inode as bad seems to be in > fuse_iget() at fuse_stale_inode() check. inode->i_generation is 0 while > the generation value is as I set it originally. > > From the FUSE server I see: > > lookup(nodeid=3,name=".") > lookup(nodeid=3,name="..") which returns ino=1 gen=expected_val > getattr(nodeid=2) inodeid=2 is the file I'm reading in a loop > forget(nodeid=2) > > after which point it's no longer functional. > > I've resolved the issue and I believe I know why I couldn't reproduce with current libfuse examples. The fact root node has a generation of 0 is implicit in the examples and as a result when the request came in the lookup on ".." of a child node to root it would return 0. However, in my server I start the generation value of everything at different non-zero value per instance of the server as at one point I read that ensuring different nodeid + gen pairs for different filesystems was better/needed for NFS support. I'm guessing the increase in reports I've had was happenstance of people upgrading to kernels past 5.14. In retrospect it makes sense that the nodeid and gen are assumed to be 1 and 0 respectively, and don't change, but due to the symptoms I had it wasn't clicking till I saw the stale check. Not sure if there is any changes to the kernel code that would make sense. A log entry indicating root was tagged as bad and why would have helped but not sure it needs more than a note in some docs. Which I'll likely add to libfuse. Thanks for everyone's help. Sorry for the goose chase.