On 2/22/24 05:09, Miklos Szeredi wrote: > On Thu, 22 Feb 2024 at 02:26, Antonio SJ Musumeci <trapexit@xxxxxxxxxx> wrote: > >> I'll try it when I get some cycles in the next week or so but... I'm not >> sure I see how this would address it. Is this not still marking the >> inode bad. So while it won't forget it perhaps it will still error out. >> How does this keep ".." of root being looked up? >> >> I don't know the code well but I'd have thought the reason for the >> forget was because the lookup of the parent fails. > > It shouldn't be looking up the parent of root. Root should always be > there, and the only way I see root disappearing is by marking it bad. > > If the patch makes a difference, then you need to find out why the > root is marked bad, since the filesystem will still fail in that case. > But at least the kernel won't do stupid things. > > I think the patch is correct and is needed regardless of the outcome > of your test. But there might be other kernel bugs involved, so > definitely need to see what happens. > > Thanks, > Miklos With the patch it doesn't issue forget(nodeid=1) anymore. Nor requesting parent of nodeid=1. However, I'm seeing different issues. I instrumented FUSE to print when it tags an inode bad. After it gets into the bad state I'm seeing nfsd hammering the mount even when I've umounted the nfs share and killed the FUSE server. nfsd is pegging a CPU core and the kernel log is filled with fuse_stale_inode(nodeid=1) fuse_make_bad(nodeid=1) calls. Have to reboot. What's triggering the flagging the inode as bad seems to be in fuse_iget() at fuse_stale_inode() check. inode->i_generation is 0 while the generation value is as I set it originally. From the FUSE server I see: lookup(nodeid=3,name=".") lookup(nodeid=3,name="..") which returns ino=1 gen=expected_val getattr(nodeid=2) inodeid=2 is the file I'm reading in a loop forget(nodeid=2) after which point it's no longer functional.