Luis Henriques wrote on Wed, May 12, 2021 at 12:58:58PM +0100: > <...>-20591 [000] ...2 67.538644: fscache_cookie: GET prn c=000000003080d900 u=50 p=0000000042542ee5 Nc=48 Na=1 f=22 > <...>-20591 [000] ...1 67.538645: fscache_acquire: c=0000000011fa06b1 p=000000003080d900 pu=50 pc=49 pf=22 n=9p.inod > <...>-20599 [003] .N.2 67.542180: 9p_fscache_cookie: v9fs_drop_inode cookie: 0000000097476aaa > [...] > > So, this is... annoying, I guess. Oh, this actually looks different from what I had in mind. So if I'm reading this right, the dup acquire happens before drop on another thread, meaning iget5_locked somehow returned an inode with I_NEW on same i_ino than that of the inode that is dropped later?... How much trust can we actually put in trace ordering off different cpus? My theory would really have wanted just that drop before the acquire :D Anyway, I think there's no room for doubt that it's possible to get a new inode for the same underlying file before the evict finished; which leaves room for a few questions: - as David brought up on IRC (#linuxfs@OFTC), what about the flushing of dirty data that happens in evict()? wouldn't it be possible for operations on the new inode to read stale data while the old inode is being flushed? I think that warrants asking someone who understands this better than me as it's probably not 9p specific even if 9p makes it easier to get a new inode in such a racy way... - for 9p in particular, Christian Schoenebeck (helping with 9p in qemu) brought up that we evict inodes too fast too often, so I think it'd help to have some sort of inode lifetime management and keep inodes alive for a bit. As a network filesystem with no coherency built in the protocol I don't think we can afford to keep inodes cached too long, and I know some servers have troubles if we keep too many fids open, but it would be nice to have a few knobs to just keep inodes around a bit longer... This won't solve the fundamental problem but if the inode isn't evicted at a point where it's likely to be used again then this particular problem should be much harder to hit (like other filesystems, actually :P) I'm not sure how that works though, and won't have much time to work on it short term anyway, but it's an idea :/ -- Dominique