[+Cc Al] On Wed, Sep 09, 2020 at 12:51:02PM -0400, Jeff Layton wrote: > > > No, more like: > > > > > > Syscall Workqueue > > > ------------------------------------------------------------------------------ > > > 1. allocate an inode > > > 2. determine we can do an async create > > > and allocate an inode number for it > > > 3. hash the inode (must set I_CREATING > > > if we allocated with new_inode()) > > > 4. issue the call to the MDS > > > 5. finish filling out the inode() > > > 6. MDS reply comes in, and workqueue thread > > > looks up new inode (-ESTALE) > > > 7. unlock_new_inode() > > > > > > > > > Because 6 happens before 7 in this case, we get an ESTALE on that > > > lookup. > > > > How is ESTALE at (6) possible? (3) will set I_NEW on the inode when inserting > > it into the inode hash table. Then (6) will wait for I_NEW to be cleared before > > returning the inode. (7) will clear I_NEW and I_CREATING together. > > > > Long call chain, but: > > ceph_fill_trace > ceph_get_inode > iget5_locked > ilookup5(..._nowait, etc) > find_inode_fast > > > ...and find_inode_fast does: > > if (unlikely(inode->i_state & I_CREATING)) { > spin_unlock(&inode->i_lock); > return ERR_PTR(-ESTALE); > } Why does ilookup5() not wait for I_NEW to be cleared if the inode has I_NEW|I_CREATING, but it does wait for I_NEW to be cleared if I_NEW is set its own? That seems like a bug. - Eric