Re: [RFC PATCH v2 01/18] vfs: export new_inode_pseudo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2020-09-09 at 11:49 -0700, Eric Biggers wrote:
> [+Cc Al]
> 
> On Wed, Sep 09, 2020 at 12:51:02PM -0400, Jeff Layton wrote:
> > > > No, more like:
> > > > 
> > > > Syscall					Workqueue
> > > > ------------------------------------------------------------------------------
> > > > 1. allocate an inode
> > > > 2. determine we can do an async create
> > > >    and allocate an inode number for it
> > > > 3. hash the inode (must set I_CREATING
> > > >    if we allocated with new_inode()) 
> > > > 4. issue the call to the MDS
> > > > 5. finish filling out the inode()
> > > > 6.					MDS reply comes in, and workqueue thread
> > > > 					looks up new inode (-ESTALE)
> > > > 7. unlock_new_inode()
> > > > 
> > > > 
> > > > Because 6 happens before 7 in this case, we get an ESTALE on that
> > > > lookup.
> > > 
> > > How is ESTALE at (6) possible?  (3) will set I_NEW on the inode when inserting
> > > it into the inode hash table.  Then (6) will wait for I_NEW to be cleared before
> > > returning the inode.  (7) will clear I_NEW and I_CREATING together.
> > > 
> > 
> > Long call chain, but:
> > 
> > ceph_fill_trace
> >    ceph_get_inode
> >       iget5_locked
> >          ilookup5(..._nowait, etc)
> >             find_inode_fast
> > 
> > 
> > ...and find_inode_fast does:
> > 
> >                 if (unlikely(inode->i_state & I_CREATING)) {                                        
> >                         spin_unlock(&inode->i_lock);                                                
> >                         return ERR_PTR(-ESTALE);                                                    
> >                 }                                                                                   
> 
> Why does ilookup5() not wait for I_NEW to be cleared if the inode has
> I_NEW|I_CREATING, but it does wait for I_NEW to be cleared if I_NEW is set its
> own?  That seems like a bug.
> 

Funny, I asked Al the same thing on IRC the other day:

23:28 < jlayton> viro: wondering if there is a potential race with I_CREATING in find_inode. 
                 Seems like you could have 2 tasks racing in calls to iget5_locked for the 
                 same inode. One creates an inode and starts instantiating it, and the second 
                 one gets NULL back because I_CREATING is set.
23:30 < viro> jlayton: where would I_CREATING come from?
23:30 < viro> it's set on insert_inode_locked() and similar paths
23:31 < viro> where you want iget5_locked() to fuck off and eat ESTALE
23:31 < jlayton> ok, right -- I was trying to pass an inode from new_inode to inode_insert5
23:32 < viro> seeing that it's been asked for an inode number that did _not_ exist until just 
              now (we'd just allocated it)

The assumption is that we'll never go looking for an inode until after
I_NEW is cleared. In the case of an asynchronous create in ceph though,
we may do exactly that if the reply comes back very quickly.

-- 
Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux