Re: D_LOOKUP in REAL_LOOKUP?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



for a recent work (whose comment are therefore still valid for present
kernel) is:

http://lwn.net/Articles/315501/

On Thu, Apr 2, 2009 at 10:47 AM, Peter Teoh <htmldeveloper@xxxxxxxxx> wrote:
> On Tue, Mar 31, 2009 at 2:01 PM, krushnaal pai <krisonearth@xxxxxxxxx> wrote:
>> in kernel 2.6.11
>> to retrieve a particular file the kernel calls d_lookup to check cache but
>> if  it is not found in dcache ,kernel calls real_lookup to search the hard
>> disk.
>> however inside real_lookup the kernel again calls d_lookup . can someone plz
>> explain why?
>>
>
> there are some difference, purpose for performance, because these
> functions are in the kernel's hot path.
>
> the key is NOT to lock whenever possible, and use RCU locks (optimized
> for read operation) whenever possible, and last resort is to the the
> mutex locking:
>
> so the fastest path is __d_lookup() first, then d_lookup() and finally
> real_lookup().
>
> Normally OS will attempt __d_lookup() or d_lookup() first:
>
> struct dentry * d_lookup(struct dentry * parent, struct qstr * name)
> {
>
>        do {
>                seq = read_seqbegin(&rename_lock);
>                dentry = __d_lookup(parent, name);
>                if (dentry)
>                        break;
>        } while (read_seqretry(&rename_lock, seq));
>        return dentry;
> }
>
> and reading the comment of d_lookup():
>
>  * __d_lookup is dcache_lock free. The hash list is protected using RCU.
>  * Memory barriers are used while updating and doing lockless traversal.
>  * To avoid races with d_move while rename is happening, d_lock is used.
>  *
>
> (And notice that since the directory is not lock (done only in
> real_lookup()), it is possible to have been renamed, and so the
> contents in the dcache may not be valid.)
>
> The rename_lock above protect against (comment extracted):
>
>        /*
>         * Need rcu_readlock to protect against the d_parent trashing
>         * due to d_move
>         */
>
> And dcache_lock is not called at all in __d_lookup(), so underlying
> changes (to the link list) may take place, as u can read the
> fs/dcache.c for a lot of the spinlock on dcache_lock before making
> dcache changes.
>
> struct dentry * __d_lookup(struct dentry * parent, struct qstr * name)
> {
>        rcu_read_lock();
>
>        hlist_for_each_entry_rcu(dentry, node, head, d_hash) {
>
>                spin_lock(&dentry->d_lock);
>
> The logic behind RCU is quite complex - it  allowed read-only
> operation because the object is duplicated, so existing changes to the
> object will go to another copy, but u will always be reading the old
> copy.
>
> For real_lookup() it is the worst case (slowest path).   But it is
> protected against the directory being move/rename, which d_lookup()
> does not guarantee.   And more important, since mutex lock is always
> required, once the lock is acquired, u can safely be sure that after
> that the dcache will not be modified/updated.
>
> static struct dentry * real_lookup(struct dentry * parent, struct qstr
> * name, struct nameidata *nd)
> {
>        mutex_lock(&dir->i_mutex);
>        result = d_lookup(parent, name);
>
> Since mutex_lock() can block, it is possible that while blocking the
> the file may have been read into the dcache.   So this is the final
> check for the file in the dcache again.
>
> In general, the more fine-grained the locking, the faster the
> performance (to improve concurrency), but also more complex in
> operation.   Here we have dcache_lock and rename_lock as locks for
> dcache subsystem, and then the filesystem object having different
> locks etc.
>
> --
> Regards,
> Peter Teoh
>



-- 
Regards,
Peter Teoh

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ



[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux