Re: [PATCH 01/02/RFC] implement a stat cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Apr 20, 2008 at 04:07:35PM -0700, Linus Torvalds wrote:
>
> Junio, what was the logic for that whole "has_symlink_leading_path()"
> thing? I forget. Whatever, it's broken.

===
commit f859c846e90b385c7ef873df22403529208ade50
Author: Junio C Hamano <junkio@xxxxxxx>
Date:   Fri May 11 22:11:07 2007 -0700

    Add has_symlink_leading_path() function.

    When we are applying a patch that creates a blob at a path, or
    when we are switching from a branch that does not have a blob at
    the path to another branch that has one, we need to make sure
    that there is nothing at the path in the working tree, as such a
    file is a local modification made by the user that would be lost
    by the operation.

    Normally, lstat() on the path and making sure ENOENT is returned
    is good enough for that purpose.  However there is a twist.  We
    may be creating a regular file arch/x86_64/boot/Makefile, while
    removing an existing symbolic link at arch/x86_64/boot that
    points at existing ../i386/boot directory that has Makefile in
    it.  We always first check without touching filesystem and then
    perform the actual operation, so when we verify the new file,
    arch/x86_64/boot/Makefile, does not exist, we haven't removed
    the symbolic link arc/x86_64/boot symbolic link yet.  lstat() on
    the file sees through the symbolic link and reports the file is
    there, which is not what we want.

    The function has_symlink_leading_path() function takes a path,
    and sees if any of the leading directory component is a symbolic
    link.

    When files in a new directory are created, we tend to process
    them together because both index and tree are sorted.  The
    function takes advantage of this and allows the caller to cache
    and reuse which symbolic link on the filesystem caused the
    function to return true.

    The calling sequence would be:

        char last_symlink[PATH_MAX];

            *last_symlink = '\0';
            for each index entry {
                if (!lose)
                        continue;
                if (lstat(it))
                        if (errno == ENOENT)
                                ; /* happy */
                        else
                                error;
                else if (has_symlink_leading_path(it, last_symlink))
                        ; /* happy */
                else
                        error; /* would lose local changes */
                unlink_entry(it, last_symlink);
        }
===

And there are some cases where stat() on path is desirable:
http://www.spinics.net/lists/git/msg63988.html

So while stat information for regular files is cached in the index,
stat information for directories is not cached, and that appears to
be wrong. Maybe, Lucano's cache makes sense if it stores only stat
information for directories.

IIRC, some time ago, an otherwise reasonable patch for .gitignore was
rejected just because it would drive the number calls to lstat() up as
these calls on directories are not cached in the index.

Dmitry
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux