Re: [PATCH v6 27/32] prune: strategies for linked checkouts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 9, 2014 at 3:33 AM, Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> wrote:
> (alias R=$GIT_COMMON_DIR/repos/<id>)
>
>  - linked checkouts are supposed to keep its location in $R/gitdir up
>    to date. The use case is auto fixup after a manual checkout move.
>
>  - linked checkouts are supposed to update mtime of $R/gitdir. If
>    $R/gitdir's mtime is older than a limit, and it points to nowhere,
>    repos/<id> is to be pruned.
>
>  - If $R/locked exists, repos/<id> is not supposed to be pruned. If
>    $R/locked exists and $R/gitdir's mtime is older than a really long
>    limit, warn about old unused repo.
>
>  - "git checkout --to" is supposed to make a hard link named $R/link
>    pointing to the .git file on supported file systems to help detect
>    the user manually deleting the checkout. If $R/link exists and its
>    link count is greated than 1, the repo is kept.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx>
> ---
> diff --git a/builtin/prune.c b/builtin/prune.c
> index 144a3bd..6db6bcc 100644
> --- a/builtin/prune.c
> +++ b/builtin/prune.c
> @@ -112,6 +112,70 @@ static void prune_object_dir(const char *path)
>         }
>  }
>
> +static const char *prune_repo_dir(const char *id, struct stat *st)
> +{
> +       char *path;
> +       int fd, len;
> +       if (file_exists(git_path("repos/%s/locked", id)))
> +               return NULL;
> +       if (stat(git_path("repos/%s/gitdir", id), st)) {
> +               st->st_mtime = expire;
> +               return _("gitdir does not exist");

If a plain file exists in 'repos' for some reason, it will be caught
by this case. Would it make sense, however, to handle that specially
and report a more accurate message, such as "not a repo" or some such?

> +       }
> +       fd = open(git_path("repos/%s/gitdir", id), O_RDONLY);

If 'gitdir' fails to open for some reason (lack of permissions, it's a
directory, etc.), the subsequent read_in_full() will crash.

> +       len = st->st_size;
> +       path = xmalloc(len + 1);
> +       read_in_full(fd, path, len);
> +       close(fd);

strbuf_readfile() might make this a bit cleaner (though has higher overhead).

> +       while (path[len - 1] == '\n' || path[len - 1] == '\r')
> +               len--;

If, for some reason, 'gitdir' content is empty or consists only of CR
and LF, this will access memory outside of the allocated region.
Probably want:

    while (len > 0 && (... || ...))

> +       path[len] = '\0';
> +       if (!file_exists(path)) {

What happens if 'path' ends up empty (due to hand-editing of 'gitdir'
by the user, for instance)? Does this case deserve an appropriate
diagnostic ("corrupt 'gitdir' file" or something)?

> +               struct stat st_link;
> +               free(path);
> +               /*
> +                * the repo is moved manually and has not been
> +                * accessed since?
> +                */
> +               if (!stat(git_path("repos/%s/link", id), &st_link) &&
> +                   st_link.st_nlink > 1)
> +                       return NULL;
> +               return _("gitdir points to non-existing file");

s/existing/existent/
s/file/location/

> +       }
> +       free(path);
> +       return NULL;
> +}
> +
> +static void prune_repos_dir(void)
> +{
> +       const char *reason;
> +       DIR *dir = opendir(git_path("repos"));
> +       struct dirent *d;
> +       int removed = 0;
> +       struct stat st;
> +       if (!dir)
> +               return;
> +       while ((d = readdir(dir)) != NULL) {
> +               if (!strcmp(d->d_name, ".") || !strcmp(d->d_name, ".."))
> +                       continue;
> +               if ((reason = prune_repo_dir(d->d_name, &st)) != NULL &&
> +                   st.st_mtime <= expire) {
> +                       struct strbuf sb = STRBUF_INIT;
> +                       if (show_only || verbose)
> +                               printf(_("Removing repos/%s: %s\n"), d->d_name, reason);
> +                       if (show_only)
> +                               continue;
> +                       strbuf_addstr(&sb, git_path("repos/%s", d->d_name));
> +                       remove_dir_recursively(&sb, 0);

What happens if this entry in 'repos' is a plain file (or other
non-directory)? Based upon my reading of remove_dir_recursively(), it
won't be deleted, yet the logic of prune_repo_dir() implies that such
an entry should be pruned. Perhaps handle this case specially with
unlink()?

> +                       strbuf_release(&sb);
> +                       removed = 1;
> +               }
> +       }
> +       closedir(dir);
> +       if (removed)
> +               rmdir(git_path("repos"));

This works, but at first glance it seems strange not to be checking
'show_only' before calling destructive rmdir().

However, stepping back, it's not quite clear what the intent is.
Ignoring the return value of rmdir() implies that you trust it to Do
The Right Thing: succeed when 'repos' is empty and fail when not. This
assumption of behavior applies regardless of whether or not any
content of 'repos' was removed, so the 'removed' flag does not seem
beneficial.

Moreover, the 'removed' flag actively prevents the 'repos' directory
from being pruned in the corner case where the user manually emptied
the content of 'repos' before invoking "git prune". Therefore, it
might be simpler to drop the 'removed' variable altogether and
rephrase as:

    if (!show_only)
        rmdir(git_path("repos"));

> +}
> +
>  /*
>   * Write errors (particularly out of space) can result in
>   * failed temporary packs (and more rarely indexes and other
> @@ -138,10 +202,12 @@ int cmd_prune(int argc, const char **argv, const char *prefix)
>  {
>         struct rev_info revs;
>         struct progress *progress = NULL;
> +       int prune_repos = 0;
>         const struct option options[] = {
>                 OPT__DRY_RUN(&show_only, N_("do not remove, show only")),
>                 OPT__VERBOSE(&verbose, N_("report pruned objects")),
>                 OPT_BOOL(0, "progress", &show_progress, N_("show progress")),
> +               OPT_BOOL(0, "repos", &prune_repos, N_("prune .git/repos/")),
>                 OPT_EXPIRY_DATE(0, "expire", &expire,
>                                 N_("expire objects older than <time>")),
>                 OPT_END()
> @@ -154,6 +220,14 @@ int cmd_prune(int argc, const char **argv, const char *prefix)
>         init_revisions(&revs, prefix);
>
>         argc = parse_options(argc, argv, prefix, options, prune_usage, 0);
> +
> +       if (prune_repos) {
> +               if (argc)
> +                       die(_("--repos does not take extra arguments"));
> +               prune_repos_dir();
> +               return 0;
> +       }
> +
>         while (argc--) {
>                 unsigned char sha1[20];
>                 const char *name = *argv++;
> diff --git a/setup.c b/setup.c
> index 8f90bc3..da2d669 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -390,6 +390,17 @@ static int check_repository_format_gently(const char *gitdir, int *nongit_ok)
>         return ret;
>  }
>
> +static void update_linked_gitdir(const char *gitfile, const char *gitdir)
> +{
> +       struct strbuf path = STRBUF_INIT;
> +       struct stat st;
> +
> +       strbuf_addf(&path, "%s/gitfile", gitdir);
> +       if (stat(path.buf, &st) || st.st_mtime + 24 * 3600 < time(NULL))
> +               write_file(path.buf, 0, "%s\n", gitfile);
> +       strbuf_release(&path);
> +}
> +
>  /*
>   * Try to read the location of the git directory from the .git file,
>   * return path to git directory if found.
> @@ -438,6 +449,8 @@ const char *read_gitfile(const char *path)
>
>         if (!is_git_directory(dir))
>                 die("Not a git repository: %s", dir);
> +
> +       update_linked_gitdir(path, dir);
>         path = real_path(dir);
>
>         free(buf);
> --
> 1.9.1.346.ga2b5940
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]