Re: [WIP RFC PATCH v2 5/5] clone: use dir-iterator to avoid explicit dir traversal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 26 2019, Duy Nguyen wrote:

> On Tue, Feb 26, 2019 at 12:18 PM Matheus Tavares
> <matheus.bernardino@xxxxxx> wrote:
>>
>> Replace usage of opendir/readdir/closedir API to traverse directories
>> recursively, at copy_or_link_directory function, by the dir-iterator
>> API. This simplifies the code and avoid recursive calls to
>> copy_or_link_directory.
>>
>> This process also makes copy_or_link_directory call die() in case of an
>> error on readdir or stat, inside dir_iterator_advance. Previously it
>> would just print a warning for errors on stat and ignore errors on
>> readdir, which isn't nice because a local git clone would end up
>> successfully even though the .git/objects copy didn't fully succeeded.
>>
>> Signed-off-by: Matheus Tavares <matheus.bernardino@xxxxxx>
>> ---
>> I can also make the change described in the last paragraph in a separate
>> patch before this one, but I would have to undo it in this patch because
>> dir-iterator already implements it. So, IMHO, it would be just noise
>> and not worthy.
>>
>>  builtin/clone.c | 45 +++++++++++++++++++++++----------------------
>>  1 file changed, 23 insertions(+), 22 deletions(-)
>>
>> diff --git a/builtin/clone.c b/builtin/clone.c
>> index fd580fa98d..b23ba64c94 100644
>> --- a/builtin/clone.c
>> +++ b/builtin/clone.c
>> @@ -23,6 +23,8 @@
>>  #include "transport.h"
>>  #include "strbuf.h"
>>  #include "dir.h"
>> +#include "dir-iterator.h"
>> +#include "iterator.h"
>>  #include "sigchain.h"
>>  #include "branch.h"
>>  #include "remote.h"
>> @@ -411,42 +413,37 @@ static void mkdir_if_missing(const char *pathname, mode_t mode)
>>  }
>>
>>  static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
>> -                                  const char *src_repo, int src_baselen)
>> +                                  const char *src_repo)
>>  {
>> -       struct dirent *de;
>> -       struct stat buf;
>>         int src_len, dest_len;
>> -       DIR *dir;
>> -
>> -       dir = opendir(src->buf);
>> -       if (!dir)
>> -               die_errno(_("failed to open '%s'"), src->buf);
>> +       struct dir_iterator *iter;
>> +       int iter_status;
>> +       struct stat st;
>> +       unsigned flags;
>>
>>         mkdir_if_missing(dest->buf, 0777);
>>
>> +       flags = DIR_ITERATOR_PEDANTIC | DIR_ITERATOR_FOLLOW_SYMLINKS;
>> +       iter = dir_iterator_begin(src->buf, flags);
>> +
>>         strbuf_addch(src, '/');
>>         src_len = src->len;
>>         strbuf_addch(dest, '/');
>>         dest_len = dest->len;
>>
>> -       while ((de = readdir(dir)) != NULL) {
>> +       while ((iter_status = dir_iterator_advance(iter)) == ITER_OK) {
>>                 strbuf_setlen(src, src_len);
>> -               strbuf_addstr(src, de->d_name);
>> +               strbuf_addstr(src, iter->relative_path);
>>                 strbuf_setlen(dest, dest_len);
>> -               strbuf_addstr(dest, de->d_name);
>> -               if (stat(src->buf, &buf)) {
>> -                       warning (_("failed to stat %s\n"), src->buf);
>> -                       continue;
>> -               }
>> -               if (S_ISDIR(buf.st_mode)) {
>> -                       if (!is_dot_or_dotdot(de->d_name))
>> -                               copy_or_link_directory(src, dest,
>> -                                                      src_repo, src_baselen);
>> +               strbuf_addstr(dest, iter->relative_path);
>> +
>> +               if (S_ISDIR(iter->st.st_mode)) {
>> +                       mkdir_if_missing(dest->buf, 0777);
>
> I wonder if this mkdir_if_missing is sufficient. What if you have to
> create multiple directories?
>
> Let's say the first advance, we hit "a". The the second advance we hit
> directory "b/b/b/b", we would need to mkdir recursively and something
> like safe_create_leading_directories() would be a better fit.
>
> I'm not sure if it can happen though. I haven't re-read dir-iterator
> code carefully.

This part isn't a problem. It iterates one level at a time. So given a
structure like a/b/c/d/e/f/g/h/i/j/k/some-l you'll find that if you
instrument the loop in clone.c you get:

    dir = a
    dir = a/b
    dir = a/b/c
    dir = a/b/c/d
    dir = a/b/c/d/e
    dir = a/b/c/d/e/f
    dir = a/b/c/d/e/f/g
    dir = a/b/c/d/e/f/g/h
    dir = a/b/c/d/e/f/g/h/i
    dir = a/b/c/d/e/f/g/h/i/j
    dir = a/b/c/d/e/f/g/h/i/j/k
    dir = a/b/c/d/e/f/g/h/i/j/k/some-l

So it's like the old implementation in that way. It readdir()'s and
walks directories one level at a time.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux