Re: [WIP RFC PATCH v2 1/5] dir-iterator: add flags parameter to dir_iterator_begin

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 26, 2019 at 9:02 AM Duy Nguyen <pclouds@xxxxxxxxx> wrote:
>
> On Tue, Feb 26, 2019 at 12:18 PM Matheus Tavares
> <matheus.bernardino@xxxxxx> wrote:
> >  int dir_iterator_advance(struct dir_iterator *dir_iterator)
> >  {
> >         struct dir_iterator_int *iter =
> >                 (struct dir_iterator_int *)dir_iterator;
> > +       int ret;
> >
> >         while (1) {
> >                 struct dir_iterator_level *level =
> > @@ -71,6 +75,8 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
> >
> >                         level->dir = opendir(iter->base.path.buf);
> >                         if (!level->dir && errno != ENOENT) {
> > +                               if (iter->flags & DIR_ITERATOR_PEDANTIC)
> > +                                       goto error_out;
> >                                 warning("error opening directory %s: %s",
> >                                         iter->base.path.buf, strerror(errno));
> >                                 /* Popping the level is handled below */
> > @@ -122,6 +128,8 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
> >                         if (!de) {
> >                                 /* This level is exhausted; pop up a level. */
> >                                 if (errno) {
> > +                                       if (iter->flags & DIR_ITERATOR_PEDANTIC)
> > +                                               goto error_out;
> >                                         warning("error reading directory %s: %s",
> >                                                 iter->base.path.buf, strerror(errno));
> >                                 } else if (closedir(level->dir))
> > @@ -138,11 +146,20 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
> >                                 continue;
> >
> >                         strbuf_addstr(&iter->base.path, de->d_name);
> > -                       if (lstat(iter->base.path.buf, &iter->base.st) < 0) {
> > -                               if (errno != ENOENT)
> > +
> > +                       if (iter->flags & DIR_ITERATOR_FOLLOW_SYMLINKS)
> > +                               ret = stat(iter->base.path.buf, &iter->base.st);
> > +                       else
> > +                               ret = lstat(iter->base.path.buf, &iter->base.st);
> > +
> > +                       if (ret < 0) {
> > +                               if (errno != ENOENT) {
> > +                                       if (iter->flags & DIR_ITERATOR_PEDANTIC)
> > +                                               goto error_out;
> >                                         warning("error reading path '%s': %s",
> >                                                 iter->base.path.buf,
> >                                                 strerror(errno));
> > +                               }
> >                                 continue;
> >                         }
> >
> > @@ -159,6 +176,10 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
> >                         return ITER_OK;
> >                 }
> >         }
> > +
> > +error_out:
> > +       dir_iterator_abort(dir_iterator);
>
> Should this function call this or leaveit to the caller? The
> description says "free resources" which to me sounds like something
> the caller should be aware of. Although if double dir_iterator_abort()
> has no bad side effects then I guess we can even leave it here.
>

The documentation at dir-iterator.h says that in case of errors (or
iteration exhaustion) at dir_iterator_advance(), the dir-iterator and
associated resources will be freed, so dir_iterator_abort() must be
called here, not left to the user. (Also, the use-case example at the
top of dir-iterator.h confirms this.)

> PS. files-backend.c does call dir_iterator_abort() unconditionally.
> Which sounds like this double-abort pattern should be dealt with even
> if that call site does not use the pedantic flag (it could later on,
> who knows; don't leave traps behind).
>

I have read the code at files-backend.c again but couldn't find where
it could call dir_iterator_abort() after dir_iterator_advance() have
already called it. Could you please point me out that?

A double-abort would certainly lead to 'double free or corruption'
error, so API users must only call it to abort iteration for some
reason when treating the iteration files, not because of iteration
errors. Also, this behaviour is not only for the pedantic flag. When
an iteration is complete, for example, a call to
dir_iterator_advance() will free the resources and return ITER_DONE.

> > +       return ITER_ERROR;
> >  }
> >
> > diff --git a/dir-iterator.h b/dir-iterator.h
> > index 970793d07a..fe9eb9a04b 100644
> > --- a/dir-iterator.h
> > +++ b/dir-iterator.h
> > @@ -6,9 +6,10 @@
> >  /*
> >   * Iterate over a directory tree.
> >   *
> > - * Iterate over a directory tree, recursively, including paths of all
> > - * types and hidden paths. Skip "." and ".." entries and don't follow
> > - * symlinks except for the original path.
> > + * With no flags to modify behaviour, iterate over a directory tree,
>
> Nit but I think we can just skip "With no flags to modify behavior". It's given.
>

Ok!

> > + * recursively, including paths of all types and hidden paths. Skip
> > + * "." and ".." entries and don't follow symlinks except for the
> > + * original path.
> >   *
> >   * Every time dir_iterator_advance() is called, update the members of
> >   * the dir_iterator structure to reflect the next path in the
> > @@ -19,7 +20,7 @@
> >   * A typical iteration looks like this:
> >   *
> >   *     int ok;
> > - *     struct iterator *iter = dir_iterator_begin(path);
> > + *     struct iterator *iter = dir_iterator_begin(path, 0);
> >   *
> >   *     while ((ok = dir_iterator_advance(iter)) == ITER_OK) {
> >   *             if (want_to_stop_iteration()) {
> > @@ -40,6 +41,20 @@
> >   * dir_iterator_advance() again.
> >   */
> >
> > +/*
> > + * Flags for dir_iterator_begin:
> > + *
> > + * - DIR_ITERATOR_PEDANTIC: override dir-iterator's default behavior
> > + *   in case of an error while trying to fetch the next entry, which is
> > + *   to emit a warning and keep going. With this flag, resouces are
> > + *   freed and ITER_ERROR is return immediately.
> > + *
> > + * - DIR_ITERATOR_FOLLOW_SYMLINKS: make dir-iterator follow symlinks to
> > + *   directories, i.e., iterate over linked directories' contents.
>
> Do we really need this flag? If dir-iterator does not follow symlinks,
> the caller _can_ check stat data to detect symlinks and look inside
> anyway. So this flag is more about convenience (_if_ it has more than
> one call site, convenience for one call site is just not worth it).
>

I don't think is just a convenience. If the API user wants to follow
symlinks a simple check at stat data to detect symlinks, wouldn't be
enough, since the user would want to traverse through all contents of
a symlinked directory and its subdirectories, recursively. Without
this flag, the caller would need to manually detect symlinks to
directories and start a new iteration on them (probably using a new
dir-iterator?). As that could lead to a more unnecessarily complex
code, and being this flag implementation so simple, I think it is
worthy.

> Or is there something else I'm missing?
>
> > + */
> > +#define DIR_ITERATOR_PEDANTIC (1 << 0)
> > +#define DIR_ITERATOR_FOLLOW_SYMLINKS (1 << 1)
> > +
> --
> Duy



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux