Re: [PATCH v4 1/6] kbuild: add a tool to generate a list of files ignored by git

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 02, 2023 at 12:37:11PM +0900 Masahiro Yamada wrote:
> In short, the motivation of this commit is to build a source package
> without cleaning the source tree.
> 
> The deb-pkg and (src)rpm-pkg targets first run 'make clean' before
> creating a source tarball. Otherwise build artifacts such as *.o,
> *.a, etc. would be included in the tarball. Yet, the tarball ends up
> containing several garbage files since 'make clean' does not clean
> everything.
> 
> Cleaning the tree every time is annoying since it makes the incremental
> build impossible. It is desirable to create a source tarball without
> cleaning the tree.
> 
> In fact, there are some ways to archive this.
> 
> The easiest way is 'git archive'. Actually, 'make perf-tar*-src-pkg'
> does this way, but I do not like it because it works only when the source
> tree is managed by git, and all files you want in the tarball must be
> committed in advance.
> 
> I want to make it work without relying on git. We can do this.
> 
> Files that are not tracked by git are generated files. We can list them
> out by parsing the .gitignore files. Of course, .gitignore does not cover
> all the cases, but it works well enough.
> 
> tar(1) claims to support it:
> 
>   --exclude-vcs-ignores
> 
>     Exclude files that match patterns read from VCS-specific ignore files.
>     Supported files are: .cvsignore, .gitignore, .bzrignore, and .hgignore.
> 
> The best scenario would be to use 'tar --exclude-vcs-ignores', but this
> option does not work. --exclude-vcs-ignore does not understand any of
> the negation (!), preceding slash, following slash, etc.. So, this option
> is just useless.
> 
> Hence, I wrote this gitignore parser. The previous version [1], written
> in Python, was so slow. This version is implemented in C, so it works
> much faster.
> 
> This tool traverses the source tree, parsing the .gitignore files. It
> prints the file paths that are not tracked by git. The output can be
> used for tar's --exclude-from= option.
> 
> [How to test this tool]
> 
>   $ git clean -dfx
>   $ make -s -j$(nproc) defconfig all                       # or allmodconifg or whatever
>   $ git archive -o ../linux1.tar --prefix=./ HEAD
>   $ tar tf ../linux1.tar | LANG=C sort > ../file-list1     # files emitted by 'git archive'
>   $ make scripts_exclude
>     HOSTCC  scripts/gen-exclude
>   $ scripts/gen-exclude --prefix=./ -o ../exclude-list
>   $ tar cf ../linux2.tar --exclude-from=../exclude-list .
>   $ tar tf ../linux2.tar | LANG=C sort > ../file-list2     # files emitted by 'tar'
>   $ diff  ../file-list1 ../file-list2 | grep -E '^(<|>)'
>   < ./Documentation/devicetree/bindings/.yamllint
>   < ./drivers/clk/.kunitconfig
>   < ./drivers/gpu/drm/tests/.kunitconfig
>   < ./drivers/gpu/drm/vc4/tests/.kunitconfig
>   < ./drivers/hid/.kunitconfig
>   < ./fs/ext4/.kunitconfig
>   < ./fs/fat/.kunitconfig
>   < ./kernel/kcsan/.kunitconfig
>   < ./lib/kunit/.kunitconfig
>   < ./mm/kfence/.kunitconfig
>   < ./net/sunrpc/.kunitconfig
>   < ./tools/testing/selftests/arm64/tags/
>   < ./tools/testing/selftests/arm64/tags/.gitignore
>   < ./tools/testing/selftests/arm64/tags/Makefile
>   < ./tools/testing/selftests/arm64/tags/run_tags_test.sh
>   < ./tools/testing/selftests/arm64/tags/tags_test.c
>   < ./tools/testing/selftests/kvm/.gitignore
>   < ./tools/testing/selftests/kvm/Makefile
>   < ./tools/testing/selftests/kvm/config
>   < ./tools/testing/selftests/kvm/settings
> 
> The source tarball contains most of files that are tracked by git. You
> see some diffs, but it is just because some .gitignore files are wrong.
> 
>   $ git ls-files -i -c --exclude-per-directory=.gitignore
>   Documentation/devicetree/bindings/.yamllint
>   drivers/clk/.kunitconfig
>   drivers/gpu/drm/tests/.kunitconfig
>   drivers/hid/.kunitconfig
>   fs/ext4/.kunitconfig
>   fs/fat/.kunitconfig
>   kernel/kcsan/.kunitconfig
>   lib/kunit/.kunitconfig
>   mm/kfence/.kunitconfig
>   tools/testing/selftests/arm64/tags/.gitignore
>   tools/testing/selftests/arm64/tags/Makefile
>   tools/testing/selftests/arm64/tags/run_tags_test.sh
>   tools/testing/selftests/arm64/tags/tags_test.c
>   tools/testing/selftests/kvm/.gitignore
>   tools/testing/selftests/kvm/Makefile
>   tools/testing/selftests/kvm/config
>   tools/testing/selftests/kvm/settings
> 
> [1]: https://lore.kernel.org/all/20230128173843.765212-1-masahiroy@xxxxxxxxxx/
> 
> Signed-off-by: Masahiro Yamada <masahiroy@xxxxxxxxxx>
> ---
> 
> (no changes since v3)
> 
> Changes in v3:
>  - Various code refactoring: remove struct gitignore, remove next: label etc.
>  - Support --extra-pattern option
> 
> Changes in v2:
>  - Reimplement in C
> 
>  Makefile              |   4 +
>  scripts/.gitignore    |   1 +
>  scripts/Makefile      |   2 +-
>  scripts/gen-exclude.c | 623 ++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 629 insertions(+), 1 deletion(-)
>  create mode 100644 scripts/gen-exclude.c
> 
> diff --git a/Makefile b/Makefile
> index 2faf872b6808..35b294cc6f32 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1652,6 +1652,10 @@ distclean: mrproper
>  %pkg: include/config/kernel.release FORCE
>  	$(Q)$(MAKE) -f $(srctree)/scripts/Makefile.package $@
>  
> +PHONY += scripts_exclude
> +scripts_exclude: scripts_basic
> +	$(Q)$(MAKE) $(build)=scripts scripts/gen-exclude
> +
>  # Brief documentation of the typical targets used
>  # ---------------------------------------------------------------------------
>  
> diff --git a/scripts/.gitignore b/scripts/.gitignore
> index 6e9ce6720a05..7f433bc1461c 100644
> --- a/scripts/.gitignore
> +++ b/scripts/.gitignore
> @@ -1,5 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  /asn1_compiler
> +/gen-exclude
>  /generate_rust_target
>  /insert-sys-cert
>  /kallsyms
> diff --git a/scripts/Makefile b/scripts/Makefile
> index 32b6ba722728..5dcd7f57607f 100644
> --- a/scripts/Makefile
> +++ b/scripts/Makefile
> @@ -38,7 +38,7 @@ HOSTCFLAGS_sorttable.o += -DMCOUNT_SORT_ENABLED
>  endif
>  
>  # The following programs are only built on demand
> -hostprogs += unifdef
> +hostprogs += gen-exclude unifdef
>  
>  # The module linker script is preprocessed on demand
>  targets += module.lds
> diff --git a/scripts/gen-exclude.c b/scripts/gen-exclude.c
> new file mode 100644
> index 000000000000..5c4ecd902290
> --- /dev/null
> +++ b/scripts/gen-exclude.c
> @@ -0,0 +1,623 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +//
> +// Traverse the source tree, parsing all .gitignore files, and print file paths
> +// that are not tracked by git.
> +// The output is suitable to the --exclude-from option of tar.
> +// This is useful until the --exclude-vcs-ignores option gets working correctly.
> +//
> +// Copyright (C) 2023 Masahiro Yamada <masahiroy@xxxxxxxxxx>
> +
> +#include <dirent.h>
> +#include <errno.h>
> +#include <fcntl.h>
> +#include <fnmatch.h>
> +#include <getopt.h>
> +#include <stdarg.h>
> +#include <stdbool.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <sys/stat.h>
> +#include <sys/types.h>
> +#include <unistd.h>
> +
> +#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
> +
> +// struct pattern - represent an ignore pattern (a line in .gitignroe)
> +// @negate:          negate the pattern (prefixing '!')
> +// @dir_only:        only matches directories (trailing '/')
> +// @path_match:      true if the glob pattern is a path instead of a file name
> +// @double_asterisk: true if the glob pattern contains double asterisks ('**')
> +// @glob:            glob pattern
> +struct pattern {
> +	bool negate;
> +	bool dir_only;
> +	bool path_match;
> +	bool double_asterisk;
> +	char glob[];
> +};
> +
> +struct pattern **patterns;

Is there a reason, why patterns is not static?  (sparse asked)

> +static int nr_patterns, alloced_patterns;
> +
> +// Remember the number of patterns at each directory level
> +static int *nr_patterns_at;
> +// Track the current/max directory level;
> +static int depth, max_depth;
> +static bool debug_on;
> +static FILE *out_fp;
> +static char *prefix = "";
> +static char *progname;
> +
> +static void __attribute__((noreturn)) perror_exit(const char *s)
> +{
> +	perror(s);
> +
> +	exit(EXIT_FAILURE);
> +}
> +
> +static void __attribute__((noreturn)) error_exit(const char *fmt, ...)
> +{
> +	va_list args;
> +
> +	fprintf(stderr, "%s: error: ", progname);
> +
> +	va_start(args, fmt);
> +	vfprintf(stderr, fmt, args);
> +	va_end(args);
> +
> +	exit(EXIT_FAILURE);
> +}
> +
> +static void debug(const char *fmt, ...)
> +{
> +	va_list args;
> +	int i;
> +
> +	if (!debug_on)
> +		return;
> +
> +	fprintf(stderr, "[DEBUG]");
> +
> +	for (i = 0; i < depth * 2; i++)
> +		fputc(' ', stderr);
> +
> +	va_start(args, fmt);
> +	vfprintf(stderr, fmt, args);
> +	va_end(args);
> +}
> +
> +static void *xrealloc(void *ptr, size_t size)
> +{
> +	ptr = realloc(ptr, size);
> +	if (!ptr)
> +		perror_exit(progname);
> +
> +	return ptr;
> +}
> +
> +static void *xmalloc(size_t size)
> +{
> +	return xrealloc(NULL, size);
> +}
> +
> +static char *xstrdup(const char *s)
> +{
> +	char *new = strdup(s);
> +
> +	if (!new)
> +		perror_exit(progname);
> +
> +	return new;
> +}
> +
> +static bool simple_match(const char *string, const char *pattern)
> +{
> +	return fnmatch(pattern, string, FNM_PATHNAME) == 0;
> +}
> +
> +// Handle double asterisks ("**") matching.
> +// FIXME:
> +//  This function does not work if double asterisks apppear multiple times,
> +//  like "foo/**/bar/**/baz".
> +static bool double_asterisk_match(const char *path, const char *pattern)
> +{
> +	bool result = false;
> +	int slash_diff = 0;
> +	char *modified_pattern, *q;
> +	const char *p;
> +	size_t len;
> +
> +	for (p = path; *p; p++)
> +		if (*p == '/')
> +			slash_diff++;
> +
> +	for (p = pattern; *p; p++)
> +		if (*p == '/')
> +			slash_diff--;
> +
> +	len = strlen(pattern) + 1;
> +
> +	if (slash_diff > 0)
> +		len += slash_diff * 2;
> +	modified_pattern = xmalloc(len);
> +
> +	q = modified_pattern;
> +	for (p = pattern; *p; p++) {
> +		if (!strncmp(p, "**/", 3)) {
> +			// "**/" means zero of more sequences of '*/".
> +			// "foo**/bar" matches "foobar", "foo*/bar",
> +			// "foo*/*/bar", etc.
> +			while (slash_diff-- > 0) {
> +				*q++ = '*';
> +				*q++ = '/';
> +			}
> +
> +			if (slash_diff == 0) {
> +				*q++ = '*';
> +				*q++ = '/';
> +			}
> +
> +			if (slash_diff < 0)
> +				slash_diff++;
> +
> +			p += 2;
> +		} else if (!strcmp(p, "/**")) {
> +			// A trailing "/**" matches everything inside.

In v2 you also checked against "(*p + 3) == '\0'".  Is the explicit check
against end-of-string really not needed here?  (pattern = "whatever/**/*.tmp"?)

> +			while (slash_diff-- >= 0) {
> +				*q++ = '/';
> +				*q++ = '*';
> +			}
> +
> +			p += 2;
> +		} else {
> +			// Copy other patterns as-is.
> +			// Other consecutive asterisks are considered regular
> +			// asterisks. fnmatch() already handles them like that.
> +			*q++ = *p;
> +		}
> +	}
> +
> +	*q = '\0';
> +
> +	result = simple_match(path, modified_pattern);
> +
> +	free(modified_pattern);
> +
> +	return result;
> +}
> +
> +// Return true if the given path is ignored by git.
> +static bool is_ignored(const char *path, const char *name, bool is_dir)
> +{
> +	int i;
> +
> +	// Search the patterns in the reverse order because the last matching
> +	// pattern wins.
> +	for (i = nr_patterns - 1; i >= 0; i--) {
> +		struct pattern *p = patterns[i];
> +
> +		if (!is_dir && p->dir_only)
> +			continue;
> +
> +		if (!p->path_match) {
> +			// If the pattern has no slash at the beginning or
> +			// middle, it matches against the basename. Most cases
> +			// fall into this and work well with double asterisks.
> +			if (!simple_match(name, p->glob))
> +				continue;
> +		} else if (!p->double_asterisk) {
> +			// Unless the pattern has double asterisks, it is still
> +			// simple but matches against the path instead.
> +			if (!simple_match(path, p->glob))
> +				continue;
> +		} else {
> +			// Double asterisks with a slash. Complex, but rare.
> +			if (!double_asterisk_match(path, p->glob))
> +				continue;
> +		}
> +
> +		debug("%s: matches %s%s%s\n", path, p->negate ? "!" : "",
> +		      p->glob, p->dir_only ? "/" : "");
> +
> +		return !p->negate;
> +	}
> +
> +	debug("%s: no match\n", path);
> +
> +	return false;
> +}
> +
> +// Return the length of the initial segment of the string that does not contain
> +// the unquoted sequence of the given character. Similar to strcspn() in libc.

I struggled across that comment and it took me quite some time to match it to
strcspn_trailers() behaviour.  I expect it to strip all unescaped occurrences
of c at the end of str and return the resulting strlen.  After reading it
several times, I can get a match.  I _think_ main confusion came from my (quite
imperfect) English:

  "one two  "
   ^^^         initial segment of string not containing unquoted c ??

   ^^^^^^^     substr that is considered by strcspn_trailer

But this is just about a comment and I'm sure I understand what is intended.
No action required.

> +static size_t strcspn_trailer(const char *str, char c)
> +{
> +	bool quoted = false;
> +	size_t len = strlen(str);
> +	size_t spn = len;
> +	const char *s;
> +
> +	for (s = str; *s; s++) {
> +		if (!quoted && *s == c) {
> +			if (s - str < spn)
> +				spn = s - str;
> +		} else {
> +			spn = len;

Is this really intended?  Or 'spn = str - s + 1'?

> +
> +			if (!quoted && *s == '\\')
> +				quoted = true;
> +			else
> +				quoted = false;
> +		}
> +	}
> +
> +	return spn;
> +}
> +
> +// Add an gitignore pattern.
> +static void add_pattern(char *s, const char *dirpath)
> +{
> +	bool negate = false;
> +	bool dir_only = false;
> +	bool path_match = false;
> +	bool double_asterisk = false;
> +	char *e = s + strlen(s);
> +	struct pattern *p;
> +	size_t len;
> +
> +	// Skip comments
> +	if (*s == '#')
> +		return;
> +
> +	// Trailing spaces are ignored unless they are quoted with backslash.
> +	e = s + strcspn_trailer(s, ' ');
> +	*e = '\0';
> +
> +	// The prefix '!' negates the pattern
> +	if (*s == '!') {
> +		s++;
> +		negate = true;
> +	}
> +
> +	// If there is slash(es) that is not escaped at the end of the pattern,
> +	// it matches only directories.

Are escaped slashes allowed in file names in git?  I think use of original
strcspn() would have been enough.

> +	len = strcspn_trailer(s, '/');
> +	if (s + len < e) {
> +		dir_only = true;
> +		e = s + len;
> +		*e = '\0';
> +	}
> +
> +	// Skip if the line gets empty
> +	if (*s == '\0')
> +		return;
> +
> +	// Double asterisk is tricky. Mark it to handle it specially later.
> +	if (strstr(s, "**/") || strstr(s, "/**"))
> +		double_asterisk = true;
> +
> +	// If there is a slash at the beginning or middle, the pattern
> +	// is relative to the directory level of the .gitignore.
> +	if (strchr(s, '/')) {
> +		if (*s == '/')
> +			s++;
> +		path_match = true;
> +	}
> +
> +	len = e - s;
> +
> +	// We need more room to store dirpath and '/'
> +	if (path_match)
> +		len += strlen(dirpath) + 1;
> +
> +	p = xmalloc(sizeof(*p) + len + 1);
> +	p->negate = negate;
> +	p->dir_only = dir_only;
> +	p->path_match = path_match;
> +	p->double_asterisk = double_asterisk;
> +	p->glob[0] = '\0';

(bike-shedding)
  p = (struct pattern) {
	.negate = negate,
	.dir_only = dir_only,
	.path_match = path_match,
	.double_asterisk = double_asterisk,
  };


> +
> +	if (path_match) {
> +		strcat(p->glob, dirpath);
> +		strcat(p->glob, "/");
> +	}
> +
> +	strcat(p->glob, s);
> +
> +	debug("Add pattern: %s%s%s\n", negate ? "!" : "", p->glob,
> +	      dir_only ? "/" : "");
> +
> +	if (nr_patterns >= alloced_patterns) {
> +		alloced_patterns += 128;
> +		patterns = xrealloc(patterns,
> +				    sizeof(*patterns) * alloced_patterns);
> +	}
> +
> +	patterns[nr_patterns++] = p;
> +}
> +
> +static void *load_gitignore(const char *dirpath)
> +{
> +	struct stat st;
> +	char path[PATH_MAX], *buf;
> +	int fd, ret;
> +
> +	ret = snprintf(path, sizeof(path), "%s/.gitignore", dirpath);
> +	if (ret >= sizeof(path))
> +		error_exit("%s: too long path was truncated\n", path);
> +
> +	// If .gitignore does not exist in this directory, open() fails.
> +	// It is ok, just skip it.
> +	fd = open(path, O_RDONLY);
> +	if (fd < 0)
> +		return NULL;

Why don't you check against errno == 2 (ENOENT)?  I assume, no other 
errno value is expected, but for me it feels a bit odd to not check it 
and exit loudly if something (unlikely) like EMFILE causes open() to 
fail.

> +
> +	if (fstat(fd, &st) < 0)
> +		perror_exit(path);
> +
> +	buf = xmalloc(st.st_size + 1);
> +	if (read(fd, buf, st.st_size) != st.st_size)
> +		perror_exit(path);
> +
> +	buf[st.st_size] = '\0';
> +	if (close(fd))
> +		perror_exit(path);
> +
> +	return buf;
> +}
> +
> +// Parse '.gitignore' in the given directory.
> +static void parse_gitignore(const char *dirpath)
> +{
> +	char *buf, *s, *next;
> +
> +	buf = load_gitignore(dirpath);
> +	if (!buf)
> +		return;
> +
> +	debug("Parse %s/.gitignore\n", dirpath);
> +
> +	for (s = buf; *s; s = next) {
> +		next = s;
> +
> +		while (*next != '\0' && *next != '\n')

Not relevant for in-tree use: git does not complain about '\0' in a .gitignore
but also handles the remaining part of the file.

> +			next++;
> +
> +		if (*next != '\0') {
> +			*next = '\0';
> +			next++;
> +		}
> +
> +		add_pattern(s, dirpath);
> +	}
> +
> +	free(buf);
> +}
> +
> +// Save the current number of patterns and increment the depth
> +static void increment_depth(void)
> +{
> +	if (depth >= max_depth) {
> +		max_depth += 1;
> +		nr_patterns_at = xrealloc(nr_patterns_at,
> +					  sizeof(*nr_patterns_at) * max_depth);
> +	}
> +
> +	nr_patterns_at[depth] = nr_patterns;
> +	depth++;
> +}
> +
> +// Decrement the depth, and free up the patterns of this directory level.
> +static void decrement_depth(void)
> +{
> +	depth--;
> +	if (depth < 0)
> +		error_exit("BUG\n");
> +
> +	while (nr_patterns > nr_patterns_at[depth])
> +		free(patterns[--nr_patterns]);
> +}
> +
> +// If we find an ignored path, print it.
> +static void print_path(const char *path)
> +{
> +	// The path always start with "./". If not, it is a bug.
> +	if (strlen(path) < 2)
> +		error_exit("BUG\n");
> +
> +	// Replace the root directory with the prefix you like.
> +	// This is useful for the tar command.
> +	fprintf(out_fp, "%s%s\n", prefix, path + 2);
> +}
> +
> +// Traverse the entire directory tree, parsing .gitignore files.
> +// Print file paths that are not tracked by git.
> +//
> +// Return true if all files under the directory are ignored, false otherwise.
> +static bool traverse_directory(const char *dirpath)
> +{
> +	bool all_ignored = true;
> +	DIR *dirp;
> +
> +	debug("Enter[%d]: %s\n", depth, dirpath);
> +	increment_depth();
> +
> +	// We do not know whether .gitignore exists in this directory or not.
> +	// Anyway, try to open it.
> +	parse_gitignore(dirpath);
> +
> +	dirp = opendir(dirpath);
> +	if (!dirp)
> +		perror_exit(dirpath);
> +
> +	while (1) {
> +		char path[PATH_MAX];
> +		struct dirent *d;
> +		int ret;
> +
> +		errno = 0;
> +		d = readdir(dirp);
> +		if (!d) {
> +			// readdir() returns NULL on the end of the directory
> +			// steam, and also on an error. To distinguish them,
> +			// errno should be checked.
> +			if (errno)
> +				perror_exit(dirpath);
> +			break;
> +		}
> +
> +		if (!strcmp(d->d_name, "..") || !strcmp(d->d_name, "."))
> +			continue;
> +
> +		ret = snprintf(path, sizeof(path), "%s/%s", dirpath, d->d_name);
> +		if (ret >= sizeof(path))
> +			error_exit("%s: too long path was truncated\n", path);
> +
> +		if (is_ignored(path, d->d_name, d->d_type & DT_DIR)) {
> +			debug("Ignore: %s\n", path);
> +			print_path(path);
> +		} else {
> +			if ((d->d_type & DT_DIR) && !(d->d_type & DT_LNK)) {
> +				if (!traverse_directory(path))
> +					all_ignored = false;
> +			} else {
> +				all_ignored = false;
> +			}
> +		}
> +	}
> +
> +	if (closedir(dirp))
> +		perror_exit(dirpath);
> +
> +	// If all the files under this directory are ignored, let's ignore this
> +	// directory as well in order to avoid empty directories in the tarball.
> +	if (all_ignored) {
> +		debug("Ignore: %s (due to all files inside ignored)\n", dirpath);
> +		print_path(dirpath);
> +	}
> +
> +	decrement_depth();
> +	debug("Leave[%d]: %s\n", depth, dirpath);
> +
> +	return all_ignored;
> +}
> +
> +// Register hard-coded ignore patterns.
> +static void add_fixed_patterns(void)
> +{
> +	const char * const fixed_patterns[] = {
> +		".git/",
> +	};
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(fixed_patterns); i++) {
> +		char *s = xstrdup(fixed_patterns[i]);
> +
> +		add_pattern(s, ".");
> +		free(s);
> +	}
> +}
> +
> +static void usage(void)
> +{
> +	fprintf(stderr,
> +		"usage: %s [options]\n"
> +		"\n"
> +		"Print files that are not ignored by git\n"
> +		"\n"
> +		"options:\n"
> +		"  -d, --debug                   print debug messages to stderr\n"
> +		"  -e, --extra-pattern PATTERN   Add extra ignore patterns. This behaves like it is prepended to the top .gitignore\n"
> +		"  -h, --help                    show this help message and exit\n"
> +		"  -o, --output FILE             output to a file (default: '-', i.e. stdout)\n"
> +		"  -p, --prefix PREFIX           prefix added to each path (default: empty string)\n"
> +		"  -r, --rootdir DIR             root of the source tree (default: current working directory):\n",
> +		progname);
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +	const char *output = "-";
> +	const char *rootdir = ".";
> +
> +	progname = strrchr(argv[0], '/');
> +	if (progname)
> +		progname++;
> +	else
> +		progname = argv[0];
> +
> +	while (1) {
> +		static struct option long_options[] = {
> +			{"debug",         no_argument,       NULL, 'd'},
> +			{"extra-pattern", required_argument, NULL, 'e'},
> +			{"help",          no_argument,       NULL, 'h'},
> +			{"output",        required_argument, NULL, 'o'},
> +			{"prefix",        required_argument, NULL, 'p'},
> +			{"rootdir",       required_argument, NULL, 'r'},
> +			{},
> +		};
> +
> +		int c = getopt_long(argc, argv, "de:ho:p:r:", long_options, NULL);
> +
> +		if (c == -1)
> +			break;
> +
> +		switch (c) {
> +		case 'd':
> +			debug_on = true;
> +			break;
> +		case 'e':
> +			add_pattern(optarg, ".");
> +			break;
> +		case 'h':
> +			usage();
> +			exit(0);
> +		case 'o':
> +			output = optarg;
> +			break;
> +		case 'p':
> +			prefix = optarg;
> +			break;
> +		case 'r':
> +			rootdir = optarg;
> +			break;
> +		case '?':
> +			usage();
> +			/* fallthrough */
> +		default:
> +			exit(EXIT_FAILURE);
> +		}
> +	}
> +
> +	if (chdir(rootdir))
> +		perror_exit(rootdir);
> +
> +	if (strcmp(output, "-")) {
> +		out_fp = fopen(output, "w");
> +		if (!out_fp)
> +			perror_exit(output);
> +	} else {
> +		out_fp = stdout;
> +	}
> +
> +	add_fixed_patterns();
> +
> +	traverse_directory(".");
> +
> +	if (depth != 0)
> +		error_exit("BUG\n");
> +
> +	while (nr_patterns > 0)
> +		free(patterns[--nr_patterns]);
> +	free(patterns);
> +	free(nr_patterns_at);
> +
> +	fflush(out_fp);
> +	if (ferror(out_fp))
> +		error_exit("not all data was written to the output\n");
> +
> +	if (fclose(out_fp))
> +		perror_exit(output);
> +
> +	return 0;
> +}
> -- 
> 2.34.1

I like the idea of gen-exclude.

Testing with some strange patterns seems to reveal some missing points.  It
should not be problematic, as nobody wants to write such .gitignore patterns,
but for completeness:

  $ mkdir -p test/foo/bar
  $ touch test/foo/bar/baz.tmp
  $ cat <<-eof >test/.gitignore
  **/*.tmp
  **/baz.tmp
  foo/**/*.tmp
  **/bar/baz.tmp
  /**/*.tmp
  eof
  $ cd test
  $ ../scripts/gen-exclude --debug
  [DEBUG]Add pattern: .git/
  [DEBUG]Enter[0]: .
  [DEBUG]  ./test: no match
  [DEBUG]  Enter[1]: ./test
  [DEBUG]    Parse ./test/.gitignore
  [DEBUG]    Add pattern: ./test/**/*.tmp
  [DEBUG]    Add pattern: ./test/**/baz.tmp
  [DEBUG]    Add pattern: ./test/foo/**/*.tmp
  [DEBUG]    Add pattern: ./test/**/bar/baz.tmp
  [DEBUG]    Add pattern: ./test/**/*.tmp
  [DEBUG]    ./test/.gitignore: no match
  [DEBUG]    ./test/foo: no match
  [DEBUG]    Enter[2]: ./test/foo
  [DEBUG]      ./test/foo/bar: no match
  [DEBUG]      Enter[3]: ./test/foo/bar
  [DEBUG]        ./test/foo/bar/baz.tmp: no match
  [DEBUG]      Leave[3]: ./test/foo/bar
  [DEBUG]    Leave[2]: ./test/foo
  [DEBUG]  Leave[1]: ./test
  [DEBUG]Leave[0]: .

Thus, no match.  Everything else I tested, did what I expected.

Reviewed-by: Nicolas Schier <nicolas@xxxxxxxxx>
Tested-by: Nicolas Schier <nicolas@xxxxxxxxx>

Kind regards,
Nicolas

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux&nblp;USB Development]     [Linux Media]     [Video for Linux]     [Linux Audio Users]     [Yosemite Secrets]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux