Re: [RFC PATCH util-linux] text-utils: add bits command

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Thomas,

Thomas Weißschuh, Oct 16, 2024 at 23:00:
On 2024-10-16 22:26:22+0200, Robin Jarry wrote:
Add a new test utility to convert between bit masks in various formats.

"test" or "text" utility?

That was a typo, I meant "text" :)

This can be handy to avoid parsing affinity masks in one's head and/or
to interact with the kernel in a more human friendly way.

This is a rewrite in C of the bits command from my linux-tools python
package so that it can be more widely available.

Here is an example:

 ~# cat /sys/kernel/debug/tracing/tracing_cpumask
 fffffff,ffffffff,ffffffff,ffffffff
 ~# bits -l ,$(cat /sys/kernel/debug/tracing/tracing_cpumask)
 0-128
 ~# bits -g 58,59,120,123
 9000000,00000000,0c000000,00000000
 ~# bits -g 58,59 > /sys/kernel/debug/tracing/tracing_cpumask
 ~# echo 1 > /sys/kernel/debug/tracing/tracing_on

Add man page and basic tests.

Link: https://git.sr.ht/~rjarry/linux-tools#bits
Signed-off-by: Robin Jarry <robin@xxxxxxxx>
---
 bash-completion/bits                  |  21 ++
 meson.build                           |  11 ++
 tests/commands.sh                     |   1 +
 tests/expected/misc/bits              |   0
 tests/expected/misc/bits-and          |   1 +
 tests/expected/misc/bits-binary       |   1 +
 tests/expected/misc/bits-default      |   1 +
 tests/expected/misc/bits-grouped-mask |   1 +
 tests/expected/misc/bits-list         |   1 +
 tests/expected/misc/bits-mask         |   1 +
 tests/expected/misc/bits-not          |   1 +
 tests/expected/misc/bits-or           |   1 +
 tests/expected/misc/bits-overflow     |   1 +
 tests/expected/misc/bits-parse-mask   |   1 +
 tests/expected/misc/bits-parse-range  |   1 +
 tests/expected/misc/bits-stdin        |   1 +
 tests/expected/misc/bits-xor          |   1 +
 tests/ts/misc/bits                    |  82 ++++++++
 text-utils/Makemodule.am              |   6 +
 text-utils/bits.1.adoc                | 147 ++++++++++++++
 text-utils/bits.c                     | 265 ++++++++++++++++++++++++++
 text-utils/meson.build                |   4 +
 22 files changed, 550 insertions(+)
 create mode 100644 bash-completion/bits
 create mode 100644 tests/expected/misc/bits
 create mode 100644 tests/expected/misc/bits-and
 create mode 100644 tests/expected/misc/bits-binary
 create mode 100644 tests/expected/misc/bits-default
 create mode 100644 tests/expected/misc/bits-grouped-mask
 create mode 100644 tests/expected/misc/bits-list
 create mode 100644 tests/expected/misc/bits-mask
 create mode 100644 tests/expected/misc/bits-not
 create mode 100644 tests/expected/misc/bits-or
 create mode 100644 tests/expected/misc/bits-overflow
 create mode 100644 tests/expected/misc/bits-parse-mask
 create mode 100644 tests/expected/misc/bits-parse-range
 create mode 100644 tests/expected/misc/bits-stdin
 create mode 100644 tests/expected/misc/bits-xor
 create mode 100755 tests/ts/misc/bits
 create mode 100644 text-utils/bits.1.adoc
 create mode 100644 text-utils/bits.c

...

+++ b/tests/ts/misc/bits
@@ -0,0 +1,82 @@
+#!/bin/bash
+
+#
+# Copyright (c) 2024 Robin Jarry
+#
+# This file is part of util-linux.
+#
+# This file is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This file is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.

New files should use SPDX tags.

Ack, will change in v2.

diff --git a/text-utils/bits.c b/text-utils/bits.c
new file mode 100644
index 000000000000..6dd0db81a5de
--- /dev/null
+++ b/text-utils/bits.c
@@ -0,0 +1,265 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * Copyright (c) 2024 Robin Jarry
+ *
+ * bits - convert bit masks from/to various formats
+ */
+
+#include <errno.h>
+#include <getopt.h>
+#include <sched.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+#include <unistd.h>
+
+#include "c.h"
+#include "closestream.h"
+#include "cpuset.h"
+#include "nls.h"
+#include "strutils.h"
+#include "strv.h"
+
+typedef enum {
+	OP_OR,
+	OP_AND,
+	OP_NOT,
+	OP_XOR,
+} bitvise_op_t;

"bitwise"?
Also avoid using the _t suffix as it's reserved.
"enum bitwise_op" seems clear enough.

Yes, bitwise is the correct spelling. I will probably remove that in v2 as you suggested below.

+
+static int parse_mask_or_list(const char *cmdline_arg, cpu_set_t *all_bits)
+{
+	bitvise_op_t op = OP_OR;
+	cpu_set_t bits, copy;

This seems very geared towards cpu masks.
What happens if a user wants to use it with very large non-CPU bitmasks,
as the tool is advertised as a generic utility?

Call the utility "cpumask"?

The default cpu_set_t size can hold up to 1024 bits (128 bytes). I assumed it was more than enough for common use cases. I added a basic test that checks that the program does not crash when specifying a too large mask.

I also didn't want to call that tool "*mask" because it can also deal with lists. But I don't have a strong opinion on the matter.


+	char buf[BUFSIZ];

Only use BUFSIZ in cases where the actual value does not matter,
for example loops.
On musl BUFSIZ == 1024 and this could not be enough fairly soon.

Good point. I wasn't sure what is a reasonable maximum argument size. execve(2) says MAX_ARG_STRLEN (32 pages) which seems excessive. I could hardcode 8192 and be done with it. What do you think?


+	char *arg = buf;
+
+	/* copy to allow modifying the argument contents */
+	strlcpy(buf, cmdline_arg, sizeof(buf));
+	buf[sizeof(buf) - 1] = '\0';

You can just use xstrdup().

But then, I'd have to free it. I wanted to avoid this.


+
+	/* strip optional operator first */
+	if (startswith(arg, "&")) {
+		op = OP_AND;
+		arg++;
+	} else if (startswith(arg, "^")) {
+		op = OP_XOR;
+		arg++;
+	} else if (startswith(arg, "~")) {
+		op = OP_NOT;
+		arg++;
+	} else if (startswith(arg, "|")) {
+		op = OP_OR;
+		arg++;
+	}

The whole op enum could also be replaced by the actual character,
this is all in a single function.

op = '&';

...

switch (op) {
case '&':
}

Good point. I'll do that in v2.


+
+	if (startswith(arg, ",") || startswith(arg, "0x")) {
+		if (cpumask_parse(arg, &bits, sizeof(bits)) < 0)
+			return -EINVAL;

As this is a commandline tool you can call errx() here to kill the
process and provide better error information.
It would also make the call sequence slightly more simple.

Ack.


+	} else {
+		if (cpulist_parse(arg, &bits, sizeof(bits), 1) < 0)
+			return -EINVAL;
+	}
+
+	switch (op) {
+	case OP_AND:
+		copy = *all_bits;
+		CPU_AND(all_bits, &copy, &bits);
+		break;
+	case OP_OR:
+		copy = *all_bits;
+		CPU_OR(all_bits, &copy, &bits);
+		break;
+	case OP_XOR:
+		copy = *all_bits;
+		CPU_XOR(all_bits, &copy, &bits);
+		break;
+	case OP_NOT:
+		for (int i = 0; i < CPU_SETSIZE; i++) {
+			if (CPU_ISSET(i, &bits))
+				CPU_CLR(i, all_bits);
+		}
+		break;
+	}
+
+	return 0;
+}
+
+typedef enum {
+	MODE_BINARY,
+	MODE_GROUPED_MASK,
+	MODE_LIST,
+	MODE_MASK,
+} output_mode_t;
+
+static void print_bits(cpu_set_t *bits, output_mode_t mode)
+{
+	bool started = false;
+	char buf[BUFSIZ];
+	ssize_t n = 0;
+
+	buf[0] = '\0';

Is this necessary?

It may have been at some point but it isn't anymore. I'll remove it.


+
+	switch (mode) {
+	case MODE_MASK:
+		cpumask_create(buf, sizeof(buf), bits, sizeof(*bits));
+
+		/* strip leading zeroes */
+		while (buf[n] == '0')
+			n++;
+		if (buf[n] == '\0')
+			printf("0x0\n");
+		else
+			printf("0x%s\n", buf + n);
+		break;
+
+	case MODE_GROUPED_MASK:
+		cpumask_create(buf, sizeof(buf), bits, sizeof(*bits));
+
+		/* strip leading zeroes */
+		while (buf[n] == '0')
+			n++;
+
+		while (buf[n] != '\0') {
+			if (started && (n % 8) == 0)
+				printf(",");
+			if (buf[n] != '0')
+				started = true;
+			printf("%c", buf[n]);
+			n++;
+		}
+		printf("\n");
+		break;
+
+	case MODE_BINARY:
+		printf("0b");
+		for (n = CPU_SETSIZE; n >= 0; n--) {
+			if (started && ((n + 1) % 4) == 0)
+				printf("_");
+			if (CPU_ISSET(n, bits)) {
+				started = true;
+				printf("1");
+			} else if (started) {
+				printf("0");
+			}
+		}
+		printf("\n");
+		break;
+
+	case MODE_LIST:
+		cpulist_create(buf, sizeof(buf), bits, sizeof(*bits));

This could use the return values of cpulist_create() and
cpumask_create().

Since I am printing into a stack buffer, is there a benefit?

+		printf("%s\n", buf);
+		break;
+	}
+
+}
+
+static void __attribute__((__noreturn__)) usage(void)
+{
+	fputs(USAGE_HEADER, stdout);
+	fprintf(stdout, _(" %s [options] [<mask_or_list>...]\n"), program_invocation_short_name);
+
+	fputs(USAGE_SEPARATOR, stdout);
+	fputsln(_("Convert bit masks from/to various formats."), stdout);
+
+	fputs(USAGE_ARGUMENTS, stdout);
+	fputsln(_(" <mask_or_list>      A set of bits specified as a hex mask value (e.g. 0xeec2)\n"
+		"                     or as a comma-separated list of bit IDs.\n"
+		"                     If not specified, arguments will be read from stdin."), stdout);
+
+	fputs(USAGE_OPTIONS, stdout);
+	fprintf(stdout, USAGE_HELP_OPTIONS(21));
+
+	fputs(_("\nMode:\n"), stdout);
+	fputsln(_(" -m, --mask          Print the combined args as a hex mask value (default).\n"), stdout);
+	fputsln(_(" -g, --grouped-mask  Print the combined args as a hex mask value in 32bit\n"
+			"                     comma separated groups."), stdout);
+	fputsln(_(" -b, --binary        Print the combined args as a binary mask value."), stdout);
+	fputsln(_(" -l, --list          Print the combined args as a compressed list of bit IDs."), stdout);
+
+	fputs(USAGE_SEPARATOR, stdout);
+	fprintf(stdout, USAGE_MAN_TAIL("bits(1)"));
+	exit(EXIT_SUCCESS);
+}
+
+int main(int argc, char **argv)
+{
+	output_mode_t mode = MODE_MASK;
+	char **stdin_lines = NULL;
+	cpu_set_t bits;
+	int c;
+
+	static const struct option longopts[] = {
+		{ "version",      no_argument, NULL, 'V'},
+		{ "help",         no_argument, NULL, 'h'},
+		{ "mask",         no_argument, NULL, 'm'},
+		{ "grouped-mask", no_argument, NULL, 'g'},
+		{ "binary",       no_argument, NULL, 'b'},
+		{ "list",         no_argument, NULL, 'l'},
+		{ NULL, 0, NULL, 0 }
+	};
+
+	setlocale(LC_ALL, "");
+	bindtextdomain(PACKAGE, LOCALEDIR);
+	textdomain(PACKAGE);
+	close_stdout_atexit();
+
+	while ((c = getopt_long(argc, argv, "Vhmgbl", longopts, NULL)) != -1)
+		switch (c) {
+		case 'm':
+			mode = MODE_MASK;
+			break;
+		case 'g':
+			mode = MODE_GROUPED_MASK;
+			break;
+		case 'b':
+			mode = MODE_BINARY;
+			break;
+		case 'l':
+			mode = MODE_LIST;
+			break;
+		case 'V':
+			print_version(EXIT_SUCCESS);
+		case 'h':
+			usage();
+		default:
+			errtryhelp(EXIT_FAILURE);
+		}
+
+	argc -= optind;
+	argv += optind;
+	if (argc == 0) {
+		/* no arguments provided, read lines from stdin */
+		char buf[BUFSIZ];
+
+		while (fgets(buf, sizeof(buf), stdin)) {
+			/* strip LF, CR, CRLF, LFCR */
+			buf[strcspn(buf, "\r\n")] = '\0';

rtrim_whitespace()?

Ack.


+			strv_push(&stdin_lines, strdup(buf));
+		}
+
+		argc = strv_length(stdin_lines);
+		argv = stdin_lines;
+	}
+
+	CPU_ZERO(&bits);
+
+	for (; argc > 0; argc--, argv++)
+		if (parse_mask_or_list(*argv, &bits) < 0) {
+			fprintf(stderr, _("error: invalid argument: %s\n"), *argv);
+			errtryhelp(EXIT_FAILURE);
+		}
+
+	strv_free(stdin_lines);

No need to free this here. The process will exit anyways.

I wanted to make ASAN happy. Is this not necessary?


+
+	print_bits(&bits, mode);
+
+	return EXIT_SUCCESS;
+}

Thanks for the review!






[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux