On Fri, 2024-09-06 at 11:06 +0300, Jani Nikula wrote: > On Fri, 06 Sep 2024, Roberto Sassu <roberto.sassu@xxxxxxxxxxxxxxx> wrote: > > On Thu, 2024-09-05 at 20:30 +0300, Jani Nikula wrote: > > > On Thu, 05 Sep 2024, Roberto Sassu <roberto.sassu@xxxxxxxxxxxxxxx> wrote: > > > > From: Roberto Sassu <roberto.sassu@xxxxxxxxxx> > > > > > > > > Add a parser of a generic Type-Length-Value (TLV) format: > > > > > > > > +-----------------+------------------+-----------------+ > > > > > data type (u64) | num fields (u64) | total len (u64) | # header > > > > +--------------+--+---------+--------+---------+-------+ > > > > > field1 (u64) | len1 (u64) | value1 (u8 len1) | > > > > +--------------+------------+------------------+ > > > > > ... | ... | ... | # data > > > > +--------------+------------+------------------+ > > > > > fieldN (u64) | lenN (u64) | valueN (u8 lenN) | > > > > +--------------+------------+------------------+ > > > > > > Okay, take this with a grain of salt. I'm actually not interested in > > > your use case, but the generic part here. But hear me out. > > > > > > Why do you need to have num fields in the header? I'd think the generic > > > TLV would have tag/length/value, where value may contain more TLV, or > > > not, depending on the use case specific tag. The same parser can parse > > > everything recursively, with no special handling for headers. To me, > > > that's the great part about TLV. > > > > Hi Jani > > > > the purpose of the number of entries is that the kernel uses it to > > correctly size the hash table, before adding digests to it. > > > > It also allows to group digests together and define attributes for all > > of them (e.g. the hash algorithm). > > > > You can also have multiple data blocks in the same file, and the header > > allows to go to the ones of interest. > > Or you could have a TLV entry to define how many fields follow. With no > dedicated header entries. You can choose what tags mean. Ok, I guess it is possible to do this way too. > Oh, another angle is that you could have e.g. the lowest bit in the tag > indicate whether the value is TLV too. So you can validate the TLV all > the way down to the non-TLV values. Not sure I follow, the parser associated to a TLV type knows if it is a nested TLV. Look at parse_digest_list_entry() in patch 7/14. The field parser is calling tlv_parse() inside. The parsers are user-defined and invoked through callbacks, the TLV parser itself remains the same regardless of how data are structured. > > > Also, making generic TLV have u64 tag and length is huge waste in most > > > use cases. Saving one byte requires 16 bytes of tag and length. You > > > could encode tag and length with UTF-8. Sure, it's wasteful if you need > > > an enormous amount of tags or huge lengths, but it's efficient for most > > > use cases. > > > > You are right, it is a huge size. You can introduce new fields, but not > > modify the tag and length size for retrocompatibility. > > > > Ok, I see your point for UTF-8. Let's see, I like how simple is the > > parser now. > > If you like simplicity, you should not have a different header entry. ;) Good point. > See utf8encode() and utf8decode() in fs/unicode/mkutf8data.c for an > implementation. Of course, this severely limits the max length from u64, > to put it mildly, max being 0x10ffff. Maybe that's a limitation you > don't want. But fixing at u64 seems bad too. I can reduce to u32, I doubt that more will be necessary. Thanks Roberto > BR, > Jani. > > > > > Thanks > > > > Roberto > > > > > Anyway, just my thoughts. > > > > > > > > > BR, > > > Jani. > > > > > > > > > > > [same as above, repeated N times] > > > > > > > > Each adopter can define its own data types and fields. The TLV parser does > > > > not need to be aware of those, but lets the adopter obtain the data and > > > > decide how to continue. > > > > > > > > After parsing each TLV header, call the header callback function with the > > > > callback data provided by the adopter. The latter can return 0, to skip > > > > processing of the TLV data, 1 to process the TLV data, or a negative value > > > > to stop processing the TLV data. > > > > > > > > After processing a TLV data entry, call the data callback function also > > > > with the callback data provided by the adopter. The latter can decide how > > > > to interpret the TLV data entry depending on the field ID. > > > > > > > > Nesting TLVs is also possible, the data callback function can call > > > > tlv_parse() to parse the inner structure. > > > > > > > > Signed-off-by: Roberto Sassu <roberto.sassu@xxxxxxxxxx> > > > > --- > > > > MAINTAINERS | 8 ++ > > > > include/linux/tlv_parser.h | 48 +++++++ > > > > include/uapi/linux/tlv_parser.h | 62 +++++++++ > > > > lib/Kconfig | 3 + > > > > lib/Makefile | 2 + > > > > lib/tlv_parser.c | 221 ++++++++++++++++++++++++++++++++ > > > > lib/tlv_parser.h | 17 +++ > > > > 7 files changed, 361 insertions(+) > > > > create mode 100644 include/linux/tlv_parser.h > > > > create mode 100644 include/uapi/linux/tlv_parser.h > > > > create mode 100644 lib/tlv_parser.c > > > > create mode 100644 lib/tlv_parser.h > > > > > > > > diff --git a/MAINTAINERS b/MAINTAINERS > > > > index 8766f3e5e87e..ba8d5c137bef 100644 > > > > --- a/MAINTAINERS > > > > +++ b/MAINTAINERS > > > > @@ -23055,6 +23055,14 @@ W: http://sourceforge.net/projects/tlan/ > > > > F: Documentation/networking/device_drivers/ethernet/ti/tlan.rst > > > > F: drivers/net/ethernet/ti/tlan.* > > > > > > > > +TLV PARSER > > > > +M: Roberto Sassu <roberto.sassu@xxxxxxxxxx> > > > > +L: linux-kernel@xxxxxxxxxxxxxxx > > > > +S: Maintained > > > > +F: include/linux/tlv_parser.h > > > > +F: include/uapi/linux/tlv_parser.h > > > > +F: lib/tlv_parser.* > > > > + > > > > TMIO/SDHI MMC DRIVER > > > > M: Wolfram Sang <wsa+renesas@xxxxxxxxxxxxxxxxxxxx> > > > > L: linux-mmc@xxxxxxxxxxxxxxx > > > > diff --git a/include/linux/tlv_parser.h b/include/linux/tlv_parser.h > > > > new file mode 100644 > > > > index 000000000000..6d9a655d9ec9 > > > > --- /dev/null > > > > +++ b/include/linux/tlv_parser.h > > > > @@ -0,0 +1,48 @@ > > > > +/* SPDX-License-Identifier: GPL-2.0 */ > > > > +/* > > > > + * Copyright (C) 2023-2024 Huawei Technologies Duesseldorf GmbH > > > > + * > > > > + * Author: Roberto Sassu <roberto.sassu@xxxxxxxxxx> > > > > + * > > > > + * Header file of TLV parser. > > > > + */ > > > > + > > > > +#ifndef _LINUX_TLV_PARSER_H > > > > +#define _LINUX_TLV_PARSER_H > > > > + > > > > +#include <uapi/linux/tlv_parser.h> > > > > + > > > > +/** > > > > + * typedef hdr_callback - Callback after parsing TLV header > > > > + * @callback_data: Opaque data to supply to the header callback function > > > > + * @data_type: TLV data type > > > > + * @num_entries: Number of TLV data entries > > > > + * @total_len: Total length of TLV data > > > > + * > > > > + * This callback is invoked after a TLV header is parsed. > > > > + * > > > > + * Return: 0 to skip processing, 1 to do processing, a negative value on error. > > > > + */ > > > > +typedef int (*hdr_callback)(void *callback_data, __u64 data_type, > > > > + __u64 num_entries, __u64 total_len); > > > > + > > > > +/** > > > > + * typedef data_callback - Callback after parsing TLV data entry > > > > + * @callback_data: Opaque data to supply to the data callback function > > > > + * @field: TLV field ID > > > > + * @field_data: Data of a TLV data field > > > > + * @field_len: Length of @field_data > > > > + * > > > > + * This callback is invoked after a TLV data entry is parsed. > > > > + * > > > > + * Return: 0 on success, a negative value on error. > > > > + */ > > > > +typedef int (*data_callback)(void *callback_data, __u64 field, > > > > + const __u8 *field_data, __u64 field_len); > > > > + > > > > +int tlv_parse(hdr_callback hdr_callback, void *hdr_callback_data, > > > > + data_callback data_callback, void *data_callback_data, > > > > + const __u8 *data, size_t data_len, const char **data_types, > > > > + __u64 num_data_types, const char **fields, __u64 num_fields); > > > > + > > > > +#endif /* _LINUX_TLV_PARSER_H */ > > > > diff --git a/include/uapi/linux/tlv_parser.h b/include/uapi/linux/tlv_parser.h > > > > new file mode 100644 > > > > index 000000000000..fbd4fc403ac7 > > > > --- /dev/null > > > > +++ b/include/uapi/linux/tlv_parser.h > > > > @@ -0,0 +1,62 @@ > > > > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ > > > > +/* > > > > + * Copyright (C) 2023-2024 Huawei Technologies Duesseldorf GmbH > > > > + * > > > > + * Author: Roberto Sassu <roberto.sassu@xxxxxxxxxx> > > > > + * > > > > + * Implement the user space interface for the TLV parser. > > > > + */ > > > > + > > > > +#ifndef _UAPI_LINUX_TLV_PARSER_H > > > > +#define _UAPI_LINUX_TLV_PARSER_H > > > > + > > > > +#include <linux/types.h> > > > > + > > > > +/* > > > > + * TLV format: > > > > + * > > > > + * +-----------------+------------------+-----------------+ > > > > + * | data type (u64) | num fields (u64) | total len (u64) | # header > > > > + * +--------------+--+---------+--------+---------+-------+ > > > > + * | field1 (u64) | len1 (u64) | value1 (u8 len1) | > > > > + * +--------------+------------+------------------+ > > > > + * | ... | ... | ... | # data > > > > + * +--------------+------------+------------------+ > > > > + * | fieldN (u64) | lenN (u64) | valueN (u8 lenN) | > > > > + * +--------------+------------+------------------+ > > > > + * > > > > + * [same as above, repeated N times] > > > > + * > > > > + */ > > > > + > > > > +/** > > > > + * struct tlv_hdr - Header of TLV format > > > > + * @data_type: Type of data to parse > > > > + * @num_entries: Number of data entries provided > > > > + * @_reserved: Reserved for future use (must be equal to zero) > > > > + * @total_len: Total length of the data blob, excluding the header > > > > + * > > > > + * This structure represents the header of the TLV data format. > > > > + */ > > > > +struct tlv_hdr { > > > > + __u64 data_type; > > > > + __u64 num_entries; > > > > + __u64 _reserved; > > > > + __u64 total_len; > > > > +} __attribute__((packed)); > > > > + > > > > +/** > > > > + * struct tlv_data_entry - Data entry of TLV format > > > > + * @field: Data field identifier > > > > + * @length: Data length > > > > + * @data: Data > > > > + * > > > > + * This structure represents a TLV data entry. > > > > + */ > > > > +struct tlv_data_entry { > > > > + __u64 field; > > > > + __u64 length; > > > > + __u8 data[]; > > > > +} __attribute__((packed)); > > > > + > > > > +#endif /* _UAPI_LINUX_TLV_PARSER_H */ > > > > diff --git a/lib/Kconfig b/lib/Kconfig > > > > index b38849af6f13..9141dcfc1704 100644 > > > > --- a/lib/Kconfig > > > > +++ b/lib/Kconfig > > > > @@ -777,3 +777,6 @@ config POLYNOMIAL > > > > > > > > config FIRMWARE_TABLE > > > > bool > > > > + > > > > +config TLV_PARSER > > > > + bool > > > > diff --git a/lib/Makefile b/lib/Makefile > > > > index 322bb127b4dc..c6c3614c4293 100644 > > > > --- a/lib/Makefile > > > > +++ b/lib/Makefile > > > > @@ -392,6 +392,8 @@ obj-$(CONFIG_USERCOPY_KUNIT_TEST) += usercopy_kunit.o > > > > obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o > > > > > > > > obj-$(CONFIG_FIRMWARE_TABLE) += fw_table.o > > > > +obj-$(CONFIG_TLV_PARSER) += tlv_parser.o > > > > +CFLAGS_tlv_parser.o += -I lib > > > > > > > > # FORTIFY_SOURCE compile-time behavior tests > > > > TEST_FORTIFY_SRCS = $(wildcard $(src)/test_fortify/*-*.c) > > > > diff --git a/lib/tlv_parser.c b/lib/tlv_parser.c > > > > new file mode 100644 > > > > index 000000000000..5d54844ab8d7 > > > > --- /dev/null > > > > +++ b/lib/tlv_parser.c > > > > @@ -0,0 +1,221 @@ > > > > +// SPDX-License-Identifier: GPL-2.0 > > > > +/* > > > > + * Copyright (C) 2023-2024 Huawei Technologies Duesseldorf GmbH > > > > + * > > > > + * Author: Roberto Sassu <roberto.sassu@xxxxxxxxxx> > > > > + * > > > > + * Implement the TLV parser. > > > > + */ > > > > + > > > > +#define pr_fmt(fmt) "tlv_parser: "fmt > > > > +#include <tlv_parser.h> > > > > + > > > > +/** > > > > + * tlv_parse_hdr - Parse TLV header > > > > + * @hdr_callback: Callback function to call after parsing header > > > > + * @hdr_callback_data: Opaque data to supply to the header callback function > > > > + * @data: Data to parse (updated) > > > > + * @data_len: Length of @data (updated) > > > > + * @parsed_num_entries: Parsed number of data entries (updated) > > > > + * @parsed_total_len: Parsed length of TLV data, excluding the header (updated) > > > > + * @data_types: Array of data type strings > > > > + * @num_data_types: Number of elements of @data_types > > > > + * > > > > + * Parse the header of the TLV data format, move the data pointer to the TLV > > > > + * data part, decrease the data length by the length of the header, and provide > > > > + * the number of entries and the total data length extracted from the header. > > > > + * > > > > + * Before returning, call the header callback to let the callback supplier > > > > + * decide whether or not to process the subsequent TLV data. > > > > + * > > > > + * Return: 1 to process the data entries, 0 to skip, a negative value on error. > > > > + */ > > > > +static int tlv_parse_hdr(hdr_callback hdr_callback, void *hdr_callback_data, > > > > + const __u8 **data, size_t *data_len, > > > > + __u64 *parsed_num_entries, __u64 *parsed_total_len, > > > > + const char **data_types, __u64 num_data_types) > > > > +{ > > > > + __u64 parsed_data_type; > > > > + struct tlv_hdr *hdr; > > > > + > > > > + if (*data_len < sizeof(*hdr)) { > > > > + pr_debug("Data blob too short, %lu bytes, expected %lu\n", > > > > + *data_len, sizeof(*hdr)); > > > > + return -EBADMSG; > > > > + } > > > > + > > > > + hdr = (struct tlv_hdr *)*data; > > > > + > > > > + *data += sizeof(*hdr); > > > > + *data_len -= sizeof(*hdr); > > > > + > > > > + parsed_data_type = __be64_to_cpu(hdr->data_type); > > > > + if (parsed_data_type >= num_data_types) { > > > > + pr_debug("Invalid data type %llu, max: %llu\n", > > > > + parsed_data_type, num_data_types - 1); > > > > + return -EBADMSG; > > > > + } > > > > + > > > > + *parsed_num_entries = __be64_to_cpu(hdr->num_entries); > > > > + > > > > + if (hdr->_reserved != 0) { > > > > + pr_debug("_reserved must be zero\n"); > > > > + return -EBADMSG; > > > > + } > > > > + > > > > + *parsed_total_len = __be64_to_cpu(hdr->total_len); > > > > + if (*parsed_total_len > *data_len) { > > > > + pr_debug("Invalid total length %llu, expected: %lu\n", > > > > + *parsed_total_len, *data_len); > > > > + return -EBADMSG; > > > > + } > > > > + > > > > + pr_debug("Header: type: %s, num entries: %llu, total len: %lld\n", > > > > + data_types[parsed_data_type], *parsed_num_entries, > > > > + *parsed_total_len); > > > > + > > > > + return hdr_callback(hdr_callback_data, parsed_data_type, > > > > + *parsed_num_entries, *parsed_total_len); > > > > +} > > > > + > > > > +/** > > > > + * tlv_parse_data - Parse TLV data > > > > + * @data_callback: Callback function to call to parse the data entries > > > > + * @data_callback_data: Opaque data to supply to the data callback function > > > > + * @num_entries: Number of data entries to parse > > > > + * @data: Data to parse > > > > + * @data_len: Length of @data > > > > + * @fields: Array of field strings > > > > + * @num_fields: Number of elements of @fields > > > > + * > > > > + * Parse the data part of the TLV data format and call the supplied callback > > > > + * function for each data entry, passing also the opaque data pointer. > > > > + * > > > > + * The data callback function decides how to process data depending on the > > > > + * field. > > > > + * > > > > + * Return: 0 on success, a negative value on error. > > > > + */ > > > > +static int tlv_parse_data(data_callback data_callback, void *data_callback_data, > > > > + __u64 num_entries, const __u8 *data, size_t data_len, > > > > + const char **fields, __u64 num_fields) > > > > +{ > > > > + const __u8 *data_ptr = data; > > > > + struct tlv_data_entry *entry; > > > > + __u64 parsed_field, len, i, max_num_entries; > > > > + int ret; > > > > + > > > > + max_num_entries = data_len / sizeof(*entry); > > > > + > > > > + /* Possibly lower limit on num_entries loop. */ > > > > + if (num_entries > max_num_entries) > > > > + return -EBADMSG; > > > > + > > > > + for (i = 0; i < num_entries; i++) { > > > > + if (data_len < sizeof(*entry)) > > > > + return -EBADMSG; > > > > + > > > > + entry = (struct tlv_data_entry *)data_ptr; > > > > + data_ptr += sizeof(*entry); > > > > + data_len -= sizeof(*entry); > > > > + > > > > + parsed_field = __be64_to_cpu(entry->field); > > > > + if (parsed_field >= num_fields) { > > > > + pr_debug("Invalid field %llu, max: %llu\n", > > > > + parsed_field, num_fields - 1); > > > > + return -EBADMSG; > > > > + } > > > > + > > > > + len = __be64_to_cpu(entry->length); > > > > + > > > > + if (data_len < len) > > > > + return -EBADMSG; > > > > + > > > > + pr_debug("Data: field: %s, len: %llu\n", fields[parsed_field], > > > > + len); > > > > + > > > > + if (!len) > > > > + continue; > > > > + > > > > + ret = data_callback(data_callback_data, parsed_field, data_ptr, > > > > + len); > > > > + if (ret < 0) { > > > > + pr_debug("Parsing of field %s failed, ret: %d\n", > > > > + fields[parsed_field], ret); > > > > + return ret; > > > > + } > > > > + > > > > + data_ptr += len; > > > > + data_len -= len; > > > > + } > > > > + > > > > + if (data_len) { > > > > + pr_debug("Excess data: %lu bytes\n", data_len); > > > > + return -EBADMSG; > > > > + } > > > > + > > > > + return 0; > > > > +} > > > > + > > > > +/** > > > > + * tlv_parse - Parse data in TLV format > > > > + * @hdr_callback: Callback function to call after parsing header > > > > + * @hdr_callback_data: Opaque data to supply to the header callback function > > > > + * @data_callback: Callback function to call to parse the data entries > > > > + * @data_callback_data: Opaque data to supply to the data callback function > > > > + * @data: Data to parse > > > > + * @data_len: Length of @data > > > > + * @data_types: Array of data type strings > > > > + * @num_data_types: Number of elements of @data_types > > > > + * @fields: Array of field strings > > > > + * @num_fields: Number of elements of @fields > > > > + * > > > > + * Parse data in TLV format and call tlv_parse_data() each time tlv_parse_hdr() > > > > + * returns 1. > > > > + * > > > > + * Return: 0 on success, a negative value on error. > > > > + */ > > > > +int tlv_parse(hdr_callback hdr_callback, void *hdr_callback_data, > > > > + data_callback data_callback, void *data_callback_data, > > > > + const __u8 *data, size_t data_len, const char **data_types, > > > > + __u64 num_data_types, const char **fields, __u64 num_fields) > > > > +{ > > > > + __u64 parsed_num_entries, parsed_total_len; > > > > + const __u8 *data_ptr = data; > > > > + int ret = 0; > > > > + > > > > + pr_debug("Start parsing data blob, size: %lu\n", data_len); > > > > + > > > > + while (data_len) { > > > > + ret = tlv_parse_hdr(hdr_callback, hdr_callback_data, &data_ptr, > > > > + &data_len, &parsed_num_entries, > > > > + &parsed_total_len, data_types, > > > > + num_data_types); > > > > + switch (ret) { > > > > + case 0: > > > > + /* > > > > + * tlv_parse_hdr() already checked that > > > > + * parsed_total_len <= data_len. > > > > + */ > > > > + data_ptr += parsed_total_len; > > > > + data_len -= parsed_total_len; > > > > + continue; > > > > + case 1: > > > > + break; > > > > + default: > > > > + goto out; > > > > + } > > > > + > > > > + ret = tlv_parse_data(data_callback, data_callback_data, > > > > + parsed_num_entries, data_ptr, > > > > + parsed_total_len, fields, num_fields); > > > > + if (ret < 0) > > > > + goto out; > > > > + > > > > + data_ptr += parsed_total_len; > > > > + data_len -= parsed_total_len; > > > > + } > > > > +out: > > > > + pr_debug("End of parsing data blob, ret: %d\n", ret); > > > > + return ret; > > > > +} > > > > diff --git a/lib/tlv_parser.h b/lib/tlv_parser.h > > > > new file mode 100644 > > > > index 000000000000..8fa8127bd13e > > > > --- /dev/null > > > > +++ b/lib/tlv_parser.h > > > > @@ -0,0 +1,17 @@ > > > > +/* SPDX-License-Identifier: GPL-2.0 */ > > > > +/* > > > > + * Copyright (C) 2023-2024 Huawei Technologies Duesseldorf GmbH > > > > + * > > > > + * Author: Roberto Sassu <roberto.sassu@xxxxxxxxxx> > > > > + * > > > > + * Header file of TLV parser. > > > > + */ > > > > + > > > > +#ifndef _LIB_TLV_PARSER_H > > > > +#define _LIB_TLV_PARSER_H > > > > + > > > > +#include <linux/kernel.h> > > > > +#include <linux/err.h> > > > > +#include <linux/tlv_parser.h> > > > > + > > > > +#endif /* _LIB_TLV_PARSER_H */ > > > > > >