This series improves the packing library with a new API for packing or unpacking a large number of fields at once with minimal code footprint. The API is then used to replace bespoke packing logic in the ice driver, preparing it to handle unpacking in the future. Finally, the ice driver has a few other cleanups related to the packing logic. The pack_fields and unpack_fields functions have the following improvements over the existing pack() and unpack() API: 1. Packing or unpacking a large number of fields takes significantly less code. This significantly reduces the .text size for an increase in the .data size which is much smaller. 2. The unpacked data can be stored in sizes smaller than u64 variables. This reduces the storage requirement both for runtime data structures, and for the rodata defining the fields. This scales with the number of fields used. 3. Most of the error checking is done at compile time, rather than runtime, via additional checks added to modpost. This saves wasted computation time *and* catches errors in the field definitions early, rather than only after the offending code is executed. The actual packing and unpacking code still uses the u64 size variables. However, these are converted to the appropriate field sizes when storing or reading the data from the buffer. The compile time checks do add some complexity to modpost. An attempt was made originally to implement the checks with C macros. Unfortunately, the C pre-processor cannot generate loops. Without loops, drivers are responsible for calling the correct CHECK macro. In addition, this required several thousand lines of macro to provide variants for the sizes expected to be used. While this was possible to be generated, it still meant a lot of mess spreading to the drivers, Makefile, and build process. To enable modpost checks, the packed field arrays are placed into a special section of the binary, using DECLARE_PACKED_FIELDS_(S|M) macros. This makes it relatively easy for drivers to enable. We can also easily implement checkpatch or cocinelle scripts to detect raw usage of the packed fields structures if this becomes a problem. The modpost logic is implemented by scanning the symbols to find entries in the relevant ".rodata.packed_field_s" or ".rodata.packed_field_m" sections. This must be extended if we ever add support for a "packed_field_l" or similar, though this seems unlikely. In order to enforce buffer size, modpost must have access to the size of the target buffer. This is achieved by having DECLARE_PACKED_FIELDS* also generate a const variable with the buffer size. I tried a couple of other methods of obtaining the size, but this was the simplest one. I think it may be possible to have the unused symbols discarded, but I did not attempt this. I'm in favor of this modpost implementation vs the macros attempted in previous iterations. Pros: * Significantly less code to implement the checks, even if we ignore the generated portions in <generated/packing-checks.h> * The modpost checks are able to work in arbitrarily sized arrays, without needing to add any modification in the future. The macro implementation could require adding additional macros if a driver ever needed >50 fields. * Checks are relatively self contained to modpost. We don't need to edit the top level build Kbuild, we don't need to add 50 extra config options, and we don't need to leave behind a 20K+ LOC generated header file. * The fields are required to be fully ordered, by detecting whether the first two fields indicate ascending or descending order. This enables driver authors to order the fields in which ever ordering is most natural for their data. The macro checks did a full overlap check but did not enforce order. * Drivers simply DECLARE_PACKED_FIELDS_(S|M) and get modpost warnings, without requiring a manual call to CHECK_PACKED_FIELDS*. This should make it easier to review, and less likely that checks are skipped. Cons: * modpost doesn't seem to have a way to track the symbols back to line numbers, so we can't report that info to the user. * modpost errors are reported slightly later in the compile process. * To maintain the full set of checks, we need to export the size of the target buffer to modpost, which I implemented with an additional variable that is otherwise entirely unused. * We have to place the packed field arrays in special sections to enable modpost. This is handled by DECLARE macros, so we have to ensure all users use these macros. I did not attempt to add a check for that in this series, but it could be done in principle with cocinelle or checkpatch. * A fair amount of boilerplate code is required to extract the relevant data from the symbol definitions, including special casing to handle the potential endian issues when using packed_field_m. * These type of checks do feel somewhat ancillary to the original purpose of modpost. The fact that the mess of checking the fields is fairly self contained and avoids the mess of the CHECK_* and config options is a big win, and I think the modpost option is significantly better for this. Signed-off-by: Jacob Keller <jacob.e.keller@xxxxxxxxx> --- Changes in v3: - Replace macro-based C pre-processor checks with checks implemented in modpost. - Move structure definitions into <linux/packing_types.h> to enable reuse within modpost. - Add DECLARE_PACKED_FIELDS_S and DECLARE_PACKED_FIELDS_M to enable automatically generating the buffer size constants and the section attributes. - Add additional unit tests for the pack_fields and unpack_fields APIs. - Update documentation with an explanation of the new API as well as some example code. - Link to v2: https://lore.kernel.org/r/20241025-packing-pack-fields-and-ice-implementation-v2-0-734776c88e40@xxxxxxxxx Changes in v2: - Add my missing sign-off to the first patch - Update the descriptions for a few patches - Only generate CHECK_PACKED_FIELDS_N when another module selects it - Add a new patch introducing wrapper structures for the packed Tx and Rx queue context, suggested by Vladimir. - Drop the now unnecessary macros in ice, thanks to the new types - Link to v1: https://lore.kernel.org/r/20241011-packing-pack-fields-and-ice-implementation-v1-0-d9b1f7500740@xxxxxxxxx --- Jacob Keller (6): ice: remove int_q_state from ice_tlan_ctx ice: use structures to keep track of queue context size ice: use <linux/packing.h> for Tx and Rx queue context data ice: reduce size of queue context fields ice: move prefetch enable to ice_setup_rx_ctx ice: cleanup Rx queue context programming functions Vladimir Oltean (3): lib: packing: create __pack() and __unpack() variants without error checking lib: packing: demote truncation error in pack() to a warning in __pack() lib: packing: add pack_fields() and unpack_fields() drivers/net/ethernet/intel/ice/ice_adminq_cmd.h | 11 +- drivers/net/ethernet/intel/ice/ice_common.h | 5 +- drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h | 49 +--- include/linux/packing.h | 29 +++ include/linux/packing_types.h | 54 +++++ scripts/mod/modpost.h | 19 ++ drivers/net/dsa/sja1105/sja1105_static_config.c | 8 +- drivers/net/ethernet/intel/ice/ice_base.c | 6 +- drivers/net/ethernet/intel/ice/ice_common.c | 297 +++++------------------- lib/packing.c | 285 +++++++++++++++++------ lib/packing_test.c | 61 +++++ scripts/mod/modpost.c | 18 +- scripts/mod/packed_fields.c | 294 +++++++++++++++++++++++ Documentation/core-api/packing.rst | 55 +++++ MAINTAINERS | 2 + drivers/net/ethernet/intel/Kconfig | 1 + scripts/mod/Makefile | 4 +- 17 files changed, 828 insertions(+), 370 deletions(-) --- base-commit: a84e8c05f58305dfa808bc5465c5175c29d7c9b6 change-id: 20241004-packing-pack-fields-and-ice-implementation-b17c7ce8e373 Best regards, -- Jacob Keller <jacob.e.keller@xxxxxxxxx>