Re: [PATCH v2 1/2] check-uapi: Introduce check-uapi.sh

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/4/2023 8:22 PM, Masahiro Yamada wrote:
On Wed, Mar 1, 2023 at 4:54 PM John Moon <quic_johmoo@xxxxxxxxxxx> wrote:

While the kernel community has been good at maintaining backwards
compatibility with kernel UAPIs, it would be helpful to have a tool
to check if a commit introduces changes that break backwards
compatibility.

To that end, introduce check-uapi.sh: a simple shell script that
checks for changes to UAPI headers using libabigail.

libabigail is "a framework which aims at helping developers and
software distributors to spot some ABI-related issues like interface
incompatibility in ELF shared libraries by performing a static
analysis of the ELF binaries at hand."

The script uses one of libabigail's tools, "abidiff", to compile the
changed header before and after the commit to detect any changes.

abidiff "compares the ABI of two shared libraries in ELF format. It
emits a meaningful report describing the differences between the two
ABIs."

The script also includes the ability to check the compatibilty of
all UAPI headers across commits. This allows developers to inspect
the stability of the UAPIs over time.


Let's see more test cases.


[Case 1]

I think d759be8953febb6e5b5376c7d9bbf568864c6e2d
is a trivial/good cleanup.
Apparently, it still exports equivalent headers,
but this tool reports "incorrectly removed".



$ ./scripts/check-uapi.sh -b d759be8953
Saving current tree state... OK
Installing sanitized UAPI headers from d759be8953... OK
Installing sanitized UAPI headers from d759be8953^1... OK
Restoring current tree state... OK
Checking changes to UAPI headers starting from d759be8953
error - UAPI header arch/alpha/include/uapi/asm/poll.h was incorrectly removed
error - UAPI header arch/ia64/include/uapi/asm/poll.h was incorrectly removed
error - UAPI header arch/x86/include/uapi/asm/poll.h was incorrectly removed
/tmp/tmp.ixUIBlntUP/d759be8953/x86/usr/include/asm/Kbuild does not
exist - cannot compare ABI
/tmp/tmp.ixUIBlntUP/d759be8953/alpha/usr/include/asm/Kbuild does not
exist - cannot compare ABI
/tmp/tmp.ixUIBlntUP/d759be8953/ia64/usr/include/asm/Kbuild does not
exist - cannot compare ABI
error - 6/6 UAPI headers modified between d759be8953^1 and d759be8953
are not backwards compatible
error - UAPI header ABI check failed
Failure summary saved to /home/masahiro/ref/linux/abi_error_log.txt



This is an interesting test case. Thanks for bringing it up. I don't know if there's a way for the script to filter out these kinds of changes, so it may just need to be noted under possible false positives in the document.

It also reveals that the script isn't filtering out non-headers from the git diffs... I'll fix that in v3.


[Case 2]

This tool compiles only changed headers.
Does it detect ABI change?

I believe the users of the headers must be compiled.



Think about this case.


$ cat foo-typedef.h
typedef int foo_cap_type;


$ cat foo.h
#include "foo-typedef.h"

struct foo {
        foo_cap_type capability;
};



Then, change the first header to
   typedef long long foo_cap_type;

abidiff will never notice the ABI change
until it compiles "foo.h" instead of "foo-typedef.h"
 >

For testing, I applied the following patch.


  --- a/include/uapi/linux/types.h
  +++ b/include/uapi/linux/types.h
  @@ -52,7 +52,7 @@ typedef __u32 __bitwise __wsum;
   #define __aligned_be64 __be64 __attribute__((aligned(8)))
   #define __aligned_le64 __le64 __attribute__((aligned(8)))

  -typedef unsigned __bitwise __poll_t;
  +typedef unsigned short __bitwise __poll_t;

   #endif /*  __ASSEMBLY__ */
   #endif /* _UAPI_LINUX_TYPES_H */




I believe this is an ABI change because this will change
'struct epoll_event' in the include/uapi/linux/eventpoll.h
but the tool happily reports it is backwards compatible.


$ ./scripts/check-uapi.sh
Saving current tree state... OK
Installing sanitized UAPI headers from HEAD... OK
Installing sanitized UAPI headers from HEAD^1... OK
Restoring current tree state... OK
Checking changes to UAPI headers starting from HEAD
No ABI differences detected in include/uapi/linux/types.h from HEAD^1 -> HEAD
All 1 UAPI headers modified between HEAD^1 and HEAD are backwards compatible!



You're correct, this case is missed when only checking modified headers. With the same patch, if I pass "-a" to the script, it does catch the change:

% ./scripts/check-uapi.sh -a
--- snip ---
error - 1/1328 UAPI headers modified between HEAD and dirty tree are not backwards compatible
error - UAPI header ABI check failed


% cat abi_error_log.txt
Generated by "./scripts/check-uapi.sh -a" from git ref 94b1166f7954f1136f307dafbaad5f9d871b73bf

!!! ABI differences detected in include/uapi/linux/eventpoll.h from HEAD -> dirty tree !!!

    [C] 'struct epoll_event' changed:
      type size changed from 96 to 80 (in bits)
      2 data member changes:
        type of '__poll_t events' changed:
          underlying type 'unsigned int' changed:
            type name changed from 'unsigned int' to 'unsigned short int'
            type size changed from 32 to 16 (in bits)
        '__u64 data' offset changed from 32 to 16 (in bits) (by -16 bits)

Perhaps in the next revision we could add some way to detect these dependencies (e.g. if foo.h includes bar.h, and bar.h was modified, we should check foo.h). However, the time savings may not be worth the complicated and potentially fragile dependency detection.

For now, "-a" should catch this, and it only took about 1 minute to run through all the headers on my 8-core machine, so it should be a resonable test step for a CI system.




I would not use such a tool that contains both false positives
and false negatives, but you may notice this is more difficult
than you had expected.


Right, it certainly has its shortcomings. I appreciate you helping us find and address them! Even in its current state, I believe the script has value for developers and reviewers. :)

I do not know if further review is worthwhile since this does not work
but I added some more in-line comments.






+
+# Some UAPI headers require an architecture-specific compiler to build properly.
+ARCH_SPECIFIC_CC_NEEDED=(
+       "arch/hexagon/include/uapi/asm/sigcontext.h"
+       "arch/ia64/include/uapi/asm/intel_intrin.h"
+       "arch/ia64/include/uapi/asm/setup.h"
+       "arch/ia64/include/uapi/asm/sigcontext.h"
+       "arch/mips/include/uapi/asm/bitfield.h"
+       "arch/mips/include/uapi/asm/byteorder.h"
+       "arch/mips/include/uapi/asm/inst.h"
+       "arch/sparc/include/uapi/asm/fbio.h"
+       "arch/sparc/include/uapi/asm/uctx.h"
+       "arch/xtensa/include/uapi/asm/byteorder.h"
+       "arch/xtensa/include/uapi/asm/msgbuf.h"
+       "arch/xtensa/include/uapi/asm/sembuf.h"
+)


Yes, arch/*/include/ must be compiled by the target compiler.
If you compile them by the host compiler, it is unpredictable (i.e. wrong).

BTW, was this blacklist detected on a x86 host?


Yes.

If you do this on an ARM/ARM64 host, some headers
under arch/x86/include/uapi/ might be blacklisted?


Good point - I missed those!



+# Compile the simple test app
+do_compile() {
+       local -r inc_dir="$1"
+       local -r header="$2"
+       local -r out="$3"
+       printf "int main(void) { return 0; }\n" | \
+               "${CC:-gcc}" -c \
+                 -o "$out" \
+                 -x c \
+                 -O0 \
+                 -std=c90 \
+                 -fno-eliminate-unused-debug-types \
+                 -g \
+                 "-I${inc_dir}" \
+                 -include "$header" \
+                 -
+}
+
+# Print the list of incompatible headers from the usr/include Makefile
+get_no_header_list() {
+       {
+               # shellcheck disable=SC2016
+               printf 'all: ; @echo $(no-header-test)\n'
+               cat "usr/include/Makefile"

You must pass SRCARCH=$arch.

Otherwise,

ifeq ($(SRCARCH),...)
   ...
endif

are all skipped.



Thanks for the tip, that explains it. Should be able to address this in v3.




+       } | make -f - | tr " " "\n" | grep -v "asm-generic"
+
+       # One additional header file is not building correctly
+       # with this method.
+       # TODO: why can't we build this one?
+       printf "asm-generic/ucontext.h\n"


Answer - it is not intended for standalone compiling in the first place.

<asm-generic/*.h> should be included from <asm/*.h>.

Userspace never ever includes <asm-generic/*.h> directly.
(If it does, it is a bug in the userspace program)

I am afraid you read user/include/Makefile wrongly.



Understood. I think I had misinterpreted one of your comments on v1, but now I'm clear. Will address in v3.



+
+# Install headers for every arch and ref we need
+install_headers() {
+       local -r check_all="$1"
+       local -r base_ref="$2"
+       local -r ref="$3"
+
+       local arch_list=()
+       while read -r status file; do
+               if arch="$(printf "%s" "$file" | grep -o 'arch/.*/uapi' | cut -d '/' -f 2)"; then
+                       # shellcheck disable=SC2076
+                       if ! [[ " ${arch_list[*]} " =~ " $arch " ]]; then
+                               arch_list+=("$arch")
+                       fi
+               fi
+       done < <(get_uapi_files "$check_all" "$base_ref" "$ref")
+
+       deviated_from_current_tree="false"
+       for inst_ref in "$base_ref" "$ref"; do
+               if [ -n "$inst_ref" ]; then
+                       if [ "$deviated_from_current_tree" = "false" ]; then
+                               save_tree_state
+                               trap 'rm -rf "$tmp_dir"; restore_tree_state;' EXIT
+                               deviated_from_current_tree="true"
+                       fi
+                       git checkout --quiet "$(git rev-parse "$inst_ref")"


I might be wrong, but I was worried when I looked at this line
because git-checkout may change the running code
if check-uapi.sh is changed between ref and base_ref.

If bash always loads all code into memory before running
it is safe but I do not know how it works.


If this is safe, some comments might be worthwhile:

     # 'git checkout' may update this script itself while running,
     # but it is OK because ...


Yes, my understanding is that since the script is all encapsulated in functions, the shell has loaded all of the functions before execution starts. My testing has shown this to be safe as well. Will add a comment in v3.





+
+# Make sure we have the tools we need
+check_deps() {
+       export ABIDIFF="${ABIDIFF:-abidiff}"
+
+       if ! command -v "$ABIDIFF" > /dev/null 2>&1; then
+               eprintf "error - abidiff not found!\n"
+               eprintf "Please install abigail-tools (version 1.7 or greater)\n"
+               eprintf "See: https://sourceware.org/libabigail/manual/libabigail-overview.html\n";
+               exit 1
+       fi
+
+       read -r abidiff_maj abidiff_min _unused < <("$ABIDIFF" --version | cut -d ' ' -f 2 | tr '.' ' ')
+       if [ "$abidiff_maj" -lt 1 ] || { [ "$abidiff_maj" -eq 1 ] && [ "$abidiff_min" -lt 7 ]; }; then


This is up to you, but I think "sort -V" would be cleaner.
(see Documentation/devicetree/bindings/Makefile for example)



Noted.



+       fi
+
+       if [ ! -x "scripts/unifdef" ]; then
+               if ! make -f /dev/null scripts/unifdef; then

Previously, I wanted to point out that using Make is meaningless,
and using gcc directly is better.


But, is this still necessary?

V2 uses 'make headers_install' to install all headers.
scripts/unifdef is not used anywhere in this script.



Ah, you're right it is not necessary. Previously, we were calling headers_install.sh directly, so make wasn't there to supply the unifdef dependency. Will remove this in v3.





+
+       abi_error_log="${abi_error_log:-${KERNEL_SRC}/abi_error_log.txt}"
+
+       check_deps
+
+       tmp_dir=$(mktemp -d)
+       trap 'rm -rf "$tmp_dir"' EXIT
+
+       # Set of UAPI directories to check by default
+       UAPI_DIRS=(include/uapi arch/*/include/uapi)
+
+       if ! git rev-parse --is-inside-work-tree > /dev/null 2>&1; then
+               eprintf "error - this script requires the kernel tree to be initialized with Git\n"
+               exit 1
+       fi
+
+       # If there are no dirty UAPI files, use HEAD as base_ref
+       if [ -z "$base_ref" ] && [ "$(get_uapi_files "" "" | wc -l)" -eq 0 ]; then
+               base_ref="HEAD"
+       fi
+
+       if [ -z "$ref_to_check" ]; then
+               if [ -n "$base_ref" ]; then
+                       ref_to_check="${base_ref}^1"
+               else
+                       ref_to_check="HEAD"
+               fi
+       fi


I think this is because I am not good at English, but
I was so confused between 'base_ref' vs 'ref_to_check'.
I do not get which one is the ancestor from the names.

I thought 'ref_a' and 'ref_b' would be less confusing,
but I hope somebody will come up with better naming
than that.


Agreed, I think this is a confusing case for native English-speakers too. :)

I want to indicate that one ref has to come after the other in the Git tree. Maybe "base_ref" and "past_ref"? We'll think on it.




--
Best Regards





Masahiro Yamada

Thank you again for the detailed review!



[Index of Archives]     [Linux&nblp;USB Development]     [Linux Media]     [Video for Linux]     [Linux Audio Users]     [Yosemite Secrets]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux