On Wed, 6 Sept 2023 at 12:36, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > There are a few other issues that I have with this, and Christoph did > mention a big one: it's not been in linux-next. Oh, and the fact that it hasn't been in linux-next became immediately clear, and it's a real practical problem. I get a compile error when just doing a basic built-test: fs/bcachefs/btree_key_cache.c: In function ‘btree_key_cache_create’: fs/bcachefs/btree_key_cache.c:356:56: error: implicit conversion from ‘enum btree_node_locked_type’ to ‘enum six_lock_type’ [-Werror=enum-conversion] 356 | mark_btree_node_locked(trans, path, 0, BTREE_NODE_UNLOCKED); | ^~~~~~~~~~~~~~~~~~~ and that compile error is actually a sign of a pretty serious bug. BTREE_NODE_UNLOCKED is of the wrong enum type, and has a value (-1) that simply cannot be represented in a 'enum six_lock_type'. I'm pretty sure that the compiler is actually allowed to assume it is in the range of the source enum. Yes, the inline function then does a cast to 'enum btree_node_locked_type', but because the compiler may assume that the incoming enum has values 0..2, it's not clear that it will necessarily cast the value -1 to the expected value. It probably works in practice on x86 (without -fshort-enum, I think gcc will always use an 'int'), but that code really is wrong. On other platforms, gcc defaults -fshort-enums on, and in that case a 'enum six_lock_type' is actually of underlying type 'unsigned char'. And guess what happens when you have (unsigned char)-1? It does *not* cast back to -1. I don't know whether any of the architectures we support has that "default to -fshort-enums" behaviour, and at least hexagon uses "-fno-short-enums" to avoid this issue, but it really is a serious bug. A Clang build also results in objtool complaints about bcachefs, with fs/bcachefs/buckets.o: warning: objtool: bch2_trans_mark_metadata_bucket() falls through to next function bch2_trans_mark_dev_sb() fs/bcachefs/fsck.o: warning: objtool: write_inode() falls through to next function hash_check_key() and sure enough, the code generation ends up being movq %rbx, %rdi movl %r12d, %esi callq bch2_trans_restart_error .Lfunc_end25: .size write_inode, .Lfunc_end25-write_inode and the issue is that bch2_trans_restart_error() is marked __noreturn, but objtool hasn't been told about it, so objtool is very unhappy about that "code seems to fall off the side of the earth" kind of issue. Both of these are very much signs of "this has never been tested in other environments", and would presumably have been found immediately in linux-next. As it is, I'm not convinced I want to even continue to go through this all, when it fails at the very first hurdle. A clean build is not some "would be nice" feature, and it is big part of why we have automation for new code. Linus