Hello all, I'm an undergrad student working on Google Summer of Code'19 Project[1] to apply clang thread safety analysis feature on linux kernel to find bugs related to concurrency/race condtions with Lukas & clangbuiltlinux community. Since sparse has similar context checking feature, I started investigating by looking the source and some other resources such as LWN[2] about the internals. `-Wcontext` is my prime focus for now and currently we have: himanshu@himanshu-Vostro-3559:~/linux-next$ make C=2 CF="-Wcontext" 2>&1 >/dev/null | grep -w 'context' | wc -l 772 o Why do we have so many open warnings for context imbalance ? Or Why did we stop at some point annotating the codebase ? I tried to play around with a simple C program: ------------------------------------------------- himanshu@himanshu-Vostro-3559:~/sparse$ cat sparse_context_check.c # define __acquires(x) __attribute__((context(x,0,1))) # define __releases(x) __attribute__((context(x,1,0))) # define __acquire(x) __context__(x,1) # define __release(x) __context__(x,-1) static int lock1, lock2; static void lockfn (int i) __acquires(lock1) { __acquire(lock1); i++; } static void unlockfn (int i) __releases(lock1) { __release(lock1); i--; } static void unlockfn2 (int i) __releases(lock2) { __release(lock2); i--; } static void bad_difflocks (int j) { lockfn(j); unlockfn2(j); } static void bad_onlyunlock (void) { int check = 0; unlockfn(check); } static void bad_onlylock (void) { int check = 0; lockfn(check); } ------------------------------------------------- Results: ------------------------------------------------- himanshu@himanshu-Vostro-3559:~/sparse$ sparse sparse_context_check.c sparse_context_check.c:29:13: warning: context imbalance in 'bad_onlyunlock' - unexpected unlock sparse_context_check.c:34:13: warning: context imbalance in 'bad_onlylock' - wrong count at exit o Does sparse stores some sort of context counter since we didn't get any warnings for `bad_difflocks` which locks 'lock1' and unlocks 'lock2' ? +1 for any lock, -1 for any lock => fails to distinguish two different locks. himanshu@himanshu-Vostro-3559:~/sparse$ ./test-linearize sparse_context_check.c <snip> bad_difflocks: .L6: <entry-point> call lockfn, %arg1 context 1 call unlockfn2, %arg1 context -1 ret </snip> In case anyone is interested in clang's ast-dump: https://gist.github.com/himanshujha199640/ed8359b78b9e6e0ca00ef871e0d77a0c o What exactly the usage of `__acquire/__release` ? I have used it to shut up the warning for lockfn & unlockfn above. o How exactly is `context` defined in sparse codebase which is understood by the compiler when using __attribute__((context(lock1,1,0))) on functions ? According to man sparse: -Wcontext Warn about potential errors in synchronization or other delimited contexts. Sparse supports several means of designating functions or statements that delimit contexts, such as synchronization. Functions with the extended attribute __attribute__((context(expression,in_context,out_context)) require the context expression (for instance, a lock) to have the value in_context (a constant nonnegative integer) when called, and return with the value out_context (a constant nonnegative integer). For APIs defined via macros, use the statement form __context__(expression,in_value,out_value) in the body of the macro. With -Wcontext Sparse will warn when it sees a function change the context without indicating this with a context attribute, either by decreasing a context below zero (such as by releasing a lock without acquiring it), or returning with a changed context (such as by acquiring a lock without releasing it). Sparse will also warn about blocks of code which may potentially execute with different contexts. Sparse issues these warnings by default. To turn them off, use -Wno-context. --------- My understanding is that lock1 in `bad_onlyunlock()` should have a total zero lock counter sum at the end of codeblock. So, as when `unlockfn(check);` is called we have: * lock1 with 0 initial value(from previous `bad_difflocks()`) * returns with 1 final value. Proceeding to next function `bad_onlylock()` and reaching `lockfn(check)` we have: * according to semantics of `acquires` we should have a intial 0 value for lock1 but we have 1 value from above `bad_onlyunlock()` leading to context imbalance and thus the warning. Please let me know if I'm wrong somewhere. So, clang thread safety analysis[3] follows a different mechanism to overcome what we have observed above. I did small analysis on a C program[4] and a device driver[5]. Clang analysis has many annotations available to suitable annotate the codebase which can be found in the documentation[3]. Quite surprisingly, Philipp proposed[6] `__protected_by` feature which is very similar to `guarded_by`[7] feature implemented in Clang. Similarly, Johannes proposed[8] the same with a different implementation. Questions from both you: o Why was it not deployed in sparse ? o Does the lock protecting the data should be a global variable ? ie., struct foo { struct mutex lock; int balance __protected_by(lock); } Can this be done ? Or lock should be global ? Because clang analysis wants it to be global! There are other attribute restrictions as well for clang analysis: https://github.com/llvm-mirror/clang/blob/master/test/Sema/attr-capabilities.c *Most Important* Could you please point me some critical data examples that you know in the kernel source which should be protected. This would help us a lot! I'm familiar with IIO subsystem but couldn't find any solid example to try other than the tmp007.c driver I tried[5]. Annotating device driver is not the best example and we require other high priority critical data. We discussed about this before and conclusion is same as "annotating if branches with likely()/unlikely() in a device driver" We are reporting here: https://github.com/ClangBuiltLinux/thread-safety-analysis and for now analysing the tools available for detecting concurrency problems. Currently there are ~300 warnings with clang thread safety analysis in defconfig. There is no visible C codebase that has implemneted this feature where we could refer, therefore it is more of an expermenta. Lastly, this project is under the umbrella of recently announced project by Linux Foundation called "Enabling Linux In Safety Applications(ELISA)". Any feedback is highly appreciated! [1] https://summerofcode.withgoogle.com/projects/#5358212549705728 [2] https://lwn.net/Articles/689907/ [3] https://clang.llvm.org/docs/ThreadSafetyAnalysis.html [4] https://gist.github.com/himanshujha199640/efac981a31b0a2317f8c0c61a9803447 [5] https://gist.github.com/himanshujha199640/1154f65b2762a943fdcd7b375a7d0954 [6] https://marc.info/?l=linux-sparse&m=120696665813430&w=2 [7] https://clang.llvm.org/docs/ThreadSafetyAnalysis.html#guarded-by-c-and-pt-guarded-by-c [8] https://marc.info/?l=linux-sparse&m=120895542318572&w=2 Thanks! -- Himanshu Jha Undergraduate Student Department of Electronics & Communication Guru Tegh Bahadur Institute of Technology