On 6/17/20 5:23 AM, Petr Mladek wrote:
On Mon 2020-06-15 13:27:54, Joe Lawrence wrote:
The dmesg utility already comes with a command line switch to omit
kernel timestamps, let's use it instead of applying an extra regex to
filter them out.
Now without the '[timestamp]: ' prefix at the beginning of the log
entry, revise the filtering regex to search for the 'livepatch:'
subsystem prefix at the beginning of the line.
I wanted to push this patchset and run full test after each patch.
Suddenly the tests started to fail, for example:
Hi Petr,
Thank you for running additional tests on your end. I ran this on
x86_64, ppc64le and s390 across a bunch of hosts and VMs, but never
repeatedly so I never saw this interesting combination of issues.
$/tools/testing/selftests/livepatch> ./test-livepatch.sh
TEST: basic function patching ... ok
TEST: multiple livepatches ... not ok
--- expected
+++ result
@@ -1,3 +1,9 @@
+% echo 0 > /sys/kernel/livepatch/test_klp_livepatch/enabled
+livepatch: 'test_klp_livepatch': initializing unpatching transition
+livepatch: 'test_klp_livepatch': starting unpatching transition
+livepatch: 'test_klp_livepatch': completing unpatching transition
+livepatch: 'test_klp_livepatch': unpatching complete
+% rmmod test_klp_livepatch
% modprobe test_klp_livepatch
livepatch: enabling patch 'test_klp_livepatch'
livepatch: 'test_klp_livepatch': initializing patching transition
@@ -20,9 +26,3 @@ livepatch: 'test_klp_atomic_replace': co
livepatch: 'test_klp_atomic_replace': unpatching complete
% rmmod test_klp_atomic_replace
test_klp_livepatch: this has been live patched
-% echo 0 > /sys/kernel/livepatch/test_klp_livepatch/enabled
-livepatch: 'test_klp_livepatch': initializing unpatching transition
-livepatch: 'test_klp_livepatch': starting unpatching transition
-livepatch: 'test_klp_livepatch': completing unpatching transition
-livepatch: 'test_klp_livepatch': unpatching complete
-% rmmod test_klp_livepatch
ERROR: livepatch kselftest(s) failed
The problem is a combination of:
+ 1st patch that causes that old kernel messages are not cleared
+ 2nd patch that removes time stamps from the diff
+ lost the oldest messages because internal kernel log buffer overflow
+ run the same tests more times
As a result, the diff might match with an incomplete log from
the previous run.
D'oh. The referenced commit f131d9edc29d uses dmesg without any
options, so it didn't suffer this gotcha.
Everything works when this 2nd patch is not commited. The timestamp
helps to distinguish old and new messages. The lost messages
are ignored thanks to the diff parameters:
--changed-group-format='%>' --unchanged-group-format=''
If you agree, I'll solve this problem by not committing this patch
into livepatch.git repo.
Very good catch and explanation. I'd be okay w/skipping the 2nd patch,
hopefully the others don't conflict too much by removing it.
It would be great to add a comment that the timestamp is actually
important. But it might be done in a followup patch.
Yeah, something that subtle should probably have a comment to that
effect. We all thought this was the "easy" change in the set, but never
thought it through :)
-- Joe