"On Tue, Dec 13, 2016 at 9:09 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > On Tue, 13 Dec 2016, Dmitry Vyukov wrote: > >> On Tue, Dec 13, 2016 at 7:38 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: >> > On Tue, 13 Dec 2016, Dmitry Vyukov wrote: >> > >> >> On Tue, Dec 13, 2016 at 4:52 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: >> >> > On Tue, 13 Dec 2016, Dmitry Vyukov wrote: >> >> > >> >> >> >> > If it is >> >> >> >> > not a bug in kernel source code, then it must not produce a WARNING. >> >> >> > >> >> >> > What about a memory allocation failure? The memory management part of >> >> >> > the kernel produces a WARNING message if an allocation fails and the >> >> >> > caller did not specify __GFP_NOWARN. >> >> >> > >> >> >> > There is no way for a driver to guarantee that a memory allocation >> >> >> > request will succeed -- failure is always an option. But obviously >> >> >> > memory allocation failures are not bugs in the kernel. >> >> >> > >> >> >> > Are you saying that mm/page_alloc.c:warn_alloc() should produce >> >> >> > something other than a WARNING? >> >> >> >> >> >> >> >> >> The main thing I am saying is that we absolutely need a way for a >> >> >> human or a computer program to be able to determine if there is >> >> >> anything wrong with kernel or not. >> >> > Doesn't it also produce a WARNING under other circumstances? >> >> >> >> No. >> >> >> >> OOM is not a WARNING and is easily distinguishable from BUG/WARNING. >> > >> >> Memory allocator does not print WARNINGs on allocation failures. >> > >> > Do you count dev_warn the same as WARN or WARN_ON? What about dev_WARN >> > or pr_warn() or printk(KERN_WARNING...)? Maybe we're not talking about >> > the same messages. >> > >> > The USB subsystem has got tons of dev_warn() and dev_err() calls. >> > Relatively few (if any) of them are for kernel bugs. >> >> >> I grep for "WARNING:". It is not possible to understand what function >> printed messages on console. >> Here are my current regexps: >> https://github.com/google/syzkaller/blob/master/report/report.go#L29 > > Ah, okay. > > So the take-home message is that we should use WARN* or dev_WARN or > related functions only when reporting an actual kernel bug, whereas in > other circumstances we should avoid mentioning "WARNING" or "BUG" in > log output. In addition, memory allocations where the size is given by > the user (and not limited) should always use the __GFP_NOWARN flag. Yes, such simple convention that allows to "grep" for bugs would be awesome! > I can audit the parts of the USB stack that I'm familiar with for these > things. Anything else? What about "ERROR"? Your regexps don't appear > to search for that. Well, I don't know. Are there messages that are prefixed with ERROR and indicate kernel bugs? I can't possibly manually inspect all generated kernel output. I add errors on case-by-case basis. BUG: and WARNING: where obvious starting points. Then I discovered "INFO:", "Unable to handle kernel paging request", "general protection fault:", "Kernel panic", "kernel BUG", "Kernel BUG", "divide error:", "invalid opcode:", "unreferenced object" and "UBSAN:". There are chances that there are some bugs with ERROR: that I am just missing. For example my dmesg is full of: [2536674.524238] vmwrite error: reg 2800 value ffffffffffffffff (err -255) [2536674.524256] Call Trace: [2536674.524260] [<ffffffff8172a4bb>] dump_stack+0x64/0x82 [2536674.524267] [<ffffffffa1988ce2>] vmwrite_error+0x2c/0x2e [kvm_intel] [2536674.524272] [<ffffffffa197d83f>] vmcs_writel+0x1f/0x30 [kvm_intel] [2536674.524278] [<ffffffffa19847e0>] free_nested.part.64+0x70/0x170 [kvm_intel] [2536674.524283] [<ffffffffa1984963>] vmx_free_vcpu+0x33/0x70 [kvm_intel] [2536674.524295] [<ffffffffa0eced24>] kvm_arch_vcpu_free+0x44/0x50 [kvm] [2536674.524308] [<ffffffffa0ecf892>] kvm_arch_destroy_vm+0xf2/0x1f0 [kvm] [2536674.524311] [<ffffffff810c9b3d>] ? synchronize_srcu+0x1d/0x20 [2536674.524323] [<ffffffffa0eb7f16>] kvm_put_kvm+0x106/0x1c0 [kvm] [2536674.524334] [<ffffffffa0eb8031>] kvm_vm_release+0x21/0x30 [kvm] [2536674.524337] [<ffffffff811c2f34>] __fput+0xe4/0x260 [2536674.524340] [<ffffffff811c30fe>] ____fput+0xe/0x10 [2536674.524343] [<ffffffff8108aa2c>] task_work_run+0xac/0xd0 [2536674.524346] [<ffffffff8106be10>] do_exit+0x2b0/0xa60 [2536674.524349] [<ffffffff8106c63f>] do_group_exit+0x3f/0xa0 [2536674.524352] [<ffffffff8107c740>] get_signal_to_deliver+0x1d0/0x6f0 [2536674.524356] [<ffffffff81014418>] do_signal+0x48/0xa10 [2536674.524359] [<ffffffff8161449e>] ? SYSC_sendto+0x17e/0x1c0 [2536674.524362] [<ffffffff811d49b0>] ? do_vfs_ioctl+0x2e0/0x4c0 [2536674.524365] [<ffffffff81014e49>] do_notify_resume+0x69/0xb0 [2536674.524368] [<ffffffff8173b31a>] int_signal+0x12/0x17 But I have no idea if it is a bug or not. It's kind of printed with pr_err. But on the other hand it is not printed WARN or BUG. Who knows?.. -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html