Dear developers, At first thanks for all your work! kernel version: 3.9.9-t23 kernel config: https://paste.debian.net/27351/ lspci -nvv output is attached. I merged two kernel issues into one mail to find relations easier. As both appeared only once I did not invest more time to try with newer kernels. But I will do so for testing patches. Please give me pointers where to dig in for reproducing. 1) jbd2_journal_dirty_metadata I reported this in #linuxfs and was confirmed to forward it here. 12:55 < kardan:#linuxfs> it seems like my hdd is hanging (hdd led turned. jbd is buzy for over an hour now: 1326 be/3 root 0.00 B 16.00 K 0.00 % 98.47 % [jbd2/sda1-8] The load was caused by iceape (or something stacked below) #1 0xb764680e in wait4 () at ../sysdeps/unix/syscall-template.S:81 #2 0xb76467e7 in __wait3 (stat_loc=..., options=0, usage=0x0) at ../sysdeps/unix/bsd/bsd4.4/wait3.c:3312:55 This led to several jbd related kernel bugs and a kernel panic in the end. I attached the jbd-schedulings-bugs to avoid wrapping issues. jbd2_journal_dirty_metadata+0x162/0x188 kmem_cache_alloc+0x26/0x9f spin_unlock.isra.6+0x1e/0x1e ext4_file_open+0x13e/0x1b2 spin_lock.isra.7+0xa/0xb __d_instantiate+0x59/0x63 fsnotify_perm+0x4d/0x58 __schedule_bug+0x39/0x49 __schedule+0x54/0x4e4 ttwu_do_wakeup.constprop.111+0x39/0x56 try_to_wake_up+0xe7/0xef autoremove_wake_function+0xd/0x29 activate_page+0xae/0xfc __cond_resched+0xf/0x19 _cond_resched+0x10/0x18 Aug 15 18:06:10 delight unmap_single_vma+0x3fc/0x49c unmap_vmas+0x30/0x4d exit_mmap+0x68/0xcb get_signal_to_deliver+0x202/0x4d1 kmem_cache_alloc+0x26/0x9f spin_unlock.isra.6+0x1e/0x1e ext4_file_open+0x13e/0x1b2 fsnotify+0x1fa/0x22c __d_instantiate+0x59/0x63 __schedule_bug+0x39/0x49 _schedule+0x54/0x4e4 blk_peek_request+0x16f/0x1a4 scsi_request_fn+0x35d/0x3fe activate_page+0xae/0xfc __cond_resched+0xf/0x19 _cond_resched+0x10/0x18 unmap_single_vma+0x3fc/0x49c unmap_vmas+0x30/0x4d exit_mmap+0x68/0xcb get_signal_to_deliver+0x202/0x4d1 __schedule_bug+0x39/0x49 __schedule+0x54/0x4e4 __free_one_page+0xeb/0x1c4 free_pcppages_bulk+0xbb/0x103 __cond_resched+0xf/0x19 _cond_resched+0x10/0x18 unmap_single_vma+0x3fc/0x49c unmap_vmas+0x30/0x4d exit_mmap+0x68/0xcb get_signal_to_deliver+0x202/0x4d1 __schedule_bug+0x39/0x49 __schedule+0x54/0x4e4 smp_apic_timer_interrupt+0x58/0x60 apic_timer_interrupt+0x34/0x3c activate_page+0xae/0xfc __cond_resched+0xf/0x19 _cond_resched+0x10/0x18 unmap_single_vma+0x3fc/0x49c unmap_vmas+0x30/0x4d exit_mmap+0x68/0xcb __schedule_bug+0x39/0x49 __schedule+0x54/0x4e4 vm_acct_memory+0x26/0x3c __cache_free.isra.57+0xf/0x8f percpu_counter_add.constprop.21+0x26/0x3e spin_lock.isra.7+0xa/0xb dput+0x11/0x96 spin_unlock.isra.11+0xa/0x1e __fput+0x15f/0x17e mnt_add_count.isra.16+0x1c/0x34 __cond_resched+0xf/0x19 _cond_resched+0x10/0x18 task_work_run+0x4f/0x5a do_exit+0x2c6/0x796 kmsg_dump+0x1d/0xcc oops_end+0x86/0x8a do_bounds+0x4c/0x4c Full log: https://paste.debian.net/27347/ 2) unable to handle kernel paging request INFO: task kswapd0:21 blocked for more than 120 seconds. [289200.502665] [<c10b4258>] ? kmem_cache_alloc+0x2f/0x9f [289200.502677] [<c108b07f>] ? mempool_alloc+0x3b/0xee [289200.502690] [<c104c01f>] ? timekeeping_get_ns.constprop. [289200.502703] [<c13310e9>] ? io_schedule+0x34/0x47 [289200.502715] [<c117e062>] ? get_request+0x416/0x4ae [289200.502728] [<c1005a8f>] ? native_sched_clock+0x48/0x94 [289200.502741] [<c11811f1>] ? ioc_lookup_icq+0x41/0x49 [289800.503037] INFO: task kswapd0:21 blocked for more than 120 seconds. [289800.503126] [<c104300b>] ? sched_slice.isra.36+0x67/0x85 [289800.503139] [<c104c01f>] ? timekeeping_get_ns.constprop. [289800.503153] [<c13310e9>] ? io_schedule+0x34/0x47 [289800.503165] [<c117e062>] ? get_request+0x416/0x4ae [289800.503178] [<c11811f1>] ? ioc_lookup_icq+0x41/0x49 [289800.503189] [<c1038faa>] ? abort_exclusive_wait+0x64/0x64 [289800.503199] [<c117f938>] ? blk_queue_bio+0x185/0x26d This issue dates back some weeks, sorry for not reporting earlear. I had two occurances of this with several days in between. One week before the first occurence a new ram bank and a PCMCIA card usb hub was added to the laptop. Some days ago I saw a lot of IO errors once, they did not reappear. On #linux-fs it was said the first one looks like use-after-free or some other type of software-induced memory corruption. "Those tend to be nasty problems that can take months to track down some of the crazy-looking problems end up as bad hardware. have you also experienced crashes of userspace programs?" kswap/kworker were followed by Xorg, iceweasel, claws and Xorg. Awesome was inresponsive afterwards and I needed the restart lightdm. In a new X session parts of old windows reappeared, this was reproducable. Log is attached. -- Kardan <kardan@xxxxxxxxxx> Encrypt your email: http://gnupg.org/documentation Public GPG key 9D6108AE58C06558 at hkp://pool.sks-keyservers.net fpr: F72F C4D9 6A52 16A1 E7C9 AE94 9D61 08AE 58C0 6558
Attachment:
kernel-paging-bug
Description: Binary data
Attachment:
lspci
Description: Binary data
Attachment:
signature.asc
Description: PGP signature