On Wed 08-04-15 08:37:00, Peter Hurley wrote: > [ + Al Viro, linux-fsdevel ] > > On 04/08/2015 07:12 AM, Tobias Hoffmann wrote: > > Hi, > > > > after updating from 3.19.0-rc4 to 4.0.0-rc6 I've experienced the appended two similar oopses. > > In both cases they occurred without obvious cause after less than 2 days uptime, and caused Xorg to hang - requiring a manual reboot (init 6 via ssh did not run to completion). > > The only other thing I updated was userspace libdrm + xorg-video-nouveau, but that should not cause oopses, right? > > > > With 3.19.0-rc4 I had uptime > 40 days -- and then a general protection fault at __d_lookup (also appended) which seems unrelated to the __destroy_inode oopses. > > I'm now back at 3.19. > > > > Tobias > > > > PS: please CC. > > > > --- > > BUG: unable to handle kernel paging request at ffffffffff3cffff > > IP: [<ffffffff8115a1c7>] __destroy_inode+0x77/0xd0 > > PGD 16b8067 PUD 16ba067 PMD 17f0067 PTE 0 > > Oops: 0002 [#1] PREEMPT SMP > > Modules linked in: snd_hrtimer snd_usb_audio snd_usbmidi_lib ipt_REJECT nf_reject_ipv4 iptable_filter xt_REDIRECT nf_nat_redirect xt_tcpudp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables nfsd auth_rpcgss oid_registry exportfs nfs_acl nfs lockd grace sunrpc ppdev lp snd_hda_codec_realtek snd_hda_codec_generic hid_multitouch snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_mpu401 snd_seq_dummy snd_mpu401_uart snd_seq_oss snd_seq_midi snd_rawmidi nouveau wmi video ttm drm_kms_helper drm snd_seq_midi_event snd_seq cfbfillrect cfbimgblt snd_seq_device snd_timer cfbcopyarea evdev snd psmouse i2c_algo_bit parport_pc soundcore ns558 button parport i2c_nforce2 gameport acpi_cpufreq > > CPU: 1 PID: 472 Comm: kswapd0 Not tainted 4.0.0-rc6-00188-gf8b3d8a-dirty #32 > > Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./ALiveNF5-eSATA2+., BIOS P2.10 04/09/2008 > > task: ffff8801aa0f3250 ti: ffff8801aaa64000 task.ti: ffff8801aaa64000 > > RIP: 0010:[<ffffffff8115a1c7>] [<ffffffff8115a1c7>] __destroy_inode+0x77/0xd0 > > RSP: 0000:ffff8801aaa67bd8 EFLAGS: 00210286 > > RAX: ffffffffff3cfffe RBX: ffff88010238d978 RCX: 00000000000024c0 > > RDX: 0000000000000001 RSI: ffff88010238da08 RDI: ffffffffff3cffff > > RBP: ffff8801aaa67be8 R08: ffffffff8115b3d0 R09: ffff8801aaa67d40 > > R10: 0000000000000400 R11: 0000000000000000 R12: ffff88010238d9f8 > > R13: ffffffff815210e0 R14: 0000000000000000 R15: 00000000000000a9 > > FS: 0000000000000000(0000) GS:ffff8801b1c80000(0000) knlGS:00000000f1604b40 > > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: ffffffffff3cffff CR3: 00000000c0e95000 CR4: 00000000000006e0 > > Stack: > > 0000000000000003 ffff88010238d978 ffff8801aaa67c08 ffffffff8115a7d1 > > ffff88010238d978 ffff88010238d978 ffff8801aaa67c38 ffffffff8115a922 > > ffff8801aaa67c38 ffff8801aaa67c78 ffff8800cda4e800 ffff8800cda4eb40 > > Call Trace: > > [<ffffffff8115a7d1>] destroy_inode+0x21/0x60 > > [<ffffffff8115a922>] evict+0x112/0x180 > > [<ffffffff8115a9c9>] dispose_list+0x39/0x50 > > [<ffffffff8115b825>] prune_icache_sb+0x45/0x50 > > [<ffffffff811447e3>] super_cache_scan+0x153/0x1a0 > > [<ffffffff811105a3>] shrink_slab.part.55.constprop.60+0x1a3/0x250 > > [<ffffffff811129c1>] shrink_zone+0xa1/0xb0 > > [<ffffffff81112dbf>] kswapd+0x3ef/0x700 > > [<ffffffff811129d0>] ? shrink_zone+0xb0/0xb0 > > [<ffffffff810aaf04>] kthread+0xc4/0xe0 > > [<ffffffff810aae40>] ? kthread_freezable_should_stop+0x60/0x60 > > [<ffffffff814f6588>] ret_from_fork+0x58/0x90 > > [<ffffffff810aae40>] ? kthread_freezable_should_stop+0x60/0x60 > > Code: 48 8b 7b 10 48 8d 47 ff 48 83 f8 fd 77 0a 48 85 ff 74 05 f0 ff 0f 74 3c 48 8b 7b 18 48 8d 47 ff 48 83 f8 fd 77 0a 48 85 ff 74 05 <f0> ff 0f 74 14 65 48 ff 0d c4 3d eb 7e 48 83 c4 08 5b 5d c3 0f > > RIP [<ffffffff8115a1c7>] __destroy_inode+0x77/0xd0 > > RSP <ffff8801aaa67bd8> > > CR2: ffffffffff3cffff So we are very likely oopsing on atomic_dec_and_test() in posix_acl_release() called at inode->i_default_acl. Value of i_default_acl is in RDI - ffffffffff3cffff - looks very much like corrupted value of ACL_NOT_CACHED which is -1. So likely a random memory corruption where someone wrote 0x3c into your inode. Very likely a kernel bug but impossible to debug without more info... Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html