It turns out that I did this to myself with the launder_page callback that implemented. Originally I would be able to solve the occasional page left over with PG_fscache mark. - Milosz On Tue, Jul 9, 2013 at 10:07 AM, Milosz Tanski <milosz@xxxxxxxxx> wrote: > David, > > I have another one for you... I though maybe I caused this somehow so > I spent like 8 hours tracking. But now I'm not so sure. I got this > error when running this "stress test" > > root@pbnode-58019a39:/mnt/cluster/petabucket/prod# find -type f | > xargs -P 16 cat | pv > /dev/null > ^C83GB 0:00:52 [ 0B/s] [ > <=> > > Any ideas where to look for the issue? > > - Milosz > > [4123376.919781] BUG: unable to handle kernel NULL pointer dereference > at 0000000000000040 > [4123376.919799] > [4123376.919800] CacheFiles: Error: Unexpected object collision > [4123376.919803] object: OBJ35bf > [4123376.919806] objstate=LOOK_UP_OBJECT fl=8 wbusy=2 ev=0[0] > [4123376.919807] ops=0 inp=0 exc=0 > [4123376.919808] parent=ffff880ecafec180 > [4123376.919809] cookie=ffff880bc9cb5f00 [pr=ffff880ece306000 > nd=ffff880eaadeb648 fl=3] > [4123376.919820] key=[16] '1d86000000010000feffffffffffffff' > [4123376.919821] xobject: OBJ35bc > [4123376.919824] xobjstate=WAIT_FOR_CLEARANCE fl=30 wbusy=0 ev=0[10] > [4123376.919826] xops=1 inp=1 exc=0 > [4123376.919826] xparent=ffff880ecafec180 > [4123376.919828] xcookie=ffff880bc9cb5d80 [pr=ffff880ece306000 nd= > (null) fl=10] > [4123376.919861] IP: [<ffffffffa01aed62>] > __fscache_read_or_alloc_pages+0x122/0x400 [fscache] > [4123376.919882] PGD bc9eaa067 PUD d7a501067 PMD 0 > [4123376.919892] Oops: 0000 [#1] SMP > [4123376.919901] Modules linked in: ceph libceph cachefiles > ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul > glue_helper aes_x86_64 microcode auth_rpcgss oid_registry nfsv4 nfs > fscache lockd sunrpc raid10 raid456 async_pq async_xor async_memcpy > async_raid6_recov async_tx raid1 multipath linear btrfs raid6_pq > lzo_compress raid0 xor zlib_deflate libcrc32c > [4123376.919966] CPU: 12 PID: 2523 Comm: cat Not tainted 3.10.0-virtual #22 > [4123376.919973] task: ffff880ecbfdae40 ti: ffff880bc2cda000 task.ti: > ffff880bc2cda000 > [4123376.919982] RIP: e030:[<ffffffffa01aed62>] [<ffffffffa01aed62>] > __fscache_read_or_alloc_pages+0x122/0x400 [fscache] > [4123376.919998] RSP: e02b:ffff880bc2cdb928 EFLAGS: 00010206 > [4123376.920004] RAX: 0000000000000000 RBX: ffff880bc9cb5d80 RCX: > ffff880f1b68c000 > [4123376.920012] RDX: ffff880bc9cb5d80 RSI: 0000000000000000 RDI: > ffff880bc9cb5d90 > [4123376.920019] RBP: ffff880bc2cdb978 R08: f020000000000000 R09: > 0ecac6e278100000 > [4123376.920027] R10: f1173cd039a89e04 R11: 0000000000302135 R12: > ffff880bc9c79200 > [4123376.920036] R13: ffff880ecac6e240 R14: ffff880bc9c79280 R15: > ffff880bc2cdba64 > [4123376.920050] FS: 00007fc8b53bb700(0000) GS:ffff880f1b580000(0000) > knlGS:0000000000000000 > [4123376.920059] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [4123376.920065] CR2: 0000000000000040 CR3: 0000000bc9c61000 CR4: > 0000000000002660 > [4123376.920073] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [4123376.920081] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [4123376.920090] Stack: > [4123376.920095] ffff880eca660a00 0000000000000000 ffff880bc9cb5d90 > ffff880bc2cdbb38 > [4123376.920108] ffff880bc2cdb978 ffff880eaadebb20 ffff880bc2cdbb38 > ffff880eaadeb9a8 > [4123376.920121] ffff880bc2cdba64 0000000000000016 ffff880bc2cdb9c8 > ffffffffa02f3a93 > [4123376.920133] Call Trace: > [4123376.920147] [<ffffffffa02f3a93>] > __ceph_readpages_from_fscache+0x93/0xc0 [ceph] > [4123376.920160] [<ffffffffa02e1f35>] ceph_readpages+0x45/0x400 [ceph] > [4123376.920173] [<ffffffff8115868a>] ? alloc_pages_current+0xba/0x170 > [4123376.920184] [<ffffffff81116547>] ? __page_cache_alloc+0xb7/0xd0 > [4123376.920193] [<ffffffff81132939>] ? zone_statistics+0x99/0xc0 > [4123376.920204] [<ffffffff811224f2>] __do_page_cache_readahead+0x1b2/0x260 > [4123376.920216] [<ffffffff812a9ecb>] ? radix_tree_lookup+0xb/0x10 > [4123376.920225] [<ffffffff81122aa9>] ondemand_readahead+0x189/0x230 > [4123376.920233] [<ffffffff81122c31>] page_cache_sync_readahead+0x31/0x50 > [4123376.920242] [<ffffffff81118188>] generic_file_aio_read+0x4a8/0x720 > [4123376.920253] [<ffffffffa02e7d2b>] ? ceph_get_caps+0xcb/0x1e0 [ceph] > [4123376.920265] [<ffffffffa02dfca1>] ceph_aio_read+0x531/0x5c0 [ceph] > [4123376.920275] [<ffffffff81190f28>] ? __mnt_want_write+0x58/0x70 > [4123376.920285] [<ffffffff8118bcf7>] ? file_update_time+0xa7/0x100 > [4123376.920296] [<ffffffff81171a4a>] do_sync_read+0x7a/0xb0 > [4123376.920304] [<ffffffff81172920>] vfs_read+0xb0/0x180 > [4123376.920313] [<ffffffff81172ba2>] SyS_read+0x52/0xa0 > [4123376.920323] [<ffffffff8155de99>] system_call_fastpath+0x16/0x1b > [4123376.920329] Code: 89 e7 e8 e2 d4 ff ff 85 c0 0f 88 f2 00 00 00 48 > 8b 7d c0 e8 41 66 3a e1 f0 ff 05 96 59 00 00 49 8b 56 18 49 8b 75 70 > 48 8b 42 48 <48> 8b 40 40 48 85 c0 74 06 48 8b 7a 58 ff d0 4c 89 e7 48 > c7 c1 > [4123376.920399] RIP [<ffffffffa01aed62>] > __fscache_read_or_alloc_pages+0x122/0x400 [fscache] > [4123376.920412] RSP <ffff880bc2cdb928> > [4123376.920416] CR2: 0000000000000040 > [4123376.920422] ---[ end trace c2ccf1289b7570af ]--- > [4123376.920425] BUG: unable to handle kernel NULL pointer dereference > at 0000000000000020 > [4123376.920431] IP: [<ffffffffa02d09b5>] > __cachefiles_printk_object+0xe9/0x161 [cachefiles] > [4123376.920436] PGD bca510067 PUD bdea09067 PMD 0 > [4123376.920438] Oops: 0000 [#2] SMP > [4123376.920439] Modules linked in: ceph libceph cachefiles > ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul > glue_helper aes_x86_64 microcode auth_rpcgss oid_registry nfsv4 nfs > fscache lockd sunrpc raid10 raid456 async_pq async_xor async_memcpy > async_raid6_recov async_tx raid1 multipath linear btrfs raid6_pq > lzo_compress raid0 xor zlib_deflate libcrc32c > [4123376.920488] CPU: 4 PID: 1027 Comm: kworker/u32:2 Tainted: G > D 3.10.0-virtual #22 > [4123376.920500] Workqueue: fscache_object fscache_object_work_func [fscache] > [4123376.920509] task: ffff880eca469720 ti: ffff880ecd21e000 task.ti: > ffff880ecd21e000 > [4123376.920517] RIP: e030:[<ffffffffa02d09b5>] [<ffffffffa02d09b5>] > __cachefiles_printk_object+0xe9/0x161 [cachefiles] > [4123376.920531] RSP: e02b:ffff880ecd21fbc8 EFLAGS: 00010286 > [4123376.920537] RAX: 0000000000000000 RBX: ffff880bc9cb5d80 RCX: > 0000000000002322 > [4123376.920546] RDX: 0000000000000200 RSI: ffff880ecaddc000 RDI: > 0000000000000000 > [4123376.920554] RBP: ffff880ecd21fc08 R08: 00000000430e73cc R09: > ffffffff81c228a4 > [4123376.923888] [kworke] unexpected submission OP111e11 [OBJ35c1 > LOOK_UP_OBJECT] > [4123376.923890] [kworke] objstate=LOOK_UP_OBJECT [LOOK_UP_OBJECT] > [4123376.923890] [kworke] objflags=a > [4123376.923891] [kworke] objevent=0 [0] > [4123376.923892] [kworke] ops=0 inp=0 exc=0 > [4123376.923895] CPU: 1 PID: 99 Comm: kworker/1:1 Tainted: G D > 3.10.0-virtual #22 > [4123376.923905] Workqueue: ceph-msgr con_work [libceph] > [4123376.923909] ffff880ecc8c0700 ffff880eccccfb28 ffffffff8154f6cc > ffff880eccccfb78 > [4123376.923911] ffffffffa01ac6b2 ffff880bc9c78268 ffff880ecff4c9b8 > 0000000000000000 > [4123376.923913] ffff880bc9cb5c00 ffffea002eb8cd80 ffff880bc9c78200 > ffff880ecc8c0700 > [4123376.923913] Call Trace: > [4123376.923922] [<ffffffff8154f6cc>] dump_stack+0x19/0x1b > [4123376.923930] [<ffffffffa01ac6b2>] fscache_submit_op+0x492/0x500 [fscache] > [4123376.923934] [<ffffffffa01ad8a2>] > __fscache_write_page+0x302/0x4f0 [fscache] > [4123376.923941] [<ffffffffa02f3aee>] > __ceph_readpage_to_fscache+0x2e/0x60 [ceph] > [4123376.923945] [<ffffffffa02e2e5e>] finish_read+0x9e/0x160 [ceph] > [4123376.923950] [<ffffffffa02645fd>] dispatch+0x3fd/0x730 [libceph] > [4123376.923954] [<ffffffffa025d7e4>] con_work+0x1294/0x2150 [libceph] > [4123376.923960] [<ffffffff81555455>] ? _raw_spin_unlock_irq+0x15/0x20 > [4123376.923964] [<ffffffff810855e3>] ? finish_task_switch+0x63/0xd0 > [4123376.923968] [<ffffffff810736ed>] process_one_work+0x17d/0x490 > [4123376.923970] [<ffffffff810747eb>] worker_thread+0x11b/0x370 > [4123376.923972] [<ffffffff810746d0>] ? manage_workers.isra.21+0x2e0/0x2e0 > [4123376.923975] [<ffffffff8107af88>] kthread+0xd8/0xe0 > [4123376.923977] [<ffffffff8107aeb0>] ? flush_kthread_worker+0xe0/0xe0 > [4123376.923980] [<ffffffff8155ddec>] ret_from_fork+0x7c/0xb0 > [4123376.923982] [<ffffffff8107aeb0>] ? flush_kthread_worker+0xe0/0xe0 > [4123377.122015] R10: 0000000000000068 R11: 000000000003b4d8 R12: > ffffffffa02d1fbb > [4123377.122024] R13: ffff880bc9c79230 R14: ffff880ecaddc000 R15: > 0000000000000000 > [4123377.122035] FS: 00007f14a1362700(0000) GS:ffff880f1b480000(0000) > knlGS:0000000000000000 > [4123377.122044] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [4123377.122050] CR2: 0000000000000020 CR3: 0000000bc9e6c000 CR4: > 0000000000002660 > [4123377.122057] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [4123377.122065] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [4123377.122072] Stack: > [4123377.122076] 0000000000000010 00000010c9c79350 ffff880ecd21fc38 > ffff880bc9c79200 > [4123377.122090] ffff880bc9c78900 ffff880ecaddc000 ffff880bc9c79200 > ffff880bc9c79350 > [4123377.122102] ffff880ecd21fc38 ffffffffa02d0a89 0000000000000606 > ffff880bc9c78900 > [4123377.122114] Call Trace: > [4123377.122128] [<ffffffffa02d0a89>] > cachefiles_printk_object+0x5c/0x5d3 [cachefiles] > [4123377.122140] [<ffffffffa02ccaf2>] > cachefiles_walk_to_object+0xbd2/0xda0 [cachefiles] > [4123377.122153] [<ffffffff810d51fe>] ? irq_get_irq_data+0xe/0x10 > [4123377.122163] [<ffffffff81010640>] ? xen_smp_send_reschedule+0x10/0x20 > [4123377.122174] [<ffffffff81087760>] ? resched_task+0x60/0x70 > [4123377.122185] [<ffffffffa02ca40c>] > cachefiles_lookup_object+0x6c/0x180 [cachefiles] > [4123377.122200] [<ffffffffa01aa4a9>] > fscache_look_up_object+0xe9/0x370 [fscache] > [4123377.122212] [<ffffffffa01aac47>] > fscache_object_work_func+0x107/0x4b0 [fscache] > [4123377.122221] [<ffffffff810736ed>] process_one_work+0x17d/0x490 > [4123377.122227] [<ffffffff810747eb>] worker_thread+0x11b/0x370 > [4123377.122232] [<ffffffff810746d0>] ? manage_workers.isra.21+0x2e0/0x2e0 > [4123377.122238] [<ffffffff8107af88>] kthread+0xd8/0xe0 > [4123377.122243] [<ffffffff8107aeb0>] ? flush_kthread_worker+0xe0/0xe0 > [4123377.122251] [<ffffffff8155ddec>] ret_from_fork+0x7c/0xb0 > [4123377.122256] [<ffffffff8107aeb0>] ? flush_kthread_worker+0xe0/0xe0 > [4123377.122260] Code: 4c 8b 4b 70 48 89 da 4c 89 e6 4c 8b 43 58 45 31 > ff e8 15 b3 27 e1 4d 85 f6 74 2d 48 8b 43 48 ba 00 02 00 00 4c 89 f6 > 48 8b 7b 58 <ff> 50 20 44 0f b7 f8 eb 14 4c 89 e6 48 c7 c7 92 1f 2d a0 > 31 c0 > [4123377.122307] RIP [<ffffffffa02d09b5>] > __cachefiles_printk_object+0xe9/0x161 [cachefiles] > [4123377.122315] RSP <ffff880ecd21fbc8> > [4123377.122318] CR2: 0000000000000020 > [4123377.122349] ---[ end trace c2ccf1289b7570b0 ]--- > [4123377.122415] BUG: unable to handle kernel paging request at ffffffffffffffc8 > [4123377.122436] IP: [<ffffffff8107afd1>] kthread_data+0x11/0x20 > [4123377.122442] PGD 1a0f067 PUD 1a11067 PMD 0 > [4123377.122447] Oops: 0000 [#3] SMP > [4123377.122452] Modules linked in: ceph libceph cachefiles > ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul > glue_helper aes_x86_64 microcode auth_rpcgss oid_registry nfsv4 nfs > fscache lockd sunrpc raid10 raid456 async_pq async_xor async_memcpy > async_raid6_recov async_tx raid1 multipath linear btrfs raid6_pq > lzo_compress raid0 xor zlib_deflate libcrc32c > [4123377.122495] CPU: 4 PID: 1027 Comm: kworker/u32:2 Tainted: G > D 3.10.0-virtual #22 > [4123377.122511] task: ffff880eca469720 ti: ffff880ecd21e000 task.ti: > ffff880ecd21e000 > [4123377.122516] RIP: e030:[<ffffffff8107afd1>] [<ffffffff8107afd1>] > kthread_data+0x11/0x20 > [4123377.122523] RSP: e02b:ffff880ecd21f7b0 EFLAGS: 00010046 > [4123377.122527] RAX: 0000000000000000 RBX: 0000000000000000 RCX: > ffffffff81c62a80 > [4123377.321869] RDX: 0000000000000000 RSI: 0000000000000004 RDI: > ffff880eca469720 > [4123377.321876] RBP: ffff880ecd21f7c8 R08: 000000000a046368 R09: > 0000000000000000 > [4123377.321881] R10: ffffffff81287372 R11: ffff880ecbe72a08 R12: > 0000000000000004 > [4123377.321888] R13: ffff880eca469ae8 R14: ffff880ed00e8000 R15: > ffff880eca4699f0 > [4123377.321899] FS: 00007f14a1362700(0000) GS:ffff880f1b480000(0000) > knlGS:0000000000000000 > [4123377.321907] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [4123377.321913] CR2: 0000000000000028 CR3: 0000000bc9e6c000 CR4: > 0000000000002660 > [4123377.321921] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [4123377.321929] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [4123377.321937] Stack: > [4123377.321941] ffffffff81074c66 ffff880f1b494200 0000000000000004 > ffff880ecd21f848 > [4123377.321955] ffffffff81554048 ffff880ecd21f7f8 ffff880ecbde59d8 > ffff880eca469720 > [4123377.321968] ffff880ecd21ffd8 ffff880ecd21ffd8 ffff880ecd21ffd8 > ffff880ecbde59d8 > [4123377.321981] Call Trace: > [4123377.321990] [<ffffffff81074c66>] ? wq_worker_sleeping+0x16/0x90 > [4123377.322002] [<ffffffff81554048>] __schedule+0x5c8/0x810 > [4123377.322010] [<ffffffff81554349>] schedule+0x29/0x70 > [4123377.322022] [<ffffffff8105ab20>] do_exit+0x6e0/0xa60 > [4123377.322032] [<ffffffff8154bd02>] ? printk+0x4d/0x4f > [4123377.322043] [<ffffffff8100afc2>] ? check_events+0x12/0x20 > [4123377.322054] [<ffffffff815566c0>] oops_end+0xb0/0xf0 > [4123377.322063] [<ffffffff8154b5c0>] no_context+0x276/0x285 > [4123377.322071] [<ffffffff8154b7a2>] __bad_area_nosemaphore+0x1d3/0x1f2 > [4123377.322080] [<ffffffff8154b7d4>] bad_area_nosemaphore+0x13/0x15 > [4123377.322088] [<ffffffff81559502>] __do_page_fault+0x3b2/0x550 > [4123377.322100] [<ffffffff810196b9>] ? sched_clock+0x9/0x10 > [4123377.322111] [<ffffffff8108cd8d>] ? sched_clock_cpu+0xbd/0x110 > [4123377.322121] [<ffffffff8108ce14>] ? local_clock+0x34/0x40 > [4123377.322131] [<ffffffff81555380>] ? _raw_spin_unlock_irqrestore+0x20/0x30 > [4123377.322142] [<ffffffff81080336>] ? down_trylock+0x36/0x50 > [4123377.322152] [<ffffffff815596ae>] do_page_fault+0xe/0x10 > [4123377.322161] [<ffffffff81555b18>] page_fault+0x28/0x30 > [4123377.322171] [<ffffffffa02d09b5>] ? > __cachefiles_printk_object+0xe9/0x161 [cachefiles] > [4123377.322183] [<ffffffffa02d09a0>] ? > __cachefiles_printk_object+0xd4/0x161 [cachefiles] > [4123377.322195] [<ffffffffa02d0a89>] > cachefiles_printk_object+0x5c/0x5d3 [cachefiles] > [4123377.322207] [<ffffffffa02ccaf2>] > cachefiles_walk_to_object+0xbd2/0xda0 [cachefiles] > [4123377.322218] [<ffffffff810d51fe>] ? irq_get_irq_data+0xe/0x10 > [4123377.322228] [<ffffffff81010640>] ? xen_smp_send_reschedule+0x10/0x20 > [4123377.322237] [<ffffffff81087760>] ? resched_task+0x60/0x70 > [4123377.322247] [<ffffffffa02ca40c>] > cachefiles_lookup_object+0x6c/0x180 [cachefiles] > [4123377.322261] [<ffffffffa01aa4a9>] > fscache_look_up_object+0xe9/0x370 [fscache] > [4123377.322274] [<ffffffffa01aac47>] > fscache_object_work_func+0x107/0x4b0 [fscache] > [4123377.322285] [<ffffffff810736ed>] process_one_work+0x17d/0x490 > [4123377.322293] [<ffffffff810747eb>] worker_thread+0x11b/0x370 > [4123377.322303] [<ffffffff810746d0>] ? manage_workers.isra.21+0x2e0/0x2e0 > [4123377.322311] [<ffffffff8107af88>] kthread+0xd8/0xe0 > [4123377.322320] [<ffffffff8107aeb0>] ? flush_kthread_worker+0xe0/0xe0 > [4123377.322329] [<ffffffff8155ddec>] ret_from_fork+0x7c/0xb0 > [4123377.322337] [<ffffffff8107aeb0>] ? flush_kthread_worker+0xe0/0xe0 > [4123377.322344] Code: 48 89 e5 5d 48 8b 40 b8 48 c1 e8 02 83 e0 01 c3 > 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 8b 87 70 03 00 00 55 > 48 89 e5 5d <48> 8b 40 c8 c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 > 90 55 > [4123377.322414] RIP [<ffffffff8107afd1>] kthread_data+0x11/0x20 > [4123377.322426] RSP <ffff880ecd21f7b0> > [4123377.322431] CR2: ffffffffffffffc8 > [4123377.322437] ---[ end trace c2ccf1289b7570b1 ]--- > [4123377.322443] Fixing recursive fault but reboot is needed! > > On Tue, Jul 9, 2013 at 6:33 AM, David Howells <dhowells@xxxxxxxxxx> wrote: >> Milosz Tanski <milosz@xxxxxxxxx> wrote: >> >>> It looks like both the cifs and NFS code do not bother with any >>> locking around cifs_fscache_set_inode_cookie. Is there no concern over >>> multiple open() calls racing to create the cookie in those >>> filesystems? >> >> Yeah... That's probably wrong. AFS obviates the need for special locking by >> doing it in afs_iget(). >> >> Hmmm... I think I've just spotted what might be the cause of pages getting >> marked PG_fscache whilst belonging to the allocator. >> >> void nfs_fscache_set_inode_cookie(struct inode *inode, struct file *filp) >> { >> if (NFS_FSCACHE(inode)) { >> nfs_fscache_inode_lock(inode); >> if ((filp->f_flags & O_ACCMODE) != O_RDONLY) >> nfs_fscache_disable_inode_cookie(inode); >> else >> nfs_fscache_enable_inode_cookie(inode); >> nfs_fscache_inode_unlock(inode); >> } >> } >> >> can release the cookie whilst reads are in progress on it when an inode being >> read suddenly changes to an inode being written. We need some sort of >> synchronisation on that there. >> >> David -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html