Re: NFS/d_splice_alias breakage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 6/2/16, 18:46, "linux-nfs-owner@xxxxxxxxxxxxxxx on behalf of Oleg Drokin" <linux-nfs-owner@xxxxxxxxxxxxxxx on behalf of green@xxxxxxxxxxxxxx> wrote:

>Hello!
>
>   I just came across a bug (trying to run some Lustre test scripts against NFS, while hunting for another nfsd bug)
>   that seems to be present since at least 2014 that lets users crash nfs client locally.
>
>   Here's some interesting comment quote first from d_obtain_alias:
>
>>  * Cluster filesystems may call this function with a negative, hashed dentry.
>>  * In that case, we know that the inode will be a regular file, and also this
>>  * will only occur during atomic_open. So we need to check for the dentry
>>  * being already hashed only in the final case.
>>  */
>> struct dentry *d_splice_alias(struct inode *inode, struct dentry *dentry)
>> {
>>         if (IS_ERR(inode))
>>                 return ERR_CAST(inode);
>> 
>>         BUG_ON(!d_unhashed(dentry));
>        ^^^^^^^^^^^^^^ - This does not align well with the quote above.
>
>It got imported here by commit b5ae6b15bd73e35b129408755a0804287a87e041
>
>Removing the BUG_ON headon is not going to work since the d_rehash of old
>is now folded into __d_add and we might not want to move that condition there.
>
>doing an
>if (d_unhashed(dentry))
>   __d_add(dentry, inode);
>else
>   d_instantiate(dentry, inode);
>
>also does not look super pretty and who knows how all of the previous code
>like _d_find_any_alias would react.
>
>Al, I think you might want to chime in here on how to better handle this?
>
>The problem was there at least since 3.10 it appears where the fs/nfs/dir.c code
>was calling d_materialise_unique() that did require the dentry to be unhashed.
>
>Not sure how this was not hit earlier. The crash looks like this (I added
>a printk to ensure this is what is going on indeed and not some other weird race):
>
>[   64.489326] Calling into d_splice_alias with hashed dentry, dentry->d_inode (null) inode ffff88010f500c70
>[   64.489549] ------------[ cut here ]------------
>[   64.489642] kernel BUG at /home/green/bk/linux/fs/dcache.c:2989!
>[   64.489750] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
>[   64.489853] Modules linked in: loop rpcsec_gss_krb5 joydev pcspkr acpi_cpufreq i2c_piix4 tpm_tis tpm nfsd drm_kms_helper ttm drm serio_raw virtio_blk
>[   64.491111] CPU: 6 PID: 7125 Comm: file_concat.sh Not tainted 4.7.0-rc1-vm-nfs+ #99
>[   64.492069] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>[   64.492489] task: ffff880110a801c0 ti: ffff8800c283c000 task.ti: ffff8800c283c000
>[   64.493159] RIP: 0010:[<ffffffff8124292f>]  [<ffffffff8124292f>] d_splice_alias+0x31f/0x480
>[   64.493866] RSP: 0018:ffff8800c283fb20  EFLAGS: 00010282
>[   64.494238] RAX: 0000000000000067 RBX: ffff88010f500c70 RCX: 0000000000000000
>[   64.494625] RDX: 0000000000000067 RSI: ffff8800d08d2ed0 RDI: ffff88010f500c70
>[   64.495021] RBP: ffff8800c283fb78 R08: 0000000000000001 R09: 0000000000000000
>[   64.495407] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8800d08d2ed0
>[   64.495804] R13: 0000000000000000 R14: ffff8800d4a56f00 R15: ffff8800d0bb8c70
>[   64.496199] FS:  00007f94ae25c700(0000) GS:ffff88011f580000(0000) knlGS:0000000000000000
>[   64.496859] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>[   64.497235] CR2: 000055e5d3fb46a4 CR3: 00000000ce364000 CR4: 00000000000006e0
>[   64.497626] Stack:
>[   64.497961]  ffffffff8132154e ffff8800c283fb68 ffffffff81325916 0000000000000000
>[   64.498765]  0000000000000000 ffff8801109459c0 0000000000000000 ffff8800d0bb8c70
>[   64.499578]  ffff8800c283fba0 ffff8800d08d2ed0 ffff8800cb927080 ffff8800c283fc18
>[   64.500385] Call Trace:
>[   64.500727]  [<ffffffff8132154e>] ? nfs_lookup+0x17e/0x320
>[   64.501103]  [<ffffffff81325916>] ? __put_nfs_open_context+0xc6/0xf0
>[   64.501477]  [<ffffffff8132252b>] nfs_atomic_open+0x8b/0x430
>[   64.501850]  [<ffffffff81236def>] lookup_open+0x29f/0x7a0
>[   64.502222]  [<ffffffff812377ce>] path_openat+0x4de/0xfc0
>[   64.502591]  [<ffffffff8123935e>] do_filp_open+0x7e/0xe0
>[   64.502964]  [<ffffffff8124877c>] ? __alloc_fd+0xbc/0x170
>[   64.503347]  [<ffffffff817f4977>] ? _raw_spin_unlock+0x27/0x40
>[   64.503719]  [<ffffffff8124877c>] ? __alloc_fd+0xbc/0x170
>[   64.504097]  [<ffffffff81226896>] do_sys_open+0x116/0x1f0
>[   64.504465]  [<ffffffff8122698e>] SyS_open+0x1e/0x20
>[   64.504831]  [<ffffffff817f5176>] entry_SYSCALL_64_fastpath+0x1e/0xad
>[   64.505218] Code: 01 e8 46 20 5b 00 85 db 74 0b 4c 89 ff 4c 63 fb e8 87 d8 ff ff 4c 89 e7 e8 2f 3c 00 00 4c 89 f8 e9 5e fe ff ff 0f 0b 48 89 f8 c3 <0f> 0b 48 8b 43 40 4c 8b 78 58 49 8d 8f 58 03 00 00 eb 02 f3 90 
>[   64.508754] RIP  [<ffffffff8124292f>] d_splice_alias+0x31f/0x480
>[   64.509176]  RSP <ffff8800c283fb20>

That would have to be a really tight race, since the code in _nfs4_open_and_get_state() currently reads:

                d_drop(dentry);
                alias = d_exact_alias(dentry, state->inode);
                if (!alias)
                        alias = d_splice_alias(igrab(state->inode), dentry);

IOW: something would have to be acting between the d_drop() and d_splice_alias() above...

Al, I’ve been distracted by personal matters in the past 6 months. What is there to guarantee exclusion of the readdirplus dentry instantiation and the NFSv4 atomic open in the brave new world of VFS, June 2016 vintage?

Cheers
  Trond

��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux