Hello Al and Eric, Thanks your comments, please see the below comments. > -----Original Message----- > From: Al Viro [mailto:viro@xxxxxxxxxxxxxxxx] On Behalf Of Al Viro > Sent: Monday, August 26, 2013 7:30 PM > To: Liu, Chuansheng > Cc: linux-fsdevel@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx > Subject: Re: [PATCH] Fix the race between the fget() and close() > > On Tue, Aug 27, 2013 at 12:12:49AM +0800, Chuansheng Liu wrote: > > > > When one thread is calling sys_ioctl(), and another thread is calling > > sys_close(), current code has protected most cases. > > > > But for the below case, it will cause issue: > > T1 T2 > T3 > > sys_close(oldfile) sys_open(newfile) > sys_ioctl(oldfile) > > -> __close_fd() > > lock file_lock > > assign NULL file > > put fd to be unused > > unlock file_lock > > get new fd is same as old > > assign newfile to same fd > > fget_flight() > > > get the newfile!!! > > decrease file->f_count > > file->f_count == 0 > > --> try to release file > > > > The race is when T1 try to close one oldFD, T3 is trying to ioctl the oldFD, > > if currently the new T2 is trying to open a newfile, it maybe get the newFD is > > same as oldFD. > > > > And normal case T3 should get NULL file pointer due to released by T1, but T3 > > get the newfile pointer, and continue the ioctl accessing. > > > > It maybe causes unexpectable error, we hit one system panic at > do_vfs_ioctl(). > > > > Here we can fix it that putting "put_unused_fd()" after filp_close(), > > it can avoid this case. > > NAK. T3 getting the new file is valid (think what happens if T1 returns from > close() before T2 enters open() and T3 hits ioctl() after both of those), > the userland code is, at the very least, racy and no, moving put_unused_fd() > around is not going to solve any problems - it might shift the race window, > but that's it. Yes, the fix is really insane, I realized it after sent it. But I think the race is there, right? > > It certainly does not affect the possibility of panics in do_vfs_ioctl() > you are seeing and I would really like to see the details on those instead of > this kind of voodoo "fixes". The whole panic stack is below, and my kernel code is: int do_vfs_ioctl(struct file *filp, unsigned int fd, unsigned int cmd, unsigned long arg) { int error = 0; int __user *argp = (int __user *)arg; struct inode *inode = filp->f_path.dentry->d_inode; === > Panic here, due to filp->f_path.dentry == NULL, then accessing 0x20 failed. Could you share some scenario for this case? Thanks. <1>[ 392.448004] BUG: unable to handle kernel NULL pointer dereference at 00000020 <1>[ 392.456626] IP: [<c132c043>] do_vfs_ioctl+0x23/0x570 <4>[ 392.462545] *pde = 00000000 <4>[ 392.466101] Oops: 0000 [#1] PREEMPT SMP <4>[ 392.471284] Modules linked in: atomisp lm3554 ov2722 imx1x5 atmel_mxt_ts vxd392 videobuf_vmalloc videobuf_core hdmi_audio bcm_bt_lpm bcm43241(O) kct_daemon(O) <4>[ 392.489564] <4>[ 392.491397] Pid: 3625, comm: Binder_4 Tainted: G O 3.4.43-187546-g85c9adb #1 <4>[ 392.501069] EIP: 0060:[<c132c043>] EFLAGS: 00010296 CPU: 0 <4>[ 392.507379] EIP is at do_vfs_ioctl+0x23/0x570 <4>[ 392.512539] EAX: 00000000 EBX: e7fb8cc0 ECX: c0186201 EDX: c0186201 <4>[ 392.519721] ESI: 00000009 EDI: 79022b9c EBP: db30df90 ESP: db30df1c <4>[ 392.527040] DS: 007b ES: 007b FS: 00d8 GS: 003b SS: 0068 <4>[ 392.533253] CR0: 80050033 CR2: 00000020 CR3: 2765a000 CR4: 001007d0 <4>[ 392.540558] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 <4>[ 392.547752] DR6: ffff0ff0 DR7: 00000400 <0>[ 392.552336] Process Binder_4 (pid: 3625, ti=db30c000 task=e7602c00 task.ti=db30c000) <0>[ 392.561177] Stack: <4>[ 392.563707] db30df00 c14d8344 00000000 00000001 00000001 f1204910 00000000 c7db3320 <4>[ 392.573881] 76f4feb4 00000000 00000002 00000000 00000000 00000000 00000000 e7a1bd88 <4>[ 392.584192] 00000002 00000001 00000000 76f4feb4 db30dfac c128aad0 00000000 e7fb8cc0 <0>[ 392.594362] Call Trace: <4>[ 392.597272] [<c14d8344>] ? security_file_permission+0x24/0xb0 <4>[ 392.603979] [<c128aad0>] ? sys_futex+0x80/0x130 <4>[ 392.609428] [<c131dd14>] ? fget_light+0x44/0xe0 <4>[ 392.614764] [<c132c5f8>] sys_ioctl+0x68/0x80 <4>[ 392.619930] [<c1a89331>] syscall_call+0x7/0xb <0>[ 392.625066] Code: ff ff f3 90 eb 8e 66 90 55 89 e5 83 ec 74 89 5d f4 89 75 f8 89 7d fc 3e 8d 74 26 00 89 c3 8b 40 0c 81 f9 52 54 00 00 89 d6 89 ca <8b> 78 20 0f 84 74 03 00 00 76 72 81 f9 77 58 04 c0 0f 84 c6 02 <0>[ 392.656788] EIP: [<c132c043>] do_vfs_ioctl+0x23/0x570 SS:ESP 0068:db30df1c <4>[ 392.664976] CR2: 0000000000000020 <1>[ 392.669816] BUG: unable to handle kernel NULL pointer dereference at 00000020 <1>[ 392.678055] IP: [<c131c8a7>] vfs_read+0x97/0x160 <4>[ 392.683460] *pde = 00000000 <4>[ 392.686835] Oops: 0000 [#2] PREEMPT SMP <4>[ 392.691545] Modules linked in: atomisp lm3554 ov2722 imx1x5 atmel_mxt_ts vxd392 videobuf_vmalloc videobuf_core hdmi_audio bcm_bt_lpm bcm43241(O) kct_daemon(O) <4>[ 392.708638] <4>[ 392.710379] Pid: 345, comm: adbd Tainted: G O 3.4.43-187546-g85c9adb #1 <4>[ 392.719179] EIP: 0060:[<c131c8a7>] EFLAGS: 00010206 CPU: 0 <4>[ 392.725449] EIP is at vfs_read+0x97/0x160 <4>[ 392.730015] EAX: 00000018 EBX: f108a6c0 ECX: 00000000 EDX: 00000000 <4>[ 392.737154] ESI: 00000001 EDI: 00000018 EBP: f0e23f8c ESP: f0e23f6c <4>[ 392.744251] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 <4>[ 392.750424] CR0: 8005003b CR2: 00000020 CR3: 30cc6000 CR4: 001007d0 <4>[ 392.757517] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 <4>[ 392.764658] DR6: ffff0ff0 DR7: 00000400 <0>[ 392.769022] Process adbd (pid: 345, ti=f0e22000 task=f1806880 task.ti=f0e22000) <0>[ 392.777329] Stack: <4>[ 392.779657] f0e23f9c 00000000 f108a6c0 c17df320 c131dd14 f108a6c0 080b7440 00000018 <4>[ 392.789056] f0e23fac c131c9ad f0e23f9c 00000001 00000000 00000000 00000006 080b7440 <4>[ 392.798400] f0e22000 c1a89331 00000006 080ba17c 00000018 080b7440 00000018 40310d48 <0>[ 392.807831] Call Trace: <4>[ 392.810647] [<c17df320>] ? mtp_ioctl+0x410/0x410 <4>[ 392.816034] [<c131dd14>] ? fget_light+0x44/0xe0 <4>[ 392.821282] [<c131c9ad>] sys_read+0x3d/0x70 <4>[ 392.826145] [<c1a89331>] syscall_call+0x7/0xb <4>[ 392.831246] [<c1a80000>] ? uncompress_inline.isra.42+0x86/0xff <0>[ 392.837950] Code: 89 f2 8b 40 08 89 45 ec 85 c0 8b 45 08 89 04 24 89 d8 0f 84 b4 00 00 00 8b 75 ec ff d6 89 c7 85 ff 7e 6c 8b 53 0c be 01 00 00 00 <8b> 42 20 89 45 f0 0f b7 00 66 25 00 f0 66 3d 00 40 b8 01 00 00 <0>[ 392.864457] EIP: [<c131c8a7>] vfs_read+0x97/0x160 SS:ESP 0068:f0e23f6c <4>[ 392.872059] CR2: 0000000000000020 <4>[ 392.877014] ---[ end trace ed88a6d4a6a2b4b7 ]--- <0>[ 392.882285] Kernel panic - not syncing: Fatal exception -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html