Re: [PATCH v13 00/17] Introduce the Counter character device interface

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 28, 2021 at 05:51:45PM +0900, William Breathitt Gray wrote:
> On Tue, Jul 13, 2021 at 06:53:04PM +0900, William Breathitt Gray wrote:
> > Suppose I open the chrdev from a userspace application and keep it open,
> > but then remove the respective driver and Counter subsystem module from
> > my system. The devm_counter_release() and counter_exit() functions will
> > be called as expected; the counter_chrdev_release() function will not be
> > called yet, but that is expected because the chrdev is still open by
> > userspace. If I try to break out of my userspace application, I expect
> > counter_chrdev_release() to finally be called, but this does not happen.
> > Instead, my userspace application stalls and I see the following error
> > in my dmesg:
> > 
> > [  172.859570] BUG: unable to handle page fault for address: ffffffffc09ae298
> > [  172.859594] #PF: supervisor read access in kernel mode
> > [  172.859598] #PF: error_code(0x0000) - not-present page
> > [  172.859603] PGD 23615067 P4D 23615067 PUD 23617067 PMD 1029ad067 PTE 0
> > [  172.859623] Oops: 0000 [#1] SMP NOPTI
> > [  172.859629] CPU: 2 PID: 2485 Comm: counter_example Not tainted 5.13.0+ #1
> > [  172.859640] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
> > [  172.859645] RIP: 0010:filp_close+0x29/0x70
> > [  172.859662] Code: 90 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 48 8b 47 38 48 85 c0 0f 84 b7 40 86 00 48 8b 47 28 49 89 fc 49 89 f5 45 31 f6 <48> 8b 40 78 48 85 c0 74 08 ff d0 0f 1f 00 41 89 c6 41 f6 44 24 45
> > [  172.859669] RSP: 0018:ffffad31c0ee7cb0 EFLAGS: 00010246
> > [  172.859675] RAX: ffffffffc09ae220 RBX: 0000000000000001 RCX: 0000000000000001
> > [  172.859680] RDX: ffff9a43829708e0 RSI: ffff9a4382970840 RDI: ffff9a43821f4f00
> > [  172.859684] RBP: ffffad31c0ee7cc8 R08: 0000000000000001 R09: 0000000000000001
> > [  172.859687] R10: ffffffffffff4d00 R11: ffff9a43933c6e10 R12: ffff9a43821f4f00
> > [  172.859691] R13: ffff9a4382970840 R14: 0000000000000000 R15: 0000000000000003
> > [  172.859694] FS:  0000000000000000(0000) GS:ffff9a44b7d00000(0000) knlGS:0000000000000000
> > [  172.859699] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  172.859704] CR2: ffffffffc09ae298 CR3: 0000000023610001 CR4: 0000000000370ee0
> > [  172.859713] Call Trace:
> > [  172.859730]  put_files_struct+0x73/0xd0
> > [  172.859738]  exit_files+0x49/0x50
> > [  172.859743]  do_exit+0x33b/0xa20
> > [  172.859751]  do_group_exit+0x3b/0xb0
> > [  172.859758]  get_signal+0x16f/0x8b0
> > [  172.859766]  ? _copy_to_user+0x20/0x30
> > [  172.859774]  ? put_timespec64+0x3d/0x60
> > [  172.859784]  arch_do_signal_or_restart+0xf3/0x850
> > [  172.859794]  ? hrtimer_nanosleep+0x9f/0x120
> > [  172.859802]  ? __hrtimer_init+0xd0/0xd0
> > [  172.859808]  exit_to_user_mode_prepare+0x122/0x1b0
> > [  172.859816]  syscall_exit_to_user_mode+0x27/0x50
> > [  172.859825]  do_syscall_64+0x48/0xc0
> > [  172.859831]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [  172.859839] RIP: 0033:0x7f07f8b9951a
> > [  172.859850] Code: Unable to access opcode bytes at RIP 0x7f07f8b994f0.
> > [  172.859853] RSP: 002b:00007ffc0d12c230 EFLAGS: 00000246 ORIG_RAX: 00000000000000e6
> > [  172.859860] RAX: fffffffffffffdfc RBX: ffffffffffffff01 RCX: 00007f07f8b9951a
> > [  172.859863] RDX: 00007ffc0d12c2b0 RSI: 0000000000000000 RDI: 0000000000000000
> > [  172.859867] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007ffc0d12c1c6
> > [  172.859871] R10: 00007ffc0d12c2b0 R11: 0000000000000246 R12: 00007ffc0d12c2b0
> > [  172.859874] R13: 00007ffc0d12c2b0 R14: 0000000000000000 R15: 0000000000000000
> > [  172.859886] Modules linked in: intel_rapl_msr intel_rapl_common kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel nls_iso8859_1 aesni_intel crypto_simd cryptd rapl drm_ttm_helper ttm uvcvideo drm_kms_helper videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common input_leds syscopyarea videodev sysfillrect sysimgblt fb_sys_fops cec rc_core joydev mc drm serio_raw mac_hid qemu_fw_cfg sch_fq_codel msr parport_pc ppdev lp parport virtio_rng ip_tables x_tables autofs4 hid_generic usbhid hid virtio_net psmouse net_failover i2c_piix4 virtio_blk failover pata_acpi floppy [last unloaded: counter]
> > [  172.859995] CR2: ffffffffc09ae298
> > [  172.860009] ---[ end trace e7d3d7da1a73b8f4 ]---
> > [  172.860013] RIP: 0010:filp_close+0x29/0x70
> > [  172.860021] Code: 90 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 48 8b 47 38 48 85 c0 0f 84 b7 40 86 00 48 8b 47 28 49 89 fc 49 89 f5 45 31 f6 <48> 8b 40 78 48 85 c0 74 08 ff d0 0f 1f 00 41 89 c6 41 f6 44 24 45
> > [  172.860027] RSP: 0018:ffffad31c0ee7cb0 EFLAGS: 00010246
> > [  172.860031] RAX: ffffffffc09ae220 RBX: 0000000000000001 RCX: 0000000000000001
> > [  172.860034] RDX: ffff9a43829708e0 RSI: ffff9a4382970840 RDI: ffff9a43821f4f00
> > [  172.860038] RBP: ffffad31c0ee7cc8 R08: 0000000000000001 R09: 0000000000000001
> > [  172.860041] R10: ffffffffffff4d00 R11: ffff9a43933c6e10 R12: ffff9a43821f4f00
> > [  172.860044] R13: ffff9a4382970840 R14: 0000000000000000 R15: 0000000000000003
> > [  172.860047] FS:  0000000000000000(0000) GS:ffff9a44b7d00000(0000) knlGS:0000000000000000
> > [  172.860052] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  172.860056] CR2: ffffffffc09ae298 CR3: 0000000023610001 CR4: 0000000000370ee0
> > [  172.860073] Fixing recursive fault but reboot is needed!
> > 
> > It looks like faults in filp_close() before counter_chrdev_release() is
> > called. Is this issue manifesting because counter_exit() was called
> > earlier while the chrdev was still open?
> 
> After giving this some more thought I realized that once the counter
> module is unloaded, the counter_fops callbacks are lost and thus the
> fops for the chrdev are no longer be valid. This means that
> counter_chrdev_release will need to be called before the counter module
> is unloaded. How can I guarantee this if a userspace application may
> still have the chrdev indefinitely open? In counter_chrdev_remove(),
> should I walk through the open chrdevs and release each one?
> 
> William Breathitt Gray

Sorry for all the noise, I had simply forgotten to define the owner
member of counter_fops structure to THIS_MODULE. I'll make that fix and
also add in some conditional checks to protect the callbacks so they
return -ENODEV if the counter has been unregistered. With that I expect
to have a v14 submission ready some time this coming week.

William Breathitt Gray

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Input]     [Linux Kernel]     [Linux SCSI]     [X.org]

  Powered by Linux