Re: [PATCH v2] vfio: Follow a strict lifetime for struct iommu_group

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2022-10-04 at 14:22 -0400, Matthew Rosato wrote:
> On 10/4/22 1:36 PM, Christian Borntraeger wrote:
> > 
> > 
> > Am 04.10.22 um 18:28 schrieb Jason Gunthorpe:
> > > On Tue, Oct 04, 2022 at 05:44:53PM +0200, Christian Borntraeger
> > > wrote:
> > > 
> > > > > Does some userspace have the group FD open when it stucks
> > > > > like this,
> > > > > eg what does fuser say?
> > > > 
> > > > /proc/<virtnodedevd>/fd
> > > > 51480 0 dr-x------. 2 root root  0  4. Okt 17:16 .
> > > > 43593 0 dr-xr-xr-x. 9 root root  0  4. Okt 17:16 ..
> > > > 65252 0 lr-x------. 1 root root 64  4. Okt 17:42 0 -> /dev/null
> > > > 65253 0 lrwx------. 1 root root 64  4. Okt 17:42 1 ->
> > > > 'socket:[51479]'
> > > > 65261 0 lrwx------. 1 root root 64  4. Okt 17:42 10 ->
> > > > 'anon_inode:[eventfd]'
> > > > 65262 0 lrwx------. 1 root root 64  4. Okt 17:42 11 ->
> > > > 'socket:[51485]'
> > > > 65263 0 lrwx------. 1 root root 64  4. Okt 17:42 12 ->
> > > > 'socket:[51487]'
> > > > 65264 0 lrwx------. 1 root root 64  4. Okt 17:42 13 ->
> > > > 'socket:[51486]'
> > > > 65265 0 lrwx------. 1 root root 64  4. Okt 17:42 14 ->
> > > > 'anon_inode:[eventfd]'
> > > > 65266 0 lrwx------. 1 root root 64  4. Okt 17:42 15 ->
> > > > 'socket:[60421]'
> > > > 65267 0 lrwx------. 1 root root 64  4. Okt 17:42 16 ->
> > > > 'anon_inode:[eventfd]'
> > > > 65268 0 lrwx------. 1 root root 64  4. Okt 17:42 17 ->
> > > > 'socket:[28008]'
> > > > 65269 0 l-wx------. 1 root root 64  4. Okt 17:42 18 ->
> > > > /run/libvirt/nodedev/driver.pid
> > > > 65270 0 lrwx------. 1 root root 64  4. Okt 17:42 19 ->
> > > > 'socket:[28818]'
> > > > 65254 0 lrwx------. 1 root root 64  4. Okt 17:42 2 ->
> > > > 'socket:[51479]'
> > > > 65271 0 lr-x------. 1 root root 64  4. Okt 17:42 20 ->
> > > > '/dev/vfio/3 (deleted)'
> > > 
> > > Seems like a userspace bug to keep the group FD open after the
> > > /dev/
> > > file has been deleted :|
> > > 
> > > What do you think about this?
> > > 
> > > commit a54a852b1484b1605917a8f4d80691db333b25ed
> > > Author: Jason Gunthorpe <jgg@xxxxxxxx>
> > > Date:   Tue Oct 4 13:14:37 2022 -0300
> > > 
> > >      vfio: Make the group FD disassociate from the iommu_group
> > >           Allow the vfio_group struct to exist with a NULL
> > > iommu_group pointer. When
> > >      the pointer is NULL the vfio_group users promise not to
> > > touch the
> > >      iommu_group. This allows a driver to be hot unplugged while
> > > userspace is
> > >      keeping the group FD open.
> > >           SPAPR mode is excluded from this behavior because of
> > > how it wrongly hacks
> > >      part of its iommu interface through KVM. Due to this we
> > > loose control over
> > >      what it is doing and cannot revoke the iommu_group usage in
> > > the IOMMU
> > >      layer via vfio_group_detach_container().
> > >           Thus, for SPAPR the group FDs must still be closed
> > > before a device can be
> > >      hot unplugged.
> > >           This fixes a userspace regression where we learned that
> > > virtnodedevd
> > >      leaves a group FD open even though the /dev/ node for it has
> > > been deleted
> > >      and all the drivers for it unplugged.
> > >           Fixes: ca5f21b25749 ("vfio: Follow a strict lifetime
> > > for struct iommu_group")
> > >      Reported-by: Christian Borntraeger
> > > <borntraeger@xxxxxxxxxxxxx>
> > >      Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > 
> > Almost :-)
> > 
> > drivers/vfio/vfio_main.c: In function 'vfio_file_is_group':
> > drivers/vfio/vfio_main.c:1606:47: error: expected ')' before ';'
> > token
> >  1606 |         return (file->f_op == &vfio_group_fops;
> >       |                ~                              ^
> >       |                                               )
> > drivers/vfio/vfio_main.c:1606:48: error: expected ';' before '}'
> > token
> >  1606 |         return (file->f_op == &vfio_group_fops;
> >       |                                                ^
> >       |                                                ;
> >  1607 | }
> >       | ~
> > 
> > 
> > With that fixed I get:
> > 
> > ERROR: modpost: "vfio_file_is_group" [drivers/vfio/pci/vfio-pci-
> > core.ko] undefined!
> > 
> > With that worked around (m -> y)
> 
> 
> Looks like this can be solved with
> EXPORT_SYMBOL_GPL(vfio_file_is_group);
> 
> Also:
> 
> arch/s390/kvm/../../../virt/kvm/vfio.c:64:28: warning:
> ‘kvm_vfio_file_iommu_group’ defined but not used [-Wunused-function]
>    64 | static struct iommu_group *kvm_vfio_file_iommu_group(struct
> file *file)
>       |                            ^~~~~~~~~~~~~~~~~~~~~~~~~
> 
> kvm_vfio_file_iommu_group looks like it is now SPAPR-only
> 
> > 
> > 
> > Tested-by: Christian Borntraeger <borntraeger@xxxxxxxxxxxxx>
> > 
> > At least the vfio-ap part
> 

I can reproduce the problem with vfio-ccw, also blamed to this patch.
With the changes described above, things work as they did before.

Tested-by: Eric Farman <farman@xxxxxxxxxxxxx> # vfio-ccw, vfio-ap

I can try a v2 when the below gets addressed.

> Nope, with this s390 vfio-pci at least breaks:
> 
> [  132.943389] kernel BUG at lib/list_debug.c:53!
> [  132.943406] monitor event: 0040 ilc:2 [#1] SMP 
> [  132.943410] Modules linked in: vfio_pci kvm vfio_pci_core
> irqbypass vfio_virqfd vhost_vsock vmw_vsock_virtio_transport_common
> vsock vhost vhost_iotlb nft_fib_inet nft_fib_ipv4 nft_fib_ipv6
> nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject
> nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defr
> ag_ipv4 ip_set nf_tables nfnetlink sunrpc mlx5_ib ism smc ib_uverbs
> ib_core uvdevice s390_trng tape_3590 tape tape_class eadm_sch
> vfio_ccw mdev vfio_iommu_type1 vfio zcrypt_cex4 sch_fq_codel configfs
> ghash_s390 prng chacha_s390 libchacha mlx5_core aes_s390 des_s390
> libdes sha3_512_s390 nvme sha3_256_s390 sha512_s390 sh
> a256_s390 nvme_core sha1_s390 sha_common zfcp scsi_transport_fc pkey
> zcrypt rng_core autofs4 [last unloaded: vfio_pci]
> [  132.943457] CPU: 12 PID: 4991 Comm: nose2 Tainted: G       
> W          6.0.0-rc4 #40
> [  132.943460] Hardware name: IBM 3931 A01 782 (LPAR)
> [  132.943462] Krnl PSW : 0704c00180000000 00000000cbc90568
> (__list_del_entry_valid+0xd8/0xf0)
> [  132.943469]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3
> CC:0 PM:0 RI:0 EA:3
> [  132.943474] Krnl GPRS: 8000000000000001 0000000900000027
> 000000000000004e 00000000ccc1ffe0
> [  132.943477]            00000000fffeffff 00000009fc290000
> 0000000000000000 0000000000000080
> [  132.943480]            00000000acc86438 0000000000000000
> 00000000acc86420 00000000a1492800
> [  132.943483]            00000000922a0000 000003ffb9dce260
> 00000000cbc90564 0000038004a6b9f8
> [  132.943489] Krnl Code: 00000000cbc90558: c0200045eff3       
> larl    %r2,00000000cc54e53e
> [  132.943489]            00000000cbc9055e: c0e50022c7d9       
> brasl   %r14,00000000cc0e9510
> [  132.943489]           #00000000cbc90564: af000000           
> mc      0,0
> [  132.943489]           >00000000cbc90568: b9040032           
> lgr     %r3,%r2
> [  132.943489]            00000000cbc9056c: c0200045efd4       
> larl    %r2,00000000cc54e514
> [  132.943489]            00000000cbc90572: c0e50022c7cf       
> brasl   %r14,00000000cc0e9510
> [  132.943489]            00000000cbc90578: af000000           
> mc      0,0
> [  132.943489]            00000000cbc9057c: 0707               
> bcr     0,%r7
> [  132.943510] Call Trace:
> [  132.943512]  [<00000000cbc90568>] __list_del_entry_valid+0xd8/0xf0
> [  132.943515] ([<00000000cbc90564>]
> __list_del_entry_valid+0xd4/0xf0)
> [  132.943518]  [<000003ff8011a1b8>]
> vfio_group_detach_container+0x88/0x170 [vfio] 
> [  132.943524]  [<000003ff801176c0>]
> vfio_device_remove_group.isra.0+0xb0/0x1e0 [vfio] 
> [  132.943529]  [<000003ff804f9e54>]
> vfio_pci_core_unregister_device+0x34/0x80 [vfio_pci_core] 
> [  132.943535]  [<000003ff804ae1c4>] vfio_pci_remove+0x2c/0x40
> [vfio_pci] 
> [  132.943539]  [<00000000cbd58c3c>] pci_device_remove+0x3c/0x98 
> [  132.943542]  [<00000000cbdbdbce>]
> device_release_driver_internal+0x1c6/0x288 
> [  132.943545]  [<00000000cbd4e284>] pci_stop_bus_device+0x94/0xc0 
> [  132.943549]  [<00000000cbd4e570>]
> pci_stop_and_remove_bus_device_locked+0x30/0x48 
> [  132.943552]  [<00000000cb55d980>] zpci_bus_remove_device+0x68/0xa8
> [  132.943555]  [<00000000cb556e82>]
> zpci_deconfigure_device+0x3a/0xe0 
> [  132.943558]  [<00000000cbd65d04>] power_write_file+0x7c/0x130 
> [  132.943561]  [<00000000cb8fbc90>]
> kernfs_fop_write_iter+0x138/0x210 
> [  132.943565]  [<00000000cb837344>] vfs_write+0x194/0x2e0 "
> [  132.943568]  [<00000000cb8376fa>] ksys_write+0x6a/0xf8 
> [  132.943571]  [<00000000cc0f918c>] __do_syscall+0x1d4/0x200 
> [  132.943575]  [<00000000cc107e42>] system_call+0x82/0xb0 
> [  132.943577] Last Breaking-Event-Address:
> [  132.943579]  [<00000000cc0e955c>] _printk+0x4c/0x58
> [  132.943585] Kernel panic - not syncing: Fatal exception:
> panic_on_oops





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux