Re: [PATCH] platform/x86: ISST: Fix the KASAN report slab-out-of-bounds bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2024-09-20 at 00:22 +0800, Zach Wade wrote:
> 
> 
> 
Hi Wade,


...
...

> This issue was discovered when the hard drive was passed directly to
> the 
> virtual machine and the driver was automatically loaded.
> 
> The virtual machine executes lsblk internally and can see the
> physical 
> hard disk I am directly using:
> lsblk -S
> NAME HCTL       TYPE VENDOR   MODEL                           REV
> SERIAL 
>              TRAN
> sda  32:0:0:0   disk ATA      SAMSUNG MZ7LH240HAHQ-00005 HXT7904Q 
> S45RNC0T166451
> sdb  32:0:1:0   disk ATA      INTEL SSDSC2KG480G8        XCV10100 
> BTYG84910BR5480BGN
> 
So seems these devices are using same PCI Intel vendor device ID as SST
device after emulation.
These are not SST devices.

What is 
sudo lspci -vvv

Also cat /proc/cpuinfo?

Thanks,
Srinivas

> 
> > I don't think lspci in VM will show this device.
> > Can you send lspci -k?
> 
> lspci -k
> 00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX
> Host 
> bridge (rev 01)
>          Subsystem: VMware Virtual Machine Chipset
> 00:01.0 PCI bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX AGP
> bridge (rev 01)
> 00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev
> 08)
>          Subsystem: VMware Virtual Machine Chipset
> 00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev
> 01)
>          Subsystem: VMware Virtual Machine Chipset
>          Kernel driver in use: ata_piix
>          Kernel modules: pata_acpi, ata_generic
> 00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08)
>          Subsystem: VMware Virtual Machine Chipset
>          Kernel modules: i2c_piix4
> 00:07.7 System peripheral: VMware Virtual Machine Communication 
> Interface (rev 10)
>          Subsystem: VMware Virtual Machine Communication Interface
>          Kernel driver in use: vmw_vmci
>          Kernel modules: vmw_vmci
> 00:0f.0 VGA compatible controller: VMware SVGA II Adapter
>          Subsystem: VMware SVGA II Adapter
>          Kernel driver in use: vmwgfx
>          Kernel modules: vmwgfx
> 02:00.0 Serial Attached SCSI controller: VMware PVSCSI SCSI
> Controller 
> (rev 02)
>          DeviceName: SCSI0
>          Subsystem: VMware PVSCSI SCSI Controller
>          Kernel driver in use: vmw_pvscsi
>          Kernel modules: vmw_pvscsi
> 02:01.0 USB controller: VMware USB1.1 UHCI Controller
>          DeviceName: usb
>          Subsystem: VMware Device 1976
>          Kernel driver in use: uhci_hcd
> 02:02.0 Ethernet controller: VMware VMXNET3 Ethernet Controller (rev
> 01)
>          DeviceName: Ethernet0
>          Subsystem: VMware VMXNET3 Ethernet Controller
>          Kernel driver in use: vmxnet3
>          Kernel modules: vmxnet3
> 02:03.0 USB controller: VMware USB2 EHCI Controller
>          DeviceName: ehci
>          Subsystem: VMware USB2 EHCI Controller
>          Kernel driver in use: ehci-pci
> 02:04.0 SATA controller: VMware SATA AHCI controller
>          DeviceName: sata0
>          Subsystem: VMware SATA AHCI controller
>          Kernel driver in use: ahci
> 
> > 
This is not complete list.


> > I want to make sure somehow your other VM PCI device is using same
> > ID
> > as this device.
> > 
> 
> Are you referring to transferring this hard drive to another virtual 
> machine? The action is too big, unfortunately I cannot do it this
> way.
> This physical hard drive is only directly connected to this virtual 
> machine for use.
> 
> > 
> > > > > Here on the virtualized environment, seems the
> > > > > topology_physical_package_id() (from CPU APIC ID in non
> > > > > virtualized
> > > > > case) is assigned some big value, which is more than max
> > > > > packages
> > > > > in
> > > > > the system.
> > > > > 
> > > > > But your fix is good as topology_logical_package_id() should
> > > > > be
> > > > > less
> > > > > than value returned by topology_max_packages() and hence
> > > > > avoid
> > > > > this
> > > > > issue.
> > > > > 
> > > > > Can you confirm the value returned by
> > > > > topology_logical_package_id() and
> > > > > topology_physical_package_id()?
> > > 
> > > cat /proc/cpuinfo | grep "physical id"
> > > physical id     : 0
> > > physical id     : 2
> > > physical id     : 4
> > > ......
> > > physical id     : 58
> > > physical id     : 60
> > > physical id     : 62
> > 
> > > 
> > > I calculated topology_max_packages() * sizeof (* isst_pkg.info)
> > > in
> > > isst_if_cpu_info_init, and focused on pkg_id and bus_no in
> > > _isst_if_get_pci_dev.
> > > The printk printed result is as follows:
> > > [   51.879700] Allocated size: 512
> > 
> > Here topology_max_packages() returned 32.
> 
> yes.
> 
> > 
> > > [   51.880148] pkg_id: 0, bus_no: 0
> > > [   51.881242] pkg_id: 0, bus_no: 1
> > > [   51.884209] pkg_id: 2, bus_no: 0
> > > [   51.884571] pkg_id: 2, bus_no: 1
> > > [   51.884931] pkg_id: 4, bus_no: 0
> > > [   51.885313] pkg_id: 4, bus_no: 1
> > > ......
> > > [   51.899134] pkg_id: 28, bus_no: 0
> > > [   51.899511] pkg_id: 28, bus_no: 1
> > > [   51.899909] pkg_id: 30, bus_no: 0
> > > [   51.901012] pkg_id: 30, bus_no: 1
> > > [   51.902160]
> > > =================================================================
> > > =
> > > [   51.902936] BUG: KASAN: slab-out-of-bounds in
> > > _isst_if_get_pci_dev.cold+0xde/0xe4 [isst_if_common]
> > > [   51.982707]
> > > =================================================================
> > > =
> > Package ID is 32, so it will overflow. There seems to be only 32
> > packages.
> > If you print topology_logical_package_id(), you will have no gaps,
> > and
> > will be 0-31.
> > 
> > Can you also print topology_logical_package_id() to confirm.
> 
> After adding printk to _isst_if_get_pci_dev, the changes in pkd_id
> can 
> be seen as follows:
> [   18.078652] pkg_id:0
> [   18.078669] pkg_id:0
> [   18.079215] pkg_id:1
> [   18.080920] pkg_id:1
> [   18.081847] pkg_id:2
> [   18.082756] pkg_id:2
> [   18.088928] pkg_id:3
> [   18.089839] pkg_id:3
> ......
> [   18.185462] pkg_id:30
> [   18.185471] pkg_id:30
> [   18.185561] pkg_id:31
> [   18.185569] pkg_id:31
> 
> > 
> > topology_max_packages() returns max __max_logical_packages, so
> > topology_logical_package_id() will be better here.
> > 
> > > [   51.985453] pkg_id: 32, bus_no: 0
> > > [   51.986569] pkg_id: 32, bus_no: 1
> > > [   51.988501] pkg_id: 34, bus_no: 0
> > > [   51.989616] pkg_id: 34, bus_no: 1
> > > ......
> > > [   52.059749] pkg_id: 58, bus_no: 0
> > > [   52.062331] pkg_id: 58, bus_no: 1
> > > [   52.066039] pkg_id: 60, bus_no: 0
> > > [   52.068503] pkg_id: 60, bus_no: 1
> > > [   52.072018] pkg_id: 62, bus_no: 0
> > > [   52.074375] pkg_id: 62, bus_no: 1
> > > 
> > > > > 
> > > > > We can change commit description based on that.
> > > > > 
> > > > > Thanks,
> > > > > Srinivas
> > > > > 
> > > 
> > > I think the changes are minor, so no more content was added to
> > > the
> > > patch.
> > > If you think it needs to be added, I am happy for you to help
> > > supplement
> > > it.
> > I just want to be clear how to reproduce this issue.
> > 
> > "
> > Attaching SST PCI device to VM causes "BUG: KASAN: slab-out-of-
> > bounds".
> > 
> > Then you can add the kasan bug report.
> > 
> > The reason for this error is physical_package_ids assigned by VMM
> > have
> > holes. This will cause value returned by
> > topology_physical_package_id()
> > to be more than topology_max_packages(). The allocation uses
> > topology_max_packages() to allocate memory. topology_max_packages()
> > returns maximum logical package IDs. Hence use
> > topology_logical_package_id() instead of
> > topology_physical_package_id().
> > "
> 
> Thanks, I will make the necessary modifications in v2.
> 
> > 
> > Also we should add a check
> > 
> > pkg_id = topology_logical_package_id(cpu);
> > if (pkg_id >= topology_max_packages())
> >         return NULL;
> > 
> > May be VMM has holes in logical IDs also, then atleast it will not
> > cause BUG.
> 
> Great suggestion, I will add it in the new v2.
> 
> Thanks,
> Srinivas
> 
> > 
> > Thanks,
> > Srinivas
> > 
> > 
> > > Thanks,
> > > Zach
> > > 
> > > > >    after loading the
> > > > > isst_if_common and isst_if_mbox_msr modules on the 64 core,
> > > > > the
> > > > > kasan
> > > > > report was triggered.
> > > > > After consulting the kernel manual
> > > > > (Documentation/arch/x86/topology.rst),
> > > > > I think in _isst_if_get_pci_dev, topology_physical_package_id
> > > > > should
> > > > > be
> > > > > replaced with topology_logical_package_id.
> > > > > 
> > > > > kasan bug report:
> > > > > [   19.411889]
> > > > > =============================================================
> > > > > ====
> > > > > =
> > > > > [   19.413702] BUG: KASAN: slab-out-of-bounds in
> > > > > _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
> > > > > [   19.415634] Read of size 8 at addr ffff888829e65200 by
> > > > > task
> > > > > cpuhp/16/113
> > > > > [   19.417368]
> > > > > [   19.418627] CPU: 16 PID: 113 Comm: cpuhp/16 Tainted: G
> > > > > E      6.9.0 #10
> > > > > [   19.420435] Hardware name: VMware, Inc. VMware20,1/440BX
> > > > > Desktop
> > > > > Reference Platform, BIOS VMW201.00V.20192059.B64.2207280713
> > > > > 07/28/2022
> > > > > [   19.422687] Call Trace:
> > > > > [   19.424091]  <TASK>
> > > > > [   19.425448]  dump_stack_lvl+0x5d/0x80
> > > > > [   19.426963]  ? _isst_if_get_pci_dev+0x3d5/0x400
> > > > > [isst_if_common]
> > > > > [   19.428694]  print_report+0x19d/0x52e
> > > > > [   19.430206]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
> > > > > [   19.431837]  ? _isst_if_get_pci_dev+0x3d5/0x400
> > > > > [isst_if_common]
> > > > > [   19.433539]  kasan_report+0xf0/0x170
> > > > > [   19.435019]  ? _isst_if_get_pci_dev+0x3d5/0x400
> > > > > [isst_if_common]
> > > > > [   19.436709]  _isst_if_get_pci_dev+0x3d5/0x400
> > > > > [isst_if_common]
> > > > > [   19.438379]  ? __pfx_sched_clock_cpu+0x10/0x10
> > > > > [   19.439910]  isst_if_cpu_online+0x406/0x58f
> > > > > [isst_if_common]
> > > > > [   19.441573]  ? __pfx_isst_if_cpu_online+0x10/0x10
> > > > > [isst_if_common]
> > > > > [   19.443263]  ? ttwu_queue_wakelist+0x2c1/0x360
> > > > > [   19.444797]  cpuhp_invoke_callback+0x221/0xec0
> > > > > [   19.446337]  cpuhp_thread_fun+0x21b/0x610
> > > > > [   19.447814]  ? __pfx_cpuhp_thread_fun+0x10/0x10
> > > > > [   19.449354]  smpboot_thread_fn+0x2e7/0x6e0
> > > > > [   19.450859]  ? __pfx_smpboot_thread_fn+0x10/0x10
> > > > > [   19.452405]  kthread+0x29c/0x350
> > > > > [   19.453817]  ? __pfx_kthread+0x10/0x10
> > > > > [   19.455253]  ret_from_fork+0x31/0x70
> > > > > [   19.456685]  ? __pfx_kthread+0x10/0x10
> > > > > [   19.458114]  ret_from_fork_asm+0x1a/0x30
> > > > > [   19.459573]  </TASK>
> > > > > [   19.460853]
> > > > > [   19.462055] Allocated by task 1198:
> > > > > [   19.463410]  kasan_save_stack+0x30/0x50
> > > > > [   19.464788]  kasan_save_track+0x14/0x30
> > > > > [   19.466139]  __kasan_kmalloc+0xaa/0xb0
> > > > > [   19.467465]  __kmalloc+0x1cd/0x470
> > > > > [   19.468748]  isst_if_cdev_register+0x1da/0x350
> > > > > [isst_if_common]
> > > > > [   19.470233]  isst_if_mbox_init+0x108/0xff0
> > > > > [isst_if_mbox_msr]
> > > > > [   19.471670]  do_one_initcall+0xa4/0x380
> > > > > [   19.472903]  do_init_module+0x238/0x760
> > > > > [   19.474105]  load_module+0x5239/0x6f00
> > > > > [   19.475285]  init_module_from_file+0xd1/0x130
> > > > > [   19.476506]  idempotent_init_module+0x23b/0x650
> > > > > [   19.477725]  __x64_sys_finit_module+0xbe/0x130
> > > > > [   19.476506]  idempotent_init_module+0x23b/0x650
> > > > > [   19.477725]  __x64_sys_finit_module+0xbe/0x130
> > > > > [   19.478920]  do_syscall_64+0x82/0x160
> > > > > [   19.480036]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > > > [   19.481292]
> > > > > [   19.482205] The buggy address belongs to the object at
> > > > > ffff888829e65000
> > > > >    which belongs to the cache kmalloc-512 of size 512
> > > > > [   19.484818] The buggy address is located 0 bytes to the
> > > > > right
> > > > > of
> > > > >    allocated 512-byte region [ffff888829e65000,
> > > > > ffff888829e65200)
> > > > > [   19.487447]
> > > > > [   19.488328] The buggy address belongs to the physical
> > > > > page:
> > > > > [   19.489569] page: refcount:1 mapcount:0
> > > > > mapping:0000000000000000
> > > > > index:0xffff888829e60c00 pfn:0x829e60
> > > > > [   19.491140] head: order:3 entire_mapcount:0
> > > > > nr_pages_mapped:0
> > > > > pincount:0
> > > > > [   19.492466] anon flags:
> > > > > 0x57ffffc0000840(slab|head|node=1|zone=2|lastcpupid=0x1fffff)
> > > > > [   19.493914] page_type: 0xffffffff()
> > > > > [   19.494988] raw: 0057ffffc0000840 ffff88810004cc80
> > > > > 0000000000000000 0000000000000001
> > > > > [   19.496451] raw: ffff888829e60c00 0000000080200018
> > > > > 00000001ffffffff 0000000000000000
> > > > > [   19.497906] head: 0057ffffc0000840 ffff88810004cc80
> > > > > 0000000000000000 0000000000000001
> > > > > [   19.499379] head: ffff888829e60c00 0000000080200018
> > > > > 00000001ffffffff 0000000000000000
> > > > > [   19.500844] head: 0057ffffc0000003 ffffea0020a79801
> > > > > ffffea0020a79848 00000000ffffffff
> > > > > [   19.502316] head: 0000000800000000 0000000000000000
> > > > > 00000000ffffffff 0000000000000000
> > > > > [   19.503784] page dumped because: kasan: bad access
> > > > > detected
> > > > > [   19.505058]
> > > > > [   19.505970] Memory state around the buggy address:
> > > > > [   19.507172]  ffff888829e65100: 00 00 00 00 00 00 00 00 00
> > > > > 00
> > > > > 00 00
> > > > > 00 00 00 00
> > > > > [   19.508599]  ffff888829e65180: 00 00 00 00 00 00 00 00 00
> > > > > 00
> > > > > 00 00
> > > > > 00 00 00 00
> > > > > [   19.510013] >ffff888829e65200: fc fc fc fc fc fc fc fc fc
> > > > > fc
> > > > > fc fc
> > > > > fc fc fc fc
> > > > > [   19.510014]                    ^
> > > > > [   19.510016]  ffff888829e65280: fc fc fc fc fc fc fc fc fc
> > > > > fc
> > > > > fc fc
> > > > > fc fc fc fc
> > > > > [   19.510018]  ffff888829e65300: fc fc fc fc fc fc fc fc fc
> > > > > fc
> > > > > fc fc
> > > > > fc fc fc fc
> > > > > [   19.515367]
> > > > > =============================================================
> > > > > ====
> > > > > =
> > > > > 
> > > > > Fixes: 9a1aac8a96dc ("platform/x86: ISST: PUNIT device
> > > > > mapping
> > > > > with
> > > > > Sub-NUMA clustering")
> > > > > Signed-off-by: Zach Wade <zachwade.k@xxxxxxxxx>
> > > > > ---
> > > > >    drivers/platform/x86/intel/speed_select_if/isst_if_common.
> > > > > c | 2
> > > > > +-
> > > > >    1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > 
> > > > > diff --git
> > > > > a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c
> > > > > b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c
> > > > > index 10e21563fa46..80654aacd5bd 100644
> > > > > ---
> > > > > a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c
> > > > > +++
> > > > > b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c
> > > > > @@ -316,7 +316,7 @@ static struct pci_dev
> > > > > *_isst_if_get_pci_dev(int
> > > > > cpu, int bus_no, int dev, int fn
> > > > >               cpu >= nr_cpu_ids || cpu >=
> > > > > num_possible_cpus())
> > > > >                   return NULL;
> > > > >    
> > > > > -       pkg_id = topology_physical_package_id(cpu);
> > > > > +       pkg_id = topology_logical_package_id(cpu);
> > > > >    
> > > > >           bus_number = isst_cpu_info[cpu].bus_info[bus_no];
> > > > >           if (bus_number < 0)
> > > > 
> > > 
> > 
> 






[Index of Archives]     [Linux Kernel Development]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux