Marc Zyngier <maz@xxxxxxxxxx> writes: > On Sun, 22 Jan 2023 09:00:04 +0000, > Leon Romanovsky <leon@xxxxxxxxxx> wrote: >> >> On Thu, Jan 19, 2023 at 07:06:32PM +0200, Alexander Shishkin wrote: >> > A malicious device can change its MSIX table size between the table >> > ioremap() and subsequent accesses, resulting in a kernel page fault in >> > pci_write_msg_msix(). >> > >> > To avoid this, cache the table size observed at the moment of table >> > ioremap() and use the cached value. This, however, does not help drivers >> > that peek at the PCIE_MSIX_FLAGS register directly. >> > >> > Signed-off-by: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx> >> > Reviewed-by: Mika Westerberg <mika.westerberg@xxxxxxxxxxxxxxx> >> > Cc: stable@xxxxxxxxxxxxxxx >> > --- >> > drivers/pci/msi/api.c | 7 ++++++- >> > drivers/pci/msi/msi.c | 2 +- >> > include/linux/pci.h | 1 + >> > 3 files changed, 8 insertions(+), 2 deletions(-) >> >> I'm not security expert here, but not sure that this protects from anything. >> 1. Kernel relies on working and not-malicious HW. There are gazillion ways >> to cause crashes other than changing MSI-X. >> 2. Device can report large table size, kernel will cache it and >> malicious device will reduce it back. It is not handled and will cause >> to kernel crash too. >> > > Indeed, this was my exact reaction reading this patch. This only makes > sure the same (potentially wrong) value is used at all times. So while > this results in a consistent use, this doesn't give much guarantee. It guarantees that the MSIX table is big enough to fit all the vectors, so it should prevent the page faults from out-of-bounds accesses. > The only way to deal with this is to actually handle the resulting > fault, similar to what the kernel does when accessing userspace. Not > sure how possible this is with something like PCIe. Do you mean replacing MMIO accesses with exception handling accessors? That seems like a monumental effort. And then we'd have to figure out how to handle errors in the __pci_write_msi_msg() path. Preventing page faults from happening in the first place seems like a more reasonable solution, or what do you think? Thanks, -- Alex