在 2023/6/7 09:42, bibo, mao 写道: > > > 在 2023/6/7 04:45, Bjorn Helgaas 写道: >> On Tue, Jun 06, 2023 at 06:06:04PM +0800, bibo, mao wrote: >>> Huacai, >>> >>> Although I post this patch, I think this should be arch specified >>> rather than general problem. X86 has solved this problem, arm64 >>> with 64K page size is not popular. However LoongArch has this >>> problem, page size is 16K rather than 4K. It is the problem of >>> LoongArch system rather than generic issue. >> >> I think this is only related to the page size, not to any >> LoongArch-specific details. If that's the case, I don't think the >> change should be arch-specific. >> >>> There is such discussion before: >>> https://patchwork.kernel.org/project/linux-pci/patch/22400b8828ad44ddbccb876cc5ca3b11@xxxxxxxxxxxxxxxxxxxxxxx/#19319457 >>> >>> UEFI bios sets pci memory space 4K aligned, however Loongarch kernel rescans the pci >>> bus and reassigns pci memory resource. So it it strange like this, here is pci memory info on >>> my 7A2000 board. >>> root@user-pc:~# lspci -vvv | grep Region >>> Region 5: Memory at e003526e800 (32-bit, non-prefetchable) [size=1K] >>> Region 0: Memory at e003526ec00 (64-bit, non-prefetchable) [size=1K] >> >> I guess these BARs are from different devices, and the problem you're >> pointing out is that both BARs end up in the same 16K page (actually >> even the same 4K page): >> >> (gdb) p/x 0xe003526e800 >> 12 >> $1 = 0xe003526e >> (gdb) p/x 0xe003526ec00 >> 12 >> $2 = 0xe003526e >> >> I agree that's a potential issue because a user mmap of the first >> device also gets access to the BAR of the second device. But it >> doesn't seem to be a *new* or LoongArch-specific issue. >> >> So I *think* the point of this patch is to ensure that BARs of >> different devices never share a page. Why is that suddenly an issue >> for LoongArch? Is it only because you see more sharing with 16K pages >> than other arches do with 4K pages? > The UEFI BIOS has assigned the PCI device with minimal 4K aligned, I guest > Linux kernel on X86/ARM uses resource directly rather than reassign resource > again. So there is problem for hot-plug devices, however most hot-plug devices > are used for server system and they are high speed devices, PCI resource size > is larger than 4K. So it is not obvious on x86/ARM system. > > However on LoongArch system with page size 16K, the problem is obvious since > pci devices with 4K resource size are popular. >> >>> 在 2023/6/6 16:45, Bibo Mao 写道: >>>> Some PCI devices have only 4K memory space size, it is normal in general >>>> machines and aligned with page size. However some architectures which >>>> support different page size, default page size on LoongArch is 16K, and >>>> ARM64 supports page size varying from 4K to 64K. On machines where larger >>>> page size is use, memory space region of two different pci devices may be >>>> in one page. It is not safe with mmu protection, also VFIO pci device >>>> driver requires base address of pci memory space page aligned, so that it >>>> can be memory mapped to qemu user space when it is passed-through to vm. >> >> The minimum memory BAR per spec is 128 bytes (not 4K). You just >> demonstrated that even with 4K pages, different devices can share a >> page. >> >>>> It consumes more pci memory resource with page size alignment requirement, >>>> it should not be a problem on 64 bit system. >>>> >>>> Signed-off-by: Bibo Mao <maobibo@xxxxxxxxxxx> >>>> --- >>>> drivers/pci/setup-res.c | 8 ++++++++ >>>> 1 file changed, 8 insertions(+) >>>> >>>> diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c >>>> index 967f9a758923..55440ae0128d 100644 >>>> --- a/drivers/pci/setup-res.c >>>> +++ b/drivers/pci/setup-res.c >>>> @@ -339,6 +339,14 @@ int pci_assign_resource(struct pci_dev *dev, int resno) >>>> return -EINVAL; >>>> } >>>> >>>> +#ifdef CONFIG_64BIT >>>> + /* >>>> + * force minimum page alignment for vfio pci usage >>>> + * supposing there is enough pci memory resource on 64bit system >>>> + */ >>>> + if (res->flags & IORESOURCE_MEM) >>>> + align = max_t(resource_size_t, PAGE_SIZE, align); >>>> +#endif >> >> Why is this under CONFIG_64BIT? That doesn't have anything to do with >> the BAR size. The only reason I can see is that with CONFIG_64BIT=y, >> we *might* have more MMIO space to use for BARs. > yes, I assume that there is enough PCI memory resource with 64 bit system, > after all it consumes more memory resource and brings out negative effect. > For UIO and SRIOV requirements, they are most 64-bit server system, they > can suffer from wasting some PCI memory resource. Can we add extra config option and let architecture selects whether enable it? Here is piece of code: #ifdef CONFIG_PCI_MEMRES_PAGE_ALIGN /* * force minimum page alignment for vfio pci usage */ if (res->flags & IORESOURCE_MEM) align = max_t(resource_size_t, PAGE_SIZE, align); #endif Regards Bibo, Mao > > Regards > Bibo, Mao >> >>>> size = resource_size(res); >>>> ret = _pci_assign_resource(dev, resno, size, align);