On 16.03.2023 14:53, Alex Deucher wrote: > On Thu, Mar 16, 2023 at 9:48 AM Juergen Gross <jgross@xxxxxxxx> wrote: >> >> On 16.03.23 14:45, Alex Deucher wrote: >>> On Thu, Mar 16, 2023 at 3:50 AM Jan Beulich <jbeulich@xxxxxxxx> wrote: >>>> >>>> On 16.03.2023 00:25, Stefano Stabellini wrote: >>>>> On Wed, 15 Mar 2023, Jan Beulich wrote: >>>>>> On 15.03.2023 01:52, Stefano Stabellini wrote: >>>>>>> On Mon, 13 Mar 2023, Jan Beulich wrote: >>>>>>>> On 12.03.2023 13:01, Huang Rui wrote: >>>>>>>>> Xen PVH is the paravirtualized mode and takes advantage of hardware >>>>>>>>> virtualization support when possible. It will using the hardware IOMMU >>>>>>>>> support instead of xen-swiotlb, so disable swiotlb if current domain is >>>>>>>>> Xen PVH. >>>>>>>> >>>>>>>> But the kernel has no way (yet) to drive the IOMMU, so how can it get >>>>>>>> away without resorting to swiotlb in certain cases (like I/O to an >>>>>>>> address-restricted device)? >>>>>>> >>>>>>> I think Ray meant that, thanks to the IOMMU setup by Xen, there is no >>>>>>> need for swiotlb-xen in Dom0. Address translations are done by the IOMMU >>>>>>> so we can use guest physical addresses instead of machine addresses for >>>>>>> DMA. This is a similar case to Dom0 on ARM when the IOMMU is available >>>>>>> (see include/xen/arm/swiotlb-xen.h:xen_swiotlb_detect, the corresponding >>>>>>> case is XENFEAT_not_direct_mapped). >>>>>> >>>>>> But how does Xen using an IOMMU help with, as said, address-restricted >>>>>> devices? They may still need e.g. a 32-bit address to be programmed in, >>>>>> and if the kernel has memory beyond the 4G boundary not all I/O buffers >>>>>> may fulfill this requirement. >>>>> >>>>> In short, it is going to work as long as Linux has guest physical >>>>> addresses (not machine addresses, those could be anything) lower than >>>>> 4GB. >>>>> >>>>> If the address-restricted device does DMA via an IOMMU, then the device >>>>> gets programmed by Linux using its guest physical addresses (not machine >>>>> addresses). >>>>> >>>>> The 32-bit restriction would be applied by Linux to its choice of guest >>>>> physical address to use to program the device, the same way it does on >>>>> native. The device would be fine as it always uses Linux-provided <4GB >>>>> addresses. After the IOMMU translation (pagetable setup by Xen), we >>>>> could get any address, including >4GB addresses, and that is expected to >>>>> work. >>>> >>>> I understand that's the "normal" way of working. But whatever the swiotlb >>>> is used for in baremetal Linux, that would similarly require its use in >>>> PVH (or HVM) aiui. So unconditionally disabling it in PVH would look to >>>> me like an incomplete attempt to disable its use altogether on x86. What >>>> difference of PVH vs baremetal am I missing here? >>> >>> swiotlb is not usable for GPUs even on bare metal. They often have >>> hundreds or megs or even gigs of memory mapped on the device at any >>> given time. Also, AMD GPUs support 44-48 bit DMA masks (depending on >>> the chip family). >> >> But the swiotlb isn't per device, but system global. > > Sure, but if the swiotlb is in use, then you can't really use the GPU. > So you get to pick one. Yet that "pick one" then can't be an unconditional disable in the source code. If there's no way to avoid swiotlb on a per-device basis, then users will need to be told to arrange for this via command line option when they want to use the GPU is certain ways. Jan