On Tue, 24 May 2022, Oleksandr wrote:
> > On Mon, 23 May 2022, Oleksandr wrote:
> > > > > On Thu, 19 May 2022, Oleksandr wrote:
> > > > > > > On Wed, May 18, 2022 at 5:06 PM Oleksandr <olekstysh@xxxxxxxxx>
> > > > > > > wrote:
> > > > > > > > On 18.05.22 17:32, Arnd Bergmann wrote:
> > > > > > > > > On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko
> > > > > > > > > <olekstysh@xxxxxxxxx> wrote:
> > > > > > > > > This would mean having a device
> > > > > > > > > node for the grant-table mechanism that can be referred to
> > > > > > > > > using
> > > > > > > > > the
> > > > > > > > > 'iommus'
> > > > > > > > > phandle property, with the domid as an additional argument.
> > > > > > > > I assume, you are speaking about something like the following?
> > > > > > > >
> > > > > > > >
> > > > > > > > xen_dummy_iommu {
> > > > > > > > compatible = "xen,dummy-iommu";
> > > > > > > > #iommu-cells = <1>;
> > > > > > > > };
> > > > > > > >
> > > > > > > > virtio@3000 {
> > > > > > > > compatible = "virtio,mmio";
> > > > > > > > reg = <0x3000 0x100>;
> > > > > > > > interrupts = <41>;
> > > > > > > >
> > > > > > > > /* The device is located in Xen domain with ID 1 */
> > > > > > > > iommus = <&xen_dummy_iommu 1>;
> > > > > > > > };
> > > > > > > Right, that's that's the idea,
> > > > > > thank you for the confirmation
> > > > > >
> > > > > >
> > > > > >
> > > > > > > except I would not call it a 'dummy'.
> > > > > > > From the perspective of the DT, this behaves just like an
> > > > > > > IOMMU,
> > > > > > > even if the exact mechanism is different from most hardware IOMMU
> > > > > > > implementations.
> > > > > > well, agree
> > > > > >
> > > > > >
> > > > > > > > > It does not quite fit the model that Linux currently uses for
> > > > > > > > > iommus,
> > > > > > > > > as that has an allocator for dma_addr_t space
> > > > > > > > yes (# 3/7 adds grant-table based allocator)
> > > > > > > >
> > > > > > > >
> > > > > > > > > , but it would think it's
> > > > > > > > > conceptually close enough that it makes sense for the binding.
> > > > > > > > Interesting idea. I am wondering, do we need an extra actions
> > > > > > > > for
> > > > > > > > this
> > > > > > > > to work in Linux guest (dummy IOMMU driver, etc)?
> > > > > > > It depends on how closely the guest implementation can be made to
> > > > > > > resemble a normal iommu. If you do allocate dma_addr_t addresses,
> > > > > > > it may actually be close enough that you can just turn the
> > > > > > > grant-table
> > > > > > > code into a normal iommu driver and change nothing else.
> > > > > > Unfortunately, I failed to find a way how use grant references at
> > > > > > the
> > > > > > iommu_ops level (I mean to fully pretend that we are an IOMMU
> > > > > > driver). I
> > > > > > am
> > > > > > not too familiar with that, so what is written below might be wrong
> > > > > > or
> > > > > > at
> > > > > > least not precise.
> > > > > >
> > > > > > The normal IOMMU driver in Linux doesn’t allocate DMA addresses by
> > > > > > itself, it
> > > > > > just maps (IOVA-PA) what was requested to be mapped by the upper
> > > > > > layer.
> > > > > > The
> > > > > > DMA address allocation is done by the upper layer (DMA-IOMMU which
> > > > > > is
> > > > > > the glue
> > > > > > layer between DMA API and IOMMU API allocates IOVA for PA?). But,
> > > > > > all
> > > > > > what we
> > > > > > need here is just to allocate our specific grant-table based DMA
> > > > > > addresses
> > > > > > (DMA address = grant reference + offset in the page), so let’s say
> > > > > > we
> > > > > > need an
> > > > > > entity to take a physical address as parameter and return a DMA
> > > > > > address
> > > > > > (what
> > > > > > actually commit #3/7 is doing), and that’s all. So working at the
> > > > > > dma_ops
> > > > > > layer we get exactly what we need, with the minimal changes to guest
> > > > > > infrastructure. In our case the Xen itself acts as an IOMMU.
> > > > > >
> > > > > > Assuming that we want to reuse the IOMMU infrastructure somehow for
> > > > > > our
> > > > > > needs.
> > > > > > I think, in that case we will likely need to introduce a new
> > > > > > specific
> > > > > > IOVA
> > > > > > allocator (alongside with a generic one) to be hooked up by the
> > > > > > DMA-IOMMU
> > > > > > layer if we run on top of Xen. But, even having the specific IOVA
> > > > > > allocator to
> > > > > > return what we indeed need (DMA address = grant reference + offset
> > > > > > in
> > > > > > the
> > > > > > page) we will still need the specific minimal required IOMMU driver
> > > > > > to
> > > > > > be
> > > > > > present in the system anyway in order to track the mappings(?) and
> > > > > > do
> > > > > > nothing
> > > > > > with them, returning a success (this specific IOMMU driver should
> > > > > > have
> > > > > > all
> > > > > > mandatory callbacks implemented).
> > > > > >
> > > > > > I completely agree, it would be really nice to reuse generic IOMMU
> > > > > > bindings
> > > > > > rather than introducing Xen specific property if what we are trying
> > > > > > to
> > > > > > implement in current patch series fits in the usage of "iommus" in
> > > > > > Linux
> > > > > > more-less. But, if we will have to add more complexity/more
> > > > > > components
> > > > > > to the
> > > > > > code for the sake of reusing device tree binding, this raises a
> > > > > > question
> > > > > > whether that’s worthwhile.
> > > > > >
> > > > > > Or I really missed something?
> > > > > I think Arnd was primarily suggesting to reuse the IOMMU Device Tree
> > > > > bindings, not necessarily the IOMMU drivers framework in Linux
> > > > > (although
> > > > > that would be an added bonus.)
> > > > >
> > > > > I know from previous discussions with you that making the grant table
> > > > > fit in the existing IOMMU drivers model is difficult, but just reusing
> > > > > the Device Tree bindings seems feasible?
> > > > I started experimenting with that. As wrote in a separate email, I got a
> > > > deferred probe timeout,
> > > >
> > > > after inserting required nodes into guest device tree, which seems to be
> > > > a
> > > > consequence of the unavailability of IOMMU, I will continue to
> > > > investigate
> > > > this question.
> > >
> > > I have experimented with that. Yes, just reusing the Device Tree bindings
> > > is
> > > technically feasible (and we are able to do this by only touching
> > > grant-dma-ops.c), although deferred probe timeout still stands (as there
> > > is no
> > > IOMMU driver being present actually).
> > >
> > > [ 0.583771] virtio-mmio 2000000.virtio: deferred probe timeout,
> > > ignoring
> > > dependency
> > > [ 0.615556] virtio_blk virtio0: [vda] 4096000 512-byte logical blocks
> > > (2.10
> > > GB/1.95 GiB)
> > >
> > >
> > > Below the working diff (on top of current series):
> > >
> > > diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c
> > > index da9c7ff..6586152 100644
> > > --- a/drivers/xen/grant-dma-ops.c
> > > +++ b/drivers/xen/grant-dma-ops.c
> > > @@ -272,17 +272,24 @@ static const struct dma_map_ops xen_grant_dma_ops =
> > > {
> > >
> > > bool xen_is_grant_dma_device(struct device *dev)
> > > {
> > > + struct device_node *iommu_np;
> > > + bool has_iommu;
> > > +
> > > /* XXX Handle only DT devices for now */
> > > if (!dev->of_node)
> > > return false;
> > >
> > > - return of_property_read_bool(dev->of_node, "xen,backend-domid");
> > > + iommu_np = of_parse_phandle(dev->of_node, "iommus", 0);
> > > + has_iommu = iommu_np && of_device_is_compatible(iommu_np,
> > > "xen,grant-dma");
> > > + of_node_put(iommu_np);
> > > +
> > > + return has_iommu;
> > > }
> > >
> > > void xen_grant_setup_dma_ops(struct device *dev)
> > > {
> > > struct xen_grant_dma_data *data;
> > > - uint32_t domid;
> > > + struct of_phandle_args iommu_spec;
> > >
> > > data = find_xen_grant_dma_data(dev);
> > > if (data) {
> > > @@ -294,16 +301,30 @@ void xen_grant_setup_dma_ops(struct device *dev)
> > > if (!dev->of_node)
> > > goto err;
> > >
> > > - if (of_property_read_u32(dev->of_node, "xen,backend-domid",
> > > &domid)) {
> > > - dev_err(dev, "xen,backend-domid property is not
> > > present\n");
> > > + if (of_parse_phandle_with_args(dev->of_node, "iommus",
> > > "#iommu-cells",
> > > + 0, &iommu_spec)) {
> > > + dev_err(dev, "Cannot parse iommus property\n");
> > > + goto err;
> > > + }
> > > +
> > > + if (!of_device_is_compatible(iommu_spec.np, "xen,grant-dma") ||
> > > + iommu_spec.args_count != 1) {
> > > + dev_err(dev, "Incompatible IOMMU node\n");
> > > + of_node_put(iommu_spec.np);
> > > goto err;
> > > }
> > >
> > > + of_node_put(iommu_spec.np);
> > > +
> > > data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);
> > > if (!data)
> > > goto err;
> > >
> > > - data->backend_domid = domid;
> > > + /*
> > > + * The endpoint ID here means the ID of the domain where the
> > > corresponding
> > > + * backend is running
> > > + */
> > > + data->backend_domid = iommu_spec.args[0];
> > >
> > > if (xa_err(xa_store(&xen_grant_dma_devices, (unsigned long)dev,
> > > data,
> > > GFP_KERNEL))) {
> > > (END)
> > >
> > >
> > >
> > > Below, the nodes generated by Xen toolstack:
> > >
> > > xen_grant_dma {
> > > compatible = "xen,grant-dma";
> > > #iommu-cells = <0x01>;
> > > phandle = <0xfde9>;
> > > };
> > >
> > > virtio@2000000 {
> > > compatible = "virtio,mmio";
> > > reg = <0x00 0x2000000 0x00 0x200>;
> > > interrupts = <0x00 0x01 0xf01>;
> > > interrupt-parent = <0xfde8>;
> > > dma-coherent;
> > > iommus = <0xfde9 0x01>;
> > > };
> > Not bad! I like it.
>
>
> Good.
>
>
>
> >
> > > I am wondering, would be the proper solution to eliminate deferred probe
> > > timeout issue in our particular case (without introducing an extra IOMMU
> > > driver)?
> > In reality I don't think there is a way to do that. I would create an
> > empty skelethon IOMMU driver for xen,grant-dma.
>
> Ok, I found yet another option how we can avoid deferred probe timeout issue.
> I am not sure whether it will be welcome. But it doesn't really require
> introducing stub IOMMU driver or other changes in the guest. The idea is to
> make IOMMU device unavailable (status = "disabled"), this way
> of_iommu_configure() will treat that as success condition also.
>
> https://elixir.bootlin.com/linux/v5.18/source/drivers/iommu/of_iommu.c#L31
> https://elixir.bootlin.com/linux/v5.18/source/drivers/iommu/of_iommu.c#L149
>
> xen_grant_dma {
> compatible = "xen,grant-dma";
> #iommu-cells = <0x01>;
> phandle = <0xfde9>;
> status = "disabled";
> };
> virtio@2000000 {
> compatible = "virtio,mmio";
> reg = <0x00 0x2000000 0x00 0x200>;
> interrupts = <0x00 0x01 0xf01>;
> interrupt-parent = <0xfde8>;
> dma-coherent;
> iommus = <0xfde9 0x01>;
> };
>
> I have checked, this "fixes" deferred probe timeout issue.
>
>
> Or we indeed need to introduce stub IOMMU driver (I placed it to driver/xen
> instead of driver/iommu, also we can even squash it with grant-dma-ops.c?).
> This stub driver also results in NO_IOMMU condition (as "of_xlate" callback is
> not implemented).
>
> diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
> index a7bd8ce..35b91b9 100644
> --- a/drivers/xen/Kconfig
> +++ b/drivers/xen/Kconfig
> @@ -335,6 +335,10 @@ config XEN_UNPOPULATED_ALLOC
> having to balloon out RAM regions in order to obtain physical memory
> space to create such mappings.
>
> +config XEN_GRANT_DMA_IOMMU
> + bool
> + select IOMMU_API
> +
> config XEN_GRANT_DMA_OPS
> bool
> select DMA_OPS
> @@ -343,6 +347,7 @@ config XEN_VIRTIO
> bool "Xen virtio support"
> depends on VIRTIO
> select XEN_GRANT_DMA_OPS
> + select XEN_GRANT_DMA_IOMMU
> help
> Enable virtio support for running as Xen guest. Depending on the
> guest type this will require special support on the backend side
> diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
> index 1a23cb0..c0503f1 100644
> --- a/drivers/xen/Makefile
> +++ b/drivers/xen/Makefile
> @@ -40,3 +40,4 @@ xen-privcmd-y := privcmd.o
> privcmd-buf.o
> obj-$(CONFIG_XEN_FRONT_PGDIR_SHBUF) += xen-front-pgdir-shbuf.o
> obj-$(CONFIG_XEN_UNPOPULATED_ALLOC) += unpopulated-alloc.o
> obj-$(CONFIG_XEN_GRANT_DMA_OPS) += grant-dma-ops.o
> +obj-$(CONFIG_XEN_GRANT_DMA_IOMMU) += grant-dma-iommu.o
> diff --git a/drivers/xen/grant-dma-iommu.c b/drivers/xen/grant-dma-iommu.c
> new file mode 100644
> index 00000000..b8aad8a
> --- /dev/null
> +++ b/drivers/xen/grant-dma-iommu.c
> @@ -0,0 +1,76 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Stub IOMMU driver which does nothing.
> + * The main purpose of it being present is to reuse generic device-tree IOMMU
> + * bindings by Xen grant DMA-mapping layer.
> + */
> +
> +#include <linux/iommu.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +
> +struct grant_dma_iommu_device {
> + struct device *dev;
> + struct iommu_device iommu;
> +};
> +
> +/* Nothing is really needed here */
> +static const struct iommu_ops grant_dma_iommu_ops;
> +
> +static const struct of_device_id grant_dma_iommu_of_match[] = {
> + { .compatible = "xen,grant-dma" },
> + { },
> +};
> +
> +static int grant_dma_iommu_probe(struct platform_device *pdev)
> +{
> + struct grant_dma_iommu_device *mmu;
> + int ret;
> +
> + mmu = devm_kzalloc(&pdev->dev, sizeof(*mmu), GFP_KERNEL);
> + if (!mmu)
> + return -ENOMEM;
> +
> + mmu->dev = &pdev->dev;
> +
> + ret = iommu_device_register(&mmu->iommu, &grant_dma_iommu_ops,
> &pdev->dev);
> + if (ret)
> + return ret;
> +
> + platform_set_drvdata(pdev, mmu);
> +
> + return 0;
> +}
> +
> +static int grant_dma_iommu_remove(struct platform_device *pdev)
> +{
> + struct grant_dma_iommu_device *mmu = platform_get_drvdata(pdev);
> +
> + platform_set_drvdata(pdev, NULL);
> + iommu_device_unregister(&mmu->iommu);
> +
> + return 0;
> +}
> +
> +static struct platform_driver grant_dma_iommu_driver = {
> + .driver = {
> + .name = "grant-dma-iommu",
> + .of_match_table = grant_dma_iommu_of_match,
> + },
> + .probe = grant_dma_iommu_probe,
> + .remove = grant_dma_iommu_remove,
> +};
> +
> +static int __init grant_dma_iommu_init(void)
> +{
> + struct device_node *iommu_np;
> +
> + iommu_np = of_find_matching_node(NULL, grant_dma_iommu_of_match);
> + if (!iommu_np)
> + return 0;
> +
> + of_node_put(iommu_np);
> +
> + return platform_driver_register(&grant_dma_iommu_driver);
> +}
> +subsys_initcall(grant_dma_iommu_init);
>
> I have checked, this also "fixes" deferred probe timeout issue.
>
> Personally I would prefer the first option, but I would be also happy to use
> second option in order to unblock the series.
>
> What do the maintainers think?
I don't think it is a good idea to mark the fake IOMMU as disabled
because it implies that there is no need to use it (no need to use
dma_ops) which is a problem.
If we don't want the skelethon driver then Rob's suggestion of having a
skip list for deferred probe is better.
I think the skelethon driver also is totally fine.
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization