On Thu, Feb 08, 2024 at 01:22:47AM +0000, Robin Murphy wrote: > > So, if everything still works and something else is calling > > of_dma_configure() prior to using the struct device for any DMA > > operations (eg because a driver is always probed?) then we should just > > delete this call. > > I've considered that before - arguably it could have been removed when Mikko > implemented the full bus model, but now I kind of don't want to since it's > one of the remaining few that *is* still in the right place where it's > always meant to be. The correct fix to meet the original expectations > *should* be simply: It is a bit jarring to hear you call a rare place, where it doesn't even actually work right, "meet the original expectations". :\ It is pretty clear the function doesn't do what it was originally expected too anymore, and there are now so many call sites doing different things I don't think that past is so relevant. > ----->8----- > diff --git a/drivers/gpu/host1x/bus.c b/drivers/gpu/host1x/bus.c > index 84d042796d2e..6cab950690a0 100644 > --- a/drivers/gpu/host1x/bus.c > +++ b/drivers/gpu/host1x/bus.c > @@ -455,11 +455,11 @@ static int host1x_device_add(struct host1x *host1x, > device->dev.dma_mask = &device->dev.coherent_dma_mask; > dev_set_name(&device->dev, "%s", driver->driver.name); > device->dev.release = host1x_device_release; > - device->dev.bus = &host1x_bus_type; > device->dev.parent = host1x->dev; > > of_dma_configure(&device->dev, host1x->dev->of_node, true); > > + device->dev.bus = &host1x_bus_type; > device->dev.dma_parms = &device->dma_parms; > dma_set_max_seg_size(&device->dev, UINT_MAX); > > -----8<----- > > ...except you've now also broken that at the same time by removing the > dev->bus check from of_iommu_configure() :( Umm. Is there examples in the tree of this ordering? I looked for a while and I only found a bunch of counter examples.. Trying to order it before the bus is set feels like a big hack, something so subtle should not be part of a driver facing API. At this point, if it is to stay here, the functionality it needs probably wants a new function name and no entanglement with the iommu layer. > Frankly, for all you protest whenever I call you out for demonstrating a > lack of understanding of this tangled mess, I sure do seem to spend a lot of > time explaining it to you... :/ You never do actually explain it though. You throw some insults and make a few statements and often don't answer any followup questions. The code itself is not enlightening, there are no comments explaining these entanglements, these patterns you point to as "right" are not really followed. The designs you explain as "right" don't fully work. There are lots of little accumulated hacks to trip on. It is a huge red flag that something so important as probing the iommu is so completely baroque. > I've never claimed it's *not* a horrible mess, but at the risk of repeating > myself, it *is* fragile, and the consequences of mucking about with it are > tricky to reason about even when one does understand all the history of how > it's intended to work vs. what actually happens. To coin a phrase I find > enjoyable, this is definitely "F around and find out" code; as this and > other threads show, now you're well into the finding out part. Well sure, fragile is how this kind of kernel work usually goes. How many times has Thomas broke Xen refactoring x86 stuff? This is normal. It is impossible to know what every crazy driver has done, and stuff is getting broken. The point is to slowly make progress toward a robust and understandable design in the core layer and facing the drivers, especially one that releases the drivers of any complex understanding. We've made good progress on this front. The iommu drivers especially are slowly getting more uniform. Here we got a log message that points to something really WTF going on. John's case is different and seems to have identified a real race. I think we are doing pretty good on this. It would be nice to come to conclusion on what should be done here that improves and makes the driver facing API more robust. But it is just a log message so maybe it is fine to leave it. > > Was the above issue fixed in commit 07397df29e57 ("dma-mapping: move > > dma configuration to bus infrastructure") ? > > Um, no. That commit was no more than code movement essentially separating > PCI from non-PCI configuration; merely host1x didn't need to share the full > path which ended up as platform_dma_configure() since host1x doesn't support > ACPI. Once again the fact that I've had to explain that drives me to utter > despair... Okay, I didn't go through the whole patch, I was only looking at the hunks in drivers/gpu/host1x/bus.c that added a new of_dma_configure() call. Regards, Jason