On Mon, 2023-05-01 at 17:12 +0100, Jonathan Cameron wrote: > On Tue, 18 Apr 2023 15:36:35 +0200 > Nuno Sá <noname.nuno@xxxxxxxxx> wrote: > > > On Sun, 2023-04-16 at 14:33 +0100, Jonathan Cameron wrote: > > > On Fri, 14 Apr 2023 11:11:43 +0200 > > > Nuno Sá <noname.nuno@xxxxxxxxx> wrote: > > > > > > > Hi Jonathan, > > > > > > > > Thanks for the feedback. Definitely helpful... > > > > > > > > On Mon, 2023-04-10 at 13:21 +0100, Jonathan Cameron wrote: > > > > > On Fri, 31 Mar 2023 16:40:44 +0200 > > > > > Nuno Sá <noname.nuno@xxxxxxxxx> wrote: > > > > > > > > > > > Hi Jonathan, > > > > > > > > > > > > > > > > Hmm. Complex topic, so I'd definitely like others to weigh in with > > > > > opinions . > > > > > > > > > > > There are some major issues with the implementation we have upstream > > > > > > for the adi-axi-adc driver. The only user is the ad9467 driver. In the > > > > > > rest of the email I will refer to ad9467 as the converter device and > > > > > > axi-adc as the IP core. > > > > > > > > > > > > Let me start to state that these designs, in a very a basic way, have a > > > > > > converter connected to a IP core that typically lives in a FPGA. This > > > > > > core connects to a DMA core (which is used by the DMA buffer > > > > > > implementation) so we can keep up with the high rates of these > > > > > > converters. So there's a link between these devices and we try to > > > > > > implement that so far looking at them as one single IIO device. > > > > > > > > > > > > Let's go to the major issues now: > > > > > > > > > > > > 1) There is a circular dependency between the two device. And when > > > > > > compiled as modules for example, we cannot really rmmod the modules > > > > > > anymore: > > > > > > > > > > > > "root@analog:~# rmmod ad9467 > > > > > > rmmod: ERROR: Module ad9467 is in use > > > > > > > > > > > > root@analog:~# rmmod adi-axi-adc.ko > > > > > > rmmod: ERROR: Module adi_axi_adc is in use by: ad9467 > > > > > > " > > > > > > > > > > > > This easy to understand as the ad9467 driver as a direct symbol > > > > > > dependency on the axi-adc one. And this one calls 'module_get()' as > > > > > > soon as it attaches to a "client" > > > > > > > > > > Ouch. module_get() never works for this. Long time ago I thought it > > > > > did :( An unbind will bypass that (as well any real hot unplug paths). > > > > > > > > > > > > > Yeps, it bypasses it but I just wanted to point out the flaw in the current > > > > design :) > > > > > > It was a surprise to me when Lars pointed this out years ago. > > > All we can do is treat it as a hint for something that makes > > > little sense for a user to deliberately do. > > > > > > > > > > > > > > > > > > > > > > > > 2) It's fairly easy to crash the kernel: > > > > > > > > > > > > " > > > > > > root@analog:/sys/bus/spi/drivers/ad9467# echo spi0.0 > unbind > > > > > > root@analog:/sys/bus/iio/devices/iio:device2# echo 1 > > > > > > > buffer0/in_voltage0_en > > > > > > root@analog:/sys/bus/iio/devices/iio:device2# echo 1 > buffer0/enable > > > > > > [ 132.349133] 8<--- cut here --- > > > > > > [ 132.352193] Unable to handle kernel paging request at virtual > > > > > > address e0940000 when read > > > > > > [ 132.360333] [e0940000] *pgd=0208b811, *pte=00000000, *ppte=00000000 > > > > > > [ 132.366668] Internal error: Oops: 7 [#1] SMP ARM > > > > > > [ 132.371289] Modules linked in: ad9467 adi_axi_adc ad9517 > > > > > > [ 132.376609] CPU: 1 PID: 444 Comm: bash Not tainted 6.2.9-dirty #3 > > > > > > [ 132.382701] Hardware name: Xilinx Zynq Platform > > > > > > [ 132.387223] PC is at adi_axi_adc_update_scan_mode+0x34/0x88 > > > > > > [adi_axi_adc] > > > > > > [ 132.394020] LR is at arm_heavy_mb+0x1c/0x38 > > > > > > [ 132.398212] pc : [<bf0060c4>] lr : [<c031820c>] psr: a0000013 > > > > > > [ 132.404392] sp : e0929e30 ip : deaddead fp : c4430270 > > > > > > [ 132.409678] r10: c8a0bc18 r9 : deaddeac r8 : c89b1c00 > > > > > > [ 132.414895] r7 : c4430340 r6 : c45db7c0 r5 : e093ffc0 r4 : > > > > > > 000003f0 > > > > > > [ 132.421413] r3 : 00010000 r2 : e0940000 r1 : 00000000 r0 : > > > > > > 00000000 > > > > > > " > > > > > > > > > > It's nasty but generally that sort of path can be prevented with some > > > > > careful locking and checking of indicators that a component has gone away. > > > > > So whenever you unbind one aspect it notifies the other one that it has > > > > > gone > > > > > away. I'm not sure where to currently look for best practice in this. > > > > > > > > > > There are a lot of similar situations - basically anywhere a set of > > > > > drivers end up hanging off two buses. v4l and media drivers in general > > > > > end up doing this a lot. > > > > > > > > > > One I'm familiar with is how some of the CXL tear down happens and > > > > > that lead me to device_release_driver() which is also used by usb etc. > > > > > I've not looked at this for a while but it may provide the tear down > > > > > needed > > > > > if the right dance is used. I think that will work if your convertor > > > > > driver is using services off the IP core driver and someone unbinds > > > > > that ip-core driver. > > > > > > > > > > > > > Yes, CCF does it with refcounting and some dummy clock ops for the case a > > > > clock > > > > provider is gone (with active consumers). > > > > > > That must be 'exciting' on occasion as no way to know if the clock disappeared > > > or reset and hence the device interface might lock up. > > > > > > > > > > > > > > > > > > > > > > > > 3) This solution does not really scale. There are designs where we have > > > > > > multiple instances of the IP core attached to the client (typically if > > > > > > a converter like this has more than one channel, we have one instance > > > > > > of the core per channel). The way we have it know, that would not me > > > > > > feasable. > > > > > > > > > > > > > > ... > > > > > > > > > > > > > > > > > > > > However, one thing that > > > > > > I'm not sure it will be ideal with having two devices is synchronizing > > > > > > accesses to the IP core. For example, the axi-dac driver (that I also > > > > > > intend to bring upstream as part of this work) also has an userspace > > > > > > interface so we need to be carefull. > > > > > > > > > > I'm a bit lost. I think we'd only expect to see a userspace interface > > > > > on the 'front end' driver. Is there something on the driver handling the > > > > > IP core as well (beyond debug which you can do whatever you like with :) > > > > > > > > > > > > > See below... > > > > > > > > > > But I guess this is no different > > > > > > for devices which are accessed by the in-kernel IIO interface. > > > > > > > > > > > > I'm way more inclined to go with 2). > > > > > > > > > > I'm not sure I fully follow what option 2) is! Perhaps a bit > > > > > of ascii art? > > > > > > > > > > > > > So option 2 is pretty much treating the IP core also as a separate IIO > > > > device. > > > > This is not so visible in axi-adc but on the dac counterpart, the settings > > > > to generate tones (frequency/phase) are on the core. So, this could map to > > > > IIO ABI. > > > > > > Those are of interest to which bit of analog circuitry? The big beyond the > > > front end? If so fine to provide that via the IIO interfaces, but the > > > front end device should proxy them (using the IIO in kernel consumer > > > interfaces > > > - which probably needs some work to make them safe to the provider going away. > > > From below, seems that a more focused interface for this particular device > > > representation may make sense. > > > > > > > Not sure I'm following your question but I'll try... The settings are on the IP > > core beyond the front end but I would say they are of interest of the front end > > device. For instance, we might generate a tone on the IP core (setting phase and > > frequency) but these bits are to be transmitted trough a dac/dds (our front end > > device) analog, digital front end. > > hehe. I avoided opening your email for a few weeks because I thought it might > drag me down a rat hole. Then I find there is just a nice clean reply to > my question that would have taken me 2 mins to read. > Ehehe. nice surprise then :) > That makes complete sense. To my mind only the front end needs to be an IIO > device, but if it's convenient to represent the back end as one we can probably > make that work as well. > > Alright, I'll go then the one IIO device route for v1 (at least start working with that goal :)) so we can see how it looks like... Thanks for your feedback on this! - Nuno Sá