Re: adi-axi-adc issues and how to properly support this designs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 18 Apr 2023 15:36:35 +0200
Nuno Sá <noname.nuno@xxxxxxxxx> wrote:

> On Sun, 2023-04-16 at 14:33 +0100, Jonathan Cameron wrote:
> > On Fri, 14 Apr 2023 11:11:43 +0200
> > Nuno Sá <noname.nuno@xxxxxxxxx> wrote:
> >   
> > > Hi Jonathan,
> > > 
> > > Thanks for the feedback. Definitely helpful...
> > > 
> > > On Mon, 2023-04-10 at 13:21 +0100, Jonathan Cameron wrote:  
> > > > On Fri, 31 Mar 2023 16:40:44 +0200
> > > > Nuno Sá <noname.nuno@xxxxxxxxx> wrote:
> > > >     
> > > > > Hi Jonathan,
> > > > >     
> > > > 
> > > > Hmm. Complex topic, so I'd definitely like others to weigh in with
> > > > opinions .
> > > >     
> > > > > There are some major issues with the implementation we have upstream
> > > > > for the adi-axi-adc driver. The only user is the ad9467 driver. In the
> > > > > rest of the email I will refer to ad9467 as the converter device and
> > > > > axi-adc as the IP core. 
> > > > > 
> > > > > Let me start to state that these designs, in a very a basic way, have a
> > > > > converter connected to a IP core that typically lives in a FPGA. This
> > > > > core connects to a DMA core (which is used by the DMA buffer
> > > > > implementation) so we can keep up with the high rates of these
> > > > > converters. So there's a link between these devices and we try to
> > > > > implement that so far looking at them as one single IIO device.
> > > > > 
> > > > > Let's go to the major issues now:
> > > > > 
> > > > > 1) There is a circular dependency between the two device. And when
> > > > > compiled as modules for example, we cannot really rmmod the modules
> > > > > anymore:
> > > > > 
> > > > > "root@analog:~# rmmod ad9467
> > > > > rmmod: ERROR: Module ad9467 is in use
> > > > > 
> > > > > root@analog:~# rmmod adi-axi-adc.ko
> > > > > rmmod: ERROR: Module adi_axi_adc is in use by: ad9467
> > > > > "
> > > > > 
> > > > > This easy to understand as the ad9467 driver as a direct symbol
> > > > > dependency on the axi-adc one. And this one calls 'module_get()' as
> > > > > soon as it attaches to a "client"    
> > > > 
> > > > Ouch. module_get() never works for this.  Long time ago I thought it
> > > > did :(  An unbind will bypass that (as well any real hot unplug paths).
> > > >     
> > > 
> > > Yeps, it bypasses it but I just wanted to point out the flaw in the current
> > > design :)  
> > 
> > It was a surprise to me when Lars pointed this out years ago.
> > All we can do is treat it as a hint for something that makes
> > little sense for a user to deliberately do.
> >   
> > >   
> > > >     
> > > > > 
> > > > > 2) It's fairly easy to crash the kernel:
> > > > > 
> > > > > "
> > > > > root@analog:/sys/bus/spi/drivers/ad9467# echo spi0.0 > unbind
> > > > > root@analog:/sys/bus/iio/devices/iio:device2# echo 1 >
> > > > > buffer0/in_voltage0_en
> > > > > root@analog:/sys/bus/iio/devices/iio:device2# echo 1 > buffer0/enable
> > > > > [  132.349133] 8<--- cut here ---
> > > > > [  132.352193] Unable to handle kernel paging request at virtual
> > > > > address e0940000 when read
> > > > > [  132.360333] [e0940000] *pgd=0208b811, *pte=00000000, *ppte=00000000
> > > > > [  132.366668] Internal error: Oops: 7 [#1] SMP ARM
> > > > > [  132.371289] Modules linked in: ad9467 adi_axi_adc ad9517
> > > > > [  132.376609] CPU: 1 PID: 444 Comm: bash Not tainted 6.2.9-dirty #3
> > > > > [  132.382701] Hardware name: Xilinx Zynq Platform
> > > > > [  132.387223] PC is at adi_axi_adc_update_scan_mode+0x34/0x88
> > > > > [adi_axi_adc]
> > > > > [  132.394020] LR is at arm_heavy_mb+0x1c/0x38
> > > > > [  132.398212] pc : [<bf0060c4>]    lr : [<c031820c>]    psr: a0000013
> > > > > [  132.404392] sp : e0929e30  ip : deaddead  fp : c4430270
> > > > > [  132.409678] r10: c8a0bc18  r9 : deaddeac  r8 : c89b1c00
> > > > > [  132.414895] r7 : c4430340  r6 : c45db7c0  r5 : e093ffc0  r4 :
> > > > > 000003f0
> > > > > [  132.421413] r3 : 00010000  r2 : e0940000  r1 : 00000000  r0 :
> > > > > 00000000
> > > > > "    
> > > > 
> > > > It's nasty but generally that sort of path can be prevented with some
> > > > careful locking and checking of indicators that a component has gone away.
> > > > So whenever you unbind one aspect it notifies the other one that it has
> > > > gone
> > > > away. I'm not sure where to currently look for best practice in this.
> > > > 
> > > > There are a lot of similar situations - basically anywhere a set of 
> > > > drivers end up hanging off two buses.  v4l and media drivers in general
> > > > end up doing this a lot.
> > > > 
> > > > One I'm familiar with is how some of the CXL tear down happens and
> > > > that lead me to device_release_driver() which is also used by usb etc.
> > > > I've not looked at this for a while but it may provide the tear down
> > > > needed
> > > > if the right dance is used.  I think that will work if your convertor
> > > > driver is using services off the IP core driver and someone unbinds
> > > > that ip-core driver.
> > > >     
> > > 
> > > Yes, CCF does it with refcounting and some dummy clock ops for the case a
> > > clock
> > > provider is gone (with active consumers).  
> > 
> > That must be 'exciting' on occasion as no way to know if the clock disappeared
> > or reset and hence the device interface might lock up. 
> >   
> > >   
> > > >     
> > > > > 
> > > > > 3) This solution does not really scale. There are designs where we have
> > > > > multiple instances of the IP core attached to the client (typically if
> > > > > a converter like this has more than one channel, we have one instance
> > > > > of the core per channel). The way we have it know, that would not me
> > > > > feasable.
> > > > >     
> > > 
> > > ...
> > >   
> > > > 
> > > >     
> > > > > However, one thing that
> > > > > I'm not sure it will be ideal with having two devices is synchronizing
> > > > > accesses to the IP core. For example, the axi-dac driver (that I also
> > > > > intend to bring upstream as part of this work) also has an userspace
> > > > > interface so we need to be carefull.    
> > > > 
> > > > I'm a bit lost. I think we'd only expect to see a userspace interface
> > > > on the 'front end' driver. Is there something on the driver handling the
> > > > IP core as well (beyond debug which you can do whatever you like with :)
> > > >     
> > > 
> > > See below... 
> > >   
> > > > > But I guess this is no different
> > > > > for devices which are accessed by the in-kernel IIO interface.
> > > > > 
> > > > > I'm way more inclined to go with 2).    
> > > > 
> > > > I'm not sure I fully follow what option 2) is!  Perhaps a bit
> > > > of ascii art?
> > > >     
> > > 
> > > So option 2 is pretty much treating the IP core also as a separate IIO
> > > device.
> > > This is not so visible in axi-adc but on the dac counterpart, the settings
> > > to generate tones (frequency/phase) are on the core. So, this could map to
> > > IIO ABI.  
> > 
> > Those are of interest to which bit of analog circuitry?  The big beyond the
> > front end?  If so fine to provide that via the IIO interfaces, but the 
> > front end device should proxy them (using the IIO in kernel consumer
> > interfaces
> > - which probably needs some work to make them safe to the provider going away.
> > From below, seems that a more focused interface for this particular device
> > representation may make sense.
> >   
> 
> Not sure I'm following your question but I'll try... The settings are on the IP
> core beyond the front end but I would say they are of interest of the front end
> device. For instance, we might generate a tone on the IP core (setting phase and
> frequency) but these bits are to be transmitted trough a dac/dds (our front end
> device) analog, digital front end.

hehe. I avoided opening your email for a few weeks because I thought it might
drag me down a rat hole.  Then I find there is just a nice clean reply to
my question that would have taken me 2 mins to read.

That makes complete sense.  To my mind only the front end needs to be an IIO
device, but if it's convenient to represent the back end as one we can probably
make that work as well.

Jonathan

> 
> - Nuno Sá
> 
> >   





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Input]     [Linux Kernel]     [Linux SCSI]     [X.org]

  Powered by Linux