On Fri, 03 May 2024 08:09:29 +0200 Nuno Sá <noname.nuno@xxxxxxxxx> wrote: Resend as the to / cc entries were garbled. No idea why! > On Thu, 2024-05-02 at 20:14 +0100, Jonathan Cameron wrote: > > On Thu, 02 May 2024 13:31:55 +0200 > > Nuno Sá <noname.nuno@xxxxxxxxx> wrote: > > > > > On Mon, 2024-04-29 at 20:40 +0100, Jonathan Cameron wrote: > > > > On Mon, 29 Apr 2024 13:17:42 +0000 > > > > "Gradinariu, Ramona" <Ramona.Gradinariu@xxxxxxxxxx> wrote: > > > > > > > > > > -----Original Message----- > > > > > > From: Nuno Sá <noname.nuno@xxxxxxxxx> > > > > > > Sent: Monday, April 29, 2024 10:59 AM > > > > > > To: Jonathan Cameron <jic23@xxxxxxxxxx>; Ramona Gradinariu > > > > > > <ramona.bolboaca13@xxxxxxxxx> > > > > > > Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-iio@xxxxxxxxxxxxxxx; linux- > > > > > > doc@xxxxxxxxxxxxxxx; devicetree@xxxxxxxxxxxxxxx; corbet@xxxxxxx; > > > > > > conor+dt@xxxxxxxxxx; krzysztof.kozlowski+dt@xxxxxxxxxx; > > > > > > robh@xxxxxxxxxx; > > > > > > Gradinariu, Ramona <Ramona.Gradinariu@xxxxxxxxxx>; Sa, Nuno > > > > > > <Nuno.Sa@xxxxxxxxxx> > > > > > > Subject: Re: [PATCH 4/5] iio: adis16480: add support for adis16545/7 > > > > > > families > > > > > > > > > > > > [External] > > > > > > > > > > > > On Sun, 2024-04-28 at 16:25 +0100, Jonathan Cameron wrote: > > > > > > > On Tue, 23 Apr 2024 11:42:09 +0300 > > > > > > > Ramona Gradinariu <ramona.bolboaca13@xxxxxxxxx> wrote: > > > > > > > > > > > > > > > The ADIS16545 and ADIS16547 are a complete inertial system that > > > > > > > > includes a triaxis gyroscope and a triaxis accelerometer. > > > > > > > > The serial peripheral interface (SPI) and register structure > > > > > > > > provide a > > > > > > > > simple interface for data collection and configuration control. > > > > > > > > > > > > > > > > These devices are similar to the ones already supported in the > > > > > > > > driver, > > > > > > > > with changes in the scales, timings and the max spi speed in burst > > > > > > > > mode. > > > > > > > > Also, they support delta angle and delta velocity readings in > > > > > > > > burst > > > > > > > > mode, for which support was added in the trigger handler. > > > > > > > > > > > > > > > > Signed-off-by: Nuno Sá <nuno.sa@xxxxxxxxxx> > > > > > > > > > > > > > > What is Nuno's relationship to this patch? You are author and the > > > > > > > sender > > > > > > > which is fine, but in that case you need to make Nuno's involvement > > > > > > > explicit. > > > > > > > Perhaps a Co-developed-by or similar is appropriate? > > > > > > > > > > > > > > > Signed-off-by: Ramona Gradinariu <ramona.gradinariu@xxxxxxxxxx> > > > > > > > A few comments inline. Biggest one is I'd like a clear statement of > > > > > > > why you > > > > > > > can't do a burst of one type, then a burst of other. My guess is > > > > > > > that the > > > > > > > transition is very time consuming? If so, that is fine, but you > > > > > > > should be > > > > > > > able > > > > > > > to let available_scan_masks handle the disjoint channel sets. > > > > > > > > > > > > Yeah, the burst message is a special spi transfer that brings you all > > > > > > of the > > > > > > channels data at once but for the accel/gyro you need to explicitly > > > > > > configure > > > > > > the chip to either give you the "normal vs "delta" readings. Re- > > > > > > configuring the > > > > > > chip and then do another burst would destroy performance I think. We > > > > > > could > > > > > > do > > > > > > the manual readings as we do in adis16475 for chips not supporting > > > > > > burst32. > > > > > > But > > > > > > in the burst32 case those manual readings should be minimal while in > > > > > > here we > > > > > > could have to do 6 of them which could also be very time consuming... > > > > > > > > > > > > Now, why we don't use available_scan_masks is something I can't really > > > > > > remember > > > > > > but this implementation goes in line with what we have in the > > > > > > adis16475 > > > > > > driver. > > > > > > > > > > > > - Nuno Sá > > > > > > > > > > > > > > > > Thank you Nuno for all the additional explanations. > > > > > Regarding the use of available_scan_masks, the idea is to have any > > > > > possible > > > > > combination for accel, gyro, temp and timestamp channels or delta angle, > > > > > delta > > > > > velocity, temp and timestamp channels. There are a lot of combinations > > > > > for > > > > > this and it does not seem like a good idea to write them all manually. > > > > > That is > > > > > why adis16480_update_scan_mode is used for checking the correct channels > > > > > selection. > > > > > > > > If you are using bursts, the data is getting read anyway - which is the > > > > main > > > > cost here. The real question becomes what are you actually saving by > > > > supporting all > > > > the combinations of the the two subsets of channels in the pollfunc? > > > > Currently you have to pick the channels out and repack them, if pushing > > > > them all > > > > looks to me like a mempcy and a single value being set (unconditionally). > > > > > > > Then it's a question of what the overhead of the channel demux in the core > > > > is. > > > > What you pass out of the driver via iio_push_to_buffers*() > > > > is not what ends up in the buffer if you allow the IIO core to do data > > > > demuxing > > > > for you - that is enabled by providing available_scan_masks. At buffer > > > > start up the demux code computes a fairly optimal set of copies to repack > > > > the incoming data to match with what channels the consumer (here probably > > > > the kfifo on the way to userspace) is expecting. > > > > > > > > That demux adds a small overhead but it should be small as long > > > > as the channels wanted aren't pathological (i.e. every other one). > > > > > > > > Advantage is the driver ends up simpler and in the common case of turn > > > > on all the channels (why else did you buy a device with those measurements > > > > if you didn't want them!) the demux is zerocopy so effectively free which > > > > is not going to be the case for the bitmap walk and element copy in the > > > > driver. > > > > > > > > > > Maybe my younger me was smarter but reading again the validation of the scan > > > mask > > > code (when available_scan_masks is available), I'm not sure why we're not > > > using them. > > > I think that having one mask with delta values + temperature and another one > > > with > > > normal + temperature would be enough for what we want in here. The code in > > > adis16480_update_scan_mode() could then be simpler I think. > > > > > > Now, what I'm still not following is the straight memcpy(). I may be missing > > > something but the demux code only appears to kick in when we have compound > > > masks > > > resulting of multiple buffers being enabled. So I'm not seeing how we can > > > get away > > > without picking the channels and place them correctly in the buffer passed > > > to IIO? > > > > It runs whenever the enabled mask requested (the channels that are enabled) is > > different from the active_scan_mask. It only sends channels in one > > direction if there is only one user but it only sends the ones enabled by that > > consumer. > > It's called unconditionally from iio_enable_buffers() > > > > That iterates over all enabled buffers (often there is only 1) > > > > and then checks if the active scan mask is the same as the one for this > > buffer. > > https://elixir.bootlin.com/linux/v6.9-rc6/source/drivers/iio/industrialio-buffer.c#L1006 > > > > The setup for whether to find a superset from available_scan_masks is here > > https://elixir.bootlin.com/linux/v6.9-rc6/source/drivers/iio/industrialio-buffer.c#L928 > > > > Key is that if it happens to match, then we don't actually do anything in the > > demux. > > It just gets passed straight on through. That does the fast path you mention > > below. > > Ahh got it... May failure was not realizing that iio_scan_mask_match() returns > the available masks so I was assuming that the bitmap_equal() check would only > differ when multiple buffers are enabled. > > Oh well, I think then this should work... I'm not sure it will be more > performant for the case where we don't enable all the channels because the demux > is a linked list which is far from being a performance friend (maybe we can do > some trials with something like xarray...). Still, for this to work the buffer > we pass into IIO has to be bigger than it needs to be (so bigger memset()) > because, AFAIU, we will have to pass on all the scan elements and, as I said, > the burst data will either have delta or normal measurements (not both). I guess > the code will still look simpler so if there's no visible difference in > performance I'm fine with the change. I guess Ramona can give it a try to see > how it looks like. Would be interesting to look at the performance of that code, but the real question is what channels do users typically enabled. If they enabled a full set (and I suspect most do) then that code doesn't nothing at all. Original thinking was that the non common case didn't really matter much. If it is worth optimizing I'd suggest a double pass (or cheat - see later) 1st pass works out number of elements, 2nd just saves them in a nice linear array. It's typically small however, so maybe just allocate a linear array big enough for the worst case. Jonathan > > - Nuno Sá > > > >