On Fri, 2024-05-03 at 09:42 +0100, Jonathan Cameron wrote: > On Fri, 03 May 2024 08:09:29 +0200 > Nuno Sá <noname.nuno@xxxxxxxxx> wrote: > > Resend as the to / cc entries were garbled. No idea why! > > > On Thu, 2024-05-02 at 20:14 +0100, Jonathan Cameron wrote: > > > On Thu, 02 May 2024 13:31:55 +0200 > > > Nuno Sá <noname.nuno@xxxxxxxxx> wrote: > > > > > > > On Mon, 2024-04-29 at 20:40 +0100, Jonathan Cameron wrote: > > > > > On Mon, 29 Apr 2024 13:17:42 +0000 > > > > > "Gradinariu, Ramona" <Ramona.Gradinariu@xxxxxxxxxx> wrote: > > > > > > > > > > > > -----Original Message----- > > > > > > > From: Nuno Sá <noname.nuno@xxxxxxxxx> > > > > > > > Sent: Monday, April 29, 2024 10:59 AM > > > > > > > To: Jonathan Cameron <jic23@xxxxxxxxxx>; Ramona Gradinariu > > > > > > > <ramona.bolboaca13@xxxxxxxxx> > > > > > > > Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-iio@xxxxxxxxxxxxxxx; > > > > > > > linux- > > > > > > > doc@xxxxxxxxxxxxxxx; devicetree@xxxxxxxxxxxxxxx; corbet@xxxxxxx; > > > > > > > conor+dt@xxxxxxxxxx; krzysztof.kozlowski+dt@xxxxxxxxxx; > > > > > > > robh@xxxxxxxxxx; > > > > > > > Gradinariu, Ramona <Ramona.Gradinariu@xxxxxxxxxx>; Sa, Nuno > > > > > > > <Nuno.Sa@xxxxxxxxxx> > > > > > > > Subject: Re: [PATCH 4/5] iio: adis16480: add support for > > > > > > > adis16545/7 > > > > > > > families > > > > > > > > > > > > > > [External] > > > > > > > > > > > > > > On Sun, 2024-04-28 at 16:25 +0100, Jonathan Cameron wrote: > > > > > > > > On Tue, 23 Apr 2024 11:42:09 +0300 > > > > > > > > Ramona Gradinariu <ramona.bolboaca13@xxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > The ADIS16545 and ADIS16547 are a complete inertial system > > > > > > > > > that > > > > > > > > > includes a triaxis gyroscope and a triaxis accelerometer. > > > > > > > > > The serial peripheral interface (SPI) and register structure > > > > > > > > > provide a > > > > > > > > > simple interface for data collection and configuration > > > > > > > > > control. > > > > > > > > > > > > > > > > > > These devices are similar to the ones already supported in the > > > > > > > > > driver, > > > > > > > > > with changes in the scales, timings and the max spi speed in > > > > > > > > > burst > > > > > > > > > mode. > > > > > > > > > Also, they support delta angle and delta velocity readings in > > > > > > > > > burst > > > > > > > > > mode, for which support was added in the trigger handler. > > > > > > > > > > > > > > > > > > Signed-off-by: Nuno Sá <nuno.sa@xxxxxxxxxx> > > > > > > > > > > > > > > > > What is Nuno's relationship to this patch? You are author and > > > > > > > > the > > > > > > > > sender > > > > > > > > which is fine, but in that case you need to make Nuno's > > > > > > > > involvement > > > > > > > > explicit. > > > > > > > > Perhaps a Co-developed-by or similar is appropriate? > > > > > > > > > > > > > > > > > Signed-off-by: Ramona Gradinariu > > > > > > > > > <ramona.gradinariu@xxxxxxxxxx> > > > > > > > > A few comments inline. Biggest one is I'd like a clear > > > > > > > > statement of > > > > > > > > why you > > > > > > > > can't do a burst of one type, then a burst of other. My guess > > > > > > > > is > > > > > > > > that the > > > > > > > > transition is very time consuming? If so, that is fine, but you > > > > > > > > should be > > > > > > > > able > > > > > > > > to let available_scan_masks handle the disjoint channel > > > > > > > > sets. > > > > > > > > > > > > > > Yeah, the burst message is a special spi transfer that brings you > > > > > > > all > > > > > > > of the > > > > > > > channels data at once but for the accel/gyro you need to > > > > > > > explicitly > > > > > > > configure > > > > > > > the chip to either give you the "normal vs "delta" readings. Re- > > > > > > > configuring the > > > > > > > chip and then do another burst would destroy performance I think. > > > > > > > We > > > > > > > could > > > > > > > do > > > > > > > the manual readings as we do in adis16475 for chips not supporting > > > > > > > burst32. > > > > > > > But > > > > > > > in the burst32 case those manual readings should be minimal while > > > > > > > in > > > > > > > here we > > > > > > > could have to do 6 of them which could also be very time > > > > > > > consuming... > > > > > > > > > > > > > > Now, why we don't use available_scan_masks is something I can't > > > > > > > really > > > > > > > remember > > > > > > > but this implementation goes in line with what we have in the > > > > > > > adis16475 > > > > > > > driver. > > > > > > > > > > > > > > - Nuno Sá > > > > > > > > > > > > > > > > > > > Thank you Nuno for all the additional explanations. > > > > > > Regarding the use of available_scan_masks, the idea is to have any > > > > > > possible > > > > > > combination for accel, gyro, temp and timestamp channels or delta > > > > > > angle, > > > > > > delta > > > > > > velocity, temp and timestamp channels. There are a lot of > > > > > > combinations > > > > > > for > > > > > > this and it does not seem like a good idea to write them all > > > > > > manually. > > > > > > That is > > > > > > why adis16480_update_scan_mode is used for checking the correct > > > > > > channels > > > > > > selection. > > > > > > > > > > If you are using bursts, the data is getting read anyway - which is > > > > > the > > > > > main > > > > > cost here. The real question becomes what are you actually saving by > > > > > supporting all > > > > > the combinations of the the two subsets of channels in the pollfunc? > > > > > Currently you have to pick the channels out and repack them, if > > > > > pushing > > > > > them all > > > > > looks to me like a mempcy and a single value being set > > > > > (unconditionally). > > > > > > > > > Then it's a question of what the overhead of the channel demux in the > > > > > core > > > > > is. > > > > > What you pass out of the driver via iio_push_to_buffers*() > > > > > is not what ends up in the buffer if you allow the IIO core to do data > > > > > demuxing > > > > > for you - that is enabled by providing available_scan_masks. At > > > > > buffer > > > > > start up the demux code computes a fairly optimal set of copies to > > > > > repack > > > > > the incoming data to match with what channels the consumer (here > > > > > probably > > > > > the kfifo on the way to userspace) is expecting. > > > > > > > > > > That demux adds a small overhead but it should be small as long > > > > > as the channels wanted aren't pathological (i.e. every other one). > > > > > > > > > > Advantage is the driver ends up simpler and in the common case of turn > > > > > on all the channels (why else did you buy a device with those > > > > > measurements > > > > > if you didn't want them!) the demux is zerocopy so effectively free > > > > > which > > > > > is not going to be the case for the bitmap walk and element copy in > > > > > the > > > > > driver. > > > > > > > > > > > > > Maybe my younger me was smarter but reading again the validation of the > > > > scan > > > > mask > > > > code (when available_scan_masks is available), I'm not sure why we're > > > > not > > > > using them. > > > > I think that having one mask with delta values + temperature and another > > > > one > > > > with > > > > normal + temperature would be enough for what we want in here. The code > > > > in > > > > adis16480_update_scan_mode() could then be simpler I think. > > > > > > > > Now, what I'm still not following is the straight memcpy(). I may be > > > > missing > > > > something but the demux code only appears to kick in when we have > > > > compound > > > > masks > > > > resulting of multiple buffers being enabled. So I'm not seeing how we > > > > can > > > > get away > > > > without picking the channels and place them correctly in the buffer > > > > passed > > > > to IIO? > > > > > > It runs whenever the enabled mask requested (the channels that are > > > enabled) is > > > different from the active_scan_mask. It only sends channels in one > > > direction if there is only one user but it only sends the ones enabled by > > > that > > > consumer. > > > It's called unconditionally from iio_enable_buffers() > > > > > > That iterates over all enabled buffers (often there is only 1) > > > > > > and then checks if the active scan mask is the same as the one for this > > > buffer. > > > https://elixir.bootlin.com/linux/v6.9-rc6/source/drivers/iio/industrialio-buffer.c#L1006 > > > > > > The setup for whether to find a superset from available_scan_masks is here > > > https://elixir.bootlin.com/linux/v6.9-rc6/source/drivers/iio/industrialio-buffer.c#L928 > > > > > > Key is that if it happens to match, then we don't actually do anything in > > > the > > > demux. > > > It just gets passed straight on through. That does the fast path you > > > mention > > > below. > > > > Ahh got it... May failure was not realizing that iio_scan_mask_match() > > returns > > the available masks so I was assuming that the bitmap_equal() check would > > only > > differ when multiple buffers are enabled. > > > > Oh well, I think then this should work... I'm not sure it will be more > > performant for the case where we don't enable all the channels because the > > demux > > is a linked list which is far from being a performance friend (maybe we can > > do > > some trials with something like xarray...). Still, for this to work the > > buffer > > we pass into IIO has to be bigger than it needs to be (so bigger memset()) > > because, AFAIU, we will have to pass on all the scan elements and, as I > > said, > > the burst data will either have delta or normal measurements (not both). I > > guess > > the code will still look simpler so if there's no visible difference in > > performance I'm fine with the change. I guess Ramona can give it a try to > > see > > how it looks like. > > Would be interesting to look at the performance of that code, but the > real question is what channels do users typically enabled. If they enabled > a full set (and I suspect most do) then that code doesn't nothing at all. > The only channel that makes me doubt is the temperature one but yeah, I would also expect most users just enable them all... > Original thinking was that the non common case didn't really matter much. > If it is worth optimizing I'd suggest a double pass (or cheat - see later) > 1st pass works out number of elements, 2nd just saves them in a nice > linear array. It's typically small however, so maybe just allocate a linear > array big enough for the worst case. > Yeah, linear array should also be fine and likely simpler. Maybe if I'm bored at some point I'll run some experiments :) - Nuno Sá