Hi Jonathan On Thu, 15 Dec 2022 at 12:45, Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> wrote: > > On Thu, 15 Dec 2022 11:11:40 +0200 > Laurent Pinchart <laurent.pinchart@xxxxxxxxxxxxxxxx> wrote: > > > Hi Ricardo, > > > > On Thu, Dec 15, 2022 at 11:08:05AM +0200, Laurent Pinchart wrote: > > > On Thu, Dec 15, 2022 at 08:59:14AM +0100, Ricardo Ribalda wrote: > > > > On Thu, 15 Dec 2022 at 02:15, Sergey Senozhatsky wrote: > > > > > > > > > > On (22/12/14 14:37), Ricardo Ribalda wrote: > > > > > [..] > > > > > > +struct uvc_status_streaming { > > > > > > + u8 button; > > > > > > +} __packed; > > > > > > + > > > > > > +struct uvc_status_control { > > > > > > + u8 bSelector; > > > > > > + u8 bAttribute; > > > > > > + u8 bValue[11]; > > > > > > +} __packed; > > > > > > + > > > > > > +struct uvc_status { > > > > > > + u8 bStatusType; > > > > > > + u8 bOriginator; > > > > > > + u8 bEvent; > > > > > > + union { > > > > > > + struct uvc_status_control control; > > > > > > + struct uvc_status_streaming streaming; > > > > > > + }; > > > > > > +} __packed; > > > > > > + > > > > > > struct uvc_device { > > > > > > struct usb_device *udev; > > > > > > struct usb_interface *intf; > > > > > > @@ -559,7 +579,7 @@ struct uvc_device { > > > > > > /* Status Interrupt Endpoint */ > > > > > > struct usb_host_endpoint *int_ep; > > > > > > struct urb *int_urb; > > > > > > - u8 *status; > > > > > > + > > > > > > struct input_dev *input; > > > > > > char input_phys[64]; > > > > > > > > > > > > @@ -572,6 +592,12 @@ struct uvc_device { > > > > > > } async_ctrl; > > > > > > > > > > > > struct uvc_entity *gpio_unit; > > > > > > + > > > > > > + /* > > > > > > + * Ensure that status is aligned, making it safe to use with > > > > > > + * non-coherent DMA. > > > > > > + */ > > > > > > + struct uvc_status status __aligned(ARCH_KMALLOC_MINALIGN); > > > > > > > > > > ____cacheline_aligned ? > > > > > > > > > > I don't see anyone using ARCH_KMALLOC_MINALIGN except for slab.h > > > > > > > > Seems like cacheline is not good enough: > > > > > > > > https://github.com/torvalds/linux/commit/12c4efe3509b8018e76ea3ebda8227cb53bf5887 > > > > https://lore.kernel.org/all/20220405135758.774016-1-catalin.marinas@xxxxxxx/ > > > > > > > > and ARCH_KMALLOC_MINALIGN is what we have today and is working... > > > > > > > > But yeah, the name for that define is not the nicest :) > > > > > > > > I added Jonathan Cameron, on cc, as he had to deal with something > > > > similar for iio in case we are missing something > > > > > > I'd like to get feedback on this from DMA and USB experts. Expanding the > > > CC list of the original patch would help (especially including the > > > linux-usb mailing list). > > > > Also, do we need the allocation change ? It doesn't seem to simplify the > > code that much, neither in terms of lines of code > > > > > 2 files changed, 48 insertions(+), 49 deletions(-) > > > > nor in terms of complexity. Maybe we could keep the union and offsetof > > changes, and drop the allocation change ? In any case, those are two > > different changes, so I'd split them in two patches at least. > > > > > > ps: and I thought this was an easy change :P > > > +CC Catalin who is driving effort to change what we should do here to avoid > wasting space on systems where ARCH_KMALLOC_MINALIGN is currently 128 bytes. > > I don't know the precise requirements for this particular allocation, but > if it's about ensuring the data doesn't share a cacheline with anything else in > the structure then the problem is that ____cacheline_aligned is the > size of a line in the L1 cache. It's not uncommon for microarchitectures to have > a larger cacheline size for L3 and above. Most of the time that doesn't > matter as they maintain correct coherence (all the ARM servers are fine > I think - ours has 128 byte cachelines in L3, Fujitsu have parts with > 256 byte cachelines in L3), but guess what, there are Qualcomm(?) parts where the > L1 cacheline is 64 bytes, but the l3 cacheline is 128 bytes and don't > deal with the hardware coherence issues. For those we need to ensure that > a DMA safe buffer is in it's own 128 byte cacheline, but ___cacheline_aligned > on arm64 only does 64 bytes. Currently ARCH_KMALLOC_MINALIGN enforces the > larger guarantee and is available on all architectures unlike > ARCH_DMA_MINALIGN which is not yet. > > Catalin is working to replace this, so the required guarantees may change, > but we still need something backportable. > > When I sent a bunch of fixes for Input Dmitry asked for a general > ___dma_minalign (naming to be bikeshedded) define. So far there are a few > subsystems carrying their own local equivalent (IIO moved to > IIO_DMA_MINALIGN define) in the interests of reducing the pain of > changing this in future. A central definition is another option. > Thanks a lot for the explanation! > Jonathan > > -- Ricardo Ribalda