Hello linux experts, I'm almost sure my understanding is correct but hope someone could confirm me if it's really correct. Suppose a user program wants to pass to a device some data structures connected to each other using pointers and the device processes the data with dma capability(meaning it accesses the memory on its own). Here is a simple example. I want to add each member of array a and b and put it in array c and I pass the address of the struct which contains the addresses of inputs and output arrays. (This is a code for this question and can have some bugs but it's enough for my question. ) struct _myargs { int *a; // input pointer int *b; // input pointer int *c; // output pointer } myargs; int main() { int *a, *b, *c; a = malloc(100*sizeof(int)); b = malloc(100*sizeof(int)); c = malloc(100*sizeof(int)); ... fill inputs ... myargs.a = a; myargs.b = b; myargs.c = c; ... ioctl(fd, AddIt, &myargs); ... } Then, in the driver, I allocate the input and output buffers in kernel space and I do copy_from_user to copy the input data to kernel space input buffers. (and later do copy_to_user for the buffer c). my_ioctl(struct file *file, unsigned int cmd, unsigned long args_usr) { struct _myargs *args_ker; dma_addr_t args_dev; int *a_ker, *b_ker, *c_ker; dma_addr a_dev, b_dev, c_dev; args = dma_alloc_coherent(dev, sizeof(struct _myargs), &args_dev, GFP_KERNEL); switch(cmd) { case(AddIt): copy_from_user(args_ker, args_usr, sizeof(struct _myargs)); ak = dma_alloc_coherent(dev, 100*sizeof(int), &a_dev, GFP_KERNEL); bk = dma_alloc_coherent(dev, 100*sizeof(int), &b_dev, GFP_KERNEL); ck = dma_alloc_coherent(dev, 100*sizeof(int), &c_dev, GFP_KERNEL); copy_from_user(a_ker, args_ker.a, 100*sizeof(int)); copy_from_user(b_ker, args_ker.b, 100*sizeof(int)); args_ker.a = a_dev; args_ker.b = b_dev; args_ker.c = c_dev; writel(dev, &args_dev, REG_ARGS); // set args writel(dev, 1, REG_TRIGGER); // trigger hardware ... } } So the a, b arrays have been copied to kernel space address a_ker and b_ker but when I pass the args struct to the device, I used virtual io address (dma_addr_t) a_dev, b_dev and c_dev. And I want the device hardware to read the args struct from memory and use the address there to do array addition. I understand the iommu for the device (if correctly set and initialized) will do the (io virtual address) -> (device bus address) translation because the iommu subsystem would have setup the page table for the MMU(for kernel virtual address) and IOMMU(for io virtual address) for the buffers so that they can point to the same physical(or bus) address. I'm almost sure this will work but can't do a test now. Is my understanding correct? can someone well versed in the linux driver confirm me this approach is correct? _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies